Support for Hebrew in Web browsers

This note surveys support for display of Hebrew script in several Web browsers. In particular we do not review the ability to send Hebrew input via forms. We also survey HTTP content negotiation features that may be helpful in receiving content that the browser can process well.

Features tested

The following features were tested

  1. Recognizing character encodings typically used for Hebrew
  2. Displaying Hebrew characters encoded with recognized encoding. These characters may be encoded directly or by HTML character references. This test includes only the 27 basic characters, namely the 22 letters and 5 final letters. Support for these characters is sufficient for displaying modern Hebrew, but not Yiddish, Ladino, or taamei hamikra used in the Hebrew bible.
  3. Applying the bi-directional algorithm. A browser that does not support the bi-directional algorithm can render correctly only visually encoded documents.
  4. Allowing overriding the bi-directional algorithm.
  5. Treatment of non-standard visual documents labeled with iso-8859-8
  6. Misceleneous features, such as display of the title element, and alternative text to images.
  7. Content negotiation: accept-language and accept-charset HTTP request headers.

You may also like to read the Very quick overview of standards related to Hebrew and the Web, for clarification of the above.

Browsers tested

All tests so far were conducted under English Windows 95 with Hebrew enabled. However, the results should be similar without the Hebrew support in the operating system. I intend to put an online test suit, to enable further testing by others.

MSIE 5.0
with "Language auto-selection" and "Hebrew text display support" installed.
Remark: Vendor claims browser will achieve same results without "Hebrew enabled." This is a new feature in Internet Explorer. Versions prior to 5.0 require Hebrew support from the operating system.
Tango 3.3.1 (This browser is no longer distributed)
with Hebrew module installed
Remark: Vendor claims browser will achieve same results without the Hebrew enabled add on.
Lynx 2.8.1 and 2.7.2
Also Unix versions have been tested. One must use a terminal that has a Hebrew display.
Netscape Communicator 4.51
Opera 3.51

Quick review of test results by browser

Internet Explorer 5.0

Internet Explorer has the best support for Hebrew among the browsers tested. It displays correctly documents in Hebrew written according to the standards, and in addition makes educated guesses that allow it to render the majority of poorly written documents as their authors intended. It has rather annoying bugs concerning the title element and alternative text to images.

Internet Explorer also has a controversial feature of placing the scroll bar on the left in right-to-left documents. This is rather confusing as users consider the scroll bar as part of the user interface of the browser and not a part of the document. The confusion is increased as many Hebrew documents are encoded visually with left-to-right directionality. So some Hebrew documents have their scroll bars on the right.

Tango 3.3.1

Tango has support for bi-directionality, and will also render correctly visually encoded documents, if they are labeled with ISO-8859-8. However, it is quite far from following the standards. In particular it does not support overriding the bi-directional algorithm.

The evaluation version that I tried in April 1999 was dated 1997, which indicates that development of this browser has effectively stopped. As a somewhat oldish browser it does not support the div element, which many visually encoded document use to align text to the right.

On the other hand, Tango has several advantages over Explorer. One is its ability to input Hebrew characters (in forms) even on non-Hebrew aware systems. (I did not test this feature). Another is its modest system requirements.

Lynx 2.8.1 and 2.7.2

The various versions of Lynx that I tested gave uniform results. Lynx dos not support the bi-directional algorithm but renders very well visually encoded documents. Like the two browsers discussed above its support for character references is excellent as well. Lynx also supports the Accept-Charset HTTP request header which allows users to show their preferences for visually encoded documents.

Unfortunately, many visually encoded documents in Hebrew use left-to-right tables to arrange text in columns. This mixes the order of the text in Lynx that shows the cells' content in a sequence and not side by side. Another inconvenience is that in visually encoded documents the tabbing order between links is incorrect for links that lie on the same line.

Netscape 4.51

Netscape is very disappointing for its support of the Hebrew script. It does not recognize the ISO-8859-8 encoding and its variants, and does not support character references. Support for BiDi in Mozilla is in progress.

Visually encoded documents in ISO-8859-8 may be viewed in Netscape by using the "Hebrew hackers' font". This font pretends to be a Latin font but has glyphs of Hebrew characters as if it is a font of ISO-8859-8.

This method has severe disadvantages that include the inability to render correctly accented Latin characters in a Hebrew document. Another great disadvantage of the "Hebrew hackers' font" is that these fonts are of generally poor quality and may cause damage to one's eyesight if used often.

Opera 3.51

Opera's support for Hebrew is non-existent. The "Hebrew hackers' font" trick works for it too. In order to toggle quickly between this font and the settings for Latin based scripts, one can use a short user's style sheet that assigns the hackers' font.

Test results by feature

n.a. stands for "not applicable"

Character encodings (charset parameter)

Explorer 5.0 Tango 3.3.1 Lynx 2.7.2 and 2.8.1 Netscape 4.51 Opera 3.51
ISO-8859-1 yes yes yes yes yes
ISO-8859-8 yes yes yes no no
ISO-8859-8-i yes yes no no no
Windows-1255 yes yes yes no no
UTF-8 yes yes yes yes no

Explorer 5.0 and Tango 3.3.1 recognize the five different character encoding tested. Lynx does not recognize ISO-8859-8-i which is identical to ISO-8859-8 but is used to label logically encoded documents. Netscape 4.51 does not recognize any of the 8-bit encoding that allow a single byte coding of Hebrew characters, and is therefore quite useless for interpreting documents whose main language uses the Hebrew script. Opera 3.51, as an HTML 3.2 browser, supports only ISO-8859-1.

Hebrew character display in recognized encoding

Explorer 5.0 Tango 3.3.1 Lynx 2.7.2 and 2.8.1 Netscape 4.51 Opera 3.51
direct yes yes yes yes n.a.
decimal yes yes yes no no
hexadecimal yes yes yes no no

Explorer 5.0, Tango 3.3.1, and Lynx support Hebrew characters both encoded directly in the the encodings that allow for that, as well as using numerical references. Netscape 4.51 supports directly encoded Hebrew characters in UTF-8. In the same encoding it supports also numerical references. Opera 3.51 does not support Hebrew characters at all.

Bi-Directionality

Explorer 5.0 Tango 3.3.1
Applies bi-directional algorithm yes yes
Supports bi-directionality override yes no
Assumes ISO-8859-8 means visual sometimes yes

Only Explorer 5.0 and Tango 3.3.1 support the bi-directional algorithm. Explorer's implementation is more complete as it supports overriding the algorithm with the bdo element as well as with Unicode's BiDi override characters.

Many documents in Hebrew are written visually in ISO-8859-8 in a non-standard way, namely without overriding the bi-directional algorithm explicitly as the HTML specification requires. Explorer guesses the author's intentions. For example, if a document labeled with ISO-8859-8 starts with <html dir="rtl">, it will process the document according to the standards. Tango 3.3.1 always assumes that ISO-8859-8 means visual, and assumes left-to-right directionality to all characters. In documents labeled with ISO-8859-8 it gives some honor to the dir attribute as far as alignment of text and position of list item markers, etc.

Misceleneous

Explorer Tango Lynx Netscape
<title>text</title> no yes yes no
<img alt="text"> no yes yes yes
<link title="text"> n.a. n.a yes n.a.

Explorer 5.0 and Netscape 4.51 rely too much on the operating system in displaying the content of the title element. Explorer 5.0 has bugs in the bi-directional algorithm in alternative text to images.

Content negotiation

Explorer Tango Lynx Netscape Opera
Accept-Language yes yes yes yes no
Accept-Charset no no yes no no

With the exception of Opera 3.51 all browsers tested support the accept-language HTTP request header. Lynx allows for full configuration of this header. Tango thinks that the code for Hebrew is iw, so users should choose a user defined code he.

Among the browsers tested only Lynx supports configuration of the accept-charset request header. Users may use that to request documents encoded in ISO-8859-8 rather than in ISO-8859-8-i in order to show their preferences for visually encoded documents. Netscape sends an Accept-Charset header in requests but does not allow its configuration.