>From: Lars Marius Garshol <[EMAIL PROTECTED]>

>This reminds me: does anyone have any pointers to information on how
>to convert visually encoded text (especially HTML, but also other
>formats) to Unicode?

There are programs that do it on the fly for Hebrew. The best, which I have 
used myself, is HebTML, available for free downloading from 
http://www.billy.co.il . The author has been working with me on testing a 
new version that supports Unicode. However, I use this app much less than 
before, because Hebrew Internet is rapidly making the transition from visual 
to logical ordering. With IE 5.x and Mozilla supporting logical Hebrew, the 
years-old visual order is on the way out.

The conversion of visual to logical text in BiDi scripts is straightforward: 
validate the BiDi property of the character, and if RTL then reverse. That 
means Hebrew letters reverse their order, digits and Latin letters stay the 
same. Things get more complicated, however, when hyphens, paired punctuation 
and telephone numbers appear. You need a smart converter for that.

In essence, visually ordered Hebrew is a kludge for supporting Hebrew on 
platforms that weren't designed for it. In other words, it is an adaptation 
of Hebrew text to monodirectional LTR platforms. In modern software the onus 
of directionality passes on to software.

--

Shlomi Tal
שלומי טל


_________________________________________________________________
Join the world’s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com


Reply via email to