>From: Lars Marius Garshol <[EMAIL PROTECTED]> >This reminds me: does anyone have any pointers to information on how >to convert visually encoded text (especially HTML, but also other >formats) to Unicode?
There are programs that do it on the fly for Hebrew. The best, which I have used myself, is HebTML, available for free downloading from http://www.billy.co.il . The author has been working with me on testing a new version that supports Unicode. However, I use this app much less than before, because Hebrew Internet is rapidly making the transition from visual to logical ordering. With IE 5.x and Mozilla supporting logical Hebrew, the years-old visual order is on the way out. The conversion of visual to logical text in BiDi scripts is straightforward: validate the BiDi property of the character, and if RTL then reverse. That means Hebrew letters reverse their order, digits and Latin letters stay the same. Things get more complicated, however, when hyphens, paired punctuation and telephone numbers appear. You need a smart converter for that. In essence, visually ordered Hebrew is a kludge for supporting Hebrew on platforms that weren't designed for it. In other words, it is an adaptation of Hebrew text to monodirectional LTR platforms. In modern software the onus of directionality passes on to software. -- Shlomi Tal שלומי טל _________________________________________________________________ Join the world’s largest e-mail service with MSN Hotmail. http://www.hotmail.com