On Mon, May 20, 2002, Matitiahu Allouche wrote about "Re: official hebrew in Linux-IL mailing lists?": > According to Unicode, the paragraph embedding level is computed anew for > each block. A block is delimited by the start/end of text, and by Block > Separators. There are no Block Separators within ISO-8859 code pages. It > would be up to applications to recompute the direction for each line, or > sentence, or paragraph or whatever units make sense for them.
This was exactly my point. When you have a iso-8859-8-i email, what are "blocks"? If the mail reader and writer don't agree on the same definition of blocks, there can be problems. My bidiv heuristics are as follows: a new "block" on an empty line. All the lines in a block are given the same base direction, determined by the first character that has a direction in the first line of the block. If none of the characters of that first line has a direction, I use the previous block's direction for that line, and continue to the next line. These heuristics are necessary for sensibly formatting email (or other plain text) that might contain blocks of English text, such as headers, signatures, included code, and so on. I have no idea what heuristics Microsoft Outlook uses, for example. -- Nadav Har'El | Monday, May 20 2002, 9 Sivan 5762 [EMAIL PROTECTED] |----------------------------------------- Phone: +972-53-245868, ICQ 13349191 |This '|' is not a pipe. http://nadav.harel.org.il | ================================================================= To unsubscribe, send mail to [EMAIL PROTECTED] with the word "unsubscribe" in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]