Re: [emacs-bidi] UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos) (fwd)
behdad, who is going to study after finishing this mail. -- Forwarded message -- Date: Sat, 11 Oct 2003 15:54:31 -0400 From: Eli Zaretskii [EMAIL PROTECTED] To: Behdad Esfahbod [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [emacs-bidi] UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos) Date: Sat, 11 Oct 2003 04:15:13 -0400 From: Behdad Esfahbod [EMAIL PROTECTED] Is it true that your implementation of Unicode Bidi algorithm does not follow the UTR#9, with respect to handligh dash? Just wanted to make sure this is not true, otherwise, please consider following the standard. Handa-san is currently trying to plug the sequential implementation of UAX#9 that I wrote into the Emacs display code. The code I wrote renders H-5 as -5H, as per UAX#9. One needs to type H-{RLM}5 to get the H-5 result that most Hebrew users want. I guess we will need to get used to type RLM and LRM in similar situations, since we must be UAX#9 compliant, and since UAX#9 results in such madness in quite a few cases like this, sigh. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Sat, 04 Oct 2003 15:01:04 +0200, Shachar Shemesh [EMAIL PROTECTED] wrote: Eran Tromer wrote: OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. I'm not sure how to tackle this particular problem. I think the best place to fix it would be at the root of the problem - the Unicode BiDi algorithm. I *think* I have a reasonably portable solution to this issue. I guess it's time to register with another forum This is a known issue with Unicode BiDi. It arises because we use the - character for both minus and hyphen. When one wants to connects letters with numbers one is using a HYPHEN and wants it to appear as 5-word. When one wants to write a negative number one uses a MINUS SIGN and would like it to appear as -5 word. The Unicode wise men have ignored the 1st case (or require the use of a special Hebrew MAKAF). I have pointed this and some other problem at the m17n2000 conference. (See http://www.m17n.org/m17n2000_all_but_registration/proceedings/ehud/ See slide no. 10). I proposed my solution (slides 11-15) and this algorithm was implemented by Kenichi Handa in his Emacs-BiDi (see notes on http://www.m17n.org/emacs-bidi/ ). Ehud. - -- Ehud Karni Tel: +972-3-7966-561 /\ Mivtach - Simon Fax: +972-3-7966-667 \ / ASCII Ribbon Campaign Insurance agencies (USA) voice mail and X Against HTML Mail http://www.mvs.co.il FAX: 1-815-5509341 / \ GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry -BEGIN PGP SIGNATURE- Comment: use http://www.keyserver.net/ to get my key (and others) iD8DBQE/hFA7LFvTvpjqOY0RAiGKAJ0Y6lV+IaWZPqLhGwOTVa3gDv/gGACfa3Br KaVInTd6je8gWB/26loM1+A= =904+ -END PGP SIGNATURE- = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)
On 2003/10/08 19:58, Ehud Karni wrote: This is a known issue with Unicode BiDi. It arises because we use the - character for both minus and hyphen. When one wants to connects letters with numbers one is using a HYPHEN and wants it to appear as 5-word. When one wants to write a negative number one uses a MINUS SIGN and would like it to appear as -5 word. The Unicode wise men have ignored the 1st case (or require the use of a special Hebrew MAKAF). Won't a regular U+2010 HYPHEN (instead of the U+05BE maqaf) do the job, proper Hebrew typography aside? I've tested it on fribidi and it's rendered correctly in both LTR and RTL context. I proposed my solution (slides 11-15) and this algorithm was implemented by Kenichi Handa in his Emacs-BiDi (see notes on http://www.m17n.org/emacs-bidi/ ). As you note, your algorithm is incompatible with Unicode's. All means are valid for converting legacy text, but there's a strong case for insisting that all newly created text must be rendered correctly by the standard algorithm. This, of course, leaves open the problem of distinguishing the two types of texts. It may be easy when importing a file since you know its type, but what do you do with a HYPHEN-MINUS when pasting from the clipboard? Maybe the right strategy is to convert all U+002D HYPHEN-MINUS to either U+2010 HYPHEN or to U+2212 MINUS SIGN upon import/paste/keypress (via appropriate heuristics), so that HYPHEN-MINUS never occurs in the output. Here the only breakage is for legacy/external texts for which the heuristic fails. Eran = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)
Eran Tromer wrote on 2003-10-08: As you note, your algorithm is incompatible with Unicode's. All means are valid for converting legacy text, but there's a strong case for insisting that all newly created text must be rendered correctly by the standard algorithm. This, of course, leaves open the problem of distinguishing the two types of texts. It may be easy when importing a file since you know its type, but what do you do with a HYPHEN-MINUS when pasting from the clipboard? Maybe the right strategy is to convert all U+002D HYPHEN-MINUS to either U+2010 HYPHEN or to U+2212 MINUS SIGN upon import/paste/keypress (via appropriate heuristics), so that HYPHEN-MINUS never occurs in the output. Here the only breakage is for legacy/external texts for which the heuristic fails. I think modifying pasted text is wrong. Instead you should fix the text from which you copy, by whatever means needed. -- Beni Cherniavsky [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Diego Iastrubni wrote on 2003-10-07: , 5 2003, 12:54,Beni Cherniavsky: Shift-minus produces not simply a hyphen but 05BE;HEBREW PUNCTUATION MAQAF, which is even better because it looks different from a western hyphen (a maqaf is at the top of the characters) and AFAIK, it's the correct character to use beween a letter and a number (and also as a hyphen between hebrew letters). here on Mandrake 9.2 SHIFT - will produce an underscore _. Just like in english. how can you reproduce this? setxkbmap -layout us,il -variant ,lyx -- Beni Cherniavsky [EMAIL PROTECTED] WEP was broken on every concievable level, and on several inconcievable levels. -- Shachar Shemesh in linux-il To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
, 5 2003, 12:54,Beni Cherniavsky: Shift-minus produces not simply a hyphen but 05BE;HEBREW PUNCTUATION MAQAF, which is even better because it looks different from a western hyphen (a maqaf is at the top of the characters) and AFAIK, it's the correct character to use beween a letter and a number (and also as a hyphen between hebrew letters). here on Mandrake 9.2 SHIFT - will produce an underscore _. Just like in english. how can you reproduce this? So, please, in all editors, convert a minus to a maqaf if the preceding character is a hebrew character (like in geresh), and the problem will practically go away (and your documents will look better). -- diego, 11 Tishrey 5764 Please avoid sending me Word or PowerPoint attachments. See http://www.fsf.org/philosophy/no-word-attachments.html To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Is it possible to set paragraph direction when in a non Hebrew locale? (i.e., when in LOCALE en_US, even when I force OOwriter to display the paragraph direction button, it is grayed out. When I start OO in a Hebrew locale, it defaults all paragraphs to RTL, which is not what I want.) Arie -- It is absurd to seek to give an account of the matter to a man who cannot himself give an account of anything; for insofar as he is already like this, such a man is no better than a vegetable. -- Book IV of Aristotle's Metaphysics = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Arie Folger wrote: Is it possible to set paragraph direction when in a non Hebrew locale? (i.e., when in LOCALE en_US, even when I force OOwriter to display the paragraph direction button, it is grayed out. When I start OO in a Hebrew locale, it defaults all paragraphs to RTL, which is not what I want.) Arie Go to tools/options In the resulting dialog, select language settings/languages. Ask to enable CTL (Complex Text Layout), and choose the CTL language to be Hebrew. You will then be able to perform Hebrew editing, regardless of your locale. Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Saturday, Oct 4, 2003, at 15:01 Asia/Jerusalem, Shachar Shemesh wrote: I'm not sure how to tackle this particular problem. I think the best place to fix it would be at the root of the problem - the Unicode BiDi algorithm. I *think* I have a reasonably portable solution to this issue. I guess it's time to register with another forum There was a thread about this not long ago at the w3c i18n list: http://lists.w3.org/Archives/Public/www-international/2003JulSep/ 0084.html To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Shachar Shemesh wrote on 2003-10-04: Tzafrir Cohen wrote: It is not a bug. It a feature (standard conformance). Well, maybe it's a feature of OpenOffice. It's still a bug in the standard. But what about the cases when you do want a negative number? Being as it is that there is no legacy way of producing a hyphen, the Unicode standard must accept that minus is used instead. Any attempt to insist on it being correct in the face of real life is simply absurd. There is no legacy way of producing many other symbols, which should not drive Unicode. Let the cruft die as quick as possible ;). The real point is that typographically you should use the maqaf rather than western hyphens after hebrew letters (can somebody confirm this?), so people should switch to maqaf anyway. I can attest from my experience with geresh that converting a typed minus to the maqaf is the preceding character is a Hebrew character does the right thing 99% percent of the time. -- Beni Cherniavsky [EMAIL PROTECTED] = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Beni Cherniavsky wrote: Shachar Shemesh wrote on 2003-10-04: Tzafrir Cohen wrote: It is not a bug. It a feature (standard conformance). Well, maybe it's a feature of OpenOffice. It's still a bug in the standard. But what about the cases when you do want a negative number? Those will rarely be right next to a Hebrew character. Shachar = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Sun, Oct 05, 2003 at 09:30:43AM +0200, Arie Folger wrote: Is it possible to set paragraph direction when in a non Hebrew locale? (ie., when in LOCALE en_US, even when I force OOwriter to display the paragraph direction button, it is grayed out. When I start OO in a Hebrew locale, it defaults all paragraphs to RTL, which is not what I want.) Arie -- It is absurd to seek to give an account of the matter to a man who cannot himself give an account of anything; for insofar as he is already like this, such a man is no better than a vegetable. -- Book IV of Aristotle's Metaphysics = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] This works for version 1.1 and might work for earlier version: You need to enter the Options-Languages menu, enable CTL and choose Hebrew from the drop-down list. -- Cut your own wood and it will warm you twice Regards, Yoni Rabkin pgp0.pgp Description: PGP signature
Re: OpenOffice BiDi kudos
On 2003/10/04 13:44, Eran Tromer wrote: OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? The following are my conclusions from the discussion here, as well as the following threads. http://bugzilla.mozilla.org/show_bug.cgi?id=73251#c32 http://lists.w3.org/Archives/Public/www-international/2003JulSep/0084.html http://mozilla.org.il/board/viewtopic.php?p=1790#1790 I have cross-posted this summary to the OpenOffice IssueZilla [sic] at http://www.openoffice.org/issues/show_bug.cgi?id=19848 You can point out my (undoubtedly numerous and grave) errors there, but please don't spam it unnecessarily. I see two practical alternatives to solving the problem. 1. Break compatibility with the Unicode algorithm. Starting with Office 2000, Microsoft uses a different algorithm that fixes this problem (I'm not aware of any other deviation from Unicode) -- use that instead. -or- 2. a. During text input, use heuristics to produce an encoding that's rendered as desired. In the case of hebrew+minus+digit, instead of a plain HYPHEN-MINUS insert some appropriate Unicode sequences such as RLE+(HYPHEN-MINUS)+PDF or RLE+(NON-BREAKING HYPHEN)+PDF (see note below). b. Do something smart about those sequences during editing (e.g., treat them as one logical character). c. In the MS Office import filters, add RLE+PDF where necessary so as to simulate Microsoft's algorithm. d. Likewise, kludge the MS Office output filters as necessary. Both seem rather horrible, but is the current situation. The hebrew+hyphen+digit pattern occurs in many (perhaps most) Hebrew documents, so its being rendered incorrectly in legacy documents is a major issue. As for new documents, enter a space between the minus and the number is unsatisfactory since the result is typographically appalling, especially if the space induces a line break. A couple of notes on 2.a. above: The sequence (HYPHEN-MINUS)+LRM can be used in RTL context, but breaks things in LTR context. Arguably, the Right Thing is to use the single character U+05BE (HEBREW PUNCTUATION MAQAF). Alas, this seems impractical as the character is misrendered or missing in most fonts. Also, Maqaf is not represented on keyboards and is missing from the iso8859-8 charset (though it's present in windows-1255). Moreover, the widespread use of HYPHEN-MINUS instead of the Maqaf character has virtually eliminated the latter from common texts -- it seems to be perceived as a quaint historical quirk that is bearable in professional typesetting, but would look quite strange in (say) everyday correspondence. Regards, Eran = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On 2003/10/06 01:38, Eran Tromer wrote: 2. a. During text input, use heuristics to produce an encoding that's rendered as desired. In the case of hebrew+minus+digit, instead of a plain HYPHEN-MINUS insert some appropriate Unicode sequences such as RLE+(HYPHEN-MINUS)+PDF or RLE+(NON-BREAKING HYPHEN)+PDF (see note below). On second thought, what's wrong with a plain U+2011 (NON-BREAKING HYPHEN)? Eran = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
At 21:40 03.10.2003 -0400, Behdad Esfahbod wrote: On Fri, 3 Oct 2003, Alexander Maryanovsky wrote: Hi everyone, Just wanted to pass my thanks (in the hope they're listening) to anyone and everyone involved with OpenOffice's new BiDi support. It's absolutely perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. Just love it. Col. Where can I find the patch? Alexander (aka Sasha) Maryanovsky. Just get OpenOffice 1.1 and look for BiDi in the help. Alexander Maryanovsky. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On 2003/10/03 19:44, Alexander Maryanovsky wrote: It's absolutely perfect as far as I can see. [...] mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? Eran = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Alexander Maryanovsky wrote: Hi everyone, Just wanted to pass my thanks (in the hope they're listening) to anyone and everyone involved with OpenOffice's new BiDi support. It's absolutely perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. Just love it. Alexander (aka Sasha) Maryanovsky. Most of the kudus should go to Shoshana Forbes, who did a wonderful job of nagging the Sun developers into fixing the problems. Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Eran Tromer wrote: OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? Eran I'm not sure how to tackle this particular problem. I think the best place to fix it would be at the root of the problem - the Unicode BiDi algorithm. I *think* I have a reasonably portable solution to this issue. I guess it's time to register with another forum I'll also note that the entire BiDi editing experience is still awaiting revamping. That is work that is supposed to happen with Sun directly, possibly involving other parties as well. I promised to write a paper, but it is falling behind. Sorry everybody. Once I have something to show, I'll be sure to let everyone (here, whatsup, ivrix, arabeyes) know so that proper feedback will be processed. We'll try to get it accepted as a standard, so that everyone gets to do Hebrew editing in a consistant manner. Shachar Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Sat, Oct 04, 2003 at 02:44:05PM +0300, Eran Tromer wrote: On 2003/10/03 19:44, Alexander Maryanovsky wrote: It's absolutely perfect as far as I can see. [...] mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? IMO, the correct way of handling it is MS Word-like auto-correction. The word processor should detect patterns like HYPEN-MINUS BETWEEN WORDS and change it to a UNICODE DASH character, HYPHEN-MINUS BETWEEN NUMBERS and change it to a UNICODE MINUS character. Not all patterns have a single solution, so it requires some thinking through. I think it's the Right Way(tm). Another solution is to see the keyboard language and add direction control characters. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Fri, Oct 03, 2003 at 06:44:38PM +0200, Alexander Maryanovsky wrote: Hi everyone, Just wanted to pass my thanks (in the hope they're listening) to anyone and everyone involved with OpenOffice's new BiDi support. It's absolutely perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. Just love it. Alexander (aka Sasha) Maryanovsky. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] I've just today installed OpenOffice with Hebrew for the wife and must say thank you to all that had a hand in this. -- Cut your own wood and it will warm you twice Regards, Yoni Rabkin pgp0.pgp Description: PGP signature
Re: OpenOffice BiDi kudos
On Sat, 4 Oct 2003, Eran Tromer wrote: On 2003/10/03 19:44, Alexander Maryanovsky wrote: It's absolutely perfect as far as I can see. [...] mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? In addition to what Ilya wrote: It is not a bug. It a feature (standard conformance). Anyway, try the keyboard variant lyx (available on XFree = 4.3, 'setxkbmap -variant ,lyx us,il' ), and then press shift-y to get an RLM character. Type one after the minus. Better still: shift-minus should give a hyphen on that variant. This avoids the problem in the first place. -- Tzafrir Cohen mailto:[EMAIL PROTECTED] http://www.technion.ac.il/~tzafrir = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Tzafrir Cohen wrote: On Sat, 4 Oct 2003, Eran Tromer wrote: On 2003/10/03 19:44, Alexander Maryanovsky wrote: It's absolutely perfect as far as I can see. [...] mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. OOe 1.1 seems to have the usual hebrew-hyphen-number problem (H-5 renders as H5-), which necessitates typing of the logically incorrect H5- and causes bad importing of newer MS Word documents. http://www.openoffice.org/issues/show_bug.cgi?id=19848 What's the proper way to handle this? Using hebrew hyphens or something of the sorts? In addition to what Ilya wrote: It is not a bug. It a feature (standard conformance). Well, maybe it's a feature of OpenOffice. It's still a bug in the standard. Being as it is that there is no legacy way of producing a hyphen, the Unicode standard must accept that minus is used instead. Any attempt to insist on it being correct in the face of real life is simply absurd. If standard purity was what Unicode mandated, they should have only defined 22 characters for Hebrew, like they did for Arabic (28). Put the final forms in some god-forsaken place, and have the display engine render them. They didn't do that, because real life dictates the fact that, practically, people use 27 characters for Hebrew, and changing that would break far too many existing applications. Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Sat, 4 Oct 2003, Shachar Shemesh wrote: If standard purity was what Unicode mandated, they should have only defined 22 characters for Hebrew, like they did for Arabic (28). Put the final forms in some god-forsaken place, and have the display engine render them. They didn't do that, because real life dictates the fact that, practically, people use 27 characters for Hebrew, and changing that would break far too many existing applications. Correction: Hebrew is not strict about using final forms for characters at end of words. Examples: abbreviations (roshei teivot) and words with special pronounciation (Mubarak is an example which I have on mind). The hyphen problem is not really a problem in the Unicode standard, but of the text editors which do not automatically convert minus sign into hyphen when the context expects it. --- Omer My opinions, as expressed in this E-mail message, are mine alone. They do not represent the official policy of any organization with which I may be affiliated in any way. WARNING TO SPAMMERS: at http://www.zak.co.il/spamwarning.html = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
Omer Zak wrote: On Sat, 4 Oct 2003, Shachar Shemesh wrote: If standard purity was what Unicode mandated, they should have only defined 22 characters for Hebrew, like they did for Arabic (28). Put the final forms in some god-forsaken place, and have the display engine render them. They didn't do that, because real life dictates the fact that, practically, people use 27 characters for Hebrew, and changing that would break far too many existing applications. Correction: Hebrew is not strict about using final forms for characters at end of words. Examples: abbreviations (roshei teivot) and words with special pronounciation (Mubarak is an example which I have on mind). The hyphen problem is not really a problem in the Unicode standard, but of the text editors which do not automatically convert minus sign into hyphen when the context expects it. AND legacy text, AND existing implementations, AND the fact that the hyphen doesn't actually appear in Unicode's Hebrew encoding, etc. etc. It is gracious enough to offer us RLM and LRM, but not, say, RLE/LRE/PDF. This forces us to use a modifier where the standard claims we should use the right character. The ISO-8859-8 is, I think, a badly constructed codepage. What's the point of providing a Yen, pound and cent symbols, but not NIS? I'm sorry, Windows-1255 is actually better. Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]
Re: OpenOffice BiDi kudos
On Sat, 4 Oct 2003, Shachar Shemesh wrote: Alexander Maryanovsky wrote: Hi everyone, Just wanted to pass my thanks (in the hope they're listening) to anyone and everyone involved with OpenOffice's new BiDi support. It's absolutely perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and numbers, dashes etc. etc. - all work exactly as expected. Just love it. Alexander (aka Sasha) Maryanovsky. Most of the kudus should go to Shoshana Forbes, who did a wonderful job of nagging the Sun developers into fixing the problems. And to the Sun developers for actually fixing them. ;-) Regards, Shlomi Fish Shachar -- Shachar Shemesh Open Source integration consultant Home page resume - http://www.shemesh.biz/ = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED] -- Shlomi Fish[EMAIL PROTECTED] Home Page: http://t2.technion.ac.il/~shlomif/ Writing a BitKeeper replacement is probably easier at this point than getting its license changed. Matt Mackall on OFTC.net #offtopic. = To unsubscribe, send mail to [EMAIL PROTECTED] with the word unsubscribe in the message body, e.g., run the command echo unsubscribe | mail [EMAIL PROTECTED]