Re: [emacs-bidi] UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos) (fwd)

2003-10-11 Thread Behdad Esfahbod


behdad,
who is going to study after finishing this mail.

-- Forwarded message --
Date: Sat, 11 Oct 2003 15:54:31 -0400
From: Eli Zaretskii [EMAIL PROTECTED]
To: Behdad Esfahbod [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: [emacs-bidi] UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi
kudos)

 Date: Sat, 11 Oct 2003 04:15:13 -0400
 From: Behdad Esfahbod [EMAIL PROTECTED]

 Is it true that your implementation of Unicode Bidi algorithm
 does not follow the UTR#9, with respect to handligh dash?  Just
 wanted to make sure this is not true, otherwise, please consider
 following the standard.

Handa-san is currently trying to plug the sequential implementation of
UAX#9 that I wrote into the Emacs display code.  The code I wrote
renders H-5 as -5H, as per UAX#9.  One needs to type H-{RLM}5
to get the H-5 result that most Hebrew users want.

I guess we will need to get used to type RLM and LRM in similar
situations, since we must be UAX#9 compliant, and since UAX#9 results
in such madness in quite a few cases like this, sigh.


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)

2003-10-08 Thread Ehud Karni
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Sat, 04 Oct 2003 15:01:04 +0200, Shachar Shemesh [EMAIL PROTECTED] wrote:

 Eran Tromer wrote:

  OOe 1.1 seems to have the usual hebrew-hyphen-number problem
  (H-5 renders as H5-), which necessitates typing of the logically
  incorrect H5- and causes bad importing of newer MS Word documents.

 I'm not sure how to tackle this particular problem. I think the best
 place to fix it would be at the root of the problem - the Unicode BiDi
 algorithm. I *think* I have a reasonably portable solution to this issue.

 I guess it's time to register with another forum

This is a known issue with Unicode BiDi. It arises because we use the -
character for both minus and hyphen. When one wants to connects letters
with numbers one is using a HYPHEN and wants it to appear as 5-word.
When one wants to write a negative number one uses a MINUS SIGN and
would like it to appear as -5 word. The Unicode wise men have ignored
the 1st case (or require the use of a special Hebrew MAKAF). I have
pointed this and some other problem at the m17n2000 conference. (See
http://www.m17n.org/m17n2000_all_but_registration/proceedings/ehud/
See slide no. 10). I proposed my solution (slides 11-15) and this
algorithm was implemented by Kenichi Handa in his Emacs-BiDi (see
notes on http://www.m17n.org/emacs-bidi/ ).

Ehud.


- --
 Ehud Karni   Tel: +972-3-7966-561  /\
 Mivtach - Simon  Fax: +972-3-7966-667  \ /  ASCII Ribbon Campaign
 Insurance agencies   (USA) voice mail and   X   Against   HTML   Mail
 http://www.mvs.co.il  FAX:  1-815-5509341  / \
 GnuPG: 98EA398D http://www.keyserver.net/Better Safe Than Sorry
-BEGIN PGP SIGNATURE-
Comment: use http://www.keyserver.net/ to get my key (and others)

iD8DBQE/hFA7LFvTvpjqOY0RAiGKAJ0Y6lV+IaWZPqLhGwOTVa3gDv/gGACfa3Br
KaVInTd6je8gWB/26loM1+A=
=904+
-END PGP SIGNATURE-

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)

2003-10-08 Thread Eran Tromer
On 2003/10/08 19:58, Ehud Karni wrote:
This is a known issue with Unicode BiDi. It arises because we use the -
character for both minus and hyphen. When one wants to connects letters
with numbers one is using a HYPHEN and wants it to appear as 5-word.
When one wants to write a negative number one uses a MINUS SIGN and
would like it to appear as -5 word. The Unicode wise men have ignored
the 1st case (or require the use of a special Hebrew MAKAF). 
Won't a regular U+2010 HYPHEN (instead of the U+05BE maqaf) do the job, 
proper Hebrew typography aside? I've tested it on fribidi and it's 
rendered correctly in both LTR and RTL context.

I proposed my solution (slides 11-15) and this
algorithm was implemented by Kenichi Handa in his Emacs-BiDi (see
notes on http://www.m17n.org/emacs-bidi/ ).
As you note, your algorithm is incompatible with Unicode's.
All means are valid for converting legacy text, but there's a strong 
case for insisting that all newly created text must be rendered 
correctly by the standard algorithm.

This, of course, leaves open the problem of distinguishing the two types 
of texts. It may be easy when importing a file since you know its type, 
but what do you do with a HYPHEN-MINUS when pasting from the clipboard?

Maybe the right strategy is to convert all U+002D HYPHEN-MINUS to either 
U+2010 HYPHEN or to U+2212 MINUS SIGN upon import/paste/keypress (via 
appropriate heuristics), so that HYPHEN-MINUS never occurs in the 
output. Here the only breakage is for legacy/external texts for which 
the heuristic fails.

  Eran



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: UTR#9 - Unicode BiDi (was Re: OpenOffice BiDi kudos)

2003-10-08 Thread Beni Cherniavsky
Eran Tromer wrote on 2003-10-08:

 As you note, your algorithm is incompatible with Unicode's.
 All means are valid for converting legacy text, but there's a strong
 case for insisting that all newly created text must be rendered
 correctly by the standard algorithm.

 This, of course, leaves open the problem of distinguishing the two types
 of texts. It may be easy when importing a file since you know its type,
 but what do you do with a HYPHEN-MINUS when pasting from the clipboard?

 Maybe the right strategy is to convert all U+002D HYPHEN-MINUS to either
 U+2010 HYPHEN or to U+2212 MINUS SIGN upon import/paste/keypress (via
 appropriate heuristics), so that HYPHEN-MINUS never occurs in the
 output. Here the only breakage is for legacy/external texts for which
 the heuristic fails.

I think modifying pasted text is wrong.  Instead you should fix the
text from which you copy, by whatever means needed.

-- 
Beni Cherniavsky [EMAIL PROTECTED]


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-07 Thread Beni Cherniavsky
Diego Iastrubni wrote on 2003-10-07:

  , 5  2003, 12:54,Beni Cherniavsky:
  Shift-minus produces not simply a hyphen but 05BE;HEBREW PUNCTUATION
  MAQAF, which is even better because it looks different from a western
  hyphen (a maqaf is at the top of the characters) and AFAIK, it's the
  correct character to use beween a letter and a number (and also as a
  hyphen between hebrew letters).
 here on Mandrake 9.2 SHIFT - will produce an underscore _. Just like in
 english.

 how can you reproduce this?

setxkbmap -layout us,il -variant ,lyx

-- 
Beni Cherniavsky [EMAIL PROTECTED]

WEP was broken on every concievable level, and on several
inconcievable levels.  -- Shachar Shemesh in linux-il

To unsubscribe, send 
mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-06 Thread Diego Iastrubni
 , 5  2003, 12:54,Beni Cherniavsky:
 Shift-minus produces not simply a hyphen but 05BE;HEBREW PUNCTUATION
 MAQAF, which is even better because it looks different from a western
 hyphen (a maqaf is at the top of the characters) and AFAIK, it's the
 correct character to use beween a letter and a number (and also as a
 hyphen between hebrew letters).
here on Mandrake 9.2 SHIFT - will produce an underscore _. Just like in 
english.

how can you reproduce this?

 So, please, in all editors, convert a minus to a maqaf if the
 preceding character is a hebrew character (like in geresh), and the
 problem will practically go away (and your documents will look
 better).

-- 

diego, 11 Tishrey 5764

Please avoid sending me Word or PowerPoint attachments.
See http://www.fsf.org/philosophy/no-word-attachments.html



To unsubscribe, send 
mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-05 Thread Arie Folger
Is it possible to set paragraph direction when in a non Hebrew locale? (i.e., 
when in LOCALE en_US, even when I force OOwriter to display the paragraph 
direction button, it is grayed out. When I start OO in a Hebrew locale, it 
defaults all paragraphs to RTL, which is not what I want.)

Arie
-- 
It is absurd to seek to give an account of the matter to a man 
who cannot himself give an account of anything; for insofar as
he is already like this, such a man is no better than a vegetable.
   -- Book IV of Aristotle's Metaphysics


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-05 Thread Shachar Shemesh
Arie Folger wrote:

Is it possible to set paragraph direction when in a non Hebrew locale? (i.e., 
when in LOCALE en_US, even when I force OOwriter to display the paragraph 
direction button, it is grayed out. When I start OO in a Hebrew locale, it 
defaults all paragraphs to RTL, which is not what I want.)

Arie
 

Go to tools/options
In the resulting dialog, select language settings/languages.
Ask to enable CTL (Complex Text Layout), and choose the CTL language to 
be Hebrew. You will then be able to perform Hebrew editing, regardless 
of your locale.

 Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page  resume - http://www.shemesh.biz/


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-05 Thread Shoshannah Forbes
On Saturday, Oct 4, 2003, at 15:01 Asia/Jerusalem, Shachar Shemesh  
wrote:

I'm not sure how to tackle this particular problem. I think the best  
place to fix it would be at the root of the problem - the Unicode BiDi  
algorithm. I *think* I have a reasonably portable solution to this  
issue.

I guess it's time to register with another forum
There was a thread about this not long ago at the w3c i18n list:
http://lists.w3.org/Archives/Public/www-international/2003JulSep/ 
0084.html


To unsubscribe, 
send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-05 Thread Beni Cherniavsky
Shachar Shemesh wrote on 2003-10-04:

 Tzafrir Cohen wrote:

 It is not a bug. It a feature (standard conformance).
 
 Well, maybe it's a feature of OpenOffice. It's still a bug in the standard.

But what about the cases when you do want a negative number?

 Being as it is that there is no legacy way of producing a hyphen, the
 Unicode standard must accept that minus is used instead. Any attempt to
 insist on it being correct in the face of real life is simply absurd.

There is no legacy way of producing many other symbols, which should
not drive Unicode.  Let the cruft die as quick as possible ;).

The real point is that typographically you should use the maqaf rather
than western hyphens after hebrew letters (can somebody confirm
this?), so people should switch to maqaf anyway.  I can attest from my
experience with geresh that converting a typed minus to the maqaf is
the preceding character is a Hebrew character does the right thing 99%
percent of the time.

-- 
Beni Cherniavsky [EMAIL PROTECTED]


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-05 Thread Shachar Shemesh
Beni Cherniavsky wrote:

Shachar Shemesh wrote on 2003-10-04:

 

Tzafrir Cohen wrote:

   

It is not a bug. It a feature (standard conformance).

 

Well, maybe it's a feature of OpenOffice. It's still a bug in the standard.

   

But what about the cases when you do want a negative number?

 

Those will rarely be right next to a Hebrew character.

 Shachar



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-05 Thread John Rabkin
On Sun, Oct 05, 2003 at 09:30:43AM +0200, Arie Folger wrote:
 Is it possible to set paragraph direction when in a non Hebrew locale? (ie., 
 when in LOCALE en_US, even when I force OOwriter to display the paragraph 
 direction button, it is grayed out. When I start OO in a Hebrew locale, it 
 defaults all paragraphs to RTL, which is not what I want.)
 
 Arie
 -- 
 It is absurd to seek to give an account of the matter to a man 
 who cannot himself give an account of anything; for insofar as
 he is already like this, such a man is no better than a vegetable.
-- Book IV of Aristotle's Metaphysics
 
 
 =
 To unsubscribe, send mail to [EMAIL PROTECTED] with
 the word unsubscribe in the message body, e.g., run the command
 echo unsubscribe | mail [EMAIL PROTECTED]
 

This works for version 1.1 and might work for earlier version:
You need to enter the Options-Languages menu, enable CTL and choose Hebrew from the 
drop-down list.

-- 
Cut your own wood and it will warm you twice
Regards, Yoni Rabkin


pgp0.pgp
Description: PGP signature


Re: OpenOffice BiDi kudos

2003-10-05 Thread Eran Tromer
On 2003/10/04 13:44, Eran Tromer wrote:
OOe 1.1 seems to have the usual hebrew-hyphen-number problem
(H-5 renders as H5-), which necessitates typing of the logically 
incorrect H5- and causes bad importing of newer MS Word documents.

  http://www.openoffice.org/issues/show_bug.cgi?id=19848

What's the proper way to handle this? Using hebrew hyphens or 
something of the sorts?
The following are my conclusions from the discussion here, as well as 
the following threads.
http://bugzilla.mozilla.org/show_bug.cgi?id=73251#c32
http://lists.w3.org/Archives/Public/www-international/2003JulSep/0084.html
http://mozilla.org.il/board/viewtopic.php?p=1790#1790

I have cross-posted this summary to the OpenOffice IssueZilla [sic] at
  http://www.openoffice.org/issues/show_bug.cgi?id=19848
You can point out my (undoubtedly numerous and grave) errors there, but 
please don't spam it unnecessarily.

I see two practical alternatives to solving the problem.

1. Break compatibility with the Unicode algorithm. Starting with
Office 2000, Microsoft uses a different algorithm that fixes this
problem (I'm not aware of any other deviation from Unicode) -- use
that instead.
-or-

2. a. During text input, use heuristics to produce an encoding that's
rendered as desired. In the case of hebrew+minus+digit, instead of a
plain HYPHEN-MINUS insert some appropriate Unicode sequences such as
RLE+(HYPHEN-MINUS)+PDF or RLE+(NON-BREAKING HYPHEN)+PDF (see note below).
   b. Do something smart about those sequences during editing (e.g.,
treat them as one logical character).
   c. In the MS Office import filters, add RLE+PDF where necessary so
as to simulate Microsoft's algorithm.
   d. Likewise, kludge the MS Office output filters as necessary.
Both seem rather horrible, but is the current situation. The
hebrew+hyphen+digit pattern occurs in many (perhaps most) Hebrew
documents, so its being rendered incorrectly in legacy documents is a
major issue. As for new documents, enter a space between the minus
and the number is unsatisfactory since the result is typographically
appalling, especially if the space induces a line break.
A couple of notes on 2.a. above:
The sequence (HYPHEN-MINUS)+LRM can be used in RTL context, but breaks
things in LTR context.
Arguably, the Right Thing is to use the single character U+05BE
(HEBREW PUNCTUATION MAQAF). Alas, this seems impractical as the
character is misrendered or missing in most fonts. Also, Maqaf is not
represented on keyboards and is missing from the iso8859-8 charset
(though it's present in windows-1255). Moreover, the widespread use of
HYPHEN-MINUS instead of the Maqaf character has virtually eliminated
the latter from common texts -- it seems to be perceived as a quaint
historical quirk that is bearable in professional typesetting, but
would look quite strange in (say) everyday correspondence.
Regards,
  Eran


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-05 Thread Eran Tromer
On 2003/10/06 01:38, Eran Tromer wrote:

2. a. During text input, use heuristics to produce an encoding that's
rendered as desired. In the case of hebrew+minus+digit, instead of a
plain HYPHEN-MINUS insert some appropriate Unicode sequences such as
RLE+(HYPHEN-MINUS)+PDF or RLE+(NON-BREAKING HYPHEN)+PDF (see note below).
On second thought, what's wrong with a plain U+2011 (NON-BREAKING HYPHEN)?

  Eran



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Alexander Maryanovsky
At 21:40 03.10.2003 -0400, Behdad Esfahbod wrote:
On Fri, 3 Oct 2003, Alexander Maryanovsky wrote:

 Hi everyone,

 Just wanted to pass my thanks (in the hope they're listening) to anyone and
 everyone involved with OpenOffice's new BiDi support. It's absolutely
 perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and
 numbers, dashes etc. etc. - all work exactly as expected. Just love it.
Col.  Where can I find the patch?

 Alexander (aka Sasha) Maryanovsky.
Just get OpenOffice 1.1 and look for BiDi in the help.

Alexander Maryanovsky.

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Eran Tromer
On 2003/10/03 19:44, Alexander Maryanovsky wrote:

 It's absolutely perfect as far as I can see.
 [...] mixed RTL and numbers, dashes etc. etc. -
 all work exactly as expected.
OOe 1.1 seems to have the usual hebrew-hyphen-number problem
(H-5 renders as H5-), which necessitates typing of the logically 
incorrect H5- and causes bad importing of newer MS Word documents.

  http://www.openoffice.org/issues/show_bug.cgi?id=19848

What's the proper way to handle this? Using hebrew hyphens or 
something of the sorts?

  Eran



=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Shachar Shemesh
Alexander Maryanovsky wrote:

Hi everyone,

Just wanted to pass my thanks (in the hope they're listening) to 
anyone and everyone involved with OpenOffice's new BiDi support. It's 
absolutely perfect as far as I can see. Brackets, mixed RTL and LTR, 
mixed RTL and numbers, dashes etc. etc. - all work exactly as 
expected. Just love it.

Alexander (aka Sasha) Maryanovsky.


Most of the kudus should go to Shoshana Forbes, who did a wonderful job 
of nagging the Sun developers into fixing the problems.

 Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page  resume - http://www.shemesh.biz/


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Shachar Shemesh
Eran Tromer wrote:

OOe 1.1 seems to have the usual hebrew-hyphen-number problem
(H-5 renders as H5-), which necessitates typing of the logically 
incorrect H5- and causes bad importing of newer MS Word documents.

  http://www.openoffice.org/issues/show_bug.cgi?id=19848

What's the proper way to handle this? Using hebrew hyphens or 
something of the sorts?

  Eran
I'm not sure how to tackle this particular problem. I think the best 
place to fix it would be at the root of the problem - the Unicode BiDi 
algorithm. I *think* I have a reasonably portable solution to this issue.

I guess it's time to register with another forum

I'll also note that the entire BiDi editing experience is still awaiting 
revamping. That is work that is supposed to happen with Sun directly, 
possibly involving other parties as well. I promised to write a paper, 
but it is falling behind. Sorry everybody.

Once I have something to show, I'll be sure to let everyone (here, 
whatsup, ivrix, arabeyes) know so that proper feedback will be 
processed. We'll try to get it accepted as a standard, so that everyone 
gets to do Hebrew editing in a consistant manner.

Shachar
 Shachar
--
Shachar Shemesh
Open Source integration consultant
Home page  resume - http://www.shemesh.biz/


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Ilya Konstantinov
On Sat, Oct 04, 2003 at 02:44:05PM +0300, Eran Tromer wrote:
 On 2003/10/03 19:44, Alexander Maryanovsky wrote:
 
  It's absolutely perfect as far as I can see.
  [...] mixed RTL and numbers, dashes etc. etc. -
  all work exactly as expected.
 
 OOe 1.1 seems to have the usual hebrew-hyphen-number problem
 (H-5 renders as H5-), which necessitates typing of the logically 
 incorrect H5- and causes bad importing of newer MS Word documents.
 
   http://www.openoffice.org/issues/show_bug.cgi?id=19848
 
 What's the proper way to handle this? Using hebrew hyphens or 
 something of the sorts?

IMO, the correct way of handling it is MS Word-like auto-correction.
The word processor should detect patterns like HYPEN-MINUS BETWEEN
WORDS and change it to a UNICODE DASH character, HYPHEN-MINUS BETWEEN
NUMBERS and change it to a UNICODE MINUS character. Not all patterns
have a single solution, so it requires some thinking through. I think
it's the Right Way(tm).

Another solution is to see the keyboard language and add direction
control characters.

=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-04 Thread John Rabkin
On Fri, Oct 03, 2003 at 06:44:38PM +0200, Alexander Maryanovsky wrote:
 Hi everyone,
 
 Just wanted to pass my thanks (in the hope they're listening) to anyone and 
 everyone involved with OpenOffice's new BiDi support. It's absolutely 
 perfect as far as I can see. Brackets, mixed RTL and LTR, mixed RTL and 
 numbers, dashes etc. etc. - all work exactly as expected. Just love it.
 
 
 Alexander (aka Sasha) Maryanovsky.
 
 
 =
 To unsubscribe, send mail to [EMAIL PROTECTED] with
 the word unsubscribe in the message body, e.g., run the command
 echo unsubscribe | mail [EMAIL PROTECTED]
 

I've just today installed OpenOffice with Hebrew for the wife and must
say thank you to all that had a hand in this.

-- 
Cut your own wood and it will warm you twice
Regards, Yoni Rabkin


pgp0.pgp
Description: PGP signature


Re: OpenOffice BiDi kudos

2003-10-04 Thread Tzafrir Cohen
On Sat, 4 Oct 2003, Eran Tromer wrote:

 On 2003/10/03 19:44, Alexander Maryanovsky wrote:

   It's absolutely perfect as far as I can see.
   [...] mixed RTL and numbers, dashes etc. etc. -
   all work exactly as expected.

 OOe 1.1 seems to have the usual hebrew-hyphen-number problem
 (H-5 renders as H5-), which necessitates typing of the logically
 incorrect H5- and causes bad importing of newer MS Word documents.

http://www.openoffice.org/issues/show_bug.cgi?id=19848

 What's the proper way to handle this? Using hebrew hyphens or
 something of the sorts?

In addition to what Ilya wrote:

It is not a bug. It a feature (standard conformance). Anyway, try the
keyboard variant lyx (available on XFree = 4.3, 'setxkbmap -variant
,lyx us,il' ), and then press shift-y to get an RLM character. Type one
after the minus.

Better still: shift-minus should give a hyphen on that variant. This
avoids the problem in the first place.

-- 
Tzafrir Cohen
mailto:[EMAIL PROTECTED]
http://www.technion.ac.il/~tzafrir


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-04 Thread Shachar Shemesh
Tzafrir Cohen wrote:

On Sat, 4 Oct 2003, Eran Tromer wrote:

 

On 2003/10/03 19:44, Alexander Maryanovsky wrote:

 It's absolutely perfect as far as I can see.
 [...] mixed RTL and numbers, dashes etc. etc. -
 all work exactly as expected.
OOe 1.1 seems to have the usual hebrew-hyphen-number problem
(H-5 renders as H5-), which necessitates typing of the logically
incorrect H5- and causes bad importing of newer MS Word documents.
  http://www.openoffice.org/issues/show_bug.cgi?id=19848

What's the proper way to handle this? Using hebrew hyphens or
something of the sorts?
   

In addition to what Ilya wrote:

It is not a bug. It a feature (standard conformance).

Well, maybe it's a feature of OpenOffice. It's still a bug in the standard.

Being as it is that there is no legacy way of producing a hyphen, the 
Unicode standard must accept that minus is used instead. Any attempt to 
insist on it being correct in the face of real life is simply absurd.

If standard purity was what Unicode mandated, they should have only 
defined 22 characters for Hebrew, like they did for Arabic (28). Put the 
final forms in some god-forsaken place, and have the display engine 
render them. They didn't do that, because real life dictates the fact 
that, practically, people use 27 characters for Hebrew, and changing 
that would break far too many existing applications.

Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page  resume - http://www.shemesh.biz/


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Omer Zak

On Sat, 4 Oct 2003, Shachar Shemesh wrote:

 If standard purity was what Unicode mandated, they should have only
 defined 22 characters for Hebrew, like they did for Arabic (28). Put the
 final forms in some god-forsaken place, and have the display engine
 render them. They didn't do that, because real life dictates the fact
 that, practically, people use 27 characters for Hebrew, and changing
 that would break far too many existing applications.

Correction:
Hebrew is not strict about using final forms for characters at end of
words.  Examples:  abbreviations (roshei teivot) and words with special
pronounciation (Mubarak is an example which I have on mind).

The hyphen problem is not really a problem in the Unicode standard, but of
the text editors which do not automatically convert minus sign into hyphen
when the context expects it.
 --- Omer
My opinions, as expressed in this E-mail message, are mine alone.
They do not represent the official policy of any organization with which
I may be affiliated in any way.
WARNING TO SPAMMERS:  at http://www.zak.co.il/spamwarning.html


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]



Re: OpenOffice BiDi kudos

2003-10-04 Thread Shachar Shemesh
Omer Zak wrote:

On Sat, 4 Oct 2003, Shachar Shemesh wrote:

 

If standard purity was what Unicode mandated, they should have only
defined 22 characters for Hebrew, like they did for Arabic (28). Put the
final forms in some god-forsaken place, and have the display engine
render them. They didn't do that, because real life dictates the fact
that, practically, people use 27 characters for Hebrew, and changing
that would break far too many existing applications.
   

Correction:
Hebrew is not strict about using final forms for characters at end of
words.  Examples:  abbreviations (roshei teivot) and words with special
pronounciation (Mubarak is an example which I have on mind).
The hyphen problem is not really a problem in the Unicode standard, but of
the text editors which do not automatically convert minus sign into hyphen
when the context expects it.
 

AND legacy text, AND existing implementations, AND the fact that the 
hyphen doesn't actually appear in Unicode's Hebrew encoding, etc. etc. 
It is gracious enough to offer us RLM and LRM, but not, say, 
RLE/LRE/PDF. This forces us to use a modifier where the standard claims 
we should use the right character.

The ISO-8859-8 is, I think, a badly constructed codepage. What's the 
point of providing a Yen, pound and cent symbols, but not NIS? I'm 
sorry, Windows-1255 is actually better.

 Shachar

--
Shachar Shemesh
Open Source integration consultant
Home page  resume - http://www.shemesh.biz/


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]


Re: OpenOffice BiDi kudos

2003-10-04 Thread Shlomi Fish
On Sat, 4 Oct 2003, Shachar Shemesh wrote:

 Alexander Maryanovsky wrote:

  Hi everyone,
 
  Just wanted to pass my thanks (in the hope they're listening) to
  anyone and everyone involved with OpenOffice's new BiDi support. It's
  absolutely perfect as far as I can see. Brackets, mixed RTL and LTR,
  mixed RTL and numbers, dashes etc. etc. - all work exactly as
  expected. Just love it.
 
 
  Alexander (aka Sasha) Maryanovsky.
 
 
 Most of the kudus should go to Shoshana Forbes, who did a wonderful job
 of nagging the Sun developers into fixing the problems.


And to the Sun developers for actually fixing them. ;-)

Regards,

Shlomi Fish


 Shachar

 --
 Shachar Shemesh
 Open Source integration consultant
 Home page  resume - http://www.shemesh.biz/



 =
 To unsubscribe, send mail to [EMAIL PROTECTED] with
 the word unsubscribe in the message body, e.g., run the command
 echo unsubscribe | mail [EMAIL PROTECTED]




--
Shlomi Fish[EMAIL PROTECTED]
Home Page: http://t2.technion.ac.il/~shlomif/

Writing a BitKeeper replacement is probably easier at this point than getting
its license changed.

Matt Mackall on OFTC.net #offtopic.


=
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word unsubscribe in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]