Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-05-02 Thread Philippe Verdy
The email was sent from Gmail on its webmail, French edition. May be Gmail is causing this, this is not expected and I don't know why Gmail transforms the text to ISO 8859-1 (without breaking the text without notice, it could had used windows-1252, which has completely superseded ISO 8859-1 along

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-05-01 Thread Richard Wordingham
On Thu, 24 Apr 2014 17:19:57 -0700 Asmus Freytag asm...@ix.netcom.com wrote: On this side show, Philippe finally is correct, because I received his message without ASCII-i-fication; he cc'd me directly, and I never saw the mangled text. It's a bit embarassing for a Unicode mail list to not

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-05-01 Thread Asmus Freytag
This has seen off-line discussion with the mail manager and we're good. A./ On 5/1/2014 3:44 PM, Richard Wordingham wrote: On Thu, 24 Apr 2014 17:19:57 -0700 Asmus Freytag asm...@ix.netcom.com wrote: On this side show, Philippe finally is correct, because I received his message without

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-25 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 06:15:44PM -0700, Asmus Freytag wrote: On 4/23/2014 4:41 PM, Ilya Zakharevich wrote: GREED) Given any close-delimiter marked as “non-matching”, its pre-context does not contain any open-delimiter which could match it. Here pre-context

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Asmus Freytag
On 4/23/2014 7:37 PM, Philippe Verdy wrote: Thanks for the clear reply, now I know that my example in a prior message would work appropriately with UBA: This is an [«] ARABIC EXAMPLE [»] for demonstration only. Because: - the opening guillemet is not stripped out of the context stack when

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Eli Zaretskii
Date: Thu, 24 Apr 2014 00:28:50 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: k...@unicode.org, Eli Zaretskii e...@gnu.org, James Clark j...@jclark.com, unicode Unicode Discussion unicode@unicode.org On 4/23/2014 7:37 PM, Philippe Verdy wrote: Thanks for the clear reply, now I

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Philippe Verdy
2014-04-24 16:39 GMT+02:00 Eli Zaretskii e...@gnu.org: In addition, assuming that by guillemets Philippe means U+00AB and U+00BB, guillemet is THE correct name, even in English. guillemot comes from an old typo error. If you don't want this term in Engmish you can still use double angle

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Eli Zaretskii
From: Philippe Verdy verd...@wanadoo.fr Date: Thu, 24 Apr 2014 17:11:23 +0200 Cc: Asmus Freytag asm...@ix.netcom.com, Ilya Zakharevich nospam-ab...@ilyaz.org, k...@unicode.org, James Clark j...@jclark.com, unicode Unicode Discussion unicode@unicode.org In addition, assuming that

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Asmus Freytag
On 4/24/2014 8:20 AM, Eli Zaretskii wrote: So nothing (at least not the reason of the GC which is just an intermediate but incomplete helper) forbids the guillemets to be listed in BidiBrackets.txt. They don't satisfy the conditions for that. From BidiBrackets.txt: Philippe is incorrect once

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Asmus Freytag
On 4/24/2014 7:39 AM, Eli Zaretskii wrote: This is _*incorrect*_, see the text in blue/bold in the definition copied below. The second bullet in item 3 of the second second-level bullet of the third top-level bullet of BD16 clearly says that all elements that are above the matched element are

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Philippe Verdy
2014-04-24 17:20 GMT+02:00 Eli Zaretskii e...@gnu.org: From: Philippe Verdy verd...@wanadoo.fr Date: Thu, 24 Apr 2014 17:11:23 +0200 Cc: Asmus Freytag asm...@ix.netcom.com, Ilya Zakharevich nospam-ab...@ilyaz.org, k...@unicode.org, James Clark j...@jclark.com, unicode Unicode

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Doug Ewell
Re: Unclear text in the UBA (UAX#9) of Unicode 6.3 Philippe Verdy verdy underscore p at wanadoo dot fr wrote: [...] And at least your original message used and transliterations, not the actual characters. No I used the «» characters exacvtly like here. I absolutely never use the ASCII

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-24 Thread Asmus Freytag
on tracking this side issue. A./ On 4/24/2014 12:41 PM, Doug Ewell wrote: Re: Unclear text in the UBA (UAX#9) of Unicode 6.3 Philippe Verdy verdy underscore p at wanadoo dot fr wrote: [...] And at least your original message used and transliterations, not the actual characters. No I used

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Tue, Apr 22, 2014 at 09:06:27AM -0700, Asmus Freytag wrote: if you read UAX#9, the way the algorithm works is by pushing openers on a stack, then, on finding the first closer, going down the stack and attempting to locate a match, then, on finding a match, discarding any enclosed openers,

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Eli Zaretskii
Date: Tue, 22 Apr 2014 14:17:44 -0700 From: Ilya Zakharevich nospam-ab...@ilyaz.org Cc: asm...@ix.netcom.com, verd...@wanadoo.fr, k...@unicode.org, unicode@unicode.org, j...@jclark.com On Tue, Apr 22, 2014 at 07:08:56PM +0300, Eli Zaretskii wrote: Sorry, I do not see any definition

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Eli Zaretskii
Date: Wed, 23 Apr 2014 00:35:02 -0700 From: Ilya Zakharevich nospam-ab...@ilyaz.org Cc: Eli Zaretskii e...@gnu.org, k...@unicode.org, unicode Unicode Discussion unicode@unicode.org, James Clark j...@jclark.com On Tue, Apr 22, 2014 at 09:06:27AM -0700, Asmus Freytag wrote: if you read

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Asmus Freytag
On 4/23/2014 12:35 AM, Ilya Zakharevich wrote: On Tue, Apr 22, 2014 at 09:06:27AM -0700, Asmus Freytag wrote: if you read UAX#9, the way the algorithm works is by pushing openers on a stack, then, on finding the first closer, going down the stack and attempting to locate a match, then, on

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 09:21:04AM -0700, Asmus Freytag wrote: a parsing is good if it satisfies all conditions below: 0) Some delimiters in the string are marked as “non-matching”; the rest is broken into disjoint “matched” pairs; MATCH) A “matched” pair consists of an

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Ilya Zakharevich
On Wed, Apr 23, 2014 at 06:25:53PM +0300, Eli Zaretskii wrote: I see nothing in your definition that is significantly different from our attempts. It does feel more complex, mainly because you have much more conditions, combining which in one's mind might not be easy at first reading.

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Asmus Freytag
On 4/23/2014 4:41 PM, Ilya Zakharevich wrote: On Wed, Apr 23, 2014 at 09:21:04AM -0700, Asmus Freytag wrote: a parsing is good if it satisfies all conditions below: 0) Some delimiters in the string are marked as “non-matching”; the rest is broken into disjoint “matched” pairs;

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-23 Thread Philippe Verdy
Thanks for the clear reply, now I know that my example in a prior message would work appropriately with UBA: This is an [«] ARABIC EXAMPLE [»] for demonstration only. Because: - the opening guillemet is not stripped out of the context stack when the first closing bracket is matched with the

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Asmus Freytag
On 4/21/2014 8:32 PM, Ilya Zakharevich wrote: On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote: Here's the text I supplied, with numbers added for discussion. It definitely needs some editing, but the point of the exercise would be to see what: 1. A bracket pair is a pair of

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Mark Davis ☕️
We try not to do that. There are some known holes, like RBNF. if you know of others please file a ticket. {phone} On Apr 21, 2014 9:18 PM, Doug Ewell d...@ewellic.org wrote: From: Asmus Freytag asmusf at ix dot netcom dot com wrote: In general, I heartily dislike specifications that just

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 11:25:05PM -0700, Asmus Freytag wrote: On 4/21/2014 8:32 PM, Ilya Zakharevich wrote: On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote: Here's the text I supplied, with numbers added for discussion. It definitely needs some editing, but the point of the

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Eli Zaretskii
Date: Mon, 21 Apr 2014 23:25:05 -0700 From: Asmus Freytag asm...@ix.netcom.com Cc: verd...@wanadoo.fr, k...@unicode.org, Eli Zaretskii e...@gnu.org, James Clark j...@jclark.com, unicode Unicode Discussion unicode@unicode.org And I think I can even invent an example which I

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Asmus Freytag
On 4/22/2014 2:19 AM, Ilya Zakharevich wrote: I think the crucial problem is with 1( 2[ 3( 4] 5) 5b] 6) I have two possible interpretations: one matches 2 with 5b, another leaves 2 unmatched. Ilya, if you read UAX#9, the way the algorithm works is by pushing openers on a stack,

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Eli Zaretskii
Date: Mon, 21 Apr 2014 20:32:15 -0700 From: Ilya Zakharevich nospam-ab...@ilyaz.org Cc: verd...@wanadoo.fr, k...@unicode.org, Eli Zaretskii e...@gnu.org, unicode Unicode Discussion unicode@unicode.org, James Clark j...@jclark.com Sorry, I do not see any definition here. Just a

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Asmus Freytag
On 4/22/2014 9:02 AM, Eli Zaretskii wrote: an resolve it, so we match 1) and 6). But that's wrong, isn't it? Yes, brain fart. I agree, but let me try to say the same more concisely: A bracket pair is a pair of an opening paired bracket and a closing paired bracket characters

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Eli Zaretskii
Date: Tue, 22 Apr 2014 09:06:27 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: Eli Zaretskii e...@gnu.org, k...@unicode.org, unicode Unicode Discussion unicode@unicode.org, James Clark j...@jclark.com I believe that your scheme does not match the PBA in that it assumes that

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Eli Zaretskii
Date: Tue, 22 Apr 2014 09:52:43 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: nospam-ab...@ilyaz.org, verd...@wanadoo.fr, k...@unicode.org, j...@jclark.com, unicode@unicode.org I agree, but let me try to say the same more concisely: A bracket pair is a pair of an opening

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Asmus Freytag
On 4/22/2014 10:11 AM, Eli Zaretskii wrote: Date: Tue, 22 Apr 2014 09:52:43 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: nospam-ab...@ilyaz.org, verd...@wanadoo.fr, k...@unicode.org, j...@jclark.com, unicode@unicode.org I agree, but let me try to say the same more concisely: A

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Ilya Zakharevich
On Tue, Apr 22, 2014 at 07:08:56PM +0300, Eli Zaretskii wrote: Sorry, I do not see any definition here. Just a collection of words which looks like a definition, but only locally… Any definition is just a collection of words, of course. Can you tell what is missing from this collection

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-22 Thread Asmus Freytag
On 4/22/2014 2:17 PM, Ilya Zakharevich wrote: On Tue, Apr 22, 2014 at 07:08:56PM +0300, Eli Zaretskii wrote: Sorry, I do not see any definition here. Just a collection of words which looks like a definition, but only locally… Any definition is just a collection of words, of course. Can you

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/20/2014 6:54 PM, James Clark wrote: On Mon, Apr 21, 2014 at 2:58 AM, Asmus Freytag asm...@ix.netcom.com mailto:asm...@ix.netcom.com wrote: On 4/20/2014 3:24 AM, Eli Zaretskii wrote: Would someone please help understand the following subtleties and obscure language in the UBA

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Eli Zaretskii
Date: Sun, 20 Apr 2014 12:58:23 -0700 From: Asmus Freytag asm...@ix.netcom.com On 4/20/2014 3:24 AM, Eli Zaretskii wrote: Would someone please help understand the following subtleties and obscure language in the UBA document found at http://www.unicode.org/reports/tr9/? Thanks in

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Eli Zaretskii
From: James Clark j...@jclark.com Date: Mon, 21 Apr 2014 08:54:34 +0700 Cc: Eli Zaretskii e...@gnu.org, unicode@unicode.org, Kenneth Whistler k...@unicode.org X6. For all types besides B, BN, RLE, LRE, RLO, LRO, PDF, RLI, LRI, FSI, and PDI: . Set the current

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Eli Zaretskii
Date: Sun, 20 Apr 2014 23:03:20 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: Eli Zaretskii e...@gnu.org, unicode@unicode.org, Kenneth Whistler k...@unicode.org Note that the current embedding level is not changed by this rule. What does this last sentence mean by

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 1:33 AM, Eli Zaretskii wrote: Date: Sun, 20 Apr 2014 23:03:20 -0700 From: Asmus Freytag asm...@ix.netcom.com CC: Eli Zaretskii e...@gnu.org, unicode@unicode.org, Kenneth Whistler k...@unicode.org Note that the current embedding level is not changed by this rule.

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 12:55 AM, Eli Zaretskii wrote: in some places, I concur with you that the wording could be improved and that such improved wording should be proposed to the UTC (or its editorial committee) for incorporation into a future update. How do we do that? You file a problem report using

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Philippe Verdy
There are some cases where these rules will not be clear enough. Look at the following where overlaps do occur; but directionality still matters: This is an [] example [] for demonstration only. There are two parsings possible if you just consider a hierarchic layout where overlaps are disabled:

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
Philippe, I fail to understand how your post contributes to the topic. The issue was unclear wording of the specification, not deficiencies in the UBA or the PBA in general. Let's keep this discussion limited to issues of wording for the *existing* specification. Feel free to start a new

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Doug Ewell
From: Asmus Freytag asmusf at ix dot netcom dot com wrote: In general, I heartily dislike specifications that just narrate a particular implementation... I agree completely. I see this with CLDR as well; there is a more or less implicit assumption that I will be using ICU to implement whatever

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Philippe Verdy
It is on topic because the proposed description attempts to explain how paired brackets should match and how this witll then affect the rendering in bidirectional contexts. This is exactly the kind of things that are difficult because the proposed description assumes that paired brackets are

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 11:23 AM, Philippe Verdy wrote: It is on topic because the proposed description attempts to explain how paired brackets should match and how this witll then affect the rendering in bidirectional contexts. This is exactly the kind of things that are difficult because the proposed

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 11:14 AM, Doug Ewell wrote: From: Asmus Freytag asmusf at ix dot netcom dot com wrote: In general, I heartily dislike specifications that just narrate a particular implementation... I agree completely. I see this with CLDR as well; there is a more or less implicit assumption that

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Philippe Verdy
My intent was not to demonstrate a bug in the algorithm, I have not even claimed that, but to make sure that (less common) usages of paired brackets that do not obey to a pure hierarchy (because these notations use different type of brackets, they are not ambiguous) but still preserve their left

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 1:54 PM, Philippe Verdy wrote: My intent was not to demonstrate a bug in the algorithm, I have not even claimed that, but to make sure that (less common) usages of paired brackets that do not obey to a pure hierarchy (because these notations use different type of brackets, they

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 02:44:14PM -0700, Asmus Freytag wrote: On 4/21/2014 1:54 PM, Philippe Verdy wrote: My intent was not to demonstrate a bug in the algorithm, I have not even claimed that, but to make sure that (less common) usages of paired brackets that do not obey to a pure hierarchy

RE: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Whistler, Ken
Ilya noted: [Below, I completely ignore BIDI part of the specification, and concentrate ONLY on the parens match. I do not understand why this question is interlaced with BIDI determination; I trust that it is.] Actually, it is, because the bracket-matching is really only

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
Ilya, I appreciate your taking the time to take apart Philippe's message. That aspect of it was not obvious to me. A./ PS: more comments below On 4/21/2014 4:41 PM, Ilya Zakharevich wrote: On Mon, Apr 21, 2014 at 02:44:14PM -0700, Asmus Freytag wrote: On 4/21/2014 1:54 PM, Philippe Verdy

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Asmus Freytag
On 4/21/2014 5:44 PM, Whistler, Ken wrote: So one may ask: what will be the result of the CURRENT UNICODE parsing applied to Phillipe’s example? This is an [«] example [»] for demonstration only. That is easily answered. Let's crank up the bidi reference code with a shorter example

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-21 Thread Ilya Zakharevich
On Mon, Apr 21, 2014 at 06:08:12PM -0700, Asmus Freytag wrote: Here's the text I supplied, with numbers added for discussion. It definitely needs some editing, but the point of the exercise would be to see what: 1. A bracket pair is a pair of characters consisting of an opening

Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-20 Thread Eli Zaretskii
Would someone please help understand the following subtleties and obscure language in the UBA document found at http://www.unicode.org/reports/tr9/? Thanks in advance. 1. In paragraph 3.1.2, near its very end, we have this sentence (with my emphasis): As rule X10 will specify, an isolating

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-20 Thread Asmus Freytag
On 4/20/2014 3:24 AM, Eli Zaretskii wrote: Would someone please help understand the following subtleties and obscure language in the UBA document found at http://www.unicode.org/reports/tr9/? Thanks in advance. Eli, I've tried to give you some explanations - in some places, I concur with

Re: Unclear text in the UBA (UAX#9) of Unicode 6.3

2014-04-20 Thread James Clark
On Mon, Apr 21, 2014 at 2:58 AM, Asmus Freytag asm...@ix.netcom.com wrote: On 4/20/2014 3:24 AM, Eli Zaretskii wrote: Would someone please help understand the following subtleties and obscure language in the UBA document found athttp://www.unicode.org/reports/tr9/? Thanks in advance.