6.3.0 Bidi implementation snag

2013-10-18 Thread Loren Brichter
I'm finishing up an implementation of the 6.3.0 Bidi algorithm and hit a snag: 
currently passing all tests but one.

After much careful re-re-reading of the spec, I can't quite figure out where 
this thing is going wrong and would really appreciate some advice, or perhaps 
someone who knows that this test case is in fact incorrect (unlikely as that 
may be).

The particular test case is this:
0661 0009 0028 0662 0029;2;0;2 0 1 2 1;0 1 4 3 2
(line 184 of BidiCharacterTest.txt)

The correct levels are [2 0 1 2 1], however, this implementation resolved the 
levels to be [2 1 1 2 1]. A single incorrect level for the Segment Separator at 
index 1.

Below is a trace of the algorithm (apologies if this is not visible on your end 
in a monospaced font, it is much more readable that way). In stage [N1] the S 
at index 1 becomes R. This seems correct to me because the previous character 
at index 0 is AN (interpreted as R according to the spec) and the following 
character at index 2 is an R. Both preceding and following characters are the 
same direction, so the S becomes an R as well.

At stage [Resolving_Implicit_Levels], the R (previously S) at index 1 has an 
even level (0), so its level is increased by one (0-1). I'm not entirely sure 
why the test has the level staying at 0, unless I missed something else 
completely unrelated to these two stages mentioned.

Any advice would be greatly appreciated, thanks very much!

Loren

---

TRACE:

[Initializaton+The_Paragraph_Level]
   pe: 0
index:01234 
character: 0661 0009 0028 0662 0029 
 matching:----- 
bidiclass:   ANS   ON   AN   ON 
embedding:00000 

[Explicit_Levels_and_Directions]
 matching:----- 
bidiclass:   ANS   ON   AN   ON 
embedding:00000 

[Preparations_for_Implicit_Processing]
 matching:----- 
bidiclass:   ANS   ON   AN   ON 
embedding:00000 

[Resolving_Weak_Types]
 matching:----- 
bidiclass:   ANS   ON   AN   ON 
embedding:00000 

[N0]
 matching:--4-- 
bidiclass:   ANSR   ANR 
embedding:00000 

[N1]
 matching:--4-- 
bidiclass:   ANRR   ANR 
embedding:00000 

[N2]
 matching:--4-- 
bidiclass:   ANRR   ANR 
embedding:00000 

[Resolving_Neutral_and_Isolate_Formatting_Types]
 matching:--4-- 
bidiclass:   ANRR   ANR 
embedding:00000 

[Resolving_Implicit_Levels]
 matching:--4-- 
bidiclass:   ANRR   ANR 
embedding:21121 



Re: 6.3.0 Bidi implementation snag

2013-10-18 Thread Loren Brichter
Ken,

Aha! Embarrassed to admit I completely missed that as I was considering 
reordering in a later stage. Looks like an easy fix, thanks so much for the 
help,

Loren

On Oct 18, 2013, at 2:37 PM, Whistler, Ken ken.whist...@sap.com wrote:

 Loren,
  
 Your implementation is fine through [Resolving_Implicit_Levels]. And rule I1 
 *does* set
 the embedding level of the 0009 from 0 to 1.
  
 What you are missing it that rule L1 then *re*sets the level of 0009 back to 
 the
 paragraph embedding level, i.e. 0. And that is how you get the expected 
 result.
  
 Here is the relevant portion of the corresponding trace output from the 
 bidiref
 implementation.
  
 HTH,
  
 --Ken
  
 Trace: Entering br_UBA_ResolveNeutralsByLevel [N2]
 Current State: 16
   Text:0661 0009 0028 0662 0029
   Bidi_Class:ANRR   ANR
   Levels: 00000
   Runs:LL
  
 Trace: Entering br_UBA_ResolveImplicitLevels [I1, I2]
 Current State: 17
   Text:0661 0009 0028 0662 0029
   Bidi_Class:ANRR   ANR
   Levels: 21121
   Runs:LL
  
 Trace: Entering br_UBA63_ResetWhitespaceLevels [L1]
 Current State: 18
   Text:0661 0009 0028 0662 0029
   Bidi_Class:ANRR   ANR
   Levels: 20121
   Runs:LL
  
  
  
 From: unicode-bou...@unicode.org [mailto:unicode-bou...@unicode.org] On 
 Behalf Of Loren Brichter
 Sent: Friday, October 18, 2013 9:09 AM
 To: unicode@unicode.org
 Subject: 6.3.0 Bidi implementation snag
  
 I'm finishing up an implementation of the 6.3.0 Bidi algorithm and hit a 
 snag: currently passing all tests but one.
  
 After much careful re-re-reading of the spec, I can't quite figure out where 
 this thing is going wrong and would really appreciate some advice, or perhaps 
 someone who knows that this test case is in fact incorrect (unlikely as that 
 may be).
  
 The particular test case is this:
 0661 0009 0028 0662 0029;2;0;2 0 1 2 1;0 1 4 3 2
 (line 184 of BidiCharacterTest.txt)
  
 The correct levels are [2 0 1 2 1], however, this implementation resolved 
 the levels to be [2 1 1 2 1]. A single incorrect level for the Segment 
 Separator at index 1.
  
 Below is a trace of the algorithm (apologies if this is not visible on your 
 end in a monospaced font, it is much more readable that way). In stage [N1] 
 the S at index 1 becomes R. This seems correct to me because the previous 
 character at index 0 is AN (interpreted as R according to the spec) and the 
 following character at index 2 is an R. Both preceding and following 
 characters are the same direction, so the S becomes an R as well.
  
 At stage [Resolving_Implicit_Levels], the R (previously S) at index 1 has an 
 even level (0), so its level is increased by one (0-1). I'm not entirely 
 sure why the test has the level staying at 0, unless I missed something else 
 completely unrelated to these two stages mentioned.
  
 Any advice would be greatly appreciated, thanks very much!
  
 Loren
  
 [Resolving_Neutral_and_Isolate_Formatting_Types]
  matching:--4-- 
 bidiclass:   ANRR   ANR 
 embedding:00000 
  
 [Resolving_Implicit_Levels]
  matching:--4-- 
 bidiclass:   ANRR   ANR 
 embedding:21121