To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=59576





------- Additional comments from [EMAIL PROTECTED] Fri Jan 20 22:56:53 -0800 
2006 -------
On Linux, you can see the problems in the first three lines of my test case.
(It's better to use Acrobat Reader to test, rather than evince, since evince has
some bugs.)

There are actually three problems:

a)  The first problem you can see in line 2.  In the cut-and-pasted text, the
single SARA AM (OE33) has turned into two SARA AMs.  What happens is that the
ICU layout engine decomposes SARA AM into NIKHAHIT (OE4D) and SARA AA (OE32). 
The glyph to character mapping returned by ICU associates both the NIKHAHIT and
the SARA AA glyphs with SARA AM character.

b) The second problem you can see in line 1.  The last character on the line,
which is SARA A in the PDF has been turned into a SARA AM in the cut-and-pasted
text.  This happens because when the PDF writer implementation sees the SARA AM
character it creates an entry in the font with a glyph SARA AA associated with
Unicode character SARA AM; when it sees the SARA AA character, it reuses the
font entry because it has the same SARA AA glyph, even though this SARA AA glyph
is associated with a SARA AA character.

c) The third problem you can see in line 3.  The MAI THO (OE49) in the PDF has
turned into another SARA AM.  In this case the ICU layout engine decomposes the
SARA AM as before, then it swaps the MAI THO and NIKHAHIT glyphs: the three
characters NO NEN, MAI THO, SARA AM are mapped into four glyphs, NO NEN,
NIKHAHIT, MAI THO, SARA A.  The character to glyph mapping generated by ICU is
[0 2 1 2], in other words it correctly and unambiguously associates the MAI THO
glyph with the MAI THO character.  However, IcuLayoutEngine::operator()
"smooths" this out to [0 2 2 2] as part of its cluster detection heuristics, so
you end up with three SARA AMs.

Note that to make the example in line 3 work properly, when SARA AM is
decomposed, in the PDF the NIKHAHIT glyph should not be associated with
anything, and the SARA AA glyph should be associated with the SARA AM character.





---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to