[sw-issues] [Issue 69945] Accessible text implementa tion appears to use byte offsets rather than c haracter offsets

williewalker Fri, 06 Oct 2006 06:10:44 -0700

To comment on the following update, log in, then open the issue:
http://www.openoffice.org/issues/show_bug.cgi?id=69945






------- Additional comments from [EMAIL PROTECTED] Fri Oct  6 06:10:44 -0700 
2006 -------
Thanks for the explanation of the -1.  It's not something we've come across
before, so it was a bit surprising.  It makes sense now.

With respect to the 'â' character, one needs to dig just a little bit more.  I
converted the decimal values of the bytes in the sample output to hexidecimal,
and then created a string from them in the Python interpreter:

>>> a="\xe2\x97\x8f\x69\x74\x65\x6d"
>>> a
'\xe2\x97\x8fitem'

Now, when I treat this as a UTF-8 string and convert it to unicode:

>>> a.decode("UTF-8")

I get this:

u'\u25cfitem'

When I look up 25CF in my handy dandy "Unicode Standard, Version 2.0" book,
which I usually use as a foot rest, I find that is maps to "BLACK CIRCLE", which
is pretty much what the particular bullet in the test document is.  So...that
looks good.  I probably could have made this conversion easier in the sample
program by doing this:

    print "TEXT AT CARET IS (%s)" % string.decode("UTF-8")

I'm a bit confused about the terms "CWS atkbridge4" and "m186".  How do these
map to what we can get as a download?  


---------------------------------------------------------------------
Please do not reply to this automatically generated notification from
Issue Tracker. Please log onto the website and enter your comments.
http://qa.openoffice.org/issue_handling/project_issues.html#notification

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[sw-issues] [Issue 69945] Accessible text implementa tion appears to use byte offsets rather than c haracter offsets

Reply via email to