On 7 May 2015 at 02:07, Ross Moore ross.mo...@mq.edu.au wrote:
Hi David,
..
No disagreement to this.
OK:-)
In the current versions d835dc00 is two characters in luatex
and one character in xetex
as the implementation detail that xetex's underlying storage is mostly
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
points above 65536: a pair of them makes up one Unicode character;
however they're not meant to be
On 6 May 2015 at 23:04, Arthur Reutenauer
arthur.reutena...@normalesup.org wrote:
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
points above
The character itself, as bytes that is, is not wrong and users should be able
to create these.
But preferably through macros that ensure that they come correctly paired.
placing two character tokens representing a surrogate pair should not
though magically turn itself
into a single character.
Hi Arthur,
On 07/05/2015, at 8:04, Arthur Reutenauer arthur.reutena...@normalesup.org
wrote:
While working on these bugs, we also discussed how surrogate
characters were handled in XeTeX. Surrogate characters are the 2048
code points that are used in UTF-16 to encode characters with code
Hi David,
On 07/05/2015, at 9:26 AM, David Carlisle wrote:
The character itself, as bytes that is, is not wrong and users should be
able to create these.
But preferably through macros that ensure that they come correctly paired.
placing two character tokens representing a surrogate pair
On 4 May 2015 at 16:27, Jonathan Kew jfkth...@gmail.com wrote:
...
A fix for this bug, so that \string generates single Unicode characters
even for values above U+, is currently on the utf16-issues branch in
the XeTeX repository on sourceforge.[1]
A bug with characters above U+
On 23/4/15 20:59, David Carlisle wrote:
I can confirm that \string does convert character tokens
to two tokens giving the UTF-16 representation.
With the attached file luatex produces
90,33
34,33
233,33
233,33
65530,33
65537,33
65537,33
which is in each case the unicode value of the