On 11 May 2011, at 23:46, Arthur Reutenauer wrote:

>> That doesn't surprise me; I'd expect you to get the font's .notdef glyph 
>> (which might be a blank space, as in this example, or a box, or some other 
>> symbol).
> 
> Thanks for the explanation, that makes sense.
> 
>> What you want is a character that has a zero-width, invisible glyph; if the 
>> font supports any of the Unicode characters such as ZWNBSP or ZWNJ or WJ or 
>> CGJ, etc., that ought to work.
> 
>  Yes, that's what I thought too, but it doesn't provide a font-independent 
> solution.
> 
>> Or character 13 (CR) is a likely bet, too.
> 
>  Note that Mojca remarked that using character 10 (LF) produced the desired 
> result in that particular font (Abyssinica SIL).  Is there any reason why one 
> would prefer the former over the latter, or why either of these characters 
> would be a safer bet in general?  I would have thought that both of them, 
> being control characters (sort of), would precisely have no glyph in most 
> fonts; after all, who would want to set a glyph for a character that's 
> supposed to indicate the end of a line of text?

Hmm, looking at Microsoft's recommendations[1], it sounds like you should be 
aiming for glyph 1, and character codes that should map to that glyph include 
U+0000 (null), U+0008 (backspace) and U+001D (group separator). They say that 
U+000D (CR) should have a positive advance width (which is not what you want); 
although I think I recall seeing somewhat different recommendations in the 
past, perhaps from Apple.

With U+000A (LF), there's a greater risk that it will map to .notdef and show 
up as a box, I think. This certainly used to be fairly common in TrueType 
fonts, and showed up as boxes at the start of each line when a DOS-originated 
text file with <CRLF> line-ends was loaded into a classic MacOS application 
that treated <CR> alone as the line ending, and didn't filter out the <LF> 
characters.

So to sum up, I think U+0000 "ought" to work if fonts carefully follow the MS 
recommendations; if it doesn't, other control-char codes are worth a try, but 
there's no guarantee that you'll find a universal, font-independent solution.

JK

[1] http://www.microsoft.com/typography/otspec/recom.htm




--------------------------------------------------
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Reply via email to