Re: [XeTeX] XeTeX and ignore sub substitution rules

2011-11-29 Thread Khaled Hosny
On Mon, Nov 28, 2011 at 02:10:23PM -0600, msk...@ansuz.sooke.bc.ca wrote:
 Here's a stripped-down example of the problem.  The attached OTF font
 contains rules in the clig feature saying that a b should be replaced
 by a B (i.e. the b is changed to upper case) except when it is followed
 by c.  For greater clarity, the feature file is also attached.  Testing
 in FontForge's metrics window requires me to manually turn on clig
 (which should be on by default) but with the feature turned on, the
 substitution and non-subsitution happen as expected.
 
 When I run the attached .tex file through XeLaTeX with the attached font,
 aba becomes aBa as it should, but abc becomes aBc, whereas
 FontForge leaves it as abc (which I think is correct).  The ignore rule
 doesn't seem to be processed by XeTeX.
 
 Confirmed on a couple of different installations, but I'd be interested to
 hear whether it happens for anyone else.  Apostolos Syropoulos sent me a
 font using ignore rules and reported to work correctly, but I haven't
 finished testing myself whether that one works for me.

The same here, but I get the expected result (aBa, abc) with LuaTeX as
well as HarfBuzz. I tested with another application using ICU layout
engine (fontmatrix) and got the same result as XeTeX, so I think it is
an ICU bug.

Regards,
 Khaled


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Detect, whether a font contains a certain character

2011-11-29 Thread Khaled Hosny
On Mon, Nov 28, 2011 at 09:06:12AM +0100, Heiko Oberdiek wrote:
 On Mon, Nov 28, 2011 at 07:19:48AM +, Jonathan Kew wrote:
 
  On 28 Nov 2011, at 06:59, Heiko Oberdiek wrote:
  
   On Mon, Nov 28, 2011 at 03:07:07PM +1030, Andrew Moschou wrote:
   
   2011/11/28 Zdenek Wagner zdenek.wag...@gmail.com
   
   Put it into an \hbox and measure its width (\wd). If the width is
   zero, the glyph does not exist.
   
   
   If the required glyph doesn't exist, wouldn't this measure the .notdef
   glyph?
   
   No,
  
  Yes, it would (and .notdef may of course have non-zero width).
 
 \catcode`\{=1
 \catcode`\}=2
 \catcode`\^=7
 \showboxdepth=1
 \showboxbreadth=1
 \tracingonline=1
 \font\rm=cmr10\relax
 \rm
 \setbox0=\hbox{\kern1pt018e}
 \showbox0
 \csname @@end\endcsname\end
 
 And where is the inserted .notdef glyph?

.notdef is inserted for native fonts, for TFM fonts (as in your
example), the old TeX behaviour of inserting nothing is retained.

Regards,
 Khaled


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX and ignore sub substitution rules

2011-11-29 Thread mskala
On Tue, 29 Nov 2011, Khaled Hosny wrote:
 The same here, but I get the expected result (aBa, abc) with LuaTeX as
 well as HarfBuzz. I tested with another application using ICU layout
 engine (fontmatrix) and got the same result as XeTeX, so I think it is
 an ICU bug.

On further testing, I don't think Apostolos's font works with XeTeX
either.  It may appear to at first glance, just because the effect of the
ignore sub rules in that font is very subtle, but if I modify the
alternate glyphs to be more obviously different, it's clear that they are
being put in in cases where the rules say they shouldn't.

I think I've also figured out just what XeTeX (presumably ICU) is doing
wrong:  it is failing to move the glyph pointer ahead on a successful
match.  As a result, later rules in the lookup still have the chance to
match again on the output of earlier rules, and ignore sub rules have
no effect.

I remember that you once commented that my Terrible Secret article was
wrong because I'd documented this behaviour, and this explains the
disagreement - I was documenting what I'd observed XeTeX to do, and at the
time, I hadn't tested ignore sub rules and didn't realize that it was
incorrect behaviour by XeTeX and would be a problem for ignore sub
rules.  Some of my own code both in that article and in my actual fonts
depends on the later rules see output of earlier rules behaviour and
will have to be fixed, but there's no help for that; it's more important
to have ignore sub work.

I will attempt to navigate ICU's bug tracking system and submit the bug to
them.  I don't know if XeTeX's practice is to track updates of ICU, though.
Unfortunately, it appears that in the short term I have to not only do
without ignore sub, but also do without later rules seeing the output of
earlier rules, because I need my fonts to work both with widely-deployed
XeTeX and with correct implementations.
-- 
Matthew Skala
msk...@ansuz.sooke.bc.ca People before principles.
http://ansuz.sooke.bc.ca/


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Detect, whether a font contains a certain character

2011-11-29 Thread Heiko Oberdiek
On Tue, Nov 29, 2011 at 07:40:13AM +, Jonathan Kew wrote:

 On 28 Nov 2011, at 08:06, Heiko Oberdiek wrote:
 
  \catcode`\{=1
  \catcode`\}=2
  \catcode`\^=7
  \showboxdepth=1
  \showboxbreadth=1
  \tracingonline=1
  \font\rm=cmr10\relax
  \rm
  \setbox0=\hbox{\kern1pt018e}
  \showbox0
  \csname @@end\endcsname\end
  
  And where is the inserted .notdef glyph?
 
 There won't be one with cmr10: that's a TFM font, so missing chars get 
 dropped, just like in standard TeX. But if \rm is a native truetype/opentype 
 font, it'll be there:
 
 \catcode`\{=1
 \catcode`\}=2
 \catcode`\^=7
 \showboxdepth=1
 \showboxbreadth=1
 \scrollmode
 \tracingonline=1
 \font\rm=Trebuchet MS
 \rm
 \setbox0=\hbox{\kern1pt018e}
 \showbox0
 \showthe\wd0
 \end
 
 --
 
 This is XeTeX, Version 3.1415926-2.3-0.9997.5 (TeX Live 2011)
  restricted \write18 enabled.
 entering extended mode
 (./x.tex
 Missing character: There is no ?? in font Trebuchet MS!
  \box0=
 \hbox(5.45789+0.0)x6.0
 .\kern 1.0
 .\rm ??
 
 ! OK.
 l.11 \showbox0
   
  6.0pt.
 l.12 \showthe\wd0
  
  )
 
 Which tells us that the width of .notdef in Trebuchet MS is 5pt, but tells
 us nothing (from within the document - the Missing character message
 tells us externally, of course) about the presence or absence of U+018E in
 this font.

Thanks for clarifying.

I try to summarize, state of the art for testing the existence
of a glyph is the following algorithm, implemented in the
macro \IfXeTeXTextCharExists. I have added a local
\tracinglostchars=0 to get rid of the warning in the .log file.

\catcode`\{=1
\catcode`\}=2
\catcode`\#=6
\catcode`\^=7
\showboxdepth=1
\showboxbreadth=1
\scrollmode
\tracingonline=1

%%% Begin %%%   
\def\IfXeTeXTextCharExists#1{%
  \begingroup
\long\def\next##1##2{##2}%
% or in LaTeX: \let\next\@secondoftwo
\ifnum\XeTeXfonttype\font0 %
  \ifnum\XeTeXcharglyph`#10 %
\long\def\next##1##2{##1}%
% or in LaTeX: \let\next\@firstoftwo
  \fi
\else
  \setbox0=\hbox{%
\tracinglostchars=0 %
\kern1sp#1%
\expandafter
  }%
  \ifnum\lastkern=1 %
  \else
\long\def\next##1##2{##1}%
% or in LaTeX: \let\next\@firstoftwo
  \fi
\fi
  \expandafter\endgroup
  \next
}
%%% End %%%

\def\Test#1#2{%
  \begingroup
\font\test=#1\relax
\test
\IfXeTeXTextCharExists{#2}{%
  \immediate\write16{YES (\detokenize{#1/#2})}%
}{%
  \immediate\write16{NO (\detokenize{#1/#2})}%
}%
  \endgroup
}
\Test{Trebuchet MS}{A}
\Test{cmr10}{A}
\Test{Trebuchet MS}{018e}
\Test{cmr10}{018e}

\end

  the problem is rather that a existing glyph can have width zero
  (not likely in your case)

The algorithm doesn't look for the width, that avoids that problem.

 and that there is a warning in the .log file.

Solved by a local setting of \tracinglostchars=0.

  Or what do you suggest for a general test of glyph existence?
 
 For native Unicode fonts, as I said, use \XeTeXcharglyph.

I agree, see above.

 For TFM fonts, I
 don't think the question is particularly interesting or worthwhile.

 TFM fonts do not have a standard encoding, so querying them for a
 particular character code is meaningless - you have to know the encoding
 of the font you're using in order to do anything useful with it, in which
 case you should already know what characters it supports.

If TFM fonts are used, then the encoding and character code has to
be known. But that does not answer the question whether the character
is available in the font. There are incomplete fonts, see the
subencodings of TS1.
  Testing this revealed another general glyph test problem: A font might
not support a glyph, but provide a funny replacement instead,
thus at TeX level this cannot be detected, because the glyph exists.

Yours sincerely
  Heiko Oberdiek


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color (was Re: XETEX cannot access OpenType features in PUA?)

2011-11-29 Thread Aleksandr Andreev
Ross Moore writes:

 Would you be so kind as to post the PDF from this? And where does one obtain 
 the font MezenetsUnicode ?

Mezenets Unicode is a font I'm developing for Znamenny neumatic
notation and it is available here:
http://www.ponomar.net/files/mezen_uni.ttf

Attempting to encode Znamenny Notation in Unicode seems to be a
mind-boggling task because each character is able to take several
diacritical marks, each of which is governed by its own position rules
and each of which can have its own color.

The attached image shows what I am and am not able to do right now in
XeTeX and LuaTeX.

The code was:

\documentclass{article}
\usepackage{fontspec}
\usepackage{xcolor}
\usepackage{xunicode}
\usepackage{luacolor}
\newfontface\moo{MezenetsUnicode}
\begin{document}
\Huge
\moo
 \\
\textcolor{red}{} \\
\textcolor{red}{} \\
\textcolor{red}{} \\
\textcolor{red}{} \\
 \\
\end{document}

In XeTeX I am using the color package instead of luacolor.

Now, I realize that comparing TeX to Firefox is like apples to
oranges, but I'm just giving the Firefox example to show what is
supposed to happen.

Aleks
attachment: znamenny.png

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color (was Re: XETEX cannot access OpenType features in PUA?)

2011-11-29 Thread Philip TAYLOR



Aleksandr Andreev wrote:


Mezenets Unicode is a font I'm developing for Znamenny neumatic notation


Oohhh, this is exciting : [p]nuematic notation as in [p]neumes
and as in [p]neumatic music ?  Will this be a first for TeX, if
you succeed ?

Philip Taylor


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-29 Thread Heiko Oberdiek
Hi Ross,

On Wed, Nov 30, 2011 at 06:23:54AM +1100, Ross Moore wrote:

 Hi Heiko,
 
 On 29/11/2011, at 9:29 AM, Heiko Oberdiek wrote:
 
  The same also works in XeLaTeX:
  
  \documentclass{minimal}
  \usepackage{fontspec}
  \usepackage{color}
  \pagestyle{empty}
  
  \begin{document}
  \fontsize{100pt}{100pt}\selectfont
  \noindent
  00e4\\
  a0308\\
  a\textcolor{red}{0308}
  \end{document}
  
   stream
q 1 0 0 1 72 769.89 cm 0 G 0 g 0 G 0 g 0 G 0 g BT /F1 99.626 Tf 0 -63.86
   Td[00a0]TJ 0 -99.63 Td[001c00ee]TJ 0 -99.62 Td[001c]TJ ET 1 0 0 RG 1
   0 0 rg BT /F1 99.626 Tf 45.73 -263.11 Td[00ee]TJ ET 0 G 0 g 0 G 0 g 0 G
   0 g Q
  
   endstream
 
 
 Yes, this is OK with lower-case letters.
 But try with uppers:
 
 \begin{document}
 \fontsize{100pt}{100pt}\selectfont
 \noindent
 00e4\\
 a0308\\
 a\textcolor{red}{0308}\a
 U\textcolor{red}{0308}\U
 
 \end{document}

or
  00c4\\
  A0308\\
  A\textcolor{red}{0308}

As you can see, this problem is not related to color,
both XeTeX and LuaTeX fail:

 Both the vertical and horizontal position are now wrong.

The Unicode input neither seems to be normalized in XeTeX or LuaTeX
nor the accent seems to know its base letter. Manual fixing
of the accent position remains as (only?) option.

Yours sincerely
  Heiko Oberdiek


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-29 Thread Heiko Oberdiek
On Wed, Nov 30, 2011 at 10:58:23AM +1100, Ross Moore wrote:

 On 30/11/2011, at 10:32 AM, Heiko Oberdiek wrote:
 
  or
   00c4\\
   A0308\\
   A\textcolor{red}{0308}
  
  As you can see, this problem is not related to color,
  both XeTeX and LuaTeX fail:
 
 With this font (Latin Modern) yes. I noticed this too.
 
 But try switching the font:
 
 \documentclass{minimal}
 \usepackage{fontspec}
 \usepackage{color}
 \pagestyle{empty}
 
 \setmainfont{Charis SIL}
 
 \begin{document}
 \fontsize{100pt}{100pt}\selectfont
 \noindent
 00e4\\
 a0308\\
 a\textcolor{red}{0308}\a
 \hbox{U0308} U\textcolor{red}{0308}\U
 
  00c4\\
  A0308\\
  A\textcolor{red}{0308}
 
 \end{document}
 
 
 Now it *does* depend upon having the color commands.

\documentclass{minimal}
\usepackage{ifluatex}
\usepackage{fontspec}
\ifluatex
  \usepackage{luacolor}
  \pdfobjcompresslevel=0
  \pdfcompresslevel=0
\else
  \usepackage{color}
\fi

\pagestyle{empty}

\setmainfont{CharisSIL-R.ttf}

\begin{document}
\fontsize{100pt}{100pt}\selectfont
\noindent
00c4\\ % (1)
A0308\\ % (2)
A\textcolor{red}{0308}\\ % (3)
A\textcolor{red}{\hbox{0308}} % (4)
\end{document}

* LuaTeX: A + U+0308 gets combined to one glyph U+00C4, the
  color attribute of the diaeresis vanishes and the result is
  black (3). In the last line (4) \hbox prevents the recombination
  and the diaeresis is red, but misplaced.
* XeTeX: The color special prevents the glyph recombination and
  U+0308 is processed separately without knowing the base character
  (3 and 4).

Thus the workaround would be to prevent the recombination and
to fix the placement manually.

Yours sincerely
  Heiko Oberdiek


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-29 Thread Aleksandr Andreev
Heiko Oberdiek writes:

 * LuaTeX: A + U+0308 gets combined to one glyph U+00C4, the color 
 attribute of the diaeresis vanishes and the result is black (3).

On my machine, in LuaTeX (3) results in a correctly positioned *red*
diaeresis over a black A. (4) results in a red diaeresis with
incorrect positioning, both vertical and horizontal. I'm running
Ubuntu 11.10 and TexLive 2011.

 XeTeX: The color special prevents the glyph recombination and U+0308 is 
 processed separately without knowing the base character (3 and 4)

Yes, I can confirm this also.

 Thus the workaround would be to prevent the recombination and to fix the 
 placement manually.

Would it be possible to read the positioning data out of GPOS in order
to do this manual placement? It could then be handled by a macro.

Aleks


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-29 Thread Khaled Hosny
On Wed, Nov 30, 2011 at 02:09:47AM +0100, Heiko Oberdiek wrote:
 On Wed, Nov 30, 2011 at 10:58:23AM +1100, Ross Moore wrote:
 
  On 30/11/2011, at 10:32 AM, Heiko Oberdiek wrote:
  
   or
00c4\\
A0308\\
A\textcolor{red}{0308}
   
   As you can see, this problem is not related to color,
   both XeTeX and LuaTeX fail:
  
  With this font (Latin Modern) yes. I noticed this too.
  
  But try switching the font:
  
  \documentclass{minimal}
  \usepackage{fontspec}
  \usepackage{color}
  \pagestyle{empty}
  
  \setmainfont{Charis SIL}
  
  \begin{document}
  \fontsize{100pt}{100pt}\selectfont
  \noindent
  00e4\\
  a0308\\
  a\textcolor{red}{0308}\a
  \hbox{U0308} U\textcolor{red}{0308}\U
  
   00c4\\
   A0308\\
   A\textcolor{red}{0308}
  
  \end{document}
  
  
  Now it *does* depend upon having the color commands.
 
 \documentclass{minimal}
 \usepackage{ifluatex}
 \usepackage{fontspec}
 \ifluatex
   \usepackage{luacolor}
   \pdfobjcompresslevel=0
   \pdfcompresslevel=0
 \else
   \usepackage{color}
 \fi
 
 \pagestyle{empty}
 
 \setmainfont{CharisSIL-R.ttf}
 
 \begin{document}
 \fontsize{100pt}{100pt}\selectfont
 \noindent
 00c4\\ % (1)
 A0308\\ % (2)
 A\textcolor{red}{0308}\\ % (3)
 A\textcolor{red}{\hbox{0308}} % (4)
 \end{document}
 
 * LuaTeX: A + U+0308 gets combined to one glyph U+00C4, the
   color attribute of the diaeresis vanishes and the result is
   black (3). In the last line (4) \hbox prevents the recombination
   and the diaeresis is red, but misplaced.

It seems Charis SIL composes accented glyphs, try Gentium Basic instead
(GenBasR.ttf).

Regards,
 Khaled


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Diacritics in color

2011-11-29 Thread Khaled Hosny
On Wed, Nov 30, 2011 at 05:10:07AM +0200, Khaled Hosny wrote:
 It seems Charis SIL composes accented glyphs,

After closer look, this is not entirely true; Charis SIL has AAT tables
that, among other things, compose accented glyphs but the OpenType
tables do mark positioning. LuaTeX shouldn't be executing AAT tables, so
that is a bug (it is a side effect of using FontForge internallu which
can read some AAT tables in present them in an OpenType-like format).

Regards,
 Khaled


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex