Re: [XeTeX] XeTeX bugs in bidirectional typesetting

2016-11-19 Thread Simon Cozens
On 20/11/2016 12:35, Vafa Khalighi wrote:
> For the past 10 years I have reported numerous issues to the LuaTeX 
> and etex teams without any results but it is more than enough. I am 
> not going to waste time and energy doing useless things.

Well, it's not necessarily useless. If both engines are using the same
model, then you have twice as many communities available to fix bugs.
And it means the bug fixes and bidi expertise can be shared between the
two communities. I would also recommend XeTeX moving to the LuaTeX model
- and then fixing it!

> I put my time and energy into developing an engine that really has a
>  working bidi model. an engine which is developed by a native speaker
>  and meets the needs of people with real documents.

XeTeX has a lot of advantages in terms of opentype support, large set of
packages and mature community. It's a shame the bidi support is not
great; that is a known problem and there are not many people with the
expertise to make it work and do it well. As I understand it, the
problems with bidi were one of the reasons that Khaled stopped working
on XeTeX, which in a sense is a shame - he's exactly the sort of person
you need to get this right...

If you want an engine with a working bidi model, then you might want to
have a look at SILE. (https://github.com/simoncozens/sile) It uses the
Unicode bidi algorithm and so you get multilevel reordering without any
markup required. (See
https://github.com/simoncozens/sile/blob/master/examples/arabic.pdf) But
of course then you don't get the large set of packages and the mature
community...

S


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] New feature REQUEST for xetex

2016-02-22 Thread Simon Cozens
On 23/02/2016 13:54, Andrew Cunningham wrote:
> PDF/UA for instance leaves the question deliberately ambigious.
> ActualText is the way to make the content accessible, but developers
> creating tools for PDF do not actually have to process the ActualText.

Yeah. (Sorry to keep banging the drum but) I've just done some tests
with SILE, which includes some support for tagged/accessible PDFs. Even
when the ActualText includes the correct Devanagari, I am still seeing
the same problems with cut-and-paste. I'm not sure what needs to be done
to get it right.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] New feature planned for xetex

2016-02-19 Thread Simon Cozens
On 19/02/2016 01:06, Jonathan Kew wrote:
> So... that was an interesting and thought-provoking suggestion, but at
> this point I think I'm inclined towards keeping the existing model. To
> me, it makes sense to think of these as increasing levels of support,
> rather than as independent features.

That's interesting, because when we implemented these two things in
SILE, we did it the other way around! Whole-run shaping came first, and
then space kerning came second.

There's a long and ponderous discussion of the implementation at
https://github.com/simoncozens/sile/issues/179 - but basically we first
sent the runs for shaping and then turned any spaces back into TeX-style
constant width glue nodes, and then afterwards we decided to rely on the
shaper's understanding of the space width.

S


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-14 Thread Simon Cozens
On 15/12/2015 04:02, Werner LEMBERG wrote:
> I guess this would need a complete rewrite of the ucharclasses package
> (so Michiel should answer :-), but yes, such an approach could solve
> the issue.
> Regardless of that, I think that \XeTeXcharclass should allow more
> than 256 registers.

Yeah, it shouldn't be necessary to rewrite everything to get around the
fact that TeX is an 8-bit system. I would try changing the bounds in
scan_char_class myself, but my build of XeTex is... somewhat custom.

S



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Simon Cozens
On 14/12/2015 04:31, Jonathan Kew wrote:
> Probably, at least in principle; I don't remember the code offhand to
> know how easy/difficult this might be.

Here's the relevant code:

  p:=cur_chr; scan_usv_num;
  p:=p+cur_val;
  n:=sf_code(cur_val) mod @"1;
  scan_optional_equals;
  scan_char_class;
  define(p,data,cur_val*@"1 + n);

scan_char_class calls scan_int (which will scan a number up to
2147483647) and then ensures it is between 0 and 256. It's then scaled
up by << 16 and put into the table of equivalents by eq_define which
expects its final parameter to be a halfword. The maximum value of a
halfword is 1073741823, which I guess gives you a theoretical maximum of
16383 character classes.

It *might* be that if you up the maximum in scan_char_class it will all
just work right?


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Using OpenType fonts of 2048 em-width glyphs with xelatex, using fontspec

2015-11-09 Thread Simon Cozens
On 04/11/2015 07:30, Simon Cozens wrote:
> OK, I'm happy to have a dig into this. (because it will probably affect
> my software as well... ;-) Hugo, could you send me the font and a test file?

I started to dig into this:

* The font has 2048 upem set in the head table (as one would expect).

* The FontMatrix in the CFF table is [0.000488281 0 0 0.000488281 0 0].

* dvipdfmx writes the font in Type 0C format.

* This bug was reported six years ago, and it was found that if the font
was used with plain tex and dvipdfmx (not xdvipdfmx), the font was
written in Type 1C format and all would be well.

Here's Jonathan Kew about the bug back then:
http://www.ntg.nl/pipermail/dev-luatex/2009-March/002420.html

And here is how luatex handles the situation:
http://www.ntg.nl/maps/40/07.pdf

I'm not sure what the answer is at this point. In this specific case,
converting the font to Type 1 should do it. In the general case, luatex
seems to (a) scale the font size by 1000/upem when writing PDF Tf
instructions for PS fonts and (b) reset the FontMatrix entry back to
[0.001 * * 0.001 * *] when outputting the PDF object containing the CFF
dictionary. It seems a bit kludgy, but I would be happy to dvipdfmx do
that if that is the right thing to do. I would like some advice from
dvipdfmx-y people first, though.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Using OpenType fonts of 2048 em-width glyphs with xelatex, using fontspec

2015-11-01 Thread Simon Cozens
On 02/11/2015 04:41, Hugo Roy wrote:
> The issue is that the fonts are in a 2048 em-wide PostScript size 
> instead of 1000 (I sometimes get a warning from FontForge that the 
> width should be 1000).

I think it's very unlikely that this is the problem per se - SIL
Gentium is a 2048 upem font, and I've used that with xetex and Preview
for years with no problems.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Fwd: [tex-live] XeTeX or xdvipdfmx Broken on Mac

2015-04-29 Thread Simon Cozens
On 29/04/2015 02:34, Jiang Jiang wrote:
 I just pushed a fix (r37097) that's supposed to fix the issue for
 fonts like Kohinoor Devanagari Light and HanziPen TC, which according
 to my limited testing seems to work.

Thanks - yes, this seems to work for me too! Hooray!

S



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Simon Cozens
On 27/04/2015 19:39, Philip Taylor wrote:
 As to whether XML is a particularly good format not only here or for
 anything, all I can say is that in my experience we (humanity, that is)
 have not yet come up with anything better; LaTeX 2e, by explicitly
 permitting the conflation of form and content, fails abysmally in this
 respect (IMHO, of course).

For what it's worth, SILE's approach to this is to have pluggable input
parsers, shipping with an XML and a TeX-like parser by default.

People who use software tools to author their documents, or convert them
in from other sources, can use the XML syntax; people authoring by hand
can use the TeX-like syntax. The two syntaces are isomorphic:

 foo thing=wibblebar/foo
is equivalent to
 \foo[thing=wibble]{bar}
and is also equivalent to
 \begin[thing=wibble]{foo}bar\end{foo}

This means that if you have an XML document you want to typeset, you can
define processing expectations for its tags in an auxiliary class:

 SILE.registerCommand(foo, function(options,content)
SILE.process(content)
if options.thing == wibble then
SILE.typeset( (wobble))
end
 end)

 % Or even \define[command=foo]{\dowhatever{\process}}

and then load in the class on the command line; the upshot being you can
then feed the XML file directly to SILE without having to mess about
with XSLT or whatever.

I haven't tried creating tagged PDFs with SILE yet - there isn't support
for this in the libtexpdf library so it would mean messing about with
raw PDF specials (essentially what luatex was doing). I don't need the
functionality myself right now, so it's not a priority.

But if this is going to be a big deal, and it sounds like it might be,
then it could be worth adding specials for PDF tagging into libtexpdf
and dvipdfmx.

S


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Future *TeX [was: XeTeX maintenance]

2015-04-26 Thread Simon Cozens
On 26/04/2015 23:02, Karljurgen Feuerherm wrote:
 b) assuming a certain level of Xe(La)TeX competence at the Œpresenting¹
 level, what recommendations would experts on this list make to Œupping the
 ante¹ I.e. progress toward a more insider understanding of the software?

OK, I can answer this but I shall add a few other scattered thoughts as
well. :-)

First I also want to thank Khaled for his fantastic work in keeping
xetex maintained, bugs fixed, and users happy. Until recently I presumed
that Khaled was a professional software and/or typography person; I
didn't realise that xetex was essentially a free time project for him. I
was even more amazed by his dedication and professionalism.

Bit by bit, I seem to have picked up how TeX and xetex works, and I wish
I could help with the maintainership, but I just don't have the time at
present. (And given that I think the future of TeX is spelt SILE I
don't think it's appropriate for me to either. ;-) )

How to get to know xetex? I think the first step in moving from a user
to understanding the mechanics has to be Victor Eijkhout's TeX By Topic.
Either buy a hard copy or download it from
https://bitbucket.org/VictorEijkhout/tex-by-topic/src and read it over
four or five times. It's by far the best introduction to how the TeX
engine works.

After that you should be able to work your way through the TeXBook; read
that until you can understand the double-arrow sections.

From there, there are two directions you need to go in: the WEB program
for TeX, and the xetex extensions and all the related font handling code
and libraries that it uses.

As Joseph and Phillip have mentioned, WEB is not an easy thing to work
with, and WEB2C doesn't make it any better. But in a way there's nothing
you can do about that; TeX is the WEB source. A lot of the design
constraints of TeX in the early 80s don't apply any more; most of the
unpleasantness around WEB comes from the fact that memory is allocated
statically and that structures are hand-rolled with pointers and
offsets. Rewriting the whole thing in another language wouldn't be a
crazy idea (I've done it) and for long-term maintainability I think it's
essential - we can't go on with statically-allocated PASCAL code for
ever - but it would be a major operation outside of the bounds of
maintaining the current *TeX projects.

But the up side of that is that there's very little of the WEB code that
you actually need to mess with. Most of it Just Works and is never going
to need to change, and most of the time you can assume that if there's
an problem, it's with the xetex-specific bits, rather than with Knuth's TeX.

So after you have a conceptual understanding of how TeX works, the next
step is to run weave on source/texk/web2c/xetexdir/xetex.web [1] and
start reading. You can skim over parts 1-19, read the rest normally, and
focus most of your attention on parts 37-46. In particular, you want to
read over the parts which deal with native word nodes, which are
(basically) hboxes containing native font characters. Look up
native_word_node in the index at the back and read those sections.

Many of the XeTeX extensions call out from Pascal into C; these are
defined in the xetex.defines file. This is a bit tricky to match up
because WEB2C (I think) strips the underscores from the names in the WEB
file. So set_native_metrics in xetex.web gets turned into
setnativemetrics, which is defined by xetex.h as measure_native_node,
which you will find defined in XeTeX_ext.c - this is the key function
which, takes a Pascal memory region representing a native word node (a
bunch of Unicode characters), calls the font shaping functions on it,
and fills in the height, width, and depth of that node back into Pascal
so TeX can run its algorithms on it. Start your exploration of the C
sources from that routine, and follow all the function calls until you
understand what it's doing. At some point you will follow it down to the
harfbuzz interface in XeTeXLayoutInterface.cpp and the FontConfig
interface in XeTeXFontMgr_FC.cpp. (My feeling is that the
AAT/Mac-specific stuff is dead now, and at any rate it's easier to
understand FontConfig/harfbuzz anyway.)

Once you get to that layer, you may be perplexed by the lack of
documentation for both harfbuzz and FontConfig. Hopefully my article at
http://www.simon-cozens.org/content/duffers-guide-fontconfig-and-harfbuzz might
help with this.

Finally, about the future of TeX. Obviously my view that a complete
rewrite is a good idea is going to be a minority report for a while yet,
so I'll stick to XeTeX and LuaTeX.

I don't know as much as I should about LuaTeX. For me, the point of
xetex is not just that it's a Unicode-compatible TeX, but also that it
supports native fonts well and that it both handles native OS fonts in a
simple way and supports shaping of complex scripts (the harfbuzz bit).
Hans has stated that LuaTeX will not include external font shaping in
the core, and Graham Douglas has done some 

Re: [XeTeX] Fwd: [tex-live] XeTeX or xdvipdfmx Broken on Mac

2015-04-26 Thread Simon Cozens
On 26/04/2015 23:18, Zdenek Wagner wrote:
 I pushed a fix in r37053, both ITF Devanagari and Kohinoor Devanagari
 works for me now.

I'm still seeing problems with Kohinoor Devanagari, even after the two
recent commits.

The problem I'm getting is that I'm looking for Kohinoor Devangari
Light, which is index 4 in the cabinet.

At cidtype0.c:746, offset is 924, so cidtype0 calls cff_open(stream,
924, 4).

However, after reading the index of the TTC with idx =
cff_get_index(cff), (cff.c:105) idx-count is 1. This can't be right,
as there are definitely 4 fonts in the cabinet. We try to access index
n=4, and that's higher than idx-count so dvipdfmx blows up.

What I guess is happening is that, having gone to the right offset, you
are now looking at a font (Kohinoor Devangari Light) within the cabinet.
From the point of view of that font, there are no subfonts, so index 4
is meaningless, and perhaps instead you should be calling
cff_open(stream, offset, 1). (This is only a guess.)

There is something else which looks suspicious as well - in
CIDFont_type0_dofont we now have the code:

  if (sfont-type == SFNT_TYPE_TTC)
offset = ttc_read_offset(sfont, CIDFont_get_opt_index(font));

  if ((sfont-type != SFNT_TYPE_TTC  sfont-type !=
SFNT_TYPE_POSTSCRIPT) ||
  sfnt_read_table_directory(sfont, offset)  0)
ERROR(Not a CFF/OpenType font (1)?);
  offset = sfnt_find_table_pos(sfont, CFF );

So offset is being read by ttc_read_offset for TTC fonts, and then it is
being unconditionally reassigned by snft_find_table_pos. That might be
right, but it's different to how the font is being loaded in
CIDFont_type0_open.

Simon


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Insert Smileys in XeLaTeX

2014-12-16 Thread Simon Cozens
On 16/12/2014 20:24, Marek Stepanek wrote:
 I put together a test file, based on the first example. I replaced the
 Linux Libertine O, which is not installed on my computer, with Lucida
 Grande. This font is displaying these smileys well in my text editor
 BBEdit and my Terminal too (with an other font: Menlo Regular), but not
 in the compiled XeLaTeX-file.

The problem is that in your terminal and BBEdit, the system is
performing font substitution; neither Menlo nor Lucida Grande contain
the emoji characters, so the system shows you the characters in a
different font. (Probably Apple Color Emoji)

For some reason I can't get xelatex to use Apple Color Emoji, but the
Symbola font (available from http://users.teilar.gr/~g1951d/Symbola.zip
) also contains those characters. Install that font, add in this line:

 \section{Smileys}

\fontspec[Symbola]

  -  -  -  -  -  -  -  -  ...

and it should all work.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Gentium Book / Book Basic + TeX Ligatures = memory leak

2014-11-12 Thread Simon Cozens
On 12/11/2014 17:07, Khaled Hosny wrote:
 I'll try to see what is going on here, might be a bug or just XeTeX
 trying hard to find the font from the incorrect name it is given.

Well, if it was that, then Gentium:Ligatures=TeX would suffer the same
problem. But it doesn't. Just changing the font causes the problem.

Also, it's showing classic signs of a memory leak - xetex gets
progressively slower until it's barely usable, and memory footprint
increases.

Can't reproduce on Linux though.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Gentium Book / Book Basic + TeX Ligatures = memory leak

2014-11-12 Thread Simon Cozens
On 12/11/2014 21:33, Herbert Schulz wrote:
 \setmainfont{Gentium Book Light}[Ligatures=TeX]
 although there is backward compatibility. No slow down then.

Confirmed:

\setmainfont{Gentium Basic}[Ligatures=TeX] % Fast
\setmainfont{Gentium:Ligatures=TeX} % Fast
\setmainfont{Gentium Basic:Ligatures=TeX} % Slow



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] Gentium Book / Book Basic + TeX Ligatures = memory leak

2014-11-11 Thread Simon Cozens
OK, this is an odd one. Here's my test document:

\documentclass{book}
\usepackage{fontspec}
\setromanfont{Gentium:Ligatures=TeX}
\usepackage{blindtext}

\begin{document}
\Blinddocument
\Blinddocument
\Blinddocument

\end{document}

It runs pretty well:

% time xelatex -no-pdf test
...
Output written on test.xdv (48 pages, 920024 bytes).
Transcript written on test.log.
xelatex -no-pdf test  1.96s user 0.10s system 98% cpu 2.083 total

Memory usage peaks at around 185M on this MacBook Pro. (I'm using
-no-pdf to eliminate xdvipdfmx as a source of any problems.)

If I change that font line to

\setromanfont{Gentium Book Basic:Ligatures=TeX}

Then this happens:

% time xelatex -no-pdf test
...
xelatex -no-pdf test  118.87s user 0.91s system 99% cpu 2:00.26 total

60x slowdown! Memory usage rises to 318M. Same happens with Gentium
Basic. Take off the 'Ligatures=TeX' and it's fine.

% xetex --version
XeTeX 3.14159265-2.6-0.1 (TeX Live 2014)


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] SILE 0.9.0 is released

2014-09-01 Thread Simon Cozens

Hello all,
	This is not exactly xetex-related, although I think it will be of interest to 
xetex people. I have been working on a new typesetting system written from 
scratch in Lua.


	Being completely written in Lua means that the typesetter itself can be 
modified in interesting ways and solve problems which are difficult to solve 
in TeX. The first book typeset entirely in SILE has just been printed, and 
there is at least one project in progress which is typesetting the Bible (with 
complex annotations/footnotes/sidenotes/parallels etc.) using SILE.


Some more details are below.

## What is SILE?

SILE is a typesetting system. Its job is to produce beautiful printed 
documents. It’s similar to TeX, but with some ideas borrowed from InDesign, 
and written with modern technologies in mind.


## What can I do with it (that I can’t do with TeX)?

SILE allows you to

* Produce complex document layouts using frames.

* Easily extend the typesetting system in a high-level programming language. 
(Lua)

* Directly process XML to PDF without the use of XSL stylesheets.

* Typeset text on a grid.

## Getting and installing

SILE can be downloaded from [its home page][1], or directly from [the release 
page][2].


SILE is written in the Lua programming language, so you will need a Lua 
installation; it also relies on the Cairo and Pango libraries.


You will then need to run:

* `luarocks install stdlib lgi lpeg luaexpat inspect luaepnf luarepl cassowary`

Once your dependencies are installed, run

* `lua install.lua`

This will place the SILE libraries and executable in a sensible location.

Now try `sile examples/test.sil`.

## Finding out more

Please read the [full SILE manual][3] for more information about what SILE is 
and how it can help you.


## Why is this 0.9.0?

While this release is perfectly functional for typesetting complex documents, 
SILE has several technical and social goals that need to be accomplished 
before it can be considered 1.0. See the [roadmap][] for more information.


## Contact

Please report bugs and send patches and pull requests at the [github 
repository][4]. For questions, please contact the author, Simon Cozens 
si...@simon-cozens.org.


## License terms

SILE is distributed under the [MIT licence][5].

[1]: http://www.sile-typesetter.org/
[2]: https://github.com/simoncozens/sile/releases
[3]: 
https://raw.githubusercontent.com/simoncozens/sile/master/documentation/sile.pdf

[4]: https://github.com/simoncozens/sile
[5]: http://choosealicense.com/licenses/mit/
[roadmap]: https://github.com/simoncozens/sile/blob/master/ROADMAP


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


[XeTeX] Seeking short examples of complex renderings

2013-12-06 Thread Simon Cozens

Hello XeTeXers,

Sorry for a not-entirely-XeTeX-related request but I think there may be some 
merit in it for XeTeX in the future.


I have recently been toying around implementing my own layout engine, and have 
started to check that it works nicely with non-Roman scripts. This has already 
thrown up a few bugs in pango (which I'm using to do the shaping), and that's 
just with scripts that I can read. I am sure there are other problems in 
scripts I can't read.


So I thought it would be a useful thing for people like 
pango/xetex/graphite/harfbuzz/other layout and rendering tool developers to 
have a visual test suite, a collection of short strings which stress-test 
their engines in interesting ways: placement of composing characters, 
mandatory ligatures, mixed LTR/RTL, that sort of thing.


I have started putting a collection together but my own knowledge and 
experience is pretty limited. If you can contribute some short Unicode strings 
from a language you know which show an interesting rendering feature, I hope 
this will be something that can be beneficial to the text layout community as 
a whole.


The test suite at the moment is at
http://simoncozens.github.io/visual-testsuite/testsuite.html

and you can contribute via github at
http://github.com/simoncozens/visual-testsuite

(or just send me an email)

Thanks!
Simon


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Seeking short examples of complex renderings

2013-12-06 Thread Simon Cozens

On 06/12/2013 20:18, Khaled Hosny wrote:

Both Pango and XeTeX use HarfBuzz which in turn can use Graphite, so I
think HarfBuzz and Graphite are the proper places for these tests (and
both already have test suites in place)


This is true, but there *are* higher level applications, like XeTeX and SILE 
and Gtk and Firefox and so on...



Instead of comparing images, which can be affected by things unrelated
to layout like hinting, it would be better to compare glyph IDs with or
without glyph positioning, check HarfBuzz and Graphite test suites for
examples.


...and so I don't think that having low-level tests obviates the need for 
high-level ones.


My script is meant to look like *this* but your application renders it like 
*that* is as much a meaningful test as 
f499fbc23865022234775c43503bba2e63978fe1.ttf:U+09B0,U+09CD,U+09A5,U+09CD,U+09AF,U+09C0:[gid1=0+1320|gid13=0+523|gid18=0+545] 
- and possibly more accessible too.


And in fact it's precisely because, say, SILE uses Pango which uses Harfbuzz 
which uses Graphite, it's useful to have an easy way to see who's getting it 
wrong. If SILE messes up a rendering, I want to have some text I can throw at 
pango-view to see if that gets it right. Sorting out the layers is important.


For instance, I see that Harfbuzz already has some tests for Hebrew vowel 
pointings, [*] but running pango-view on these tests produces erroneous 
output. So how should I describe the problem to Pango developers, other than 
by having a picture of what Pango *is* generating and what I think it *should* 
be generating...


...which is basically what I am putting together.

[*] or at least it has some files with some pointed Hebrew in it - but I don't 
see the test suite doing anything with it, nor do I see any expected shapings 
for any of the texts/ directory.



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] openType and xetex

2013-12-03 Thread Simon Cozens

On 04/12/2013 07:27, maxwell wrote:
 You'll perhaps pardon me if I don't understand that...

XeTeX is based on eTeX which is based on TeX, hence...


3.1415926  (I recognize this number, but?)


TeX version.


2.5(?)


eTeX version.


0..3   (third, er fourth release of 0.)


XeTeX version.


2013060708 (date)


You got that one!


Anyway, if the 0..3 part is any indication, xetex is now using Harfbuzz,
correct?


Yep.



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


[XeTeX] Contextual shaping

2013-11-27 Thread Simon Cozens

This is possibly a daft question, but...

In traditional TeX, character tokens are processed and put into boxes 
individually, with fairly primitive ligature tables. Obviously XeTeX doesn't 
do this, using Harfbuzz (or ICU or whatever) to do the shaping and layout.


My question is, if you're not showing individual characters to the shaping 
engine for it to consider, what defines how big a string of characters to 
shape at a time? Does XeTeX break at the word level and then shape a word, 
and if so what defines a word? (Chinese has no word breaks!) Or does it shape 
an entire paragraph of text at a time (!) and then box up the glyphs 
individually? Or...?


(I've tried starting at layoutChars in XeTeXLayoutInterface.cpp and working 
backwards but I can't understand where I end up: measure_native_node shapes a 
node, but what's a node?)



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] Contextual shaping

2013-11-27 Thread Simon Cozens

On 27/11/2013 21:46, Khaled Hosny wrote:

measure_native_node is called by the WEB code (called set_native_metrics
there), check xetex.web for collect_native:, that is where bulk of the
work is done. Check also @Merge sequences of words using AAT.


Aha, I see it now, I think! Reading the WEB documentation for native_word_node 
helped.


So a run of letter characters in the same font are assembled into a 
native_word_node by collect_native, and then shaped by set_native_metrics. 
That makes sense - thanks!



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX 0.9999.0 released (OS X and png)

2013-03-28 Thread Simon Cozens
On 14/03/2013 20:56, Simon Cozens wrote:
 OS X packages work beautifully.

Except that PNG files don't get included in graphicx any more. JPEG is just
fine. Here's the build info:

Compiled with ICU version 51.1; using 51.1
Compiled with zlib version 1.2.7; using 1.2.7
Compiled with FreeType2 version 2.4.11; using 2.4.11
Compiled with Graphite2 version 1.2.1; using 1.2.1
Compiled with HarfBuzz version 0.9.13; using 0.9.13
Using Mac OS X Core Text, Cocoa  ImageIO frameworks

Should there be a libpng in there? (I don't know.) Or should this be covered
by the Mac OS Image frameworks?



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX 0.9999.0 released

2013-03-14 Thread Simon Cozens
On 14/03/2013 01:03, Mojca Miklavec wrote:
 The binaries for many platforms (windows excluded at the moment) have
 been submitted to TLContrib now. You can use
 tlmgr --repository http://tlcontrib.metatex.org/2012 update --all
 for example.

OS X packages work beautifully.

Thank you very much, Khaled, (and Jiang Jiang and possibly others!) for all
your hard work making these important changes.

S


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] XeTeX 0.9999.0 released

2013-03-14 Thread Simon Cozens
On 14/03/2013 22:11, George N. White III wrote:
 OS X packages work beautifully.
 Which OS X version please?

10.8.2 here.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] The future of XeTeX

2012-08-01 Thread Simon Cozens

On 31/07/2012 18:06, Keith J. Schultz wrote:
 Lua(La)TeX is a move in this direction. Modernizing TeX!!

Well, yes and no.

The problem with all the TeX engines, the elephant in the room that nobody's 
talking about, is TeX itself. Don't get me wrong, I think it's a great piece 
of code, and it's done fantastic service over the past thirty years, I use it 
every day, and I don't see myself using anything else in the near future, but...


Let's just say there's a good reason why there isn't a great deal of code 
that's still in use after thirty years.


When Don Knuth wrote TeX, he needed an embeddable programming language, a font 
specification system, a font library, an output format. Absolutely none of 
those things existed at the time, so he wrote all of them from scratch: the 
TeX language, MetaFont, TFM, DVI. Given that he was pioneering and had nothing 
else to draw upon, he did pretty damned well, but over time they have not 
exactly turned out to be the best choices.


*TeX development since 1982 has essentially been a bunch of disparate 
projects, each trying to rip out something that Don did and replace it with 
something more sensible instead. So we had NFSS and virtual fonts to remedy 
the deficiencies of MetaFont; then we had pstex and pdftex to remedy the 
deficiencies of DVI; xetex to remedy the deficiencies of TFM; luatex to remedy 
the deficiencies of the TeX language.


They've all been great hacks, but they've all been hacks.

My feeling is that it's time to accept the principle of one to throw away 
and finally put TeX82 out to pasture. Now we are blessed with a set of 
technologies which have proved themselves, which give great results on modern 
systems and have support for problems which were not even on the agenda thirty 
years ago. Just take your favourite scripting language, your favourite shaping 
engine, and your favourite output engine, stick the Knuth-Plass box-and-glue 
model, justification engine and page builder in the middle,  glue them all 
together, and call it something new.


Simon


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] The future of XeTeX

2012-07-30 Thread Simon Cozens

On 31/07/2012 05:05, Adam Twardoch (List) wrote:

As I've written on this list previously, integrating HarfBuzz into XeTeX
(as an alternative to the existing engines, i.e. ICU Layout, Graphite 1
and ATSUI) would be very desirable.


I have been looking at XeTeX guts (after the discussion of shifting from ATSUI 
to Core Text), and also Harfbuzz guts for another typesetting project, (still 
under wraps at the moment but more details soon!) and would be very happy to 
give some time to make this happen. Unfortunately this month is insane for me 
but I should have some time in September/October.



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] ATSUI/Core Text/etc. Re: Minimalist TeX?

2012-05-23 Thread Simon Cozens

On 23/05/2012 17:31, Peter Dyballa wrote:

Am 23.05.2012 um 03:36 schrieb Simon Cozens:

I'm coming very late to this party but could someone explain why Core Text 
would be the best replacement here?

ATSUI is 32-bit only in a 64-bit world


I understand that ATSUI needs to go; that wasn't the question I asked. Why 
replace it with another OSX-only API when others are available?



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


[XeTeX] ATSUI/Core Text/etc. Re: Minimalist TeX?

2012-05-22 Thread Simon Cozens

On 19/05/2012 03:22, Mojca Miklavec wrote:

I might be wrong, but I think that currently a much more awaited patch
would be one replacing ATSUI library calls with Core Text. I guess
that would be way less work than replacing PDF library and bring way
more benefits to the community.


I'm coming very late to this party but could someone explain why Core Text 
would be the best replacement here? It seems like swapping out one OS-specific 
library for another, when there are platform-independent text layout libraries 
available that could be used instead. (I don't know any XeTeX internals yet so 
I don't know what it uses on non-OSX platforms already.)


Why not replace ATSUI with e.g. SIL graphite or Pango/Harfbuzz?


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] xdCJK how to mix occasional Japanese with Chinese and English

2012-04-23 Thread Simon Cozens

On 23/04/2012 22:26, jon wrote:

If I don't use the xeCJK package, the Japanese renders correctly in Ume
Mincho, but the Chinese doesn't.


In which case here's a daft but working fix:

\newfontfamily \japanesefont{Ume Mincho}
\def\forjapanese{\japanesefont\makexeCJKinactive}


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] xdCJK [--xeCJK] how to mix occasional Japanese with Chinese and English

2012-04-23 Thread Simon Cozens

On 24/04/2012 11:32, jon wrote:

Appears to be a handy way to temporarily disable a package in order to get
something to work.


It is, but I've realised that it would obviously also turn off some of the 
more useful CJK things that xeCJK provides such as line-break handling. So if 
your little string of Japanese appears at the end of a line, it'll all go 
horribly wrong. I don't use xeCJK but when I'm mixing English and Japanese I 
use a macro like this:


\newfontinstance\japanesefont{Hiragino Mincho Pro}
\def\japanese{\XeTeXlinebreaklocale jp
\XeTeXlinebreakskip0pt plus 1pt
\japanesefont\small}

But since you are using xeCJK, (which is obviously much easier than marking up 
every single bit of Chinese!) Honda-san's suggestion is best.



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] [LeTeX] Spreadsheet::ParseExcel Error

2012-01-08 Thread Simon Cozens

On 08/01/2012 21:55, A u wrote:

PS : I have texlive 2011, Active perl, and running on 2011 Macbook Pro


Not knowing the details, Activeperl is probably your problem; my guess is that 
it has its own location for storing modules. TeX will probably be calling the 
system Perl. I don't see much reason to use ActivePerl on OS X - Perl is 
already installed by default.


Try running this command:
 % perldoc -l SpreadSheet::ParseExcel

and compare the path with the paths that the perl run from TeX is searching. 
(/Library/Perl/Updates/5.10.0 
/System/Library/Perl/5.10.0/darwin-thread-multi-2level 
/System/Library/Perl/5.10.0 /Library/Perl/5.10.0/darwin-thread-multi-2level 
/Library/Perl/5.10.0 /Network/Library/Perl/5.10.0/darwin-thread-multi-2level 
/Network/Library/Perl/5.10.0 /Network/Library/Perl 
/System/Library/Perl/Extras/5.10.0/darwin-thread-multi-2level 
/System/Library/Perl/Extras/5.10.0 .)



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] amsrefs and xeCJK don't play nice together

2011-07-19 Thread Simon Cozens
On 19/07/2011 10:26, Nathan Sidoli wrote:
 When I collaborate with Japanese colleagues on work in Japanese we always end
 up using pTeX, because the formatting of any of the CJK packages will never by
 satisfactory for a native reader.

Thanks. I've bitten the bullet and moved over to (u)p(la)tex. This also gives
me vertical typesetting, which I had been hoping for as well.

Of course, while that fixes my problem in this case, the incompatibility bug
between the two packages still remains.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] amsrefs and xeCJK don't play nice together

2011-07-18 Thread Simon Cozens
Hello,

I'm trying to write an article in Japanese with some bibliographic 
elements
in English. I'm using xeCJK to automatically swap fonts between Japanese and
English without explicitly marking up the different languages, but it seems
that the xeCJK package eats some of the punctuation characters in my 
bibliography.

A minimal test case is attached; with the xeCJK package commented out, 
the
bibliography is correct. With it active, the commas between names and between
book title, publisher and address are all omitted for the Japanese entry.

All my TeXnical skills have atrophied through lack of use, but it seems 
like
the first element of BibSpec commands (e.g. the comma in
+{,} { }{address}
) is being eaten somewhere.

I guess the problem is probably within xecjk, because it messes around 
with
punctuation catcodes. (I have tried setting \punctstyle, but nothing helps.)
Since amsrefs seems to be the only way to do mixed-language UTF8
bibliographies nicely, it would be great if we could get this to work.

Thanks,
Simon


testcase.tex
Description: TeX document


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


[XeTeX] XeTeX's (old) vbox model

2011-02-02 Thread Simon Cozens
Hello,
This is a bit of an odd question, I think, but please humour me. I read 
from
an old mailing list message
(http://www.tug.org/pipermail/xetex/2005-March/002025.html) that XeTeX
formerly did not retrieve the height and depth of glyphs when building boxes.
I tried the code in that thread, and found that today it *is* certainly
getting those metrics - it was reporting height and depth correctly for a
range of glyphs I tried. So obviously there was a change there at some point.

However, I'm intrigued as to how, in the past, XeTeX put together 
vertical
lists, given that you would need to know height and depth measurements to
apply interline glue: (from the TeXBook)

Here are the exact rules by which TEX calculates the interline glue
between boxes: Assume that a new box of height h (not a rule box) is
about to be appended to the bottom of the current vertical list, and
let \prevdepth = p, \lineskiplimit = l, \baselineskip = (b plus y
minus z). If p =- -1000 pt, no interline glue is added. Otherwise
if b - p - h = l, the interline glue `b - p - h plus y minus z)
will be appended just above the new box. Otherwise the \lineskip
glue will be appended. Finally, \prevdepth is set to the depth of
the new box.

Did it simply use a fixed height and depth per font size, essentially 
using a
fixed leading model? If so, how did it calculate those values? And how does it
do it now - I'm presuming it gets metrics through FreeType; does it use
control boxes (FT_Glyph_Get_CBox) or the exact bounding box? 
(FT_Outline_Get_BBox)

(The reason I want to know is somewhat arcane, and somewhat 
embarrassing.
I've accidentally written a typesetting system, and it's a fun thing to hack
on and I want to keep toying with it. Although it's probably unnecessary, I
believe it's not totally pointless because it's embedded in a high-level
programming language and so it's very easily scriptable. And my goodness I'm
learning a lot about how TrueType/OpenType works. The slightly longer story is
at http://www.simon-cozens.org/content/typesetting-perl)

Thanks,
Simon


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex