from:"Joseph Wright"

Re: [XeTeX] I want to commit to xetex, how to make a pull request?

2022-12-13 Thread Joseph Wright

Ah, right.

This looks like cross-engine work, which mainly happens in the TL repo.
It will need to be ported back to the SourceForge repo at some stage I
guess.

Personally, I'd just use the TL repo and leave that sort of admin to
Karl or the people with write access on the SourceForge repo.

Joseph

On 13/12/2022 11:18, Tuff Contender wrote:

Ummm stupid of me. The correct url is
https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/xetexdir/xetex.ch?r1=55885&r2=57724
, a commit on 2021-02-13.

On Tue, Dec 13, 2022 at 7:06 PM Joseph Wright <
joseph.wri...@morningstar2.co.uk> wrote:

On 13/12/2022 11:03, Tuff Contender wrote:

On Sat, Dec 10, 2022 at 6:22 PM Joseph Wright <
joseph.wri...@morningstar2.co.uk> wrote:

On 09/12/2022 19:07, Tuff Contender wrote:

On 08/12/2022 09:22, Tuff Contender wrote:

The code on XeTeX - Unicode-based TeX / Code / [bc89c7] (

sourceforge.net)

<https://sourceforge.net/p/xetex/code/ci/master/tree/> seems not up

date, since the last modification is on 2020-01-20.
[image: image.png]
Where should I submit a merge request to?

What makes you think it's not up-to-date? (Other than some TL version
strings, I imagine this is the same code as in TL, etc.) That said,

I'd

likely look to send a patch in the first instance to TL, as it tends

be the place that any changes actually happen.

I viewed the page https://sourceforge.net/p/xetex/code/ci/master/tree/
and found the last change is on 2020-01-20, here's the snapshot

https://tug.org/pipermail/xetex/attachments/20221208/db8bad7f/attachment-0001.png

Sure: I only meant that as far as I know, there have been no changes in
XeTeX since them. I was wondering why you thought there might be.

Joseph

Sorry for not getting the idea.

The last significant commit in `xetex.ch` is

https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/tex.ch?r1=63916&r2=64547

on 2022-09-29, which is much later than the one on sf.

That's tex.ch, not xetex?

Joseph

Re: [XeTeX] I want to commit to xetex, how to make a pull request?

2022-12-13 Thread Joseph Wright


On 13/12/2022 11:03, Tuff Contender wrote:

On Sat, Dec 10, 2022 at 6:22 PM Joseph Wright <
joseph.wri...@morningstar2.co.uk> wrote:


On 09/12/2022 19:07, Tuff Contender wrote:


On 08/12/2022 09:22, Tuff Contender wrote:

The code on XeTeX - Unicode-based TeX / Code / [bc89c7] (

sourceforge.net)

<https://sourceforge.net/p/xetex/code/ci/master/tree/> seems not up to
date, since the last modification is on 2020-01-20.
[image: image.png]
Where should I submit a merge request to?


What makes you think it's not up-to-date? (Other than some TL version
strings, I imagine this is the same code as in TL, etc.) That said, I'd
likely look to send a patch in the first instance to TL, as it tends to
be the place that any changes actually happen.

I viewed the page https://sourceforge.net/p/xetex/code/ci/master/tree/
and found the last change is on 2020-01-20, here's the snapshot


https://tug.org/pipermail/xetex/attachments/20221208/db8bad7f/attachment-0001.png

Sure: I only meant that as far as I know, there have been no changes in
XeTeX since them. I was wondering why you thought there might be.

Joseph

Sorry for not getting the idea.


The last significant commit in `xetex.ch` is
https://www.tug.org/svn/texlive/trunk/Build/source/texk/web2c/tex.ch?r1=63916&r2=64547
on 2022-09-29, which is much later than the one on sf.




That's tex.ch, not xetex?

Joseph

Re: [XeTeX] I want to commit to xetex, how to make a pull request?

2022-12-10 Thread Joseph Wright


On 09/12/2022 19:07, Tuff Contender wrote:


On 08/12/2022 09:22, Tuff Contender wrote:

The code on XeTeX - Unicode-based TeX / Code / [bc89c7] (sourceforge.net)
 seems not up to
date, since the last modification is on 2020-01-20.
[image: image.png]
Where should I submit a merge request to?


What makes you think it's not up-to-date? (Other than some TL version
strings, I imagine this is the same code as in TL, etc.) That said, I'd
likely look to send a patch in the first instance to TL, as it tends to
be the place that any changes actually happen.

I viewed the page https://sourceforge.net/p/xetex/code/ci/master/tree/
and found the last change is on 2020-01-20, here's the snapshot
https://tug.org/pipermail/xetex/attachments/20221208/db8bad7f/attachment-0001.png


Sure: I only meant that as far as I know, there have been no changes in 
XeTeX since them. I was wondering why you thought there might be.


Joseph

Re: [XeTeX] I want to commit to xetex, how to make a pull request?

2022-12-08 Thread Joseph Wright


On 08/12/2022 09:22, Tuff Contender wrote:

The code on XeTeX - Unicode-based TeX / Code / [bc89c7] (sourceforge.net)
 seems not up to
date, since the last modification is on 2020-01-20.
[image: image.png]
Where should I submit a merge request to?


What makes you think it's not up-to-date? (Other than some TL version 
strings, I imagine this is the same code as in TL, etc.) That said, I'd 
likely look to send a patch in the first instance to TL, as it tends to 
be the place that any changes actually happen.



Still there are some other problems related to commits to the project.
1. It is advised to make patches in the `xetex.ch` file, not `xetex.web`.
But I can find changes in both `*tex.web` and `*tex.ch`. So which one am I
supposed to modify?


I think both to keep them in line with each other.


2. In `.ch` files, what's those after "@x" and "@y", are they comments? How
do I write them?


Yes, they are there for humans to understand what's going on.

What changes are you considering suggesting?

Joseph

Re: [XeTeX] Uppercase in Armenian

2022-05-01 Thread Joseph Wright


On 30/04/2022 23:52, David Carlisle wrote:

Something like this, I think.

[image: image.png]

\documentclass{article}
\usepackage{polyglossia}

\setdefaultlanguage{armenian}
\setmainfont{DejaVu Sans}
\ExplSyntaxOn
\let\tuppercase\text_uppercase:n
\ExplSyntaxOff
\pagestyle{empty}
\begin{document}
Երևան $\rightarrow$ \uppercase{Երևան}

Երևան $\rightarrow$ \tuppercase{Երևան}

\end{document}

David


The next expl3 release will include hy-x-yiwn as a language settings, 
allowing


   \newcommand\tuppercsae{\text_uppercase:n{hy-x-yiwn}}

in David's example - this variant will use the alternative mapping.

Joseph

Re: [XeTeX] Uppercase in Armenian

2022-05-01 Thread Joseph Wright


On 01/05/2022 13:10, Jonathan Kew wrote:

Hi Zdeněk,

Checking the Unicode character database[1], U+0587 is listed as having a 
*compatibility* decomposition to <0565,0582> (not 0587):


0587;ARMENIAN SMALL LIGATURE ECH YIWN;Ll;0;L; 0565 0582N;

Likewise, the SpecialCasing.txt file[2] that defines case mappings other 
than simple 1:1 substitutions shows the same decomposition for the 
uppercase form:


0587; 0587; 0535 0582; 0535 0552; # ARMENIAN SMALL LIGATURE ECH YIWN

So if I understand correctly, what \text_uppercase:n is doing is simply 
implementing what the Unicode standard defines.


If this isn't the appropriate behavior, at least for some locales, I 
believe that will need custom programming at some level, but I don't 
know enough about it to get into any details.


Indeed: we will add support for alternative casing for Arminian to 
\text_uppercase:nn shortly.


Joseph

Re: [XeTeX] Colour specials for XeTeX

2020-08-07 Thread Joseph Wright


On 07/08/2020 16:43, morris roger wrote:

There are much simpler ways of adding colour; see
https://ctan.org/pkg/do-it-yourself-tex
where I include examples using opmac
Roger H-F,
Ottawa


That's still wrappers around the same specials.

Joseph

Re: [XeTeX] A LaTeX Unicode initialization desire/question/suggestion

2020-01-13 Thread Joseph Wright


On 13/01/2020 03:41, Doug McKenna wrote:

| load-unicode-data handles some of the reading, but there is additional
| reading  (see l3unicode.dtx) that is in expl3.sty (in current xelatex
| fomats) but will be preloaded in future releases and in the current
| xelatex-dev release as noted above.


I tried looking at, e.g., l3unicode.dtx, and it's still using TeX (or 
impenetrable LaTeX3 kernel language built on top) to parse the official Unicode 
data files.


For performance reasons, we had to make that part a bit more complex 
than it was originally: at present, it's run during every LuaTeX/XeTeX 
run, and that is a bit of an issue. It's one of the reasons we want to 
pre-load expl3 and dump it into the format.



It's hard for me to imagine how any of that isn't at least an order of magnitude slower 
than scanning through a mere 20K block of bytes with a machine pointer in C, and 
installing into all pertinent character mapping tables every piece of information that 
XeTeX says it's interested in on a per character or per character range basis.  When I 
use the term "preloaded" I'm not talking about parsing anything inside TeX's 
virtual machine using the TeX language (or whatever's built on top of it).


It's not absolutely as fast as it can be in TeX, but it's close. (For 
LuaTeX, a Lua reader would of course be possible and likely faster, but 
then we'd have two code paths to worry about.)


David's point was that the Unicode data is not needed only for the TeX 
internal tables for \uccode, \lccode, \catcode (possibly others). It's 
also needed to cover other Unicode concepts that TeX doesn't know, and 
so have to be coded at the macro level. For example, Unicode case 
changing is not a one-one operation. For the majority of codepoints, one 
can use the TeX \lccode/\uccode values (and avoid needing to hold them 
in TeX macros). Most of this information is in the relatively small file 
SpecialCasing.txt, but there is also the information one needs from 
UnicodeData.txt to cover titlecasing. We did consider 'pre-extracting' 
that data, but it made relatively little difference during a normal TeX 
run, and leaves open the risk of mismatched files. A 'bigger' data set 
required is NFD mappings: they are needed to handle for example Greek 
case changing. TeX doesn't know about NFD, so again one needs some data, 
which again comes from UnicodeData.txt, and again needs to be stored 
somewhere that's not 'pre-defined'.



| A tex primitive that controls a macro set seems to be reversing the
| natural layering, you could test for \jsboxversion (or whatever you
| have) or test that the lccode of some character is already non zero
| or... several other possibilities without introducing a primitive
| here.


The point is that it *isn't* a TeX primitive.  The idea is that it would be a 
primitive specific only to those engines that initialize their character 
mapping tables (\catcode, \lccode, \uccode, etc.) when the interpreter is 
created/launched/whatever, before it ever executes any TeX source code as a 
virtual machine.  My point is that testing for the existence of \Umathcode is 
an inappropriate test for that condition.


Er, it's a primitive, no? Or would be set up a macro that was 
pre-defined by the engine?



But when your engine is just a library linked into another program the lives for a 
long time, perhaps measured in days, and when the user is running multiple jobs from 
the same program, then there ought to be a way to load the format from its source 
code >once<, and have it live in the engine's memory even while job after job 
is executing on top, with a clean-up after each job ends.  This is, after all, 
completely conformant with everyday use of TeX (edit...run job...edit...run job...), 
not to mention every other computer language.  I'm pretty sure that I've architected 
my code to allow this, although it's untested for now.  One step at a time.


Years ago, Jonathan Fine wrote a TeX daemon that could stay running, 
relying on the fact that DVI files don't need to be closed (unlike PDF 
ones). That requires avoiding \end, and he could only support plain TeX 
as that means disabling \csname, so no environments. I assume you are 
not thinking of a 'permanently running TeX job' in that sense?



| As noted above, with latex-dev releases you are still going to need
| the unicode data files to be read using tex macros.


Are these files read more than once, and if so, why?  If not, I don't 
understand why I'm still going to need to read them.


l3unicode reads each one once, as noted above to populate macro data 
storage. Presumably you are not worried about LuaTeX, so don't have to 
think about font loaders (which also need Unicode info, and which is 
handled by LuaTeX in Lua code).



| To be in the core tex macros we would need to have the engine
| incorporated into texlive so that it could be tested as part of our
| test suite and continuous integration tests.


That doesn't make sense to me.

Re: [XeTeX] [EXT] A LaTeX Unicode initialization desire/question/suggestion

2020-01-13 Thread Joseph Wright

On 13/01/2020 03:41, Doug McKenna wrote:

Phil Taylor wrote:

| So because JSBox is required/designed to incorporate all of XeTeX's
| features, it must (by definition) implement/provide \Umathcode.

Just to be clear, JSBox can eventually incorporate all of XeTeX's features
(primitives), but does not do so now. It doesn't even incorporate pdfTeX's
features, but it is set up to. I'm merely adding XeTeX features as necessary to
get the LaTeX macro library installed and then typeset a LaTeX document
containing no Unicode at all. The problem is that somewhere in the LaTeX format
initialization the ability to recognize a Unicode character (as opposed to a
UTF-8 byte sequence) is equated with the assumption that it's being run under
XeTeX, and that therefore at least some of XeTeX's features are there and can
be relied upon at format initialization time.

At present, there are two engines that implement \Umathcode, etc., 'in
the wild', XeTeX and LuaTeX, and they have (over time) come to an agreed
position on what core features are available at the macro level. (For
example, originally XeTeX called it's new primitives \XeTeX... but they
got renamed to \U... to match LuaTeX.)

They have quite a lot of differences too, but a core subset of features
is available with both, and that comes about as they offer \Umathcode.
Almost all of the tests in LaTeX look for the relevant primitive, so for
example when we want \Uchar we look for it. However, there are as you
note a few places where finding \Umathcode is by far the easiest marker.

It's quite possible to add additional tests to the core code, provided
there is a spec or at least some notes on what's available. (For
example, (u)pTeX for a long time had no docs in English, so things were
tricky. But there is now a basic manual there to allow those of us who
do not know Japanese to offer at least some basic support.)

| But could not JSbox perform (or simulate) the following :

| \let \Umathschar = \Umathchar % use British spelling as synonym
| \let \Umathchar = \undefined % inhibit "load-unicode-data.tex"'s special
treatment of engines that implement \Umathchar
| \input load-unicode-data % since it would seem that you cannot simply skip
this step
| \let \Umathchar = \Umathschar % restore canonical meaning of \Umathchar

It could, but it's not my code that's issuing "\input load-unicode-data". The reading of
"load-unicode-data.tex" is embedded within my version of LaTeX's own initialization code,
and there's no guarantee that elsewhere in that code there isn't some dependence on \Umathchar that
such a re-definition might interfere with. LaTeX's code has several tests that rely on whether
|\Umathchar| is defined or not, and even in the latest versions, it is declared that \Umathchar
existence is the official way to test. Indeed, the latest official comments, as David Carlisle
brought to my attention in this thread, declare that \Umathchar existence testing is the current
way to go in all sorts of places.

I think you mean \Umathcode :)

Each place that uses Unicode features does test for this primitive; if
it exists, we have to-date been able to assume a few additional
primitives are also available (e-TeX, \Uchar, \Umathchardef) but mainly
tells us that we can allocate \lccode and \uccode beyond 255.

Here is perhaps a slightly better hack:

If it's acceptable as the very first executable line in latex.ltx (or other format source
files) to test the catcode value of `{ to determine whether a format has already been
loaded or not, then it should be acceptable within "load-unicode-data.tex" (or
the like) to include a similar test to determine whether to proceed with the TeX parse of
the Unicode data, or to bail because it's presumable that the tables are already
initialized. For example, the first non-8-bit Unicode character is:

0100;LATIN CAPITAL LETTER A WITH MACRON;Lu;0;L;0041 0304N;LATIN CAPITAL
LETTER A MACRON;;;0101;

It is safe, I think, to assume that this Unicode character will forever be
classified as an uppercase letter (with a lowercase mapping value of U+0101).

The test at the start of latex.ltx is about making sure we are in IniTeX
mode: I'm not sure I'd choose to do that today, but the test is
long-standing. For load-unicode-data, the idea was partly that there was
really no issue about checking: unlike formats, that might have hidden
stuff, here all we are trying to do is get to a known position. That
links to the second reason I'm slightly wary of a test. As-written,
load-unicode-data ensures that the \lccode, \uccode and \catcode tables
are in a state *known to the macro layer*. I know it's slightly strange
to you, but as a macro programmer I can't 'know' what different engine
devs might do/change, and I certainly don't know exactly what version of
UnicodeData.txt you are working from. By doing initialisation without
checking, I can be sure that we are on a known Unicode version.

To b

Re: [XeTeX] How much time to build LaTeX format for XeTeX

2019-12-05 Thread Joseph Wright


On 06/12/2019 03:15, Doug McKenna wrote:

Given all the parsing of the Unicode character data files during INITEX, and 
all the inputting and creation of the hyphenation trees, how much CPU time 
elapses while building the XeTeX format file for LateX?  I'm going to assume 
that the writing out of the format at the final \dump command is negligible, 
though I don't really know.

- Doug McKenna



'Not very long': of the order of seconds. On my i5 Dell XPS, using an 
Ubuntu VM on top of Windows 10 "time fmtutil-sys -byfmt xelatex" gives


real   0m2.154s
user   0m1.314s
sys0m0.094s

My native system is slower: this is a known issue to do with file access 
on Windows. LuaTeX is about 0.5s faster, I guess largely as it does not 
load hyphenation patterns.


The LaTeX dev formats, which pre-load more data, take a little longer: 
for XeLaTeX-dev


real   0m3.186s
user   0m2.244s
sys0m0.090s

Joseph

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-12-02 Thread Joseph Wright


On 02/12/2019 17:52, Doug McKenna wrote:

Joseph -

A similar ambiguity occurs later in the README.md file.  It says

- \Umathcode for all letters as TeX class 7 (var)

Does "letters" mean those code points on the TeX side with \catcode 11, or 
those Unicode code points labeled with 'L' in UnicodeData.txt?

If the former, then combining marks (Unicode 'M') should be entered into 
\Umathcode as TeX class 7; if the latter, then presumably not, though it's not 
clear why a math variable name can't have a combining mark.

- Doug McKenna



The former: I've clarified.

Joseph

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-12-01 Thread Joseph Wright


On 02/12/2019 05:56, Doug McKenna wrote:

- \lccode and/or \uccode for non-letter code points
   for which an upper or lower case mapping is given

The problem with this is that earlier, it is stated that all combining mark 
code points (class code starting with 'M' in the UnicodeData.txt file) are to 
be considered letters (\catcode set to 11).  So there's an ambiguity here that 
needs clearing up.  Does the above apply to combining mark code points or not?


You've read something in that is not in the README ;)

The file says

  - `\catcode` 11 for all combining marks (Unicode class "M")

where I've very deliberately kept the TeX 'side' as what *actually 
happens* (catcode-11), not said they are 'treated as letters', or similar.


I will clarify that 'letter' here means a codepoint with Unicode 
character class "L", and is not linked to the TeX catcode.



It may be that none of the combining marks in the data file have any case 
mappings, but there's no guarantee that is true.  So the question is, if a 
combining mark has an uppercase or lowercase mapping, does that get installed 
in \lccode and/or \uccode?


Yes, or at least would be the case in principle: all code points with 
upper/lower/title properties are set up.



Also, there's a confusing typo ("can"?) in

- \lccode and \uccode for all of class "Lt" (title
   case letters) to the lower can upper case mappings
   (or if not given to the code point itself)

Should "can' be "and/or"?


It is 'and': you need to set lccode and uccode for these code points.

Joseph

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-27 Thread Joseph Wright


On 28/11/2019 00:16, Ross Moore wrote:

If by ignoring you mean removing the character entirely, then that is surely 
not best at all.

Most  N Class (Normal) characters would be simply of the default  \mathord  
class.


That is already the case: it's where IniTeX starts off, chars are 
mathord. So 'nothing to do here'. Also note that some of this 
information is already set from the main Unicode file: it tells us which 
chars are letters.



I’d expect others to be mapped instead into a macro that corresponds to 
something that TeX does support.
e.g.
  space characters for  thinspace, 2-em space, etc.  in  U+2000 – U+200A
can expand into things like:   \, \; \> \quad \qquad  etc.  ( even to 
constructions like  \mskip1mu )


That's not a generic IniTeX thing, I'm afraid. The Unicode data loaders 
are explicitly about setting up the basic data in Unicode TeX engines 
that's held in (primitive) tables. Creating macros is the job of the 
'rest' of the format. Here, presumably you are thinking of making chars 
math-active: that's well out-of-scope for the loader.



After all, this is essentially what happens when pdfTeX reads raw Unicode input.


pdfTeX reads bytes, there's not really much comparison. In IniTeX mode, 
there is not much happening with UTF-8 and pdfTeX: perhaps you are 
thinking of with LaTeX?


Joseph

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-27 Thread Joseph Wright


On 28/11/2019 01:26, Doug McKenna wrote:

Ross wrote:


| If by ignoring you mean removing the character entirely, then that is surely 
not best at all.
|
| Most N Class (Normal) characters would be simply of the default \mathord 
class.


The parsing code in load-unicode-math-classes.tex installs values in the 
\Umathcode table that comport with some rule, which without too much of a close 
look seems to me to be whether the character code math class read from 
MathClass.txt is one of the eight possibilities that parsing code pays 
attention to, out of the 15 possible ones in the file. Therefore it appears to 
me that all entries in MathClass.txt that are marked with, for instance, 'N', 
are ignored with respect to installing any entry in the \Umathcode table.

It may be that such characters in MatClass.txt marked with 'N' take on the 
\mathOrd attribute by default when TeX finds them within math mode, I'm not 
sure without looking at its code.

Doug McKenna


The loader is intended for use in IniTeX mode and so relies on the 
defaults. As you say, characters are already \mathord unless actively 
set to something else.


Joseph

Re: [XeTeX] Math class initialization in Unicde-aware engine

2019-11-27 Thread Joseph Wright


On 27/11/2019 23:20, Doug McKenna wrote:

Another question about Unicode-aware TeX engine (e.g., XeTeX) initialization 
files.

The Unicode Consortium provides a file, MathClass.txt, e.g.,

./texmf-dist/tex/generic/unicode-data/MathClass.txt

It contains a list of lines (and comments).  Field 0 of an entry line is a 
Unicode code point or a range of code points, and field 1 is a single ASCII 
character that declares the Unicode math class to which the code point or range 
of code points belongs.

Comments in that file say that there are (currently) 15 different Unicode math 
class codes:

#   N - Normal - includes all digits and symbols requiring only one form
#   A - Alphabetic
#   B - Binary
#   C - Closing - usually paired with opening delimiter
#   D - Diacritic
#   F - Fence - unpaired delimiter (often used as opening or closing)
#   G - Glyph_Part - piece of large operator
#   L - Large - n-ary or large operator, often takes limits
#   O - Opening - usually paired with closing delimiter
#   P - Punctuation
#   R - Relation - includes arrows
#   S - Space
#   U - Unary - operators that are only unary
#   V - Vary - operators that can be unary or binary depending on context
#   X - Special - characters not covered by other classes

During XeTeX format initialization, the file load-unicode-math-classes.tex in 
that same directory is executed, in order to declare to the engine which 
Unicode code points belong to which TeX math classes.  The comments in that 
file say that the classes it pays attention to are those with the following 
Unicode math codes:

% This file parses MathClass.txt, provided by the Unicode Consortium, and sets
% up the following mapping between Unicode classes and TeX math types
% - "L" (large)   \mathop
% - "B" (binary)  \mathbin
% - "V" (vary)\mathbin
% - "R" (relation)\mathrel
% - "O" (opening) \mathopen
% - "C" (closing) \mathclose
% - "P" (punctuation) \mathpunct
% - "A" (alphabetic)  \mathalpha

That means that there are 7 other Unicode math classes that are unaccounted for.

Unfortunately, the documentation/comments don't say what happens to entries 
having these other Unicode math codes (N, D, F, G, S, U, and X).  Are they 
completely ignored, or are they mapped to one of the other eight codes that 
matches what TeX is interested in or only capable of handing?

I can imagine that the space character, given Unicode math class 'S' in MathClass.txt, is 
ignored during this parse.  But what happens to the '¬' character (U+00AC) ("NOT 
SIGN"), which is assigned 'U' (Unary Operator).  Surely the logical not sign is not 
being ignored during initialization of a Unicode-aware engine, yet the comments in 
load-unicode-math-classes.tex don't say one way or the other, and it appears to me that 
the parsing code is ignoring it.

The ReadMe.md file



is also deficient in answering this question.

TIA,


Er, I thought the README was reasonably clear, ah well!

The other Unicode math classes don't really map directly to TeX ones, so 
they are currently ignored. Suggestions for improvements here are of 
course welcome.


Joseph

Re: [XeTeX] Lowercase Unicode code points in hyphenation patterns

2019-11-24 Thread Joseph Wright


On 24/11/2019 19:42, Joseph Wright wrote:
This has of course come up before, and I'd like to add to the expl3 case 
changers. However, I've not been able to track down any formal statement 
on the case mappings: are they in the UCD, some official publication, ...?


Joseph


Found the appropriate .xml files in the CLDR: see attached.

I plan to make some revisions to the expl3 case changer over the next 
month or two: I'll likely incorporate this information.


Joseph




	
	
		
			
		
	





	
	
		
			
# Copyright (C) 2011-2013, Apple Inc. and others. All Rights Reserved.
# Remove \0301 following Greek, with possible intervening 0308 marks.
::NFD();
# For uppercasing (not titlecasing!) remove all greek accents from greek letters.
# This is done in two groups, to account for canonical ordering.
[:Greek:] [^[:ccc=Not_Reordered:][:ccc=Above:]]*? { [\u0313\u0314\u0301\u0300\u0306\u0342\u0308\u0304] → ;
[:Greek:] [^[:ccc=Not_Reordered:][:ccc=Iota_Subscript:]]*? { \u0345 → ;
::NFC();
::Any-Upper();
			
		
	





	
	
		
			
# Special case for final form of sigma.
::NFD();
# C is preceded by a sequence consisting of a cased letter and then zero or more case-ignorable characters,
# and C is not followed by a sequence consisting of zero or more case-ignorable characters and then a cased letter.
# 03A3; 03C2; 03A3; 03A3; Final_Sigma; # GREEK CAPITAL LETTER SIGMA
# With translit rules, easiest is to handle the negative condition first, mapping in that case to the regular sigma.
Σ } [:case-ignorable:]* [:cased:] → σ;
[:cased:] [:case-ignorable:]* { Σ → ς;
::Any-Lower;
::NFC();

Re: [XeTeX] Lowercase Unicode code points in hyphenation patterns

2019-11-24 Thread Joseph Wright


On 24/11/2019 19:42, Joseph Wright wrote:
This has of course come up before, and I'd like to add to the expl3 case 
changers. However, I've not been able to track down any formal statement 
on the case mappings: are they in the UCD, some official publication, ...?


Joseph


Found the appropriate .xml files in the CLDR: see attached.

I plan to make some revisions to the expl3 case changer over the next 
month or two: I'll likely incorporate this information.


Joseph




	
	
		
			
		
	





	
	
		
			
# Copyright (C) 2011-2013, Apple Inc. and others. All Rights Reserved.
# Remove \0301 following Greek, with possible intervening 0308 marks.
::NFD();
# For uppercasing (not titlecasing!) remove all greek accents from greek letters.
# This is done in two groups, to account for canonical ordering.
[:Greek:] [^[:ccc=Not_Reordered:][:ccc=Above:]]*? { [\u0313\u0314\u0301\u0300\u0306\u0342\u0308\u0304] → ;
[:Greek:] [^[:ccc=Not_Reordered:][:ccc=Iota_Subscript:]]*? { \u0345 → ;
::NFC();
::Any-Upper();
			
		
	





	
	
		
			
# Special case for final form of sigma.
::NFD();
# C is preceded by a sequence consisting of a cased letter and then zero or more case-ignorable characters,
# and C is not followed by a sequence consisting of zero or more case-ignorable characters and then a cased letter.
# 03A3; 03C2; 03A3; 03A3; Final_Sigma; # GREEK CAPITAL LETTER SIGMA
# With translit rules, easiest is to handle the negative condition first, mapping in that case to the regular sigma.
Σ } [:case-ignorable:]* [:cased:] → σ;
[:cased:] [:case-ignorable:]* { Σ → ς;
::Any-Lower;
::NFC();

Re: [XeTeX] Lowercase Unicode code points in hyphenation patterns

2019-11-24 Thread Joseph Wright


On 24/11/2019 18:40, Apostolos Syropoulos via XeTeX wrote:


   On Sunday, November 24, 2019, 4:21:32 AM GMT+2, David Carlisle 
 wrote:
  
  >the lccode tables are set by the macro layer not the engine code, it

reads in The Unicode consortium data file
tex/generic/unicode-data/UnicodeData.txt
and sets the lccode values and catcode values according to the data there.




see



tex/generic/unicode-data/load-unicode-data.tex


Of course these tables are all wrong but this is another problem.For example, 
this table specifies that the capital form of έ is
Έ which is wrong because uppercase letters do not get accents,expect when they 
start a sentence or the name of a person (e.g.,Έλενα). Since the Unicode 
consortium is not going to change this,I have added the correct \uccodes and 
\lccodes in xgreek.sty
Regards,
A.S.


This has of course come up before, and I'd like to add to the expl3 case 
changers. However, I've not been able to track down any formal statement 
on the case mappings: are they in the UCD, some official publication, ...?


Joseph

Re: [XeTeX] [tex-live] Primitive parity, \expanded and \Ucharcat

2018-06-18 Thread Joseph Wright


On 13/05/2018 13:36, Jonathan Kew wrote:

On 13/05/2018 13:15, Joseph Wright wrote:

On 13/05/2018 12:23, Jonathan Kew wrote:

On 13/05/2018 10:57, Joseph Wright wrote:

Hello all,

Modulo any issues that show up in testing, all of the above is now 
done and on my GitHub fork 
(https://github.com/josephwright/texlive-source/tree/Ucharcat: this 
branch has 'all the stuff' on it).


I know that https://github.com/texjporg/tex-jp-build already has a 
branch for \expanded. What's the best way to request 'officially' 
that the changes go into pdfTeX/XeTeX? I can send a .diff to the 
pdfTeX dev list, and put in a pull request on SourceForge for XeTeX, 
if that's best.


Thanks for working on these things, Joseph.

For xetex, a pull request would be the best approach, I think; or if 
it's feasible to do separate PRs for each feature, that would 
probably make reviewing and tracking the changes easier.


Is there documentation of the added features available somewhere, so 
we can more accurately understand what we're thinking of adding? Thanks!


JK


Excellent: I'll start on putting something together later today.

Do you want all PRs against master or can they be 'chained'? Adding 
primitives, it's easier if you are working knowing which others have 
been created.


"Chained" should be fine, I expect; I doubt there'd be any reason we'll 
want to take a later one but decide against an earlier one.


At which point perhaps a single PR is just as good, as long each feature 
is a separate commit so that it comes in manageable chunks. From a quick 
glance at your fork, it looks like that's how it would naturally appear. 
So, feel free to do whichever seems easiest for you.


JK


Hello Jonathan,

Have you been able to look at my merge requests? We are now moving to 
using \expanded for other engines: it's going into pdfTeX and (u)pTeX, 
and is already in LuaTeX. Ideally, we'd like to avoid XeTeX being 'left 
behind'.


Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Primitive parity, \expanded and \Ucharcat

2018-05-13 Thread Joseph Wright


On 13/05/2018 12:23, Jonathan Kew wrote:

On 13/05/2018 10:57, Joseph Wright wrote:

Hello all,

Modulo any issues that show up in testing, all of the above is now 
done and on my GitHub fork 
(https://github.com/josephwright/texlive-source/tree/Ucharcat: this 
branch has 'all the stuff' on it).


I know that https://github.com/texjporg/tex-jp-build already has a 
branch for \expanded. What's the best way to request 'officially' that 
the changes go into pdfTeX/XeTeX? I can send a .diff to the pdfTeX dev 
list, and put in a pull request on SourceForge for XeTeX, if that's best.


Thanks for working on these things, Joseph.

For xetex, a pull request would be the best approach, I think; or if 
it's feasible to do separate PRs for each feature, that would probably 
make reviewing and tracking the changes easier.


Is there documentation of the added features available somewhere, so we 
can more accurately understand what we're thinking of adding? Thanks!


JK


Excellent: I'll start on putting something together later today.

Do you want all PRs against master or can they be 'chained'? Adding 
primitives, it's easier if you are working knowing which others have 
been created.


Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Primitive parity, \expanded and \Ucharcat

2018-05-13 Thread Joseph Wright


On 03/05/2018 22:38, Joseph Wright wrote:

Hello all,

In adding features to expl3, the LaTeX team have been making use of a 
variety of 'new' (post-e-TeX) 'utility' primitives in various engines. 
Almost always these originate in pdfTeX and have migrated to other 
engines, but are not in any way tied to PDF output, etc. Depending on 
the exact engine in use, some or all of these primitives may be 
unavailable, and that then limits macro-level features.


It seems sensible long-term to have cross-engine feature stay 'in sync' 
with each other. In particular, (u)pTeX has picked up a number of pdfTeX 
features, meaning that XeTeX often is the most 'limited' engine. The 
team would like, if possible, to have a common feature set in all 
engines in this regard. At the same time, there are a few 'bits and 
pieces' that make sense to raise at the same time. I'll lay out the 
various areas below.


Doing the work here is non-trivial, but luckily there is an automated 
build system available via GitHub which is allowing us (me/David 
Carlisle) to do some testing. I'm building up patches in various 
branches at https://github.com/josephwright/texlive-source: assuming 
these look good, I'll merge them as required and send diff files to 
where/whoever is best. The branches on GitHub should hopefully have 
clear names for what they address.


The areas we are keen to look at are as follows:

- 'pdfutils': (u)pTeX has picked up a number of pdfTeX primitives, and
   a subset have made their way into XeTeX too. However, XeTeX is still
   missing several, most notably an expandable RNG. We are part-way
   though working out patches to add the rest to XeTeX (RNG is done,
   file data and timer to do)

- banners: pdfTeX and LuaTeX have \banner, other engines lack
   that. The banner includes TeX version and details of the TeX system,
   so is potentially useful. Adding this to (u)pTeX/XeTeX looks
   straight forward: still to-do.

- \expanded: This was slated for pdfTeX 1.50 but that has never
   appeared, but the primitive is useful as it allows 'function-like'
   expandable macros. We can see this begin very useful for simplifying
   some macro code, and in many ways it feels like an e-TeX primitive.
   The GitHub expanded branch adds it to pdfTeX/XeTeX/(u)pTeX

- Allowing \Ucharcat (XeTeX) to make \active tokens: this was raised
   recently on the XeTeX list, but fits here as we've put a branch
   together to show it works

It's likely I'll finish the outstanding patches by the weekend. Note 
that at present each feature addition is in a separate Git branch, so to 
add all of them I'll have to do a little tidying up: that will happen 
once I know which of these suggestions are useful.


Feedback most welcome.

Joseph


Hello all,

Modulo any issues that show up in testing, all of the above is now done 
and on my GitHub fork 
(https://github.com/josephwright/texlive-source/tree/Ucharcat: this 
branch has 'all the stuff' on it).


I know that https://github.com/texjporg/tex-jp-build already has a 
branch for \expanded. What's the best way to request 'officially' that 
the changes go into pdfTeX/XeTeX? I can send a .diff to the pdfTeX dev 
list, and put in a pull request on SourceForge for XeTeX, if that's best.


Regards,

Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Primitive parity, \expanded and \Ucharcat

2018-05-04 Thread Joseph Wright


On 03/05/2018 22:38, Joseph Wright wrote:

- 'pdfutils': (u)pTeX has picked up a number of pdfTeX primitives, and
   a subset have made their way into XeTeX too. However, XeTeX is still
   missing several, most notably an expandable RNG. We are part-way
   though working out patches to add the rest to XeTeX (RNG is done,
   file data and timer to do)


To be clear, the full set of primitives here is

- \pdfrandomseed
- \pdfsetrandomseed
- \pdfuniformdeviate
- \pdfnormaldeviate

- \pdfresettimer
- \pdfelapsedtime

- \pdffilesize
- \pdffilemoddate
- \pdffiledump
- \pdfcreationdate

of which the first set is done (a working branch for XeTeX).

Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Primitive parity, \expanded and \Ucharcat

2018-05-04 Thread Joseph Wright


Hello Norbert,


I'll merge them as required and send diff files to where/whoever is best.


For (u)ptex a pull request at https://github.com/texjporg/tex-jp-build
might be useful. This is the main development area for all Japanese TeX
engine stuff.


I see that the \expanded code has already been picked up in a branch 
there: https://github.com/texjporg/tex-jp-build/tree/expanded. So I'm 
guessing that is likely to happen.


At the moment what I'm aiming to do is get everything in one place so it 
can be reviewed, etc., and commented on. Particularly in the case of 
\expanded, the wider plan only works if there is general 
(pdfTeX/XeTeX/(u)pTeX) agreement on taking the patch.


Once we have that agreement, putting in pull requests, diffs, etc. 
against the 'right' places should be easy enough (at least in the sense 
I'm happy to sort it).



them I'll have to do a little tidying up: that will happen once I know which
of these suggestions are useful.


I think all are fine.


I don't imagine there is anything particularly controversial, but there 
is also the technical business (I'm no WEB expert).


Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

[XeTeX] Primitive parity, \expanded and \Ucharcat

2018-05-03 Thread Joseph Wright


Hello all,

In adding features to expl3, the LaTeX team have been making use of a 
variety of 'new' (post-e-TeX) 'utility' primitives in various engines. 
Almost always these originate in pdfTeX and have migrated to other 
engines, but are not in any way tied to PDF output, etc. Depending on 
the exact engine in use, some or all of these primitives may be 
unavailable, and that then limits macro-level features.


It seems sensible long-term to have cross-engine feature stay 'in sync' 
with each other. In particular, (u)pTeX has picked up a number of pdfTeX 
features, meaning that XeTeX often is the most 'limited' engine. The 
team would like, if possible, to have a common feature set in all 
engines in this regard. At the same time, there are a few 'bits and 
pieces' that make sense to raise at the same time. I'll lay out the 
various areas below.


Doing the work here is non-trivial, but luckily there is an automated 
build system available via GitHub which is allowing us (me/David 
Carlisle) to do some testing. I'm building up patches in various 
branches at https://github.com/josephwright/texlive-source: assuming 
these look good, I'll merge them as required and send diff files to 
where/whoever is best. The branches on GitHub should hopefully have 
clear names for what they address.


The areas we are keen to look at are as follows:

- 'pdfutils': (u)pTeX has picked up a number of pdfTeX primitives, and
  a subset have made their way into XeTeX too. However, XeTeX is still
  missing several, most notably an expandable RNG. We are part-way
  though working out patches to add the rest to XeTeX (RNG is done,
  file data and timer to do)

- banners: pdfTeX and LuaTeX have \banner, other engines lack
  that. The banner includes TeX version and details of the TeX system,
  so is potentially useful. Adding this to (u)pTeX/XeTeX looks
  straight forward: still to-do.

- \expanded: This was slated for pdfTeX 1.50 but that has never
  appeared, but the primitive is useful as it allows 'function-like'
  expandable macros. We can see this begin very useful for simplifying
  some macro code, and in many ways it feels like an e-TeX primitive.
  The GitHub expanded branch adds it to pdfTeX/XeTeX/(u)pTeX

- Allowing \Ucharcat (XeTeX) to make \active tokens: this was raised
  recently on the XeTeX list, but fits here as we've put a branch
  together to show it works

It's likely I'll finish the outstanding patches by the weekend. Note 
that at present each feature addition is in a separate Git branch, so to 
add all of them I'll have to do a little tidying up: that will happen 
once I know which of these suggestions are useful.


Feedback most welcome.

Joseph


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Allowing Ucharcat to produce active characters

2018-04-18 Thread Joseph Wright


On 18/04/2018 16:08, Bruno Le Floch wrote:

Hello,

I suggest allowing \Ucharcat to produce active characters.  See
three-line patch attached.  This would allow to produce active
characters expandably in all engines (pdfTeX, luaTeX, XeTeX, pTeX,
upTeX).  My code makes

 \Ucharcat `~ 13
 \expandafter\show\Ucharcat `~ 13
 \edef\foo{\expandafter\noexpand\Ucharcat `~ 13 }

run the code of the active ~ as if it had been typed directly, then show
its meaning, then do the equivalent of \def\foo{~}.

Bruno


In case anyone wonders: only XeTeX has \Ucharcat. In LuaTeX we can make 
char tokens from the 'Lua side', so are unrestricted in terms of 
catcode. In pdfTeX and (u)pTeX, assuming we are only dealing with the 
8-bit range (upTeX) it's feasible to pre-generate all combinations and 
use expandable macros to output the tokens.


Joseph



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Problem involving \includegraphics after texlive update on Debian Tesing

2017-07-07 Thread Joseph Wright

On 07/07/2017 12:54, Johann Spies wrote:
> After a recent upgrade of texlive to 2017.20170629-1 on Debian I am
> experiencing a problem compiling a longstanding document and I can
> replicate the problem with the following code:
> 
> \documentclass[12pt,a4paper]{article}
> \usepackage{fontspec} % Gebruik met xelatex
> \usepackage{graphicx} % Gebruik met xelatex
> \usepackage[hyperindex=true,colorlinks=true,bookmarks]{hyperref}
> \usepackage{colortbl}
> \setmainfont[Ligatures=TeX,Mapping=tex-text]{Linux Libertine O}
> \begin{document}
> 
> 
> 
> \includegraphics{fruits.jpg}
> 
> \end{document}
> 
> %%% Local Variables: 
> %%% mode: latex
> %%% TeX-engine: xetex
> %%% TeX-master: t
> %%% End: 
> 
> 
> results in
> 
> ERROR: Undefined control sequence.
> 
> --- TeX said ---
> \Ginclude@bmp #1->\Gin@log 
>{<#1>}\bgroup \def \@tempa {!}\special 
> {pdf:image...l.11 \includegraphics{fruits.jpg}
>  
> --- HELP ---
> TeX encountered an unknown command name. You probably misspelled the
> name. If this message occurs when a LaTeX command is being processed,
> the command is probably in the wrong place---for example, the error
> can be produced by an \item command that's not inside a list-making
> environment. The error can also be caused by a missing \documentclass
> command.
> 
> You can replace "fruit.jpg" with any jpg to replicate the problem.
> 
> It will help me if someone on this list can identify what is causing
> this.
> 
> Regards
> Johann

Could you add \listfiles to your input and post the resulting *File
list* from the .log?

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Using tikz with plain XeTeX

2017-05-13 Thread Joseph Wright

On 13/05/2017 12:49, Philip Taylor wrote:
> 
> 
> John Was wrote:
>> Even if PS-Tricks and Tikz do clash, it doesn't seem to be PS-Tricks 
>> specifically that's causing this issue (I've tried commenting it out) - 
>> suspicion currently falls on Edmac, which I use for cropmarks and sometimes 
>> for other purposes (e.g. automatic line numbering of texts when required). I 
>> don't really mind doing without tikz (at least for now), but it would be 
>> good to know the cause of the weird behaviour!
> Then I think you will have to strip down your fault-provoking code to 
> something manageable, John; "necessary and sufficient" is the key -- you have 
> provided the necessary, now it is surely incumbent on you to strip it down to 
> the necessary if others are to be able to help you in finite time.
> 
> Philip Taylor

A minimal example is

\input ulem.sty
\input tikz

\tikzpicture
  \path[draw=red] (0,0) -- (1,1) -- (2,1) circle (10pt);
\endtikzpicture
\bye

with the first piece of text pointing to ifpdf: the issue is not limited
to TikZ. (It doesn't help though that TikZ's emulation of a minimal
LaTeX set up isn't 'self-contained': the load order cannot be reversed
here.)

This allows us to isolate the issue: ulem.sty does

\expandafter\ifx\csname ProvidesPackage\endcsname \relax

which leaves \ProvidesPackage as \relax in plain (there is no grouping).
That's an issue for any code that tests 'quickly' for \ProvidesPackage,
for example in ifpdf.sty

\ifx\ProvidesPackage\undefined

The most obvious solution is to get rid of the problematic definition:

\input ulem.sty
\let\ProvidesPackage\undefined
\input tikz

\tikzpicture
  \path[draw=red] (0,0) -- (1,1) -- (2,1) circle (10pt);
\endtikzpicture
\bye





Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Using tikz with plain XeTeX

2017-05-13 Thread Joseph Wright

On 13/05/2017 11:53, John Was wrote:
> Dear All
> 
> Apologies if this is the wrong list (but I’ve always found participants here 
> very helpful!).
> 
> I have been sent some tikz code for diagrams to be included in a forthcoming 
> article.  The author uses a version of LaTeX but tikz should work OK in plain 
> (Xe)TeX, I think – though I haven’t tried it for a number of years.  Oddly 
> enough, when I invoke tikz with:
> 
> \input tikz
> 
> the package does load, and a simple drawing works:
> 
> \tikzpicture
> \path[draw=red] (0,0) -- (1,1) -- (2,1) circle (10pt);
> \endtikzpicture
> 
> (pasted from a stackexchange discussion of a different matter).
> 
> BUT, before the drawing I get six lines of info in the output (the sort of 
> thing I’d expect in the log), viz.:
> 
> pgfrcs[2010/10/25 v2.10 (rcs-revision 1.24)]
> pgf[2008/01/15 (rcs-revision 1.10)]
> pgfsys[2010/06/30 v2.10 (rcs-revision 1.37)]
> pgfcore[2010/04/11 v2.10 (rcs-revision 1.7)]
> pgffor[2010/03/23 v2.10 (rcs-revision 1.18)]
> tikz[2010/10/13 v2.10 (rcs-revision 1.76)]
> 
> It also messes up my crop marks and running headlines in subsequent pages, 
> but I suspect that could be rectified by invoking other \inputs in a 
> different order (I include edmac and pstricks at the start).  I can manage 
> without tikz if necessary (the worst-case scenario would be redrawing with 
> pstricks), but it would be good to know at least that I can use tikz in 
> future without these unwanted half-dozen lines coming into the output.  It’s 
> a powerful package that I’ve always meant to learn.
> 
> Best
> 
> 
> John

TikZ is certainly loadable with plain. Could you give more details of
your TeX system or perhaps a log for the simple file

\input tikz
\tikzpicture
\path[draw=red] (0,0) -- (1,1) -- (2,1) circle (10pt);
\endtikzpicture
\bye

I get the 'expected' output with both TL'16 final and TL'17 pretesting.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] xetex.def

2017-05-10 Thread Joseph Wright

On 09/05/2017 23:13, Karl Berry wrote:
> what is the reason for having two .def files here.
> 
> I can't imagine an insurmountable technical reason for having two
> independent .def files these days. Indeed, it would seem highly
> desirable to me to merge them, with conditional parts as needed. It sure
> was a pain to be applying changes to both independently, when I was
> the one doing that.

'Yes'

> One of the reasons I was so happy to turn them over
> to you guys :).

Well the ideas originate from the team ... you'll see for expl3 we've
gone back to 'one definite source' as the number of drivers is nowadays
small and predictable.

> As I expect you know, they currently exist separately because of their
> historical development. xetex.def was based on dvipdfmx.def at the time
> of creating XeTeX. And that was reasonable during active XeTeX
> development. And so it has continued to the present day.  But nowadays,
> when dvipdfmx and xdvipdfmx themselves have been (sort of) merged
> (thanks always to Khaled ...), merging the .def files too seems good.

OK, I'll probably work on this but not before TL'17 release: somewhat
risky and not something I'd want to put on the DVD. (I will send the
latest update to CTAN to fix the issue concerning scaling of links.)

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] xetex.def

2017-05-09 Thread Joseph Wright

On 09/05/2017 14:27, Joseph Wright wrote:
> So the question is what are the _essentials_ of the difference: really
> it's about 'how much difference do we need to look after in the .def files'.

I should add that the question arose as dvipdfmx.def and xetex.def
currently use different approaches to colour. The reasons are I think
historical: xdv2pdf didn't support the dvipdfmx approach, but xdvipdfmx
does. For maintenance *today* it would be clearer if we had a simple way
of knowing which parts have to be different between the dvipdfmx and
xetex drivers.

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] xetex.def

2017-05-09 Thread Joseph Wright

On 09/05/2017 14:19, Akira Kakuto wrote:
> Dear Joseph,
> 
>> Following a bug report for (x)dvipdfmx box scaling, we are talking a
>> look at xetex.def and dvidpfmx.def to fix that and related issues. This
>> raises a question: what is the reason for having two .def files here. A
>> quick test suggests that XeTeX (xdvipdfmx) can happily use dvipdfmx.def
>> with the exception of a few lines at the end of the file: those could
>> easily be made conditional.
> 
> I'm not familiar with the drivers, but I think that the independent
> xetex.def is definitely needed.
> 
> I think that images png, jpg, pdf, are efficiently embedded in XeTeX,
> probably by using primitives, while dvipdfmx requires an external
> program extractbb to obtain sizes of the images.
> 
> For example,
> 
> %
> % xelatex test.tex(xetex.def)
> %
> \documentclass[12pt]{article}
> \usepackage{graphicx}
> \usepackage{pdfpages}
> \begin{document}
> \includepdf[pages={1-9}]{xtst.pdf}
> \end{document}
> 
> is far faster than
> 
> %
> % xelatex test.tex   (dvipdfmx.def)
> %
> \documentclass[12pt,dvipdfmx]{article}
> \usepackage{graphicx}
> \usepackage{pdfpages}
> \begin{document}
> \includepdf[pages={1-9}]{xtst.pdf}
> \end{document}
> 
> Best,
> Akira

Image inclusion is one I'd looked at, and certainly there is some
benefit from using the primitive for bounding-box lookup. However, that
doesn't mean that the entire .def files have to be different: for
example, we might pull them 'back together' in a .dtx and have only the
image inclusion bit varying. Other operations (scaling, rotation, colour
support, ...) seem to be addressable using common code, and indeed final
image inclusion (as opposed to BB extraction) could be done using a
common path for the shared data types (probably though as the BB lookup
is separate one should stick to the primitives in XeTeX).

So the question is what are the _essentials_ of the difference: really
it's about 'how much difference do we need to look after in the .def files'.

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] xetex.def

2017-05-09 Thread Joseph Wright

Hello all,

Following a bug report for (x)dvipdfmx box scaling, we are talking a
look at xetex.def and dvidpfmx.def to fix that and related issues. This
raises a question: what is the reason for having two .def files here. A
quick test suggests that XeTeX (xdvipdfmx) can happily use dvipdfmx.def
with the exception of a few lines at the end of the file: those could
easily be made conditional.

Reading over the comments, I see some about the older non-xdvipdfmx
drivers for XeTeX, but these are as far as I know no longer in use
(particularly for anyone likely to use an updated .def file). Are there
any particular reasons that XeTeX needs a separate driver today?

Joseph


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] TeX--XeT and OpenType fonts

2017-03-01 Thread Joseph Wright

On 01/03/2017 12:14, Jonathan Kew wrote:
> On 01/03/2017 11:59, Joseph Wright wrote:
>> Hello all,
>>
>> With example
>>
>> \font\OTtenrm="[lmroman10-regular.otf]/OT"
>> \OTtenrm
>> \TeXXeTstate=1
>> \beginR
>> abc
>> \endR
>> \bye
>>
>> the output is LTR with TL'16. Is this a known issue?
> 
> Yes, this is expected behavior. The TeX--XeT direction controls (\beginR
> etc) control the ordering of words within a line, etc. (slightly more
> accurately, the direction in which nodes in an hlist progress), but do
> not override the inherent directionality of Unicode characters, so "abc"
> is still a sequence of three strong-LTR letters and they stay in their
> left-to-right order.
> 
> (However, if you try
> 
>   \beginR
>   abc def
>   \endR
> 
> I'd expect you to get output that reads "def abc" because the two words
> are ordered RTL, even though each of them remains LTR internally.)
> 
> This is why it is possible -- for better or worse -- to do something like
> 
> ...english text {\arabfont العربي} more english
> 
> in a xetex document and have the isolated Arabic word appear with
> correct (internal) RTL directionality, without having to explicitly
> surround it with \beginR...\endR (although for a multi-word Arabic
> phrase that would be necessary); the RTL-ness of the characters controls
> their behavior within the word, despite the TeX direction remaining LTR.
> 
> Currently, there isn't an option to make the TeX-level direction
> override the Unicode character directionality (comparable to the CSS
> property "unicode-bidi:bidi-override;"). Perhaps that would occasionally
> be useful, though people haven't exactly been clamouring for it AFAIK.
> 
> JK

Thanks: all clear.

My guess is for the rare 'override' case one would probably do something
at the macro level in any case (kerning is all wrong to start with).

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] TeX--XeT and OpenType fonts

2017-03-01 Thread Joseph Wright

On 01/03/2017 12:08, Joseph Wright wrote:
> On 01/03/2017 12:06, Philip Taylor wrote:
>> What happens if you replace the \endR with a blank line to cause the 
>> paragraph to end, Joseph [1] ?
>> Philip Taylor
> 
> Box ends on starting on the right but text itself is still LTR.

BTW, that's the lack of an explicit \endR not the \par: if you simply
force a paragraph nothing alters. Not that this impacts on the question ...

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] TeX--XeT and OpenType fonts

2017-03-01 Thread Joseph Wright

On 01/03/2017 12:06, Philip Taylor wrote:
> What happens if you replace the \endR with a blank line to cause the 
> paragraph to end, Joseph [1] ?
> Philip Taylor

Box ends on starting on the right but text itself is still LTR.

I suspect this is not normally noticed as my guess is HarfBuzz 'does its
own thing' with placing the glyphs based on their codepoint (and thus
'natural' LTR/RTL properties), and thus entirely ignores TeX--XeT.
However, it would be useful to know that is correct.

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] TeX--XeT and OpenType fonts

2017-03-01 Thread Joseph Wright

Hello all,

With example

\font\OTtenrm="[lmroman10-regular.otf]/OT"
\OTtenrm
\TeXXeTstate=1
\beginR
abc
\endR
\bye

the output is LTR with TL'16. Is this a known issue?

Joseph


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Random number primitives

2016-11-14 Thread Joseph Wright

On 14/11/2016 07:39, Akira Kakuto wrote:
> Dear Joseph,
> 
>> - \(pdf)uniformdeviate
>> - \(pdf)normaldeviate
>> - \(pdf)randomseed
>> - \(pdf)setrandomseed
>> (LuaTeX drops the 'pdf' part of the names.)
> 
> H. Kitagawa, the author of eptex, has added
> \pdfuniformdeviate
> \pdfnormaldeviate
> \pdfrandomseed
> \pdfsetrandomseed
> in eptex and euptex (r42506 in TL).
> 
> Best,
> Akira

Hi Akira,

Thanks for letting me know: we'll certainly add the random functionality
now!

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Random number primitives

2016-11-13 Thread Joseph Wright

On 13/11/2016 15:49, Karljürgen Feuerherm wrote:
> Well… does it have to be either/or, anyhow? Taking Apostolos’ last point, why 
> not have a switch?
> 
> It makes sense to allow for cross-platform compatibility, but there’s no 
> reason to think that *nobody* would appreciate an improved algorithm….
> 
> K

I'm not *against* an improved approach, but the work needed for 'a new
implementation in pdfTeX, LuaTeX, XeTeX, ...' is greater than 'add
more-or-less the current implementation from pdfTeX and LuaTeX to XeTeX,
...'.

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Random number primitives

2016-11-13 Thread Joseph Wright

On 13/11/2016 13:04, Apostolos Syropoulos wrote:
>> to track the seed). The usefulness of pseudo-random numbers has come up
>> a few times recently, and so we'd like to address this. (Expandable
>> floating point evaluation is pretty handy as an end user!)
>  
> 
> Can you please elaborate on the usefulness of pseudo-random numbers in
> a typesetting engine? I think it is a good thing to add features but
> those features should be added for some good reason.

Indeed.

I've seen a variety of use cases for pseudo-random values, in particular
two which come up reasonably often. The first is selecting entries from
a larger 'pool' of values, for example for creating test papers. ('Use 5
out of the 20 questions I've written, picked at random'.) The second,
probably more common, case is in creating figures (for example using
TikZ). Depending on what one is representing, an element of
(pseudo)randomness is useful.

These use cases are beyond what might call the 'classical' idea of what
TeX is for, in the sense one could do them (and other uses) by hand or
using another tool and import into TeX. However, the programmable nature
of TeX attracts use in these ways. One can generate pseudo-random
numbers at the macro level, most obviously in the pgf package, and this
facility is used by many people. (My own use case for random values, in
creating some figures, uses the pgf implementation.) However, the need
to track the seed value to allow a pseudo-random sequence of values
means that any macro-based implementation is necessarily non-expandable.
The experience of the team with \fp_eval:n, the expandable FPU of the
expl3 bundle, suggests that expandable calculations are useful. In that
context, it would be nice to be able to offer some random value
abilities. (As noted at the start of this thread, that is already
possible in pdfTeX and LuaTeX and we will likely add something to the
FPU which currently will work with those engines only.)

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [tex-live] Random number primitives

2016-11-13 Thread Joseph Wright

On 12/11/2016 23:50, Nelson H. F. Beebe wrote:
> Joseph Wright  writes on
> Sat, 12 Nov 2016 21:45:37 +:
> 
>>> Both pdfTeX and LuaTeX include a series of primitives that expose a
>>> lower-level pseudo-random number generator (I assume from C: there is
>>> very little actual code in the pdfTeX WEB source to implement these).
> 
> Since all *TeX engines on Unix(-like) systems these days are built on
> C code originally translated from the Pascal sources, it should be
> possible to supply an interface in all such engines to a C-library
> random-number generator.
> 
> Because the historical rand() is often platform-dependent, and poor,
> I'd recommend the POSIX drand48() family, which is available on all
> systems these days (and I can supply portable code if you feel the
> need for it).  It repairs some of the defects of 32-bit linear
> congruential generators, and should be able to deliver the same stream
> of random numbers from a given starting seed on all platforms.
> 
> It is imperative that users be able to supply an initial seed, because
> otherwise, it is not possible to generate independent streams of
> random numbers of successive runs, such as might be needed for
> multiple simulations.
> 
> However, the default seed should always be a constant, rather than one
> dependent on time, process-id, or other internal data; that way,
> successive runs are reproducible.  Unreproducible output may make
> debugging impossible.
> 
> If you want to improve the quality of the generator beyond what
> drand48() produces, contact me offlist for details of a simple
> extension that costs almost nothing extra, yet dramatically lengthens
> the period and reduces correlations between the output random numbers.

pdfTeX and LuaTeX *already have primitives* that generate random
numbers: I'm no C/WEB/... programmer but from pdftex.web I think it's
likely to be using rand() ultimately. Getting the same result from
pdfTeX and other engines on the same platform is what seems to me to be
important.

Note that the reason for asking about this is to allow pseudo-random
numbers for the type of thing that does make sense from inside a TeX
run. Examples are adding 'interest' in graphics, picking 'm from n
questions' in a list, etc. Being able to set the seed is useful (and
indeed implemented in pdfTeX/LuaTeX), but highly statistically
satisfying randomness is not. (I've never tested the pdf macro-based
implementation, but it's likely to be a balance between having some
'randomness' and being reasonable in TeX macros.)

Joseph

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] Random number primitives

2016-11-12 Thread Joseph Wright

Hello all,

As many people will know, the LaTeX team have developed an expandable
FPU as part of expl3. That gets quite a bit of use, but one area we
can't currently address is random numbers. The pgf bundle has a
pseudo-random number generator, but that can't be expandable (you need
to track the seed). The usefulness of pseudo-random numbers has come up
a few times recently, and so we'd like to address this. (Expandable
floating point evaluation is pretty handy as an end user!)

Both pdfTeX and LuaTeX include a series of primitives that expose a
lower-level pseudo-random number generator (I assume from C: there is
very little actual code in the pdfTeX WEB source to implement these).
That gives us a way of providing expandable random numbers at the macro
level, but at present will be limited to those engines. As this is
something of an 'extra' (not core functionality), we will at present
accept that it's not doable in XeTeX/e-(u)pTeX (and issue an error if
requested), but it would be handy if the functionality could find its
way into those engines.

For reference, the 'full set' of primitives in this area is

- \(pdf)uniformdeviate
- \(pdf)normaldeviate
- \(pdf)randomseed
- \(pdf)setrandomseed

(LuaTeX drops the 'pdf' part of the names.)

Joseph


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Joseph Wright

On 01/02/2016 09:37, Philip Taylor wrote:
> 
> 
> Akira Kakuto wrote:
> 
>>> Some interesting (and new, and unexpected) diagnostics, Akira-san; as
>>> far as I can tell, no PDF was produced :
>>
>> You have to replace "all" included binaries, by saving the old ones.
>> Note that size of xdvipdfmx.exe,  which is a wrapper of dvipdfmx.dll, 
>> is 1536 bytes.
> 
> Akira-san :  I did just that.  I copied the entire "...\bin\win32"
> directory to ...\bin\win32-old, then overwrote all files in
> ...\bin\win32 with the corresponding files from your ZIP file.
> 
> I will repeat the process just to ensure that I made no errors and then
> report back.
> 
> Philip Taylor

Works for me replacing xetex.exe, dvipdfmx.dll and adding icudt56.dll
(no present in stock TL2015). (System TL2015, updated this morning,
those files + luatex.dll from the W32TeX dev version.)

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Joseph Wright

On 01/02/2016 09:00, Philip Taylor wrote:
> 
> 
> Akira Kakuto wrote:
> 
>> You can test the new experimental XeTeX on win32 by
>> http://members2.jcom.home.ne.jp/wt1357ak/xetex-exp-w32.zip
> 
> /Domo arigato gozaimasu/, Akira-san.  Downloaded and installed, just
> re-building formats before commencing testing.
> 
> Philip Taylor

Indeed: all working here, e.g.

\XeTeXcharclass6=16384 %

with

This is XeTeX, Version 3.14159265-2.6-0.3 (TeX Live 2016/W32TeX/dev)

Thanks Akira for this and the LuaTeX builds: very useful.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Joseph Wright

On 31/01/2016 18:31, Philip Taylor wrote:
> Just as TeX has \maxdimen, it would be useful if derivatives of TeX such
> as XeTeX could add analogous environmental enquiries such as
> \maxXeTeXcharclass (or, less uglily but also less meaningfully,
> \XeTeXmaxcharclass).

\maxdimen isn't a primitive (though it's in the plain format).

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Joseph Wright

On 31/01/2016 18:07, Philip Taylor wrote:
> 
> 
> Jonathan Kew wrote:
> 
>> Before this gets merged to the master source, though, some testing would
>> be appreciated -- obviously, this will currently require rebuilding
>> xetex from the git source branch.
> 
> I use XeTeX on a daily basis, Jonathan ('tho XeTeXcharclass far less
> frequently) and would be happy to test your version on my production
> suites, but in order to do so I would require a Win64 (or Win32) build.
> 
> Philip Taylor

Hopefully Akira Kakuto will do that for W32TeX: he's done LuaTeX
v0.85/0.87/0.88 binaries.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Joseph Wright

On 13/12/2015 07:04, Werner LEMBERG wrote:
> 
> [XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]
> 
> 
> Folks,
> 
> 
> I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
> style file maps Unicode character blocks to character classes, and
> I've hit the 256 entry limit of \XeTeXcharclass...
> 
> Any chance to extend it to 16 bits?
> 
> 
> Werner

I've been looking at Unicode classes recently :-) Exactly what
sub-division are you going for? There are several Unicode values that
seem to be important for 'full classification'.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] \(pdf)mdfivesum

2015-07-10 Thread Joseph Wright

On 10/07/2015 10:37, Akira Kakuto wrote:
> Dear Joseph,
> 
>> I have a request for a new primitive in XeTeX, not directly related to
>> typesetting by I think useful. To understand why I'm asking, a bit of
>> background would be useful.
> 
> The XeTeX in the latest TeX Live repository has
> a new primitive \pdfmdfivesum imported from pdfTeX.
> However the name and the implementation itself, are
> still volatile.
> 
> Best regards,
> Akira

Thanks: hope it was not too much effort.

I'll have to get on with what I was thinking it was useful for now!
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] \(pdf)mdfivesum

2015-07-02 Thread Joseph Wright

On 02/07/2015 05:54, msk...@ansuz.sooke.bc.ca wrote:
> If MD5 is necessary for compatibility with some existing standard, so be
> it; but it's not secure anymore and it shouldn't be used in any new design
> where there's a concern about possible deliberate tampering, as opposed to
> accidental errors.  SHA1 is deprecated, too.  I think SHA256 is the
> current "best practice."

Depends what you are using it for. Collisions are possible in MD5 so
it's no longer suitable for cryptographic applications. Here, however,
we are talking about avoiding the more prosaic issues of people having
not-quite matching sources. (We are *not* talking about signing
documents.) For the use case I have in mind MD5 will happily do the job.
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] \(pdf)mdfivesum

2015-07-01 Thread Joseph Wright

On 01/07/2015 19:39, Apostolos Syropoulos wrote:
>>
>> We can happily generate that file using pdfTeX (\pdfmdfivesum primitive)
>> or LuaTeX (using Lua code), but not using XeTeX. That's not a big issue
>> but the need for an MD5 sum gives me an idea which would need support in
>> XeTeX.
>>
> 
> The (Xe)TeX language has been designed not for system programming and I 
> wonder why
> people would like to make it a system's programming language. A better idea 
> would be to
> use Perl or Python or even Ruby, which are widely available. 

TeX systems are essentially self-contained: on a Windows system with TeX
Live or MiKTeX installed one can nowadays assume Lua as well as TeX but
nothing else. Moreover, I'm thinking specifically of a use case linked
to the process of document preparation itself.
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] \(pdf)mdfivesum

2015-07-01 Thread Joseph Wright

Hello all,

I have a request for a new primitive in XeTeX, not directly related to
typesetting by I think useful. To understand why I'm asking, a bit of
background would be useful.

The LaTeX team have recently taken over looking after catcode/charcode
info for the Unicode engines from the previous rather diffuse situation.
As part of that, we were asked to ensure that the derived data was
traceable and so have included the MD5 sum of the source files in the
new unicode-letters.def file.

We can happily generate that file using pdfTeX (\pdfmdfivesum primitive)
or LuaTeX (using Lua code), but not using XeTeX. That's not a big issue
but the need for an MD5 sum gives me an idea which would need support in
XeTeX.

LaTeX offers \listfiles to help us track down package version issues but
this fails if files have been locally modified or don't have
date/version info. It would therefore be useful to have a system that
can ensure that files match, which is where MD5 sums come in. Once can
imagine arranging that every file \input (or \read) has the MD5 sum
calculated as part of document typesetting: this is not LaTeX-specific.
This data could then be available as an additional file listing to help
track problems. However, to be truly useful this would need to work with
all three major engines, and currently XeTeX is out. I'd therefore like
to ask that \pdfmdfivesum (or perhaps just \mdfivesum) is added to XeTeX.



There are a small number of other 'utility' primitives in pdfTeX/LuaTeX
(some in the latter as Lua code emulation) that might also be looked at
at the same time (see
http://chat.stackexchange.com/transcript/message/22496265#22496265):

 - \pdfcreationdate
 - \pdfescapestring
 - \pdfescapename
 - \pdfescapehex
 - \pdfunescapehex
 - \pdfuniformdeviate
 - \pdfnormaldeviate
 - \pdffilemoddate
 - \pdffilesize
 - \pdffiledump
 - \pdfrandomseed
 - \pdfsetrandomseed

most of which are not related to PDF output and which may have good use
cases. I am specifically *not* asking for any of these to be added here
but note this list as it *may* be that the work may be closely related.
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Case changing for Greek

2015-05-07 Thread Joseph Wright

On 07/05/2015 15:02, Jonathan Kew wrote:
> On 7/5/15 13:22, Joseph Wright wrote:
>> Included in that 'standard' set up is the final sigma rule for Greek
>> text. For performance reasons that code has been set up to assume that a
>> sigma is final if it is followed by a space, a control sequence or a
>> character from the list
>>
>>  ) ] } . : ; , ! ? ' "
> 
> Would it be feasible to define this negatively instead -- something like
> "a sigma is final if it is NOT followed by another letter"?

Possibly yes: I guess in the TeX context a catcode-based test would work
reasonably well. I'll explore that.

> A possible refinement is that a lone sigma, neither preceded nor
> followed by another letter, should probably be lowercased as σ rather
> than ς.

One that needs input from a Greek speaker!

> To see the result of what we implemented for Firefox, you can try
> loading a testcase such as
> 
>   data:text/html;charset=utf-8,
> ΣΑΒ ΑΣΒ ΑΒΣ Σ ΣΣΣ (Σ)
> 
> in the browser, which displays it as "σαβ ασβ αβς σ σσς (σ)". (And I
> notice Chrome and Safari have the same behavior, too.)

Much the same as we do.
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Case changing for Greek

2015-05-07 Thread Joseph Wright

On 07/05/2015 14:26, Jonathan Kew wrote:
> FWIW, we've done some work on this in Mozilla in the past few years, to
> provide language-appropriate behavior for CSS features like
> text-transform:uppercase and font-variant:small-caps. You might like to
> review the discussion in bug reports such as
> 
>   https://bugzilla.mozilla.org/show_bug.cgi?id=231162
>   https://bugzilla.mozilla.org/show_bug.cgi?id=307039
>   https://bugzilla.mozilla.org/show_bug.cgi?id=740120
>   https://bugzilla.mozilla.org/show_bug.cgi?id=740477
> 
> In particular, bug 307039 has a lot to say about uppercasing Greek. The
> details of actual code patches will obviously not be relevant, but the
> comments describing desired/implemented behavior may be helpful.

Thanks for that: looks useful. Will need to read properly and digest.

I am left wondering why this is not addressed in SpecialCasing.txt!
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Case changing for Greek

2015-05-07 Thread Joseph Wright

On 07/05/2015 14:23, Nikos Platis wrote:
> 2015-05-07 15:22 GMT+03:00 Joseph Wright :
> 
>> For performance reasons that code has been set up to assume that a
>> sigma is final if it is followed by a space, a control sequence or a
>> character from the list
>>
>> ) ] } . : ; , ! ? ' "
>>
>>
> I would add to this list the dashes, "anoteleia", the greek closing quote
> "»".
> On the contrary, the english question mark "?" would not belong to a greek
> text.

Thanks for the additions. Per Unicode, the final sigma rule applies to
all text using Greek chars, not just text in Greek, so a sentence in
English finishing with a Greek word should presumably apply the rule to
that word, hence having "?" in my list (I am aware that Greek uses ";"
to indicate a question).
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Case changing for Greek

2015-05-07 Thread Joseph Wright

On 07/05/2015 13:40, Philip Taylor wrote:
> 
> 
> Joseph Wright wrote:
> 
>> For performance reasons that code has been set up to assume that a
>> sigma is final if it is followed by a space, a control sequence or a
>> character from the list
>>
>> ) ] } . : ; , ! ? ' "
> 
> The inclusion of "a control sequence" worries me; may I ask why you do
> not propose to expand the control sequence (if expandable) or ascertain
> its equivalence (if \let, for example) in order to predicate the
> assessment as to whether or not the sigma is final on the expansion /
> equivalence of the control sequences as would seem at first sight to be
> required.

As the code here is for expl3, there is an assumption that such input is
either fully-expandable or engine protected. As such, application of
\edef in a preceding step will do the same without having to put in the
rather complex loops one needs otherwise (the code is expandable so
cannot itself include assignments). Note also that the code we have is
explicit intended for 'text': it seems unlikely that real user text will
intermix stored characters and literal ones and indeed defining the
logic either way could be questionable.

If it becomes clear that the current approach does not work then
alternatives can be considered. We don't have much in the way of
use-cases at present beyond ones we've thought of ourselves.
--
Joseph Wright

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] Case changing for Greek

2015-05-07 Thread Joseph Wright

Hello all,

The question of case changing in Greek has come up in another thread.
Whilst the details here aren't XeTeX (or even TeX) specific, given the
interest by members of the list I hope I can take advantage to ask about
the area.

For work on LaTeX3/expl3 we've put together an approach to case changing
in XeTeX (and LuaTeX) that is not tied to a 1-1 mapping.

One of the design ideas behind the code was to allow a way to tackle
context- and language-dependent changes. At the same time, to date we
have used the Unicode docs to define case mappings. Thus the 'standard'
mappings follow those in UnicodeData.txt (1-1 lower/title/upper) and
SpecialCasing.txt (more complex cases).

Included in that 'standard' set up is the final sigma rule for Greek
text. For performance reasons that code has been set up to assume that a
sigma is final if it is followed by a space, a control sequence or a
character from the list

) ] } . : ; , ! ? ' "

Other potential additions are welcome as is testing of what we have
done. (There seem to be a lot of edge cases. For example, what happens
if a sigma is immediately followed by a number, say in a computational
identifier.)

What has not been covered at all to date is any special handling of
accents. As indicated in the other thread, it seems that the handling of
accents in Greek is non-trivial. Notable, we have an implementation
which separates out title case from upper case and have the idea of
language-dependent mappings. Thus it would be perfectly possible to have
logic 'Retain accents on the first letter of a word when title casing;
remove them when upper casing'. Similarly, I wonder if there are
differences in practice related to the nature of the text: modern
writing vs. historical text, etc. Again, this can be added if there is a
clear set of rules to follow.

Detailed information is most welcome.
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Σχετ: Re: Assignment of codes (particularly \catcode) based on Unicode data

2015-05-07 Thread Joseph Wright

On 07/05/2015 10:56, Jonathan Kew wrote:
> On 7/5/15 09:34, Philip Taylor wrote:
>>
>>
>> Apostolos Syropoulos wrote:
>>
>>> The only mark that remains when making all capitals is the dieredis
>>> (dialytika). All other vanish. This is common knowledge for people who
>>> speak and write Greek.
>>
>> Well, this is not the opinion of (for example) Dr Charalambos Dendrinos,
>> a native Greek speaker and Director of the Hellenic Institute.  This is
>> why I asked whether it was a universally-agreed truism or simply a
>> matter of opinion, and in view of the fact that both Dr Dendrinos (in
>> private correspondence) and Julian Bradfield (on this list) have offered
>> the alternative perspective to your own, it would seem to be a matter of
>> opinion rather than one of fact.  If you look at the opening folio of
>> George Etheridge's Encomium on Henry VIII, addressed to Elizabeth I :
>>
>> 
>> http://hellenic-institute.rhul.ac.uk/research/Etheridge/Electronic-Edition/
>>
>>
>> you will see a number of Greek majuscules with either psilí or daseîa,
>> including the very combination under discussion (GREEK CAPITAL LETTER
>> EPSILON WITH PSILI, on line 2), suggesting that the combination of
>> breathing and majuscule was common at that time.
> 
> I think there may be some confusion as to exactly what this discussion
> is about. Certainly, "the combination of breathing and majuscule" occurs
> in mixed-case polytonic text, as shown in your example. However,
> Apostolos is (I think) addressing the case of all-uppercase text, in
> which case the usual practice is to drop all marks except dieresis.
> 
> See, for example, http://unicode.org/udhr/d/udhr_ell_polytonic.html;
> note the presence of breathing marks on initial capitals within the
> text, but note also their complete absence in the ALL-CAPS title.
> 
> So if a lower-to-uppercase mapping is used just to Capitalize Initial
> Letters, it perhaps should not discard breathing marks; but if it is
> used to turn a passage of text into ALL UPPERCASE, then it probably
> should discard them.
> 
> But things are actually trickier than that. AIUI, the most correct
> polytonic UPPERCASE transform for "μάιος" would be "ΜΑΪΟΣ" -- not only
> is the accent on ά gone, but ι has acquired a dieresis and become Ϊ.
> 
> The \uccode/\lccode tables in (Xe)TeX cannot fully capture this, no
> matter what code assignments are chosen; neither can the per-character
> properties in Unicode. It requires a more powerful approach to case
> transforms.
> 
> So I still maintain that the default code values assigned in formats
> such as xe(la)tex should be based directly on the Unicode properties. It
> would be great to have a Greek package that implements proper Greek
> uppercasing, but this level of language- and orthography-specific
> behavior does not belong in the base format.

Indeed, whilst not what I was after here (which as you say is about
defaults for the formats), in the expl3 code I've written for case
changing the idea of positional dependence is built it. There's no
question that the TeX 1-1 mapping for case changing is not applicable to
many situations, not just the case of Greek text. I'll ask a separate
question about Greek case mapping for the expl3 context later on as it
seems to have people's attention.
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Assignment of codes (particularly \catcode) based on Unicode data

2015-05-06 Thread Joseph Wright

On 06/05/2015 21:06, David Carlisle wrote:
> On 6 May 2015 at 20:15, Philip Taylor  wrote:
>>
>>
>> Apostolos Syropoulos wrote:
>>
>>> It seems to me that most people have no idea what Unicode is and what is 
>>> really
>>> involved.
>>
>> OK, so if we restrict the Universe of Discourse to the set of native
>> Hellenic speakers who know what Unicode is, know the importance of being
>> able to use it to identify the correct upper case of (for example)
>> 'GREEK SMALL LETTER EPSILON WITH PSILI', and hold an informed opinion on
>> the matter, would you expect that 100% of these would agree that the
>> uppercase is 'GREEK LETTER EPSILON' and not 'GREEK LETTER EPSILON WITH
>> PSILI', or would you expect that some percentage (perhaps small) would
>> hold the opposite point of view ?
>>
>> ** Phil.
>>
> 
> I don't think that's the right question. Even if everyone, including
> the Unicode technical committee,
> agreed some properties are incorrect for some characters, it isn't
> clear we should change
> them at this level.
> 
> I think that unicode-letters.def makes most sense as a
> fully automated representation of the UCD data files in TeX syntax.
> 
> That way everyone knows what data is in there.
> 
> Individual language packages have far fewer characters to worry about
> and can over-ride
> the base settings where appropriate.

Indeed: provided hyphenation is correct then we are OK. (LuaTeX of
course is rather more flexible there than XeTeX.)
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Assignment of codes (particularly \catcode) based on Unicode data

2015-05-06 Thread Joseph Wright

On 06/05/2015 16:04, Apostolos Syropoulos wrote:
> Hello,
> 
> I checked a bit the file and I have noticed that 
> 
> 
> \L 1F10 1F18 1F10 % 
> 
> while xgreek.sty defines 
> 
> 
> \global\lccode"1F10="1F10 \global\uccode"1F10="0395
> 
> You see the uppercase of 'GREEK SMALL LETTER EPSILON WITH PSILI'
> is 'GREEK LETTER EPSILON' and not 'GREEK LETTER EPSILON WITH PSILI. 
> 
> Some time ago I reported this to the Unicode people and they told me 
> 
> something like "we cannot change it now" (I do not remember the exact 
> 
> wording but the essence remains the same.) Naturally, all \lccodes and
> \uccodes for Greek letters are wrong and I suspect many more are wrong. 

This is slightly at a tangent from my original question (whether we are
processing the Unicode data in the right way), but is worth
consideration. It also has some impact on expl3 code related to case
changing (which does not use \lccode/\uccode).

I guess one could imagine deviating from the Unicode data but there are
issues. First, the current position is at least easy to explain. Second,
the current approach is the same position taken by I guess many other
pieces of software, so is cross-compatible with other stuff. Third, as a
non-Greek I can't comment on the technical correctness of what you say!
Is there some place I could see this discussed in detail? (I'm a bit
confused as to what 'GREEK CAPITAL LETTER EPSILON WITH PSILI' represents
if it's not the upper case of 'GREEK SMALL LETTER EPSILON WITH PSILI': I
notice in xgreek you map U+1F18 to U+0395 for upper casing and U+1F10
for lower casing.)
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Assignment of codes (particularly \catcode) based on Unicode data

2015-05-06 Thread Joseph Wright

On 06/05/2015 15:09, Jonathan Kew wrote:
> On 6/5/15 14:14, Joseph Wright wrote:
> 
>> Based on the current files, we have a block to set \XeTeXcharclass,
>> which only applies to XeTeX. The logic followed in that code is that
>> characters in the file LineBreak.txt which have class "ID" (ideographs)
>> not only set the \XeTeXcharclass class to 1 but also set the \catcode of
>> the code point to 11. That leads to a difference between the two Unicode
>> engines. My current feeling is that the data file should split this
>> process such that the category code change applies to both XeTeX and
>> LuaTeX, with the XeTeX-specific code separate. Does this make sense and
>> indeed does the current assignment make sense?
>>
> 
> ISTM that the most appropriate (default) \catcode for characters with
> class ID is clearly letter (11), and would suggest that LuaTeX should
> follow XeTeX in this.

Well for LaTeX at least the team get to make the call here and I think
we will pull everything into line.

> So yes, splitting out the XeTeX-specific code and having LuaTeX share
> the catcode assignments makes sense.

OK, if there are no objections I have a plan on this (I'll actually keep
all of the data, I think, and alter the assignment code).

> After all, if users can write control sequences such as
> 
>   \hello
>   \halló
>   \Здравствуйте
>   \ሰላም
>   \सलाम
> 
> they should equally well be able to write
> 
>   \你好
>   \こんにちわ
> 
> and have each of these treated as single control sequences, too. This
> will not work if category ID characters are given catcode 12.

Entirely reasonable.

> If you're making improvements to unicode-letters.def, I would suggest
> also adding a section that assigns catcode 15 (invalid) to the code
> values "D800 - "DFFF (i.e. the UTF-16 surrogates, which should never be
> used in isolation as characters).

Noted: easy enough to add.
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] Assignment of codes (particularly \catcode) based on Unicode data

2015-05-06 Thread Joseph Wright

Hello all,

As some people will have seen, the LaTeX team have recently integrated
setting of codes (\catcode, \lccode, etc.) for the entire Unicode range
 into the kernel when XeTeX/LuaTeX are in use. This is not a functional
change for end users but does mean that the team now have some control
over these important settings. Notably, the new data file we have
created (unicode-letters.def) is compatible with plain TeX and works
with both XeTeX and LuaTeX. We are therefore hopeful that it will
provide useful not only to LaTeX users but also to those using
plain-basef formats.

For the initial pass we have adopted the settings applied by
unicode-letters.tex (XeTeX)/luatex-unicode-letters.tex (LuaTeX) as-is.
We have constructed a new (TeX) script to generate this data from the
raw Unicode data files.

Most of the settings are straight-forward and shared between XeTeX and
LuaTeX. For example, characters marked as Unicode as letters have
\catcode 11, \lccode and \uccode are set up based on case relationships,
etc. However, we would like to raise one area that may need revision.

Based on the current files, we have a block to set \XeTeXcharclass,
which only applies to XeTeX. The logic followed in that code is that
characters in the file LineBreak.txt which have class "ID" (ideographs)
not only set the \XeTeXcharclass class to 1 but also set the \catcode of
the code point to 11. That leads to a difference between the two Unicode
engines. My current feeling is that the data file should split this
process such that the category code change applies to both XeTeX and
LuaTeX, with the XeTeX-specific code separate. Does this make sense and
indeed does the current assignment make sense?

We are very keen to hear about any other logic changes that may be
required in the data file. This is a complex area and we have at present
done little other than copy the current logic.
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Joseph Wright

On 28/04/2015 00:48, Douglas McKenna wrote:
>> That isn't at all clear, I don't see any evidence that the equivalent 
>> &#; notation in XML is getting less used. For runs of natural language 
>> text then clearly using character data directly makes more sense
>>  but to get specific symbols accessing by code point often makes sense. 
>>
>> To get a math bold A, It is much easier to tell someone to enter ^1d400 
>> than to tell them how
>> to enter 𝐀 in whatever system they are using.
> 
> Of course, every Unicode reference on the web would be referring to it as 
> U+1D4000.  Sigh.
> 
> Anyway, duly noted.  Except in the future, "whatever system they are using" 
> will likely be able to handle the UTF-8 character as direct input, just like 
> it appears (in my email reader) above as a math bold A.

Well yes and no. Whilst editors, viewers, etc. can be expanded to cover
the entire Unicode range, no one font will cover the entire spectrum. At
the same time, most keyboards are only ever going to have ~100 keys. So
whilst for the main language of a document Unicode makes sense, saying
that you have to find the correct Unicode code point for everything is
not so convenient. (One can arrange different key binds to flip between
which could be used to do for example the math-mode bold A business, but
that may or may not be easier than just typing \mathbf{A} or whatever,
depending on the use case.)
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Joseph Wright

On 27/04/2015 08:43, Ross Moore wrote:
> Hi Joseph,
> 
> On 27/04/2015, at 4:19 PM, Joseph Wright wrote:
> 
>> On 27/04/2015 00:22, Ross Moore wrote:
>>>> But of course that doesn't address the problem for LaTeXt users until
>>>> someone writes a suitable/comparable package (maybe someone did
>>>> already, I didn't try to follow).
>>>
>>> I have coding for much of what is needed, using the modified pdfTeX.
>>> But there is a lot that still needs to be added; e.g. PDF’s table model,
>>> References, footnotes, etc.
>>
>> Somewhat away from the original topic, but it strikes me that building a
>> tagged PDF is going to be much more problematic at the macro layer than
>> at the engine level: is that fair? 
> 
> Certainly one needs help at the engine level, to build the tree
> structures: what is a parent/child of what else.

Yes, I didn't mean that engine support isn't required, but that some of
the more complex concepts are probably at the macro layer. You know a
lot more about this than I do, but I assume that there is more to tagged
PDFs than sectioning (which is relatively easy to define). For example,
as a chemist I'd guess one has to worry about chemical formulae and
about reference numbers to compounds. (We tend to give the latter in
bold and they commonly refer to graphics representing the structures.
That looks very tricky to me to express in a tagged form!)

> But macros are needed to determine where new structure starts
> and finishes.
> Think  \section  and friends, list environments, \item  etc.

Yes, those elements seem relatively clear. As I've noted in another
reply, ConTeXt MkIV has moved to a more XML-like \startitem ...
\stopitem construct as the preferred way to deal with (here) items, I
guess in part as that makes such things easier. As a user that's
slightly more tricky: I'd say that the ideal that one item is ended by
the start of the next is pretty clear :-)

> Indicators must go in at a high level, before these are decomposed
> into the content:  letters, font-switches, etc.

Again, understood and I think reasonably clear for the macro level.

> In short, determining where structure is to be found is *much* harder
> at the engine level; but doing the book-keeping to preserve that
> structure, once known, is definitely easier when done at that level.

Makes sense. One can imagine constructing a tree at the macro level but
as there still needs to be some tagging I guess it doesn't help. (Can
the latter be done using \specials?)

> Philip Taylor is correct in thinking that such things can be
> better controlled in XML. But there the author has to put in
> the extra verbose markup for themselves --- hopefully with help
> from some kind of interface.
> However, that can involve a pretty steep learning curve anyway.

> Word has had styles for decades, but how many authors actually
> make proper use of them?  e.g. linking one style to another,
> setting space before & after, rather than just using newlines,
> and inserting space runs instead of setting tabs.
> How many even know of the difference between   and
> Shift-  (or is it Option- ) ?

:-)

> The point of (La)TeX is surely to allow the human author
> to not worry too much about detailed structure, but still allow
> sufficient hints (via the choice of environments and macros used)
> that most things should be able to be worked out.

That's the plan, I guess.
--
Joseph Wright

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-27 Thread Joseph Wright

On 27/04/2015 07:35, Philip Taylor wrote:
> Going even further off-topic, but pursuing this one aspect of the
> thread, is there not only real one problem :  the need to educate users
> to cease marking up their documents in raw (La)TeX syntax, and instead
> to express them in well-formed XML ?  I have just finished typesetting
> (using [plain] XeTeX) a 544pp book marked up entirely in XML, and whilst
> I have made no efforts to generate PDF/UA, I am convinced that the task
> of so doing (assuming that the necessary primitives are or were
> available in XeTeX) would have been 1/1000 of the effort needed to do so
> had the book been marked up in traditional (La)TeX syntax with its usual
> accompanying conflation of form and content.

As Ross says in a parallel message, XML raises different issues and is
not a panacea. For a start, we can ask if XML is a particularly good
format not only here or for anything (there's a blog post by Linus
Torvalds suggesting the answer is 'no'!). Assuming XML is at some level
a good plan, that still doesn't make it a good plan for the end user nor
ensure that the end sure will stick to logical structures. There's also
the business that TeX is useful because sometimes we do need some visual
adjustment or programming element.

LaTeX2e is already not bad for structure if used in the right way, and
ConTeXt MkIV has gone further along an XML-like road without using this
as the native syntax (\startsection/\stopsection for example), and of
course plain users can define similar structures (indeed without the
constraints that LaTeX has of needing not to break things).
--
Joseph Wright

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Joseph Wright

On 27/04/2015 01:05, Douglas McKenna wrote:
> Joseph Wright wrote:
> 
>> \def\"{0}\expandafter\def\csname^00022\endcsname{1}
>> \ifnum\"=0 \message{tex82}\else\message{newstuff}\fi
> 
> When I implemented a Unicode escape sequence extension using double-caret 
> notation in the JSBox TeX-language interpreter I've been working on (which is 
> all 21-bit Unicode internally, all the time, but can be configured at 
> run-time to be 8-bit input only), I was unaware of what XeTeX had 
> implemented, so I just used
> 
> ^^u (for 16-bit, BMP codes)
> ^^Uxx (for all 21-bit Unicode code points)
> 
> Seemed straightforward enough.

XeTeX conventions have been picked up by LuaTeX on this, and there's
been some 'feedback' from LuaTeX to XeTeX to give us some
standardisation for Unicode primitives/syntax (admittedly with bugs, but
that's a different point). I'd hope that any future Unicode TeX-like
systems would also pick up on the model used by XeTeX/LuaTeX.

> Given that the number of TeX input files using ^^u is likely miniscule, and 
> the number of those that follow the ^^u or ^^U with four or six hex digits is 
> even smaller, it seemed like a worthwhile benefit vs. cost, 
> compatibility-wise.  Maybe there's something I've not thought out well.

I didn't mean that there would be many real-world docs with this issue.
I was trying to point out that it's almost impossible to imagine that a
Unicode TeX-like engine could be used as a drop-in replacement for the
current 8-bit ones (pdfTeX most obviously), so when we talk about 'the
future' we have to mean 'for documents written assuming Unicode' rather
than 'for all existing TeX documents'. (For mathematicians the latter
point is very important.)

> This discussion I just found is both pertinent and frightening, I suppose:
> 
> http://stackroulette.com/tex/62725/the-notation-in-various-engines

That's a (questionable) reuse of the info from

http://tex.stackexchange.com/questions/62725/the-notation-in-various-engines

Note that the discussion is editable (wiki-like) and to my knowledge is
still correct as-is. There are some tricky issues in XeTeX, particularly
related to non-BMP chars, partly because working out what should happen
here has been a work-in-progress.
--
Joseph Wright

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Joseph Wright

On 27/04/2015 00:22, Ross Moore wrote:
>> But of course that doesn't address the problem for LaTeXt users until
>> someone writes a suitable/comparable package (maybe someone did
>> already, I didn't try to follow).
> 
> I have coding for much of what is needed, using the modified pdfTeX.
> But there is a lot that still needs to be added; e.g. PDF’s table model,
> References, footnotes, etc. 

Somewhat away from the original topic, but it strikes me that building a
tagged PDF is going to be much more problematic at the macro layer than
at the engine level: is that fair? Deciding what elements of a document
are 'structure' is hard, and in 'real' documents it's not unusual to see
a lot of input that's more about appearance than structure. That of
course isn't limited to TeX: I suspect anyone trying to generate tagged
output has the same concern (users do odd things).
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Joseph Wright

On 26/04/2015 12:16, Philip Taylor wrote:
> 
> 
> Joseph Wright wrote:
> 
>> See for example details in
>> http://tex.stackexchange.com/questions/86/what-are-the-incompatibilities-of-pdftex-xetex-and-luatex
>> for places where there are edge cases. The most obvious would be that
>> XeTeX requires the xdvipdfmx back-end (so differences at the \special
>> level), 
> 
> Yes, I accept that, but to the user (as I have argued elsewhere), XeTeX
> subsumes 'xdvipdfmx' -- the fact that they are, historically, two
> separate pieces of software and are separately maintained is a sad fact
> of life but not one that the user of XeTeX should be required to consider.

Still requires changes in a document, particularly one written for
pdfTeX in PDF mode (certainly for plain: for LaTeX of course this is
more transparent).

> but a simple piece of code
>>
>> \def\"{0}\expandafter\def\csname^00022\endcsname{1}
>> \ifnum\"=0 \message{tex82}\else\message{newstuff}\fi
>>
>> (ConTeXt wiki) gives different results with TeX90 and XeTeX due to
>> different treatment of more than two ^^ (catcode 7) in a row.
> 
> OK, agreed: by adding support for wider characters, some breakages will,
> almost of necessity occur, but I would respectfully argue that these are
> pathological cases that will not impact real-world documents.

My point though is that neither XeTeX nor indeed any other Unicode
TeX-like engine can be used as a direct replacement for an 8-bit engine:
contrast the fact that the standard engine for TeX Live is nowadays
pdfTeX used as a direct drop-in replacement for TeX90 (with the
exception of using "tex", which is Kunth's TeX unaltered). As such,
whilst new documents may be written using a Unicode engine, pdfTeX will
remain vital.

All that said, I am keen that some way is found to continue to work on
XeTeX. The problem is that WEB code is *hard* to work with!
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Joseph Wright

On 26/04/2015 12:00, Philip Taylor wrote:
> 
> 
> Joseph Wright wrote:
> 
>> The problem as always is not so much money as people.
> 
> Yes, I do appreciate that, but sometimes money is also an obstacle ("do
> I work on X, which will help keep a roof over my head, or on XeTeX,
> which may bring me fame but which may also result in my eviction ?").
> 
>> [Also, you do know about LuaTeX, yes? ;-)
> 
> Yes, of course, but I see it as an evolutionary dead-end, much as I
> would wish to see it otherwise.
> 
>> More seriously, XeTeX isn't a drop-in replacement for TeX90/pdfTeX.]
> 
> I have yet to find a legacy document which behaves differently (legacy
> Plain TeX, that is, not legacy LaTeX); if you can point me at one, I
> should be interested to experience the differences for myself.
> 
> ** Phil.

See for example details in
http://tex.stackexchange.com/questions/86/what-are-the-incompatibilities-of-pdftex-xetex-and-luatex
for places where there are edge cases. The most obvious would be that
XeTeX requires the xdvipdfmx back-end (so differences at the \special
level), but a simple piece of code

\def\"{0}\expandafter\def\csname^00022\endcsname{1}
\ifnum\"=0 \message{tex82}\else\message{newstuff}\fi

(ConTeXt wiki) gives different results with TeX90 and XeTeX due to
different treatment of more than two ^^ (catcode 7) in a row.
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-26 Thread Joseph Wright

On 26/04/2015 11:47, Philip Taylor wrote:
> To my mind, XeTeX /is/ the future of TeX.  The days of entering
> "français" as "fran\c cais" are surely numbered, and it has never been
> possible to enter "العربية", "ελληνικά" or "עברית" (etc) in an analogous
> way.  Therefore, is it not time to petition the TUG Board to adopt XeTeX
> as a formal TUG project, and to allocate adequate funding to ensure not
> only its continued existence but its continued development, at least
> until such time as a clearly superior alternative not only emerges but
> becomes adopted as the /de facto/ replacement for TeX ?
> 
> Philip Taylor

The problem as always is not so much money as people. [Also, you do know
about LuaTeX, yes? ;-) More seriously, XeTeX isn't a drop-in replacement
for TeX90/pdfTeX.]
--
Joseph Wright




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX maintenance

2015-04-25 Thread Joseph Wright

On 25/04/2015 18:31, Khaled Hosny wrote:
> Due to lack of time, skills and motivation, I’ll be no longer able to
> work on XeTeX, so I’m stepping down as a maintainer.
> 
> Regards,
> Khaled

Hello Khaled,

Thanks for all of your work on XeTeX: it has been very much appreciated.

Joseph




--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] printing of characters above "FFFF with \string \meaning (and potentially \Uchar)

2015-04-23 Thread Joseph Wright

On 23/04/2015 14:07, David Carlisle wrote:
> Last year I asked about the possibility of adding \Uchar copied from luatex.
> 
> http://tug.org/pipermail/xetex/2014-May/025260.html
> 
> Bruno suggested a possible implementation, and I finally got round to
> trying that
> adjusted for the sources as in the texlive 2015 pretest tree (diff attached)
> 
> This seems to work fine for characters below "
> but fails for non BMP characters above that.
> 
> See the attached xetexuchar.tex file and the log produced by
> luatex and (patched) xetex.
> 
> It just uses the same print_char routine as \string so I thought I'd test
> that.
> See the file nonbmp.tex (which can be used with a non-patched xetex)
> 
> As can be seen with the attached logs this works with luatex with
> \string on U+1D538 producing a single character, but with xetex it produces
> two (presumably the UTF-16 surrogate pair, although I didn't check that).
> 
> Is my reading of this file correct and \string and meaning are turning
> U+1D538  into two characters, and if so does anyone have a suggestion
> of the best place this should be attacked in the source?
> 
> 
> David

Obviously the non-BMP issue needs to be tackled, but I wonder if \Uchar
could be added in any case. It would bring functionality in this area
closer to LuaTeX and presumably the high chars business can be viewed as
a separate issue.
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] additional beginL endL nodes in math

2015-04-15 Thread Joseph Wright

On 16/04/2015 01:21, Vafa Khalighi wrote:
> And since when
> LaTeX has a package/ or test files that uses \beginL \endL?

The kernel doesn't use any directional features. However, it does have a
large set of tests, some of which cover math mode: nny change to the
\showoutput result will show up. To date, pdfTeX and XeTeX have given
identical output for the same (7-bit) input if specials have not been
involved (LuaTeX does not as it uses Omega-like directional data).
Having TeX-XeT active changes this. As David has already said, we will
find a way to work with this, probably duplicating the test file output
for pdfTeX/XeTeX but perhaps normalising (if I can think of a way to do
it!).
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Bug with color \specials?

2014-06-24 Thread Joseph Wright

On 24/06/2014 14:18, Ulrike Fischer wrote:
> Am Tue, 24 Jun 2014 21:53:26 +0900 schrieb Akira Kakuto:
> 
>> Thus problems are in pgfsys-dvipdfmx.def
>> and pgfsys-xetex.def.
> 
> I can avoid the problem by redefining this two commands of a current
> pgfsys-dvipdfmx.def:

[snip]

> But the binary (x)dvipdfmx must be involved somehow too: Even if I
> use all the files from texlive2014 with miktex I don't get the error
> there. 

Which files are we talking about here: just pgf-related ones or also
xetex.def/dvipdfmx.def/dvipdfmx.cfg/...?
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX/xdvipdfmx or the driver bug with eps images

2014-05-28 Thread Joseph Wright

On 28/05/2014 16:14, Akira Kakuto wrote:
> Dear Vafa Karen-Pahlav
> 
>> w.eps is taken from LaTeX graphics companion examples;
>> therefore I do not think there is anything wrong with the image itself.
>>
>> What is wrong?
> 
> It is sufficient to change the header of w.eps
> from
> %!PS-Adobe-2.0
> to
> %!PS-Adobe-2.0 EPSF-2.0
> in order to tell Ghostscript that w.eps is an
> eps file.
> 
> Please try, then you will obtain an expected pdf.
> 
> Thanks,
> Akira

All true, but both latex + dvips and  pdflatex produce the expected
output, as do latex + dvipdfmx or xelatex with the older driver set up.
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeTeX/xdvipdfmx or the driver bug with eps images

2014-05-28 Thread Joseph Wright

On 28/05/2014 05:24, Vafa Karen-Pahlav wrote:
> Hi list
> 
> It seems that xelatex has problems including eps images; please see the
> attached example and the provided eps image.
> 
> latex+dvips+pstopdf: image is included inside the \fbox
> 
> xetex recent versions: image is outside of \fbox
> 
> xetex old versions: (the one coded by Jonathan Kew): ok, image is included
> inside \fbox.
> 
> w.eps is taken from LaTeX graphics companion examples; therefore I do not
> think there is anything wrong with the image itself.
> 
> What is wrong?
> 
> I also have experienced some strange problems with recent versions of
> xetex; include an image in a document on Windows and the result is
> perfectly fine but you try to compile the same document on a different
> operating system, then images are placed strangely (i.e. the image width
> exceeds the textwidth and is placed on the right or left hand side). This
> issue is very annoying and existed for few years now. I try to send some
> minimal example for this later today.

Initial analysis: http://tex.stackexchange.com/questions/180766. It
looks to me like this is not XeTeX-specific but is linked to a change in
(x)divpdfmx and related config files:
http://tug.org/svn/texlive?view=revision&revision=30175 (versions of
xetex.def before this do not give the issue).

As noted by Herbert Voss, w.eps is really a PS file. However, it's
treated differently by dvips than (x)dvipdfmx, and that's not expected!
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Extended ^^ notation and \scantokens

2014-05-05 Thread Joseph Wright

On 05/05/2014 10:55, Qing Lee wrote:
> It seems to be related to the following tickets:
> 
> http://sourceforge.net/p/xetex/bugs/79/
> http://sourceforge.net/p/xetex/bugs/80/
> http://sourceforge.net/p/xetex/bugs/88/
> 
> Qing Lee

OK, so issue(s) are known: I'd been trying to get a short demo together
for that last one!

Guess for the moment I'll just have to put up!
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] ^^J in the plain XeTeX format

2014-05-05 Thread Joseph Wright

On 05/05/2014 08:05, Joseph Wright wrote:
> Hello all,
> 
> Doing some experiments on writing to the log, I find that the XeTeX
> format shows different behaviour from other formats with respect to ^^J.
> Trying
> 
> \immediate\write-1{Hello^^Jworld}
> \bye
> 
> with pdfTeX or LuaTeX gives two lines in the log
> 
> Hello
> world
> 
> but with XeTeX gives
> 
> Hello^^Jworld
> 
> LaTeX and ConTeXt (MkII) show identical behaviour for XeTeX and pdfTeX
> (and LuaTeX in the LaTeX case).
> 
> Anyone know why this is, and if it's deliberate?

Further to this, I note that the plain format doesn't set \newlinechar
^^J, so Knuth's TeX gives the same behaviour as XeTeX. Arguably
therefore an issue with pdfTeX/LuaTeX: will raise in appropriate place(s).
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] Extended ^^ notation and \scantokens

2014-05-05 Thread Joseph Wright

Hello all,

Experimenting with \scantokens for generating characters from the
charcodes, the following issue comes up in XeTeX. For the test file

\show ^10400
\show ^^010400
\def\gobble#1{}
\showtokens\expandafter{%
  \romannumeral-`\q\expandafter\expandafter\expandafter\gobble
\expandafter\string\csname
\scantokens{^10400\noexpand}\endcsname%
}
\showtokens\expandafter{%
  \romannumeral-`\q\expandafter\expandafter\expandafter\gobble
\expandafter\string\csname
\scantokens{^^010400\noexpand}\endcsname%
}

the \show statements work fine but the \scantokens versions don't. This
is not limited to the rather odd setup above (used so \showtokens is
applicable): with \everyeof{\noexpand} and a suitable set of \write
statements you see the same in a temporary file.

It seems that something is up once you get to five hexadecimal digits:
bug in XeTeX?
--
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

[XeTeX] ^^J in the plain XeTeX format

2014-05-05 Thread Joseph Wright

Hello all,

Doing some experiments on writing to the log, I find that the XeTeX
format shows different behaviour from other formats with respect to ^^J.
Trying

\immediate\write-1{Hello^^Jworld}
\bye

with pdfTeX or LuaTeX gives two lines in the log

Hello
world

but with XeTeX gives

Hello^^Jworld

LaTeX and ConTeXt (MkII) show identical behaviour for XeTeX and pdfTeX
(and LuaTeX in the LaTeX case).

Anyone know why this is, and if it's deliberate?
-- 
Joseph Wright



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] The arcs package

2013-08-25 Thread Joseph Wright

On 25/08/2013 16:00, Joseph Wright wrote:
> Nothing to do with XeTeX: it's due to relsize and shows up with a demo
> for pdfTeX. Inside arcs.sty you find
> 
> \let \rs@size@warning = \@gobbletwo
> \relsize{-10}%
> 
> but in the latest relsize (dated 2013-03-29) \rs@size@warning takes
> three arguments. Thus arcs needs adjusting (probaly should just leave
> the relsize code alone): I've CC'd the arcs author.

Message to the arcs author bounced. I'll raise this on c.t.t.: the
package is LPPL so a change can be sorted out if he can't be found.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] The arcs package

2013-08-25 Thread Joseph Wright

On 25/08/2013 15:42, Arash Zeini wrote:
> Hello,
> 
> Since the upgrade to TeX Live 2013, the arcs package behaves strangely. It
> draws the desired arc under the respective characters, but a string like
> "5.0pt" will always precede the characters with the arc. Has anyone else
> noticed this problem?
> 
> I have tried this MWE with two different fonts on two computers running
> Debian unstable and a "vanilla" TL 2013:
> 
> \documentclass[a4paper,12pt]{article}
> 
> \usepackage{xltxtra}
> 
> \setromanfont[Mapping=tex-text]{Junicode}
> \usepackage{arcs}
> 
> \begin{document}
> An underarc: \underarc{ab}. And now an overarc: \overarc{ab}.
> 
> \end{document}
> 
> Best wishes,
> Arash

Nothing to do with XeTeX: it's due to relsize and shows up with a demo
for pdfTeX. Inside arcs.sty you find

\let \rs@size@warning = \@gobbletwo
\relsize{-10}%

but in the latest relsize (dated 2013-03-29) \rs@size@warning takes
three arguments. Thus arcs needs adjusting (probaly should just leave
the relsize code alone): I've CC'd the arcs author.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Error of xeCJK and fonspec (Bokov Gleb - First Message)

2013-05-31 Thread Joseph Wright

On 31/05/2013 13:52, Peter Dyballa wrote:
> 
> Am 31.05.2013 um 14:10 schrieb Bruno Le Floch:
> 
>> This is caused by fontspec, I'd say, due to somewhat recent changes in
>> the expl3 supporting package: \c_keys_code_root_tl was renamed
>> \c__keys_code_root_tl at some point, to reflect its internal nature,
>> and fontspec should not be using it.
> 
> Fontspec [2013/03/16 v2.3a Font selection for XeLaTeX and LuaLaTeX] requires 
> {expl3}[2011/09/05] and seems to work alright for me with expl3.sty
> 2013/03/14 v4469. These files don't make use of \c__keys_code_root_tl or 
> \c_keys_code_root_tl, on my system it's only l3keys.sty2013/02/24 v4461, 
> that has \c__keys_code_root_tl. A final update might be useful… (since TeX 
> Live has become stable a month ago)

All correct: there was a change, it was a while ago and provided you
have up-to-date fontspec and expl3 packages everything should work.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] First message: Xelatex, pstricks and Mountain Lion

2013-01-14 Thread Joseph Wright

On 14/01/2013 15:10, François Boone wrote:
> Hi,
> 
> I am on Macbook pro, 2012, with Mountain Lion, 10.8.2.
> My texlive 2012 is up to date : i update it with Tex Live Utility.
> 
> I have a problem:
> This is my document:
> 
> \listfiles
> \documentclass{minimal}
> \usepackage{pstricks}
> \begin{document}
> \begin{pspicture}(0,0)(10cm,2cm)
> \psline[linewidth=2pt,linecolor=red](0,0)(10,2)
> \end{pspicture}
> \end{document}
> 
> When I xelatex this simple example, I obtain a one blank page pdf file.
> 
> In console, I have this message:
> (./ecm.aux) [1] (./ecm.aux)gs requires X11.  Please visit 
> http://support.apple.com/kb/HT5293 for more information.
> 
> ** WARNING ** Filtering file via command -->rungs -q -dNOPAUSE -dBATCH 
> -dNOSAFER -sPAPERSIZE=a0 -sDEVICE=pdfwrite -dCompatibilityLevel=1.5 
> -dAutoFilterGrayImages=false -dGrayImageFilter=/FlateEncode 
> -dAutoFilterColorImages=false -dColorImageFilter=/FlateEncode 
> -sOutputFile='/var/folders/tx/jz3ygk150jldyb_xjg26tbr4gn/T//dvipdfmx.yQDiuc2o'
>  '/var/folders/tx/jz3ygk150jldyb_xjg26tbr4gn/T//dvipdfmx.Y8pRsQwx' -c 
> quit<-- failed.
> ** WARNING ** Image format conversion for PSTricks failed.
> ** WARNING ** Interpreting special command pst: (ps:) failed.
> ** WARNING ** >> at page="1" position="(91.9253, 663.307)" (in PDF)
> ** WARNING ** >> xxx "pst:  tx@Dict begin STP newpath /ArrowA { moveto } def 
> /ArrowB "
> ** WARNING ** 5 memory objects still allocated
> You may want to report this to te...@tug.org
> 
> I had a long talk with Herbert Voss this last WE and we don't find a 
> solution. Herbert works on PC, so it is difficult to find the problem on Mac.
> I someone can help me...
> 
> Thank-you
> François

Have you tried installing XQuartz? (Apple no longer include X11 with ML,
so you have to get your own from the community spin-out.)
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] bidi.sty for plain XeTeX

2012-12-22 Thread Joseph Wright

On 21/12/2012 22:24, John Was wrote:
> I'm not an Arabist but have occasionally had to typeset articles in
> plain XeTex using Arabic, and all I have in my file header is:
> 
> \TeXXeTstate=1 % this turns e-TeX's bidi functionality on
> \def\intextarab#1{{\arabic {\beginR #1\endR}}}
> 
> I define \arabic as a call to my Arabic font (the definition of \arabic
> changes according to whether  I'm in main text, footnote text, or
> extract text).  To achieve Arabic I just give \intextarab{ARABIC TEXT
> HERE}. That works fine for bits of Arabic embedded in English (or other
> left-to-right) text in the same paragraph.  For separate Arabic
> paragraphs you really just need
> \beginR
> 
> and at the end
> 
> \endR
> 
> 
> There are no doubt slicker ways of doing things, but that gave me good
> output first time round so I stuck with it!
> 
> 
> John

For an entire document, you need to worry about \everypar and
\parindent, for example

  \TeXXeTstate = 1 %
  \newbox\indentbox
  \everypar{%
\setbox\indentbox=\lastbox
\beginR
  \box\indentbox
  }

as \beginR is an hmode command.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] bidi.sty for plain XeTeX

2012-12-22 Thread Joseph Wright

On 22/12/2012 08:48, Philip TAYLOR wrote:
> 
> 
> heer wrote:
>> Joseph,
>>
>> I have here Vafa Khalighi's documentation for his bidi package. It has a 
>> section on using his bidi package with plain TeX.  He says the bidi package 
>> is loaded with the command \input bidi, but that command doesn't work (at 
>> least with my 2009 version of TeX Live) because xetex can't find the bidi 
>> file.  If I use the command \input bidi.sty xetex finds the bidi.sty file 
>> for xelatex not xetex. That's why I was wondering whether there wasn't 
>> another bidi.sty file available for plain xetex.
>> Perhaps there are other bidi packages available that I do not know of.
> 
> Probably this one, Nicholas :
> 
> d:/TeX/Live/2011/texmf-dist/tex/latex/bidi/bidi.tex
> 
> Philip Taylor

Indeed,

  \input bidi
  Some text
  \bye

works on my system using XeTeX.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] bidi.sty for plain XeTeX

2012-12-21 Thread Joseph Wright

On 21/12/2012 21:52, heer wrote:
> 
> Is there a bidi.sty file for plain XeTeX or only for XeLateX? I'd
> like to be able to use Arabic script in plain XeTeX.
> 
> Nicholas

The bidi docs include instructions for using it with plain XeTeX
(although I'd imagine plain users would tend to 'roll their own').
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-05 Thread Joseph Wright

On 05/12/2012 12:35, Zdenek Wagner wrote:
> The book is written in Czech but contains words and sometimes even
> sentences in Hindi and Urdu. The lines are often broken within the
> Urdu text.

So mainly constructs of the form

  blah blah blah \beginR halb halb halb\endR blah blah blah

or similar?

> Since the book contains more than 300 pictures, almost all
> paragraphs require \parshape. 

Paragraph shape is a someone separate issue :-) (In the sense that I
have other worries that come up there without bringing in RTL!)
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-05 Thread Joseph Wright

On 05/12/2012 12:14, Jonathan Kew wrote:
> On 5/12/12 11:56, Joseph Wright wrote:
> 
>> Right, so some more thinking required here. The question is what is
>> sensible for new content: from what you say about TeX--XeT and bidi,
>> using pdfTeX/XeTeX is not currently to be recommended for RTL work
> 
> IMO, this is an overly general and somewhat misleading "blanket
> statement". The TeX--XeT model has its limitations, certainly, but there
> are plenty of RTL documents for which it is perfectly adequate. Not
> every document requires mixed-direction math, or \special-based colour
> and hyperlinks wrapping across lines.
> 
> Back when I was doing typesetting on behalf of authors and publishers in
> Pakistan and elsewhere, we used (TeX--XeT-based) XeTeX with considerable
> success and without feeling at all constrained by its bidi shortcomings.
> But then, we were producing "traditional" printed books full of text,
> not colourful, hyperlinked PDFs full of math.
> 
> JK
Hello Jonathan,

It was not my intention to criticise XeTeX: I'm trying to get a feel for
the issues in RTL work and to understand what people who use the TeX
tools in this area find works (or does not). I'm also trying to work out
how the different engines work when you abstract to the package layer,
i.e. to what extend you can create macros which hide the complexities or
which do not enforce particular engine requirements. All of the input is
very useful.
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-05 Thread Joseph Wright

On 05/12/2012 08:54, Vafa Khalighi wrote:
> 
>>
>> At the moment, I'm looking specifically at what we need to worry about
>> at a low level. For example, the current expl3 code does not take any
>> notice of direction, which is probably right for something like \hbox:n
>> (follow whatever is going on around it), but should be documented and
>> deliberate, not just something we've ignored. So what's important at
>> this stage is much more the concepts than trying to write any code,
>> although any thoughts on what is required for RTL support at the 'base
>> level' are of course welcome.
> 
> 
> For the boxes in luatex you can change directions: \hbox dir TRT{...}

I was thinking more at the level of something like \hboxR/\hboxL as
defined by bidi, plus perhaps some form of test similar to \ifmmode.
Then again, I have no idea what is needed beyond certain small contexts
(for example ensuring LTR for units).

>> For pdfTeX that's not an issue: I doubt very many people use pdfTeX for
>> RTL. 
> 
> Well, there are two groups of people. The first group use ArabTeX which does 
> not 
> make any use of TeX--XeT and it works with Knuth TeX too. The second group 
> also 
> are Hebrew and Arab users; some of them still use babel.
> 
>> XeTeX is a bit more 'interesting': I guess the existence of bidi
>> means that people are using XeTeX for 'real life' RTL work, despite
>> limitations.
> 
> Considering bidi has improved the situations and made things cleaner and 
> simpler, yes.

Right, so some more thinking required here. The question is what is
sensible for new content: from what you say about TeX--XeT and bidi,
using pdfTeX/XeTeX is not currently to be recommended for RTL work
(although I guess this would change if XeTeX switches to the Omega
approach).
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-04 Thread Joseph Wright

On 05/12/2012 00:53, Vafa Khalighi wrote:
> The point about specials is one I
>> guess I'll look at by reading the bidi code and doing some tests.
> 
> bidi package does not patch \special. It only make changes in packages like 
> color, xcolor and hyperrref, etc. And this makes them work in a very limited 
> case. For example, if you use color package in RTL with bidi package, color 
> works correctly only if your colored text stays on one line.

OK, does sound a bit limited.

>> As you might guess, my interest here steams from some LaTeX3
>> discussions, and one issue I'm trying to understand is whether the
>> TeX--XeT approach is really one that is sensible to try to support,
>> given the fact that the Omega approach exists. 
> 
> 
> My advice: do not waste time on TeX--XeT; it's useless. I have spent four 
> years 
> developing bidi package using TeX--XeT and I can tell you that it has many 
> bugs/limitations.

At the moment, I'm looking specifically at what we need to worry about
at a low level. For example, the current expl3 code does not take any
notice of direction, which is probably right for something like \hbox:n
(follow whatever is going on around it), but should be documented and
deliberate, not just something we've ignored. So what's important at
this stage is much more the concepts than trying to write any code,
although any thoughts on what is required for RTL support at the 'base
level' are of course welcome.

What you say fits in with what I'd already suspected: for RTL work,
we've be better only supporting one set of primitives, the Omega ones.
For pdfTeX that's not an issue: I doubt very many people use pdfTeX for
RTL. XeTeX is a bit more 'interesting': I guess the existence of bidi
means that people are using XeTeX for 'real life' RTL work, despite
limitations.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-04 Thread Joseph Wright

On 05/12/2012 00:45, Vafa Khalighi wrote:
> Yes, the limitations of TeX--XeT are:
> 
>   * Only four primitives \beginR \endR, \beginL \endL are provided which makes
> typesetting RTL documents very hard and complicated.
>   * The primitives above only work in horizontal mode.
>   * No way to typeset RTL tabular, the only approach is to put tabular inside 
> an
> RTL box which itself introduces lots of problems.
>   * \special do not work properly in RTL mode.
>   * There is no way to change the direction of boxes and even if you do by
> trick, the order of TOC or anything that has to do with \write at shipout
> time gets wrong.
>   * left/right skips do not get reversed in RTL, so you have to replace them
> with each other and this is not always the case, e.g. \vbox inside \hbox
>   * \parshape is not reversed in RTL mode so you have to do some macro
> programming and this is not always the case, e.g, \vbox inside \hbox
>   * No tool for controlling equation number; it only provides
> \predisplaydirection which is buggy in RTL.
>   * 

Very useful list :-) (I knew some of these, but it's nice to have them
collected up.)
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-04 Thread Joseph Wright

On 04/12/2012 15:55, Khaled Hosny wrote:
> IMO TeXXeT and all its incarnation has always been a hack to get RTL
> with the least modification as possible. Two main limitations I'm
> concerned about are the broken handling of specials (you can not get
> coloring or hyperlinks in RTL text without macro hacks with limited
> functionality) and lack of RTL math. On the other hand Omega's
> directionality code is more sophisticated and requires less adaptations
> at macro side (check the size of bidi package on how things can go wild
> with TeX--XeT), and it should allow for proper vertical typesetting in
> XeTeX as bonus.

I'd picked up the math mode point (although I'm not 100% sure on how
this works out with numerals, which don't always seem to be reversed
compared to the latin script order). The point about specials is one I
guess I'll look at by reading the bidi code and doing some tests.

As you might guess, my interest here steams from some LaTeX3
discussions, and one issue I'm trying to understand is whether the
TeX--XeT approach is really one that is sensible to try to support,
given the fact that the Omega approach exists. Certainly if you move
XeTeX from TeX--XeT to the Omega approach (with the LuaTeX fixes, I
guess), then this will be an easier thing to think about!

>> A slightly wider question which this leads me to: do I take it that
>> getting some (minor) additions to XeTeX might be possible? There was
>> some discussion last week about a few pdfTeX primitives that might be
>> useful.
> 
> Patches are always welcomed of course :) Right now work is mainly on
> layout side (replacing the abandoned ICU LayoutEngine with HarfBuzz,
> there is even a chance of replacing ATS/ATSU with Core Text, and old
> SilGraphite with Graphite2 engine) and further polishing of OpenType
> math.

Jonathan K. will tell you that my ability to write XeTeX patches is very
limited :-) I was thinking much less complicated than layout engines
(which all sounds very good!). I'll see what makes sense and perhaps be
in contact again.
--
Joseph Wright

--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] [luatex] Info on direction primitives/implementation

2012-12-04 Thread Joseph Wright

[Moved from the LuaTeX list as this moves me to XeTeX!]

On 04/12/2012 14:33, Khaled Hosny wrote:
> On Tue, Dec 04, 2012 at 01:52:19PM +0000, Joseph Wright wrote:
>> Very useful, thanks. Point about detail understood: at the moment I'm
>> trying to get my head around the entire area, and to see how the LuaTeX
>> version contrasts with the pdfTeX/XeTeX approach.
> 
> In the last few days I have been fancying the idea of scrapping TeX--XeT
> from XeTeX and replacing it with code from Aleph (with LuaTeX
> modifications backported to it), but no work has been done so far (and
> probably never will, I always underestimate how hard things are).
> 
> Regards,
> Khaled

Hello Khaled,

That suggests that TeX--XeT is not providing the tools required to do a
decent job in XeTeX: is that a fair reading? (I'd guess that there were
reasons for the Omega/Aleph/LuaTeX move from TeX--XeT to other
approaches, but as a non-expert was not sure how to read it.)

A slightly wider question which this leads me to: do I take it that
getting some (minor) additions to XeTeX might be possible? There was
some discussion last week about a few pdfTeX primitives that might be
useful.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] troubles with \setcounter in Tex 2012

2012-12-03 Thread Joseph Wright

On 02/12/2012 18:31, Andrey Klebanov wrote:
> Dear all, 
> 
> since my very recent update to the new tex distribution (tex 2012) I'm facing 
> a problem with setting values of counter using mathematic formulas as 2-1.
> a minimal example looks like this:
> 
> \documentclass{memoir}
> \begin{document}
> \newcounter{b}
> \setcounter{b}{2-1}
> \theb
> \end{document}
> 
> which in my case sets the \value{b} to 2 and prints -1 (s. the attached 
> file), instead of setting the \value{b} to 1. (I need it in order to set a 
> connection between 2 counters like \setcounter{b}{\thea-2} etc.)
> I would be extremely thankful for any support, since this issue completely 
> destroys my workflow.
> 
> thanks in advance 
> Andrey

You need to load the calc package: perhaps memoir was in earlier versions?
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] xelatex,polyglossia,biblatex conflict

2012-08-24 Thread Joseph Wright

On 24/08/2012 12:37, Haines Brown wrote:
> I'm migrating from LaTeX to XeLaTeX with TL 2011 on linux system.
> 
>   \documentclass[12pt]{article}
>   \usepackage{xltxtra}
>   \usepackage[backend=biber,style=authoryear,sorting=nyt]{biblatex} 
>   \usepackage[style=authoryear,sorting=nyt]{biblatex} 
>   \usepackage{xunicode}
>   \usepackage{fontspec}
>   \usepackage{csquotes}
>   %  \usepackage{polyglossia}
>   \addbibresource{bib}
>   ...
> 
> I would like to use the polyglossia package, but if I add it, I get 
> an Undefined control sequence error.
> 
> Haines Brown

Without a bit more detail it's hard to say, but I would strongly suggest
using TL2012. Development of both biblatex and polyglossia is active,
and so issues can and do get fixed over time.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] Announcing: LuaTeX support in Polyglossia

2012-08-22 Thread Joseph Wright

On 22/08/2012 15:05, Zdenek Wagner wrote:
> Hi Arthur,
> 
> I found some time to test your new Polyglossia with lualatex. However,
> the URL does not work now, it worked a few days ago but I had no time
> to download it. Any help?

There's a letter missing: try https://github.com/reutenauer/polyglossia
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX and SIunitx

2012-06-12 Thread Joseph Wright

On 12/06/2012 15:10, Philip TAYLOR wrote:
> 
> Tobias Schoel wrote:
>> What does normalise mean with angstrom and ohm?
> 
> Perhaps as per
> http://en.wikipedia.org/wiki/Unicode_equivalence#Normalization
> Philip Taylor

Indeed: normalization is a way of dealing with differences in logical
meaning where the symbols used are identical. For siunitx, I have to
balance meaning with the likelihood of the symbol appearing in the
output at all. Using the normalisation characters means that you have
the best chance of getting the visually correct output, while still
being able to search using the UTF-8 characters correctly.
--
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX and SIunitx

2012-06-11 Thread Joseph Wright

On 11/06/2012 21:57, Joseph Wright wrote:
> Hello all,
> 
> Taking a look back over the code, I already have some auto-detection in
> for picking up UTF-8 symbols when the correct engine is in use.
> 
> I've revised this a bit for the next release (v2.5d, on CTAN tomorrow),
> so that all of the 'problematic' symbols are covered in what seems to be
> the best way possible. Nothing happens unless appropriate support
> (fontspec/unicode-math) is loaded. If it is, then you get the following
> symbols:
> 
>  - Ångström   u+00c5 (u+212b normalises here)
>  - Degree Celsius u+00b0 + "C" (u+2103 is a compatibility character)
>  - Micro  u+00b5 (u+03bc is wrong)
>  - Ohmu+03a9 (u+2126 normalises here)
> 
>  - Degree u+00b0
>  - Arc minute u+2032 (requires unicode-math)
>  - Arc second u+2033 (requires unicode-math)

Further to that, if you look through the source of siunitx you may
wonder where this takes place as the symbols are not there. To avoid any
issues with pdfTeX, this is all done using \char and an appropriate
constant in each case.
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] XeLaTeX and SIunitx

2012-06-11 Thread Joseph Wright

Hello all,

Taking a look back over the code, I already have some auto-detection in
for picking up UTF-8 symbols when the correct engine is in use.

I've revised this a bit for the next release (v2.5d, on CTAN tomorrow),
so that all of the 'problematic' symbols are covered in what seems to be
the best way possible. Nothing happens unless appropriate support
(fontspec/unicode-math) is loaded. If it is, then you get the following
symbols:

 - Ångström   u+00c5 (u+212b normalises here)
 - Degree Celsius u+00b0 + "C" (u+2103 is a compatibility character)
 - Micro  u+00b5 (u+03bc is wrong)
 - Ohmu+03a9 (u+2126 normalises here)

 - Degree u+00b0
 - Arc minute u+2032 (requires unicode-math)
 - Arc second u+2033 (requires unicode-math)
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] "Minimalist" TeX?

2012-05-16 Thread Joseph Wright

On 16/05/2012 17:18, Khaled Hosny wrote:
>> TeX is more than one program, so it's not as simple as grabbing 'source
>> for the binary TeX' and compiling it.
>>
>> For a minimal compilable set up, maybe take a look at KerTeX:
>> http://www.kergis.com/en/kertex.html
> 
> Which does not include XeTeX :)

Oh yes, license and library issues: I forgot :-)
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

Re: [XeTeX] "Minimalist" TeX?

2012-05-16 Thread Joseph Wright

On 16/05/2012 05:38, C Y wrote:
> I have compiled xetex from the latest Git sources on sourceforge, and the 
> build appears to have been successful.
> 
> Does the sourceforge Git repo of xetex produce a working (albeit minimal) TeX 
> once compilation is complete?  (It didn't seem to in my quick test, but it's 
> quite possible I didn't do something right environment wise...)  If not, is 
> there documentation anywhere of what constitutes the minimal set of files 
> that will allow an average LaTeX document to be typeset?
> 
> My interest is in building a "Minimalist" subset of TeX in situations where a 
> system installation isn't present, but I've not had much luck locating 
> documentation describing what constitutes a minimal-yet-functional subset of 
> the TeX Live distribution.  Has anybody documented such a subset?
> 
> Thanks,
> CY

TeX is more than one program, so it's not as simple as grabbing 'source
for the binary TeX' and compiling it.

For a minimal compilable set up, maybe take a look at KerTeX:
http://www.kergis.com/en/kertex.html
-- 
Joseph Wright


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex

1 2 >

1 - 100 of 117 matches

Mail list logo