Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor


Akira Kakuto wrote:

> You can test the new experimental XeTeX on win32 by
> http://members2.jcom.home.ne.jp/wt1357ak/xetex-exp-w32.zip

/Domo arigato gozaimasu/, Akira-san.  Downloaded and installed, just
re-building formats before commencing testing.

Philip Taylor


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor


Akira Kakuto wrote:

> You can test the new experimental XeTeX on win32 by
> http://members2.jcom.home.ne.jp/wt1357ak/xetex-exp-w32.zip

Some interesting (and new, and unexpected) diagnostics, Akira-san; as
far as I can tell, no PDF was produced :

This is XeTeX, Version 3.14159265-2.6-0.3 (TeX Live 2016/W32TeX/dev)
(preloaded format=xetex)
 restricted \write18 enabled.
entering extended mode
(./Master.tex (./B5-Master.tex A5-Master processing is now complete
(./Captions.tex) (./Image-float.tex) (./Catalogue.tex
(./collating-sequence.tex
) (./XML-Elements-CW.tex XML-Elements-CW processing complete)
(./XML-Elements-MA.tex XML-Elements-MA processing complete)
(./XML-Elements-PT.tex XML-Elements-PT processing complete)
(./XML-Elements-Shared.tex XML-Elements-Shared processing complete)
(d:/TeX/Live/2014/texmf-dist/tex/eplain/eplain.tex)
(d:/TeX/Live/2014/texmf-dist/tex/plain/graphics-pln/miniltx.tex)
(d:/TeX/Live/2014/texmf-dist/tex/latex/graphics/color.sty
(d:/TeX/Live/2014/texmf-dist/tex/latex/latexconfig/color.cfg)
(d:/TeX/Live/2014/texmf-dist/tex/xelatex/xetex-def/xetex.def)))
(./Prelims.tex
(../Dynamic-content/Master.aux) [1] [2] [3] [4]
(../Dynamic-content/Master.toc
\tocentry {T A B L E\ \ \ O F\ \ \ C O N T E N T S}{\hlidxpage{}{5}}
(../Dynamic-content/TOC.aux About to process TOC.aux)))
(./Inline-images.tex)
(./XML-list.tex [5fwrite: Broken pipe
Unable to update the static FcBlanks: 0x0600
Unable to update the static FcBlanks: 0x0601
Unable to update the static FcBlanks: 0x0602
Unable to update the static FcBlanks: 0x0603
Unable to update the static FcBlanks: 0x06dd
Unable to update the static FcBlanks: 0x070f
Unable to update the static FcBlanks: 0x2028
Unable to update the static FcBlanks: 0x2029
Unable to update the static FcBlanks: 0xfff9
Unable to update the static FcBlanks: 0xfffa
Unable to update the static FcBlanks: 0xfffb
xetex.exe:


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor


Zdenek Wagner wrote:

> \maxdimen is a maximum dimension value, not a maximum number of \dimen
> registers.

Yes, I agree Zdeněk (I remembered that, even if I did not remember that
it is defined in Plain.TeX rather than a primitive !); I should have
suggested that the new environmental enquiry be called
\maxintercharclass, I suppose, to make its meaning more transparent.

Just as \maxdimen indicates that a  value may not exceed
\maxdimen, so \maxintercharclass would indicate that a 
value may not exceed \maxintercharclass.

** Phil.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Joseph Wright
On 01/02/2016 09:00, Philip Taylor wrote:
> 
> 
> Akira Kakuto wrote:
> 
>> You can test the new experimental XeTeX on win32 by
>> http://members2.jcom.home.ne.jp/wt1357ak/xetex-exp-w32.zip
> 
> /Domo arigato gozaimasu/, Akira-san.  Downloaded and installed, just
> re-building formats before commencing testing.
> 
> Philip Taylor

Indeed: all working here, e.g.

\XeTeXcharclass6=16384 %

with

This is XeTeX, Version 3.14159265-2.6-0.3 (TeX Live 2016/W32TeX/dev)

Thanks Akira for this and the LuaTeX builds: very useful.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Joseph Wright
On 01/02/2016 09:37, Philip Taylor wrote:
> 
> 
> Akira Kakuto wrote:
> 
>>> Some interesting (and new, and unexpected) diagnostics, Akira-san; as
>>> far as I can tell, no PDF was produced :
>>
>> You have to replace "all" included binaries, by saving the old ones.
>> Note that size of xdvipdfmx.exe,  which is a wrapper of dvipdfmx.dll, 
>> is 1536 bytes.
> 
> Akira-san :  I did just that.  I copied the entire "...\bin\win32"
> directory to ...\bin\win32-old, then overwrote all files in
> ...\bin\win32 with the corresponding files from your ZIP file.
> 
> I will repeat the process just to ensure that I made no errors and then
> report back.
> 
> Philip Taylor

Works for me replacing xetex.exe, dvipdfmx.dll and adding icudt56.dll
(no present in stock TL2015). (System TL2015, updated this morning,
those files + luatex.dll from the W32TeX dev version.)

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor


Zdenek Wagner wrote:

> Isn't it possible that you have 64bit binaries elsewhere and you system
> prefers them? Just guessing, I am not a Windows user.

I wish I had, Zdeněk, but unfortunately TeX for Windows does not yet
come in a 64-bit version as far as I know.

** Phil.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor
Just generating a minimal ZIP file containing everything necessary to
reproduce the problem, Akira-san.  Only non-standard requirement will be
to set the output path for XeTeX to ../Dynamic-content

** Phil.

Akira Kakuto wrote:

> There may be a bug in the dvipdfmx.dll.
> Is it possible for me to study Master.pdf?


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Akira Kakuto

Dear Philip,

Regret this will take some time to upload to Dropbox; it is circa 1Gb in size. 


I may not be able to test 1Gb size file.
Please recover the old binaries. Sorry.

Best,
Akira



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Akira Kakuto

Dear Philip,

Please try
copy  xdvipdfmx.exe  extractbb.exe
in the bin/win32 dir.

Best,
Akira



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Akira Kakuto

Dear Philip,


(./Master.tex (./B5-Master.tex A5-Master processing is now complete [1] ) )
Error -1073741515 (driver return code) generating output;
file ../Dynamic-content/Master.pdf may not be valid.
SyncTeX written on ../Dynamic-content/Master.synctex.gz.
Transcript written on ../Dynamic-content/Master.log.


There may be a bug in the dvipdfmx.dll.
Is it possible for me to study Master.pdf?

Best,
Akira



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor
OK, not a problem Akira-san; I have just realised that 99% of the
previous MN-WE were images that are not referenced when the compilation
aborts, so I am just re-building with the minimal two images necessary
to allow compilation to proceed to abortion.  Estimated zipfile size is
10Mb now.

** Phil.

Akira Kakuto wrote:

> I may not be able to test 1Gb size file.
> Please recover the old binaries. Sorry.
> 
> Best,
> Akira
> 


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Akira Kakuto

Some interesting (and new, and unexpected) diagnostics, Akira-san; as
far as I can tell, no PDF was produced :


You have to replace "all" included binaries, by saving the old ones.
Note that size of xdvipdfmx.exe,  which is a wrapper of dvipdfmx.dll,  is 1536 
bytes.

Best,
Akira



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Zdenek Wagner
Isn't it possible that you have 64bit binaries elsewhere and you system
prefers them? Just guessing, I am not a Windows user.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

2016-02-01 10:37 GMT+01:00 Philip Taylor :

>
>
> Akira Kakuto wrote:
>
> >> Some interesting (and new, and unexpected) diagnostics, Akira-san; as
> >> far as I can tell, no PDF was produced :
> >
> > You have to replace "all" included binaries, by saving the old ones.
> > Note that size of xdvipdfmx.exe,  which is a wrapper of dvipdfmx.dll,
> > is 1536 bytes.
>
> Akira-san :  I did just that.  I copied the entire "...\bin\win32"
> directory to ...\bin\win32-old, then overwrote all files in
> ...\bin\win32 with the corresponding files from your ZIP file.
>
> I will repeat the process just to ensure that I made no errors and then
> report back.
>
> Philip Taylor
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
>


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Jonathan Kew

On 1/2/16 10:25, David Carlisle wrote:

Thanks for the test sources,

It all seems to work for me (texlive 2015/cygwin 64 build), but..

I do wonder if this change is going in the right direction.

The main problem with the char classes is not the overall number, in
fact since the important thing as far as specifying code is the boundary
between different classes rather than the classes themselves, there are
now around 300 million such boundaries that could be specified, which
seems more than enough!

The main problem is that each character can only be in one class which
means that it is very hard to use these for any generic code. If you
have already classified characters by (say) line breaking properties and
then another package wants to classify by unicode block, or by default
writing direction, then the only way to handle that is to enumerate all
the intersecting properties and assign a a unique character class to
each intersection, this leads to a combinatorial explosion in the number of
boundary tokens that need to be specified. Where you may have had a
single specification for the boundary between LTR and RTL if you also
want to classify each unicode block you need  separate classes for LTR
and RTL characters in each block and then need to specify the same
boundary tokens for all the possible changes of LTR in one block
followed by RTL in another.

That limitation of course has always been there, but increasing the
number of classes available highlights it more strongly.


You're right, of course; this is a limitation of the concept as 
currently implemented.


In practice, I suppose I don't expect there to be all that many "generic 
purposes" for which intercharclass is really a useful tool. For example, 
it's hard to see how it could work well for bidi issues, because of the 
problem of resolving neutral characters -- especially run-initial neutrals.




Would it be impossibly difficult to extend the concept so that a
character takes a list of character classes so that you can classify
characters in more than one way without needing impossibly many
character classes to do that?


There would be two aspects to this: first, extending the character class 
storage so as to allow a list rather than a single number. Currently, 
it's stashed in the upper part of the word where sfcode already lives, 
making the implementation very simple and cheap.


And second, checking for the existence of a token list for the current 
boundary would become significantly more expensive. Currently, we just 
combine the two classes at the boundary to get a single 32-bit number, 
and do a simple lookup (in a sparse array) to see if there's anything 
defined. With class lists, we'd need to do this for each of the classes 
in the two lists -- i.e. m * n sparse-array lookups. Or perhaps go at it 
from the other direction: iterate over a list of defined transitions, 
and check whether each of them applies.


Oh, and if there are multiple matches at a given boundary, what happens? 
Using an imaginary extension to support lists:


  \XeTeXintercharclasses `A = { 1, 2 }
  \XeTeXintercharclasses `B = { 3, 4 }

  \XeTeXinterchartoks 1 3 = { foo }
  \XeTeXinterchartoks 1 4 = { bar }
  \XeTeXinterchartoks 2 3 = { xyzzy }
  \XeTeXinterchartoks 2 4 = { plugh }

What happens at the boundary in "AB"? Should it depend on the numerical 
values of the classes, or the order in which the transitions were 
specified, or what?


(I'm not saying the idea is a bad one; I can imagine it might be quite 
useful. But I can also imagine it getting a bit hairy..)


JK



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread David Carlisle
Thanks for the test sources,

It all seems to work for me (texlive 2015/cygwin 64 build), but..

I do wonder if this change is going in the right direction.

The main problem with the char classes is not the overall number, in fact
since the important thing as far as specifying code is the boundary between
different classes rather than the classes themselves, there are now around
300 million such boundaries that could be specified, which seems more than
enough!

The main problem is that each character can only be in one class which
means that it is very hard to use these for any generic code. If you have
already classified characters by (say) line breaking properties and then
another package wants to classify by unicode block, or by default writing
direction, then the only way to handle that is to enumerate all the
intersecting properties and assign a a unique character class to each
intersection, this leads to a combinatorial explosion in the number of
boundary tokens that need to be specified. Where you may have had a single
specification for the boundary between LTR and RTL if you also want to
classify each unicode block you need  separate classes for LTR and RTL
characters in each block and then need to specify the same boundary tokens
for all the possible changes of LTR in one block followed by RTL in another.

That limitation of course has always been there, but increasing the number
of classes available highlights it more strongly.

Would it be impossibly difficult to extend the concept so that a character
takes a list of character classes so that you can classify characters in
more than one way without needing impossibly many character classes to do
that?

Sorry for sounding ungrateful for the extension:-)

David


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Bruno Le Floch
On 2/1/16, David Carlisle  wrote:
> On 1 February 2016 at 10:53, Jonathan Kew  wrote:
>> On 1/2/16 10:25, David Carlisle wrote:
>>
>>> y.
>>>
>>
>> You're right, of course; this is a limitation of the concept as currently
>> implemented.
>>
>> In practice, I suppose I don't expect there to be all that many "generic
>> purposes" for which intercharclass is really a useful tool. For example,
>> it's hard to see how it could work well for bidi issues, because of the
>> problem of resolving neutral characters -- especially run-initial
>> neutrals.
>>
>
> Yes I hesitated about using bidi as the example there, but couldn't think
> of many other generally applicable things:-)
>
>>
>>
>>> Would it be impossibly difficult to extend the concept so that a
>>> character takes a list of character classes so that you can classify
>>> characters in more than one way without needing impossibly many
>>> character classes to do that?
>>>
>>
>> There would be two aspects to this: first, extending the character class
>> storage so as to allow a list rather than a single number. Currently,
>> it's
>> stashed in the upper part of the word where sfcode already lives, making
>> the implementation very simple and cheap.
>>
>> And second, checking for the existence of a token list for the current
>> boundary would become significantly more expensive.
>
>
> Yes I suspected as much, perhaps it's a non starter. If the extended number
> as in the current test branch is "still cheap" then perhaps that's the way
> to go although character classes always seem like they are almost a
> solution to a problem but never quite powerful enough.
>
>
>> Currently, we just combine the two classes at the boundary to get a
>> single
>> 32-bit number, and do a simple lookup (in a sparse array) to see if
>> there's
>> anything defined. With class lists, we'd need to do this for each of the
>> classes in the two lists -- i.e. m * n sparse-array lookups. Or perhaps
>> go
>> at it from the other direction: iterate over a list of defined
>> transitions,
>> and check whether each of them applies.
>>
>
> make sense.
>
>
>>
>> Oh, and if there are multiple matches at a given boundary, what happens?
>> Using an imaginary extension to support lists:
>>
>>   \XeTeXintercharclasses `A = { 1, 2 }
>>   \XeTeXintercharclasses `B = { 3, 4 }
>>
>>   \XeTeXinterchartoks 1 3 = { foo }
>>   \XeTeXinterchartoks 1 4 = { bar }
>>   \XeTeXinterchartoks 2 3 = { xyzzy }
>>   \XeTeXinterchartoks 2 4 = { plugh }
>>
>> What happens at the boundary in "AB"? Should it depend on the numerical
>> values of the classes, or the order in which the transitions were
>> specified, or what?
>>
>> (I'm not saying the idea is a bad one; I can imagine it might be quite
>> useful. But I can also imagine it getting a bit hairy..)
>>
>> JK
>
> Yes it certainly wasn't a fully worked proposal, but I thought it worth
> commenting while you were looking at that area of the code.
>
> David

Even with the current intercharclass one could write a package to
implement proposals such as David's, allowing whatever ordering of
transitions people want).  Such a package would define all transitions
to run the same code (including transitions with non-character
primitives), which would test the next token using \futurelet and save
its character code (or other info) in a global variable, say,
\lastchar.  At every step, one can use \lastchar and the next token to
decide what transition to do using whatever rules the package author
thinks of.  Major drawback: kerning is lost.

Note that this can be used to implement bidi too: just collect neutral
characters rather than leaving them right away in the output.

Not saying that's the right way to do it, but it could be made to work.

Bruno


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread David Carlisle
On 1 February 2016 at 10:53, Jonathan Kew  wrote:

> On 1/2/16 10:25, David Carlisle wrote:
>
>> y.
>>
>
> You're right, of course; this is a limitation of the concept as currently
> implemented.
>
> In practice, I suppose I don't expect there to be all that many "generic
> purposes" for which intercharclass is really a useful tool. For example,
> it's hard to see how it could work well for bidi issues, because of the
> problem of resolving neutral characters -- especially run-initial neutrals.
>

Yes I hesitated about using bidi as the example there, but couldn't think
of many other generally applicable things:-)

>
>
>> Would it be impossibly difficult to extend the concept so that a
>> character takes a list of character classes so that you can classify
>> characters in more than one way without needing impossibly many
>> character classes to do that?
>>
>
> There would be two aspects to this: first, extending the character class
> storage so as to allow a list rather than a single number. Currently, it's
> stashed in the upper part of the word where sfcode already lives, making
> the implementation very simple and cheap.
>
> And second, checking for the existence of a token list for the current
> boundary would become significantly more expensive.


Yes I suspected as much, perhaps it's a non starter. If the extended number
as in the current test branch is "still cheap" then perhaps that's the way
to go although character classes always seem like they are almost a
solution to a problem but never quite powerful enough.


> Currently, we just combine the two classes at the boundary to get a single
> 32-bit number, and do a simple lookup (in a sparse array) to see if there's
> anything defined. With class lists, we'd need to do this for each of the
> classes in the two lists -- i.e. m * n sparse-array lookups. Or perhaps go
> at it from the other direction: iterate over a list of defined transitions,
> and check whether each of them applies.
>

make sense.


>
> Oh, and if there are multiple matches at a given boundary, what happens?
> Using an imaginary extension to support lists:
>
>   \XeTeXintercharclasses `A = { 1, 2 }
>   \XeTeXintercharclasses `B = { 3, 4 }
>
>   \XeTeXinterchartoks 1 3 = { foo }
>   \XeTeXinterchartoks 1 4 = { bar }
>   \XeTeXinterchartoks 2 3 = { xyzzy }
>   \XeTeXinterchartoks 2 4 = { plugh }
>
> What happens at the boundary in "AB"? Should it depend on the numerical
> values of the classes, or the order in which the transitions were
> specified, or what?
>
> (I'm not saying the idea is a bad one; I can imagine it might be quite
> useful. But I can also imagine it getting a bit hairy..)


Yes it certainly wasn't a fully worked proposal, but I thought it worth
commenting while you were looking at that area of the code.



>
>
> JK
>
>
David


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Philip Taylor


Jonathan Kew wrote:

> What happens at the boundary in "AB"? Should it depend on the numerical
> values of the classes, or the order in which the transitions were
> specified, or what?

"Clearly" it should not depend on the numerical values of the classes,
since in reality the precedence may have only a partial ordering :

(1 > 2 > 3 > 1)

but the use of a comma does not suggest ordering.  Therefore I would
suggest a variant syntax :

  \XeTeXintercharclasses `A = {1; 2}
  \XeTeXintercharclasses `B = {3; 4}

where the semi-colon is an Algol-68 GOON (= "go on") delimiter
indication serial elaboration rather than collateral.

Given the modified syntax proposed above, what happens at the boundary
in "AB" would appear to me to be as follows :

[1]  \XeTeXintercharclasses `A = {1; 2}
[1]  \XeTeXintercharclasses `B = {3; 4}

[2]  \XeTeXinterchartoks 1 3 = {foo}
[2]  \XeTeXinterchartoks 1 4 = {bar}
[2]  \XeTeXinterchartoks 2 3 = {xyzzy}
[2]  \XeTeXinterchartoks 2 4 = {plugh}

Ruleset [1] states that 1-transitions are to be considered before
2-transitions when an "A" is encountered; similarly, the same ruleset
states that 3-transitions are to be considered before 4-transitions when
an "B" is encountered.  Thus when we hit an "AB" sequence, the
interchartoks will be inserted in the order {foo}{bar}{xyzzy}{plugh}.

N'est-ce pas ?

Philip Taylor



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-02-01 Thread Arthur Reutenauer
On Sun, Jan 31, 2016 at 05:33:49PM +, Jonathan Kew wrote:
> Arthur, if you have a chance to look at this and see if I missed anything
> obvious, that would be great - thanks.

  No obvious issue for all I can say, apart for the comment change I
just pushed.

Best,

Arthur


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Joseph Wright
On 31/01/2016 18:07, Philip Taylor wrote:
> 
> 
> Jonathan Kew wrote:
> 
>> Before this gets merged to the master source, though, some testing would
>> be appreciated -- obviously, this will currently require rebuilding
>> xetex from the git source branch.
> 
> I use XeTeX on a daily basis, Jonathan ('tho XeTeXcharclass far less
> frequently) and would be happy to test your version on my production
> suites, but in order to do so I would require a Win64 (or Win32) build.
> 
> Philip Taylor

Hopefully Akira Kakuto will do that for W32TeX: he's done LuaTeX
v0.85/0.87/0.88 binaries.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Philip Taylor


Jonathan Kew wrote:

> Before this gets merged to the master source, though, some testing would
> be appreciated -- obviously, this will currently require rebuilding
> xetex from the git source branch.

I use XeTeX on a daily basis, Jonathan ('tho XeTeXcharclass far less
frequently) and would be happy to test your version on my production
suites, but in order to do so I would require a Win64 (or Win32) build.

Philip Taylor


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Jonathan Kew

On 13/12/15 07:04, Werner LEMBERG wrote:


[XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]


Folks,


I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
style file maps Unicode character blocks to character classes, and
I've hit the 256 entry limit of \XeTeXcharclass...

Any chance to extend it to 16 bits?



Returning to this: I've just pushed an experimental branch 
'more-classes' to the git repository at 
http://sourceforge.net/projects/xetex/ that is intended to increase the 
\XeTeXcharclass limit to 16384 (where 16384 will be the "ignored" value, 
formerly 256, and 16383 will be the "boundary" value, formerly 255; so 
for general allocation purposes the maximum class number becomes 16382).


If we go ahead with this change, it will require updates to any 
packages/documents that rely on the special values 255 and 256, and the 
\newXeTeXintercharclass allocator will need to be told the new limit.


Before this gets merged to the master source, though, some testing would 
be appreciated -- obviously, this will currently require rebuilding 
xetex from the git source branch.


Arthur, if you have a chance to look at this and see if I missed 
anything obvious, that would be great - thanks.


JK



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Akira Kakuto

I use XeTeX on a daily basis, Jonathan ('tho XeTeXcharclass far less
frequently) and would be happy to test your version on my production
suites, but in order to do so I would require a Win64 (or Win32) build.


You can test the new experimental XeTeX on win32 by
http://members2.jcom.home.ne.jp/wt1357ak/xetex-exp-w32.zip

The file will be removed in due time.

Best,
Akira



--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Philip Taylor
Just as TeX has \maxdimen, it would be useful if derivatives of TeX such
as XeTeX could add analogous environmental enquiries such as
\maxXeTeXcharclass (or, less uglily but also less meaningfully,
\XeTeXmaxcharclass).

Philip Taylor

Jonathan Kew wrote:

> If/when the change happens officially, there'll be a new \XeTeXrevision
> value, I guess. So macros could check that.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Philip Taylor


Joseph Wright wrote:

> \maxdimen isn't a primitive (though it's in the plain format).

Oh.  Never knew that (or if I did, I forgot it long ago).  Thank you,
Joseph.

** Phil.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Zdenek Wagner
\maxdimen is a maximum dimension value, not a maximum number of \dimen
registers.

Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

2016-01-31 19:57 GMT+01:00 Philip Taylor :

>
>
> Joseph Wright wrote:
>
> > \maxdimen isn't a primitive (though it's in the plain format).
>
> Oh.  Never knew that (or if I did, I forgot it long ago).  Thank you,
> Joseph.
>
> ** Phil.
>
>
> --
> Subscriptions, Archive, and List information, etc.:
>   http://tug.org/mailman/listinfo/xetex
>


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Jonathan Kew

On 31/1/16 17:56, Zdenek Wagner wrote:

Hi Jonathan,

can my macro text something so that I can decide whetner to use 255 or
16383? For some time both can exist in distros and not updated user's
computers so I would like to make my packages more robust.


If/when the change happens officially, there'll be a new \XeTeXrevision 
value, I guess. So macros could check that.


JK




Zdeněk Wagner
http://ttsm.icpf.cas.cz/team/wagner.shtml
http://icebearsoft.euweb.cz

2016-01-31 18:33 GMT+01:00 Jonathan Kew >:

On 13/12/15 07:04, Werner LEMBERG wrote:


[XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]


Folks,


I'm updating the `ucharclasses.sty' to completely cover
Unicode.  This
style file maps Unicode character blocks to character classes, and
I've hit the 256 entry limit of \XeTeXcharclass...

Any chance to extend it to 16 bits?


Returning to this: I've just pushed an experimental branch
'more-classes' to the git repository at
http://sourceforge.net/projects/xetex/ that is intended to increase
the \XeTeXcharclass limit to 16384 (where 16384 will be the
"ignored" value, formerly 256, and 16383 will be the "boundary"
value, formerly 255; so for general allocation purposes the maximum
class number becomes 16382).

If we go ahead with this change, it will require updates to any
packages/documents that rely on the special values 255 and 256, and
the \newXeTeXintercharclass allocator will need to be told the new
limit.

Before this gets merged to the master source, though, some testing
would be appreciated -- obviously, this will currently require
rebuilding xetex from the git source branch.

Arthur, if you have a chance to look at this and see if I missed
anything obvious, that would be great - thanks.

JK




--
Subscriptions, Archive, and List information, etc.:
http://tug.org/mailman/listinfo/xetex






--
Subscriptions, Archive, and List information, etc.:
   http://tug.org/mailman/listinfo/xetex





--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2016-01-31 Thread Joseph Wright
On 31/01/2016 18:31, Philip Taylor wrote:
> Just as TeX has \maxdimen, it would be useful if derivatives of TeX such
> as XeTeX could add analogous environmental enquiries such as
> \maxXeTeXcharclass (or, less uglily but also less meaningfully,
> \XeTeXmaxcharclass).

\maxdimen isn't a primitive (though it's in the plain format).

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-14 Thread Simon Cozens
On 15/12/2015 04:02, Werner LEMBERG wrote:
> I guess this would need a complete rewrite of the ucharclasses package
> (so Michiel should answer :-), but yes, such an approach could solve
> the issue.
> Regardless of that, I think that \XeTeXcharclass should allow more
> than 256 registers.

Yeah, it shouldn't be necessary to rewrite everything to get around the
fact that TeX is an 8-bit system. I would try changing the bounds in
scan_char_class myself, but my build of XeTex is... somewhat custom.

S



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-14 Thread Werner LEMBERG

[please CC me, I'm not subscribed to the XeTeX mailing list]

> > I'm updating the `ucharclasses.sty' to completely cover Unicode.
> > This style file maps Unicode character blocks to character classes,
> > and I've hit the 256 entry limit of \XeTeXcharclass...
> >
> > Any chance to extend it to 16 bits?
>
> Don't you only need to allocate a class to a block for which a
> transition is specified, so 256 should only be a problem if you need
> to specify that many transitions?

I guess this would need a complete rewrite of the ucharclasses package
(so Michiel should answer :-), but yes, such an approach could solve
the issue.

Regardless of that, I think that \XeTeXcharclass should allow more
than 256 registers.


Werner


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-14 Thread Mike "Pomax" Kamermans
It's certainly possible to limit the number of default transitions, but 
with the ever increasing expanse that is Unicode, while it might seem 
unrealistic, really it's more a matter of time until someone actually 
needs more classes than are currently available


There is an obvious optimization as per Jonathan's suggestion to 
collapse multiple blocks to scripts instead, to bring down the number of 
required classes used in the package, but also remember that 
ucharclasses is just one package; if someone uses additional packages 
that rely on allocating classes to achieve some goal, then having only 
255 (sans xetex-reserved classes) may become not enough much faster than 
if it's only used for the transition triggers that ucharclasses offers.


- Pomax

On 12/14/2015 11:02 AM, Werner LEMBERG wrote:

<>
I guess this would need a complete rewrite of the ucharclasses package
(so Michiel should answer :-), but yes, such an approach could solve
the issue.

Regardless of that, I think that \XeTeXcharclass should allow more
than 256 registers.


 Werner




--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-14 Thread Mike "Pomax" Kamermans

That sounds pretty encouraging!

- Pomax

On 12/13/2015 6:11 PM, Simon Cozens wrote:

Here's the relevant code:

   p:=cur_chr; scan_usv_num;
   p:=p+cur_val;
   n:=sf_code(cur_val) mod @"1;
   scan_optional_equals;
   scan_char_class;
   define(p,data,cur_val*@"1 + n);

scan_char_class calls scan_int (which will scan a number up to
2147483647) and then ensures it is between 0 and 256. It's then scaled
up by << 16 and put into the table of equivalents by eq_define which
expects its final parameter to be a halfword. The maximum value of a
halfword is 1073741823, which I guess gives you a theoretical maximum of
16383 character classes.

It *might* be that if you up the maximum in scan_char_class it will all
just work right?


--
Subscriptions, Archive, and List information, etc.:
   http://tug.org/mailman/listinfo/xetex




--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Joseph Wright
On 13/12/2015 07:04, Werner LEMBERG wrote:
> 
> [XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]
> 
> 
> Folks,
> 
> 
> I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
> style file maps Unicode character blocks to character classes, and
> I've hit the 256 entry limit of \XeTeXcharclass...
> 
> Any chance to extend it to 16 bits?
> 
> 
> Werner

I've been looking at Unicode classes recently :-) Exactly what
sub-division are you going for? There are several Unicode values that
seem to be important for 'full classification'.

Joseph



--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Simon Cozens
On 14/12/2015 04:31, Jonathan Kew wrote:
> Probably, at least in principle; I don't remember the code offhand to
> know how easy/difficult this might be.

Here's the relevant code:

  p:=cur_chr; scan_usv_num;
  p:=p+cur_val;
  n:=sf_code(cur_val) mod @"1;
  scan_optional_equals;
  scan_char_class;
  define(p,data,cur_val*@"1 + n);

scan_char_class calls scan_int (which will scan a number up to
2147483647) and then ensures it is between 0 and 256. It's then scaled
up by << 16 and put into the table of equivalents by eq_define which
expects its final parameter to be a halfword. The maximum value of a
halfword is 1073741823, which I guess gives you a theoretical maximum of
16383 character classes.

It *might* be that if you up the maximum in scan_char_class it will all
just work right?


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Mike "Pomax" Kamermans
Getting those up to 16 bits would be fantastic. There is some PR 
activity on https://github.com/Pomax/ucharclasses right now to get 
things fitting Unicode 8, if that's hitting the limitations of XeTeX, 
then that feels like something worth having a look at.


-- Pomax

On 12/12/2015 11:04 PM, Werner LEMBERG wrote:

[XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]


Folks,


I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
style file maps Unicode character blocks to character classes, and
I've hit the 256 entry limit of \XeTeXcharclass...

Any chance to extend it to 16 bits?


 Werner




--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread Mike "Pomax" Kamermans


But my first reaction to the question is to ask whether Unicode 
character blocks are the most useful thing to map. They're rather 
arbitrary, with things like Latin script split up across numerous 
blocks largely due to historical accident.


Mapping the Script property to char classes would seem much more 
useful, IMO.


Yeah, there's some discussion in 
https://github.com/Pomax/ucharclasses/pull/12 to consolidate several of 
the blocks that come with "...extended..." blocks, although even then 
that might be passing the buck: Unicode seems to have picked up in terms 
of landing new versions with more and more languages (as well as 
non-languages) leading to more and more blocks, many of which count as 
new scripts, so it might simply be a matter of a few years until we 
cross the charclass limit again even if we switch to scripts rather than 
pure blocks.


- Pomax


--
Subscriptions, Archive, and List information, etc.:
 http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-13 Thread David Carlisle
On 13 December 2015 at 07:04, Werner LEMBERG  wrote:
>
> [XeTeX 3.14159265-2.6-0.2 (TeX Live 2015)]
>
>
> Folks,
>
>
> I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
> style file maps Unicode character blocks to character classes, and
> I've hit the 256 entry limit of \XeTeXcharclass...
>
> Any chance to extend it to 16 bits?
>
>
> Werner
>

Don't you only need to allocate a class to a block for which a
transition is specified, so
256 should only be a problem if you need to specify that many transitions?

David


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex


Re: [XeTeX] not enough \XeTeXcharclass registers

2015-12-12 Thread Philip Taylor


Werner LEMBERG wrote:

> I'm updating the `ucharclasses.sty' to completely cover Unicode.  This
> style file maps Unicode character blocks to character classes, and
> I've hit the 256 entry limit of \XeTeXcharclass...
> 
> Any chance to extend it to 16 bits?

16 bits, Sir ?  What do you think this is, the 21st century ?  8 bits
(and at most five computers [1]) are more than enough for this world,
and will be for a very long time ...

** Phil.

[1] Variously attributed to T J Watson, Douglas Hartree & Howard Aiken.


--
Subscriptions, Archive, and List information, etc.:
  http://tug.org/mailman/listinfo/xetex