Re: [tex4ht] two versions of unicode.4ht

2016-08-03 Thread Ulrike Fischer
Michal,

> Quoting Ulrike Fischer (2016-08-03 18:02:41)
>> 
>> > There is quite a lot of unicode.4hf versions generated from
>> > tex4ht-fonts-4hf.tex:
>> 
>> Yes I know. I'm not wondering about this.
>> 
>> But why do I have two in the iso8859/1/charset folder?
>> 
>> Only "iso8859/1" has a "uni" subfolder in the charset folder with an
>> additional unicode.4ht.
>> 
>> E.g. compare in your list iso88859/1 with iso8859/2:
>> 
>> 2 versions here:
>> 
>> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf
>> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf
>> 
>> but 3 versions here:
>> 
>> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf
>>  ^^^ odd
>> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf
>> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf
>> 

> I don't really understand how tex4ht selects unicode.4hf files. On my
> machine, it always selects the one in charset subdir, I don't even know
> how it is possible to select one in another subdir. Surely with
> tex4ht.env edit.

Well the general principle easy.

In the env-file there are blocks e.g.


i/tex4ht/ht-fonts/symbol/!
i/tex4ht/ht-fonts/unicode/!
i/tex4ht/ht-fonts/ascii/!
i/tex4ht/ht-fonts/alias/!


and with the -c option you are chosing such a block. -cunihtf will use
the  block, and -csymhtf the  block.
The  block seem to be the fallback.

The problem are the finer details, miktex seems to need two subfolder
levels, while texlive seems to look only in the next subfolder.




-- 
Mit freundlichen Grüßen
Ulrike Fischer
mailto:ne...@nililand.de




Re: [tex4ht] two versions of unicode.4ht

2016-08-03 Thread Michal Hoftich
Quoting Ulrike Fischer (2016-08-03 18:02:41)
> 
> > There is quite a lot of unicode.4hf versions generated from
> > tex4ht-fonts-4hf.tex:
> 
> Yes I know. I'm not wondering about this.
> 
> But why do I have two in the iso8859/1/charset folder?
> 
> Only "iso8859/1" has a "uni" subfolder in the charset folder with an
> additional unicode.4ht.
> 
> E.g. compare in your list iso88859/1 with iso8859/2:
> 
> 2 versions here:
> 
> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf
> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf
> 
> but 3 versions here:
> 
> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf
>  ^^^ odd
> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf
> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf
> 

I don't really understand how tex4ht selects unicode.4hf files. On my
machine, it always selects the one in charset subdir, I don't even know
how it is possible to select one in another subdir. Surely with
tex4ht.env edit. So it is definitely strange that in Miktex it selects
unicode.4hf in charset/uni dir. Does it work in this way also in Windows
TL? Maybe there is an old version of tex4ht.exe? 

Michal



Re: [tex4ht] two versions of unicode.4ht

2016-08-03 Thread Ulrike Fischer
Hello Michal,

>> I found two versions of unicode.4ht in
>> 
>> \ht-fonts\iso8859\1 
>> 
>> one in
>> 
>>   D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\
>> 
>> the other in
>> 
>>   D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\uni\
>> 
>> Their content is not identical, the one in charset has two extra
>> lines:
>> 
>> 'fi' ''  'fi'   ''
>> 'fl' ''  'fl'   ''
>> 
>> I'm not quite sure if both are really from the texlive installation
>> -- perhaps one of them remained from a test I did to compare the
>> location with the one from miktex, but I mention it anyway just in
>> case. Also I would like to know which one is the correct one. 

> There is quite a lot of unicode.4hf versions generated from
> tex4ht-fonts-4hf.tex:

Yes I know. I'm not wondering about this.

But why do I have two in the iso8859/1/charset folder?

Only "iso8859/1" has a "uni" subfolder in the charset folder with an
additional unicode.4ht.

E.g. compare in your list iso88859/1 with iso8859/2:

2 versions here:

> tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf
> tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf

but 3 versions here:

> tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf
 ^^^ odd
> tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf
> tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf





> It seems that issue someone had on TeX.sx with Miktex [1] is that wrong
> `unicode.4hf` file is used, it can't find the one in `unicode` dir and
> instead the one in `iso8859/1` is used, which results in file with
> declared `utf-8` encoding, but characters in `iso8859` encoding.

> I am not sure what is the issue here. It seems that the .4hf files are
> in correct places, but tex4ht can't find them.

That's a bug in miktex.

somehow the ! in e.g.

i~/tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/!

is not correctly interpreted and doesn't work if the subfolder is
exactly one level down, so files in

texmf/tex4ht/ht-fonts/unicode/charset are not found, while
texmf/tex4ht/ht-fonts/unicode/charset/uni works.



-- 
Mit freundlichen Grüßen
Ulrike Fischer
mailto:ne...@nililand.de




Re: [tex4ht] two versions of unicode.4ht

2016-08-03 Thread Michal Hoftich
Hi Ulrike,

> I found two versions of unicode.4ht in 
> 
> \ht-fonts\iso8859\1 
> 
> one in
> 
>   D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\
> 
> the other in
> 
>   D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\uni\
> 
> Their content is not identical, the one in charset has two extra
> lines:
> 
> 'fi' ''  'fi'   ''
> 'fl' ''  'fl'   ''
> 
> I'm not quite sure if both are really from the texlive installation
> -- perhaps one of them remained from a test I did to compare the
> location with the one from miktex, but I mention it anyway just in
> case. Also I would like to know which one is the correct one. 

There is quite a lot of unicode.4hf versions generated from
tex4ht-fonts-4hf.tex:

tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/html/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/win/1251/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/utf8/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/gbk/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/symbol/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/viscii/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/viqr/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/html-speech/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/mnemonic/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/native/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/cp1256/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/ooffice/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/gb2312/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/koi/8r/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/5/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/5/html/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/6/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/6/html/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/7/charset/unicode.4hf
tex4ht.dir/texmf/tex4ht/ht-fonts/jsml/charset/unicode.4hf

if I understand it correctly, tex4ht search directories specified in
font sections in tex4ht.env. These sections can be selected with `-c`
option for tex4ht command. Default section is 


i~/tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/!
i~/tex4ht.dir/texmf/tex4ht/ht-fonts/ascii/!
i~/tex4ht.dir/texmf/tex4ht/ht-fonts/alias/!
i~/tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/!
i~/tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/!


I think that first unicode.4hf file which is found is used, but I am not
sure the mechanism which is used for location.  By default,

ht-fonts/iso8859/1/html/charset/unicode.4hf

is used on my machine, with -cunihtf it is 

ht-fonts/unicode/html/charset/unicode.4hf

I don't understand why 

ht-fonts/unicode/charset/unicode.4hf

is not used instead.

Anyway, unicode.4hf files are important for output encodings different
that utf-8, as they specify charcodes to which should be unicode
entities specified in the DVI file transformed. With -utf8 option,
tex4ht output in utf-8 encoding and unicode.4hf is used only to output
some characters as named entities for example, or "fi" ligature as
literal "fi".

It seems that issue someone had on TeX.sx with Miktex [1] is that wrong
`unicode.4hf` file is used, it can't find the one in `unicode` dir and
instead the one in `iso8859/1` is used, which results in file with
declared `utf-8` encoding, but characters in `iso8859` encoding.

I am not sure what is the issue here. It seems that the .4hf files are
in correct places, but tex4ht can't find them.

Last thing that I've found is that we don't generate the .4hf files from
the sources at the moment, there is no target for tex4ht-fonts-4hf.tex
in the Makefile. It is also so huge file, that the compilation fails
with capacity exceeded. It can be compiled with LuaLaTeX though. The
generated files seems to be incorrect, as they include copyright notice
at the beginning and tex4ht complains about incorrect entries.

Best regards,
Michal

[1] http://tex.stackexchange.com/q/322164/2891



[tex4ht] two versions of unicode.4ht

2016-08-03 Thread Ulrike Fischer
I found two versions of unicode.4ht in 

\ht-fonts\iso8859\1 

one in

  D:\texlive\2016\texmf-dist\tex4ht\ht-fonts\iso8859\1\charset\

the other in

  D:\texlive\2016\texmf-dist\tex4ht\ht-fonts\iso8859\1\charset\uni\

Their content is not identical, the one in charset has two extra
lines:

'fi' ''  'fi'   ''
'fl' ''  'fl'   ''

I'm not quite sure if both are really from the texlive installation
-- perhaps one of them remained from a test I did to compare the
location with the one from miktex, but I mention it anyway just in
case. Also I would like to know which one is the correct one. 



-- 
Ulrike Fischer 
http://www.troubleshooting-tex.de/