Re: [tex4ht] two versions of unicode.4ht
Michal, > Quoting Ulrike Fischer (2016-08-03 18:02:41) >> >> > There is quite a lot of unicode.4hf versions generated from >> > tex4ht-fonts-4hf.tex: >> >> Yes I know. I'm not wondering about this. >> >> But why do I have two in the iso8859/1/charset folder? >> >> Only "iso8859/1" has a "uni" subfolder in the charset folder with an >> additional unicode.4ht. >> >> E.g. compare in your list iso88859/1 with iso8859/2: >> >> 2 versions here: >> >> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf >> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf >> >> but 3 versions here: >> >> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf >> ^^^ odd >> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf >> > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf >> > I don't really understand how tex4ht selects unicode.4hf files. On my > machine, it always selects the one in charset subdir, I don't even know > how it is possible to select one in another subdir. Surely with > tex4ht.env edit. Well the general principle easy. In the env-file there are blocks e.g. i/tex4ht/ht-fonts/symbol/! i/tex4ht/ht-fonts/unicode/! i/tex4ht/ht-fonts/ascii/! i/tex4ht/ht-fonts/alias/! and with the -c option you are chosing such a block. -cunihtf will use the block, and -csymhtf the block. The block seem to be the fallback. The problem are the finer details, miktex seems to need two subfolder levels, while texlive seems to look only in the next subfolder. -- Mit freundlichen Grüßen Ulrike Fischer mailto:ne...@nililand.de
Re: [tex4ht] two versions of unicode.4ht
Quoting Ulrike Fischer (2016-08-03 18:02:41) > > > There is quite a lot of unicode.4hf versions generated from > > tex4ht-fonts-4hf.tex: > > Yes I know. I'm not wondering about this. > > But why do I have two in the iso8859/1/charset folder? > > Only "iso8859/1" has a "uni" subfolder in the charset folder with an > additional unicode.4ht. > > E.g. compare in your list iso88859/1 with iso8859/2: > > 2 versions here: > > > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf > > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf > > but 3 versions here: > > > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf > ^^^ odd > > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf > > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf > I don't really understand how tex4ht selects unicode.4hf files. On my machine, it always selects the one in charset subdir, I don't even know how it is possible to select one in another subdir. Surely with tex4ht.env edit. So it is definitely strange that in Miktex it selects unicode.4hf in charset/uni dir. Does it work in this way also in Windows TL? Maybe there is an old version of tex4ht.exe? Michal
Re: [tex4ht] two versions of unicode.4ht
Hello Michal, >> I found two versions of unicode.4ht in >> >> \ht-fonts\iso8859\1 >> >> one in >> >> D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\ >> >> the other in >> >> D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\uni\ >> >> Their content is not identical, the one in charset has two extra >> lines: >> >> 'fi' '' 'fi' '' >> 'fl' '' 'fl' '' >> >> I'm not quite sure if both are really from the texlive installation >> -- perhaps one of them remained from a test I did to compare the >> location with the one from miktex, but I mention it anyway just in >> case. Also I would like to know which one is the correct one. > There is quite a lot of unicode.4hf versions generated from > tex4ht-fonts-4hf.tex: Yes I know. I'm not wondering about this. But why do I have two in the iso8859/1/charset folder? Only "iso8859/1" has a "uni" subfolder in the charset folder with an additional unicode.4ht. E.g. compare in your list iso88859/1 with iso8859/2: 2 versions here: > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf but 3 versions here: > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf ^^^ odd > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf > tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf > It seems that issue someone had on TeX.sx with Miktex [1] is that wrong > `unicode.4hf` file is used, it can't find the one in `unicode` dir and > instead the one in `iso8859/1` is used, which results in file with > declared `utf-8` encoding, but characters in `iso8859` encoding. > I am not sure what is the issue here. It seems that the .4hf files are > in correct places, but tex4ht can't find them. That's a bug in miktex. somehow the ! in e.g. i~/tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/! is not correctly interpreted and doesn't work if the subfolder is exactly one level down, so files in texmf/tex4ht/ht-fonts/unicode/charset are not found, while texmf/tex4ht/ht-fonts/unicode/charset/uni works. -- Mit freundlichen Grüßen Ulrike Fischer mailto:ne...@nililand.de
Re: [tex4ht] two versions of unicode.4ht
Hi Ulrike, > I found two versions of unicode.4ht in > > \ht-fonts\iso8859\1 > > one in > > D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\ > > the other in > > D:exlive\2016exmf-distex4ht\ht-fonts\iso8859\1\charset\uni\ > > Their content is not identical, the one in charset has two extra > lines: > > 'fi' '' 'fi' '' > 'fl' '' 'fl' '' > > I'm not quite sure if both are really from the texlive installation > -- perhaps one of them remained from a test I did to compare the > location with the one from miktex, but I mention it anyway just in > case. Also I would like to know which one is the correct one. There is quite a lot of unicode.4hf versions generated from tex4ht-fonts-4hf.tex: tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/html/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/win/1251/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/utf8/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/gbk/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/symbol/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/viscii/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/viqr/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/html-speech/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/mnemonic/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/charset/native/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/cp1256/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/ooffice/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/gb2312/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/koi/8r/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/2/html/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/5/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/5/html/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/uni/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/html/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/6/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/6/html/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/7/charset/unicode.4hf tex4ht.dir/texmf/tex4ht/ht-fonts/jsml/charset/unicode.4hf if I understand it correctly, tex4ht search directories specified in font sections in tex4ht.env. These sections can be selected with `-c` option for tex4ht command. Default section is i~/tex4ht.dir/texmf/tex4ht/ht-fonts/iso8859/1/! i~/tex4ht.dir/texmf/tex4ht/ht-fonts/ascii/! i~/tex4ht.dir/texmf/tex4ht/ht-fonts/alias/! i~/tex4ht.dir/texmf/tex4ht/ht-fonts/mozilla/! i~/tex4ht.dir/texmf/tex4ht/ht-fonts/unicode/! I think that first unicode.4hf file which is found is used, but I am not sure the mechanism which is used for location. By default, ht-fonts/iso8859/1/html/charset/unicode.4hf is used on my machine, with -cunihtf it is ht-fonts/unicode/html/charset/unicode.4hf I don't understand why ht-fonts/unicode/charset/unicode.4hf is not used instead. Anyway, unicode.4hf files are important for output encodings different that utf-8, as they specify charcodes to which should be unicode entities specified in the DVI file transformed. With -utf8 option, tex4ht output in utf-8 encoding and unicode.4hf is used only to output some characters as named entities for example, or "fi" ligature as literal "fi". It seems that issue someone had on TeX.sx with Miktex [1] is that wrong `unicode.4hf` file is used, it can't find the one in `unicode` dir and instead the one in `iso8859/1` is used, which results in file with declared `utf-8` encoding, but characters in `iso8859` encoding. I am not sure what is the issue here. It seems that the .4hf files are in correct places, but tex4ht can't find them. Last thing that I've found is that we don't generate the .4hf files from the sources at the moment, there is no target for tex4ht-fonts-4hf.tex in the Makefile. It is also so huge file, that the compilation fails with capacity exceeded. It can be compiled with LuaLaTeX though. The generated files seems to be incorrect, as they include copyright notice at the beginning and tex4ht complains about incorrect entries. Best regards, Michal [1] http://tex.stackexchange.com/q/322164/2891
[tex4ht] two versions of unicode.4ht
I found two versions of unicode.4ht in \ht-fonts\iso8859\1 one in D:\texlive\2016\texmf-dist\tex4ht\ht-fonts\iso8859\1\charset\ the other in D:\texlive\2016\texmf-dist\tex4ht\ht-fonts\iso8859\1\charset\uni\ Their content is not identical, the one in charset has two extra lines: 'fi' '' 'fi' '' 'fl' '' 'fl' '' I'm not quite sure if both are really from the texlive installation -- perhaps one of them remained from a test I did to compare the location with the one from miktex, but I mention it anyway just in case. Also I would like to know which one is the correct one. -- Ulrike Fischer http://www.troubleshooting-tex.de/