Werner LEMBERG <[EMAIL PROTECTED]> writes:
/../
Thank you for the answers, I managed to find my way around better,
with the help of my rather old version of Japanese Information
Processing and the unicode mapping files on ftp.unicode.org plus the
CJK documentation and source code.
>> Question 3: in essence, what I am now faced with is
>> updating/expanding the unicode subfonts with the help of the files
>> you advised me to look at.
>
> Not the subfonts itself, but the entries in the .fdx files so that the
> lines refer to the correct subfont and glyph index positions.
OK, here is what I have done up to now:
1. I use c42min.fd and JISdnp.enc to work out what the original JIS
point was from the DNP symbol subfont glyph position. I know the
JIS is encoded in EUC (or DNP) not in JIS encoding, but I don't
know exactly which EUC. BTW, EUC is also known as UJIS, that is,
Unixized JIS.
2. I could not understand yet exactly which EUC is used, so I assumed
the complete two-byte format EUC for now, based on the fact that
the first byte values seem to match the example you gave me (A1A1,
assumed to be unbreaking space). This form is apparently not
commonly encountered, according to my old reference. How times have
changed :-)
3. Now, I realize that the conversion of subtracting -160 from the
second byte, and making the first byte A1, is in fact the KUTEN <->
JIS conversion! That is, whatever the DNP coding might be elsewhere
(I did not check), it appears to be the KUTEN index encoding for at
least the sy subfont. That is very helpful for lookup in tables!
A1A1 subtracting 160 from each byte gives 01 01 which is row 1 (KU)
and symbol 01 (TEN) in the KUTEN index system. From KUTEN to JIS
involves on the other hand addition of decimal 32 to each byte.
4. Thus armed, I wrote a set of shell scripts to take the c42min.fd
file as input and output the JIS, KUTEN, and EUC decimal and
hexadecimal points, and also the mapping to UTF-8. The JIS0208.txt
file from ftp.unicode.org has a JIS encoding column, and no
EUC. Tips on how to do this process more easily much appreciated.
5. Next, I set the subfont name for unicode, from the information in
the Unicode.sfd file. The subfont name is in hexadecimal; however,
the unicode second byte needs to be transformed into decimal for
the .fdx file, so:
%% attempt to make a unicode c70min.fdx file
\def\fileversion{4.6.0}
\def\filedate{2005/08/11}
\ProvidesFile{c70min.fdx}[\filedate\space\fileversion]
\CJKvdef{rotate}{}
\CJKvdef{offset}{.5em}
%% HEX/DEC
\CJKvdef{m/n/30/1}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{1}\hss}}
\CJKvdef{m/n/30/2}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{2}\hss}}
\CJKvdef{m/n/ff/12}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{12}\hss}}
\CJKvdef{m/n/ff/14}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{14}\hss}}
\CJKvdef{m/n/30/252}{\CJKsymbolsimple{252}}
\CJKvdef{m/n/30/28}{\CJKsymbolsimple{28}}
\CJKvdef{m/n/20/38}{\CJKsymbolsimple{38}}
\CJKvdef{m/n/20/37}{\CJKsymbolsimple{37}}
\CJKvdef{m/n/ff/8}{\CJKsymbolsimple{8}}
\CJKvdef{m/n/ff/9}{\CJKsymbolsimple{9}}
\CJKvdef{m/n/30/20}{\CJKsymbolsimple{20}}
\CJKvdef{m/n/30/21}{\CJKsymbolsimple{21}}
\CJKvdef{m/n/ff/59}{\CJKsymbolsimple{59}}
\CJKvdef{m/n/ff/61}{\CJKsymbolsimple{61}}
\CJKvdef{m/n/ff/91}{\CJKsymbolsimple{91}}
\CJKvdef{m/n/ff/93}{\CJKsymbolsimple{93}}
\CJKvdef{m/n/30/8}{\CJKsymbolsimple{8}}
\CJKvdef{m/n/30/9}{\CJKsymbolsimple{9}}
\CJKvdef{m/n/30/10}{\CJKsymbolsimple{10}}
\CJKvdef{m/n/30/11}{\CJKsymbolsimple{11}}
\CJKvdef{m/n/30/12}{\CJKsymbolsimple{12}}
\CJKvdef{m/n/30/13}{\CJKsymbolsimple{13}}
\CJKvdef{m/n/30/14}{\CJKsymbolsimple{14}}
\CJKvdef{m/n/30/15}{\CJKsymbolsimple{15}}
\CJKvlet{bx/n/30/1} {m/n/30/1}
\CJKvlet{bx/n/30/2} {m/n/30/2}
\CJKvlet{bx/n/ff/12} {m/n/ff/12}
\CJKvlet{bx/n/ff/14} {m/n/ff/14}
\CJKvlet{bx/n/30/252}{m/n/30/252}
\CJKvlet{bx/n/30/28} {m/n/30/28}
\CJKvlet{bx/n/20/38} {m/n/20/38}
\CJKvlet{bx/n/20/37} {m/n/20/37}
\CJKvlet{bx/n/ff/8} {m/n/ff/8}
\CJKvlet{bx/n/ff/9} {m/n/ff/9}
\CJKvlet{bx/n/30/20} {m/n/30/20}
\CJKvlet{bx/n/30/21} {m/n/30/21}
\CJKvlet{bx/n/ff/59} {m/n/ff/59}
\CJKvlet{bx/n/ff/61} {m/n/ff/61}
\CJKvlet{bx/n/ff/91} {m/n/ff/91}
\CJKvlet{bx/n/ff/93} {m/n/ff/93}
\CJKvlet{bx/n/30/8} {m/n/30/8}
\CJKvlet{bx/n/30/9} {m/n/30/9}
\CJKvlet{bx/n/30/10} {m/n/30/10}
\CJKvlet{bx/n/30/11} {m/n/30/11}
\CJKvlet{bx/n/30/12} {m/n/30/12}
\CJKvlet{bx/n/30/13} {m/n/30/13}
\CJKvlet{bx/n/30/14} {m/n/30/14}
\CJKvlet{bx/n/30/15} {m/n/30/15}
%% original contents of the c42min.fdx file
%% \CJKvdef{m/n/sy/2}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{2}\hss}}
- Show quoted text -
%% \CJKvdef{m/n/sy/3}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{3}\hss}}
%% \CJKvdef{m/n/sy/4}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{4}\hss}}
%% \CJKvdef{m/n/sy/5}{\raise .55em \hbox to 1em {\kern -.6em \CJKsymbol{5}\hss}}
%% \CJKvdef{m/n/sy/28}{\CJKsymbolsimple{28}}
%% \CJKvdef{m/n/sy/33}{\CJKsymbolsimple{33}}
%% \CJKvdef{m/n/sy/36}{\CJKsymbolsimple{36}}
%% \CJKvdef{m/n/sy/37}{\CJKsymbolsimple{37}}
%% \CJKvdef{m/n/sy/42}{\CJKsymbolsimple{42}}
%% \CJKvdef{m/n/sy/43}{\CJKsymbolsimple{43}}
%% \CJKvdef{m/n/sy/44}{\CJKsymbolsimple{44}}
%% \CJKvdef{m/n/sy/45}{\CJKsymbolsimple{45}}
%% \CJKvdef{m/n/sy/46}{\CJKsymbolsimple{46}}
%% \CJKvdef{m/n/sy/47}{\CJKsymbolsimple{47}}
%% \CJKvdef{m/n/sy/48}{\CJKsymbolsimple{48}}
%% \CJKvdef{m/n/sy/49}{\CJKsymbolsimple{49}}
%% \CJKvdef{m/n/sy/50}{\CJKsymbolsimple{50}}
%% \CJKvdef{m/n/sy/51}{\CJKsymbolsimple{51}}
%% \CJKvdef{m/n/sy/52}{\CJKsymbolsimple{52}}
%% \CJKvdef{m/n/sy/53}{\CJKsymbolsimple{53}}
%% \CJKvdef{m/n/sy/54}{\CJKsymbolsimple{54}}
%% \CJKvdef{m/n/sy/55}{\CJKsymbolsimple{55}}
%% \CJKvdef{m/n/sy/56}{\CJKsymbolsimple{56}}
%% \CJKvdef{m/n/sy/57}{\CJKsymbolsimple{57}}
%% \CJKvlet{bx/n/sy/2}{m/n/sy/2}
%% \CJKvlet{bx/n/sy/3}{m/n/sy/3}
%% \CJKvlet{bx/n/sy/4}{m/n/sy/4}
%% \CJKvlet{bx/n/sy/5}{m/n/sy/5}
%% \CJKvlet{bx/n/sy/28}{m/n/sy/28}
%% \CJKvlet{bx/n/sy/33}{m/n/sy/33}
%% \CJKvlet{bx/n/sy/36}{m/n/sy/36}
%% \CJKvlet{bx/n/sy/37}{m/n/sy/37}
%% \CJKvlet{bx/n/sy/42}{m/n/sy/42}
%% \CJKvlet{bx/n/sy/43}{m/n/sy/43}
%% \CJKvlet{bx/n/sy/44}{m/n/sy/44}
%% \CJKvlet{bx/n/sy/45}{m/n/sy/45}
%% \CJKvlet{bx/n/sy/46}{m/n/sy/46}
%% \CJKvlet{bx/n/sy/47}{m/n/sy/47}
%% \CJKvlet{bx/n/sy/48}{m/n/sy/48}
%% \CJKvlet{bx/n/sy/49}{m/n/sy/49}
%% \CJKvlet{bx/n/sy/50}{m/n/sy/50}
%% \CJKvlet{bx/n/sy/51}{m/n/sy/51}
%% \CJKvlet{bx/n/sy/52}{m/n/sy/52}
%% \CJKvlet{bx/n/sy/53}{m/n/sy/53}
%% \CJKvlet{bx/n/sy/54}{m/n/sy/54}
%% \CJKvlet{bx/n/sy/55}{m/n/sy/55}
%% \CJKvlet{bx/n/sy/56}{m/n/sy/56}
%% \CJKvlet{bx/n/sy/57}{m/n/sy/57}
\endinput
6. I tested this using the long hyphen, and it works (character
30/252), so I am convinced from a practical standpoint. However,
looking at the CJKvert.sty I am confused, since \CJKsymbolsimple
takes only one argument, and I do not understand how it
differentiates between different subfonts from line to line. I
don't notice anything in the style file, and from makeuniwada.pl I
only note that all the unicode subfonts are already happily
created. Any comments?
7. If the above is right, it can be added to the gothic and maru .fdx
files as well, right?
8. Next, I would like to help in creating support for half-width
katakana in UTF-8 encoding too. Is this at all feasible at present?
Any advice much appreciated.
Regards,
Gernot
--
開心 - 好運気 (Kai Xin - Hao Yun Qi)
Be happy and joyful - and share that joy with others
_______________________________________________
Cjk maillist - [email protected]
https://lists.ffii.org/mailman/listinfo/cjk