Update of bug #66653 (group groff):
Status: None => Need Info
_______________________________________________________
Follow-up Comment #21:
Hi Branden,
You are so close. :-)
The dummy node no longer truncates the asciified text. Unicode characters are
converted to \[uXXXX] form. The only issue left is that composite input
characters (as decided by NFD) are not successful. If you send the asciified
diversion to the document stream the characters are completely missing and a
strange error is reported, if it is sent via device control only the base
character is converted to \[uXXXX] (not \[uXXXX_XXXX]).
Using this:-
.ft Tinos-Regular
.sp 1i0
.ds khant "time to meet the ShǍka Khǎn
.box DIV
\*[khant]
.br
.box
.DIV
.asciify DIV
.DIV
.pdfbookmark 1 "\*[DIV]
.pdfbookmark 1 "\*[khant]
.\" in Tinos-Regular
.\" u0041_030C 722,859 2 814 uni01CD
.\" u0061_030C 443,696,10 2 815 uni01CE
Two errors are shown:-
[derij@pip build (master)]$ test-groff -Tpdf -F font -k G.trf -Z | egrep
"^x|C"
troff:G.trf:10: warning: special character '\A' not defined
troff:G.trf:10: warning: special character '\a' not defined
x T pdf
x res 72000 1 1
x init
x X ps:exec [/Dest /pdf:bm1 /View [/FitH -67000 u] /DEST pdfmark
x X ps:exec [/Dest /pdf:bm1 /Title (time to meet the Sh\[u0041]ka Kh\[u0061]n)
/Level 1 /OUT pdfmark
x font 40 Tinos-Regular
Cu0041_030C
Cu0061_030C
x X ps:exec [/Dest /pdf:bm2 /View [/FitH -91000 u] /DEST pdfmark
x X ps:exec [/Dest /pdf:bm2 /Title (time to meet the Sh\[u01CD]ka Kh\[u01CE]n)
/Level 1 /OUT pdfmark
x trailer
x stop
I have also included a screen shot of the pdf produced. You can see the
missing characters in the document and the same diversion used in the bookmark
has the base character only. The most interesting thing to note is that the
original string register \*[khant] retains the original input value when used
in a bookmark (I was super-pleased when you got that working).
As far as I know the special character '\A' does not exist as a groff name, it
is not mentioned in groff_char. In Tinos the postscript name is uni10CD, but
other fonts have the postscript name as "Ahacek" (which is not in our glyph
list). I'm dealing with groff's problems with more modern fonts in my
forthcoming reply to bug #67244, which is taking an age to write as I keep
getting sidetracked into writing little utility programs discovering the
innards of ttf and otf fonts.
If you want to experiment with a font you have installed, try:-
.ft U-TR
.sp 1i0
.ds khant "time to meet the Şhaka Khan
At the top of the file and include "-Kutf8" on the command line.
This remaining problem concerning composite characters is not a show stopper,
it only raises its ugly head when we add pdf features to ms (me?) in 1.25.
Cheers
Deri
_______________________________________________________
Reply to this item at:
<https://savannah.gnu.org/bugs/?66653>
_______________________________________________
Message sent via Savannah
https://savannah.gnu.org/
signature.asc
Description: PGP signature
