On Friday, 23 June 2023 19:17:58 BST Robin Haberkorn wrote: > Hello Peter, > > I am also now stumbling across Cyrillc-related issues with pdfmark. I am > using ms for the time being. The bug also affects autogenerating link texts > given via `.pdfhref L`. > In the most simple case, preconv will turn your Cyrillic characters into > escapes which are apparently not further interpreted by pdfmark (or > anything that follows). I see text like "[u0421][u043F]..." in my outline. > > I believe that this is why you have .pdfmomclean in MOM. Do I understand > correctly that this is supposed to turn the escapes back into Latin-1? > This is presumably mainly the work of .asciify, which would be misnamed > anyway. It does not work with Cyrillic at all, which doesn't surprise. > That's also why you don't get "mojibake garbage" in the outline. None of the > Cyrillic characters end up in intermediate output. > > It also explains why I previously had no problems with German Unicode > characters (that was using MOM) - they can be converted back into Latin-1. > > Manually editing the ps:exec lines in the intermediate output and inserting > Unicode characters there, does not produce the desired results, which is > also not surprising. > > So it seems that the main problem really lies in grops and/or gropdf which > should ideally work with the Unicode escapes produced by preconv. > I am not sure if we would still need .pdfmomclean. But whatever useful stuff > it currently does, it should probably be in pdfmark.tmac (and/or pdf.tmac?) > instead. > > Best regards, > Robin
Hi Robin, The features you require are coming. This is an example of Russian with bookmarks in cyrillic. I'm afraid I don't know what it means and I have forgotten where I got the text. Cheers Deri
Rus2.pdf
Description: Adobe PDF document