I'm having a hard time with pdfe.getstring(). What am I supposed to do if it returns an UTF-16 encoded string? How to convert it to UTF-8?
Here is what I'm actually trying to do: I'm reading the /Contents of a Text-Annotation with pdfe.getstring(). The returned string happens to be UTF-16 encoded. Now I want to use this string to create a pdf_annot whatsit. Of course this doesn't work: This is LuaTeX, Version 1.13.0 (TeX Live 2021/dev) restricted system commands enabled. (./test.tex ! String contains an invalid utf-8 sequence. l.17 } I've attached an example to replicate this issue. Andreas
\pdfvariable compresslevel = 0
\directlua{
doc = pdfe.open('foo.pdf')
page = pdfe.getpage(doc, 1)
annot = page.Annots[1]
contents = annot.Contents
% contents = pdfe.getstring(annot, 'Contents', true)
str = '/Subtype/Text/Contents(' .. contents .. ')'
local annot = node.new(node.id('whatsit'), node.subtype('pdf_annot'))
annot.width = tex.sp('100bp')
annot.height = tex.sp('50bp')
annot.depth = tex.sp('0bp')
annot.data = str
node.write(annot)
}
\bye
foo.pdf
Description: Adobe PDF document
