I'm having a hard time with pdfe.getstring(). What am I supposed to do if it
returns an UTF-16 encoded string? How to convert it to UTF-8?

Here is what I'm actually trying to do: I'm reading the /Contents of a
Text-Annotation
with pdfe.getstring(). The returned string happens to be UTF-16
encoded. Now I want to
use this string to create a pdf_annot whatsit. Of course this doesn't work:

This is LuaTeX, Version 1.13.0 (TeX Live 2021/dev)
 restricted system commands enabled.
(./test.tex
! String contains an invalid utf-8 sequence.
l.17 }

I've attached an example to replicate this issue.

Andreas
\pdfvariable compresslevel = 0
\directlua{
  doc = pdfe.open('foo.pdf')
  page = pdfe.getpage(doc, 1)
  annot = page.Annots[1]

  contents = annot.Contents
  % contents = pdfe.getstring(annot, 'Contents', true)
  str = '/Subtype/Text/Contents(' .. contents .. ')'

  local annot = node.new(node.id('whatsit'), node.subtype('pdf_annot'))
  annot.width = tex.sp('100bp')
  annot.height = tex.sp('50bp')
  annot.depth = tex.sp('0bp')
  annot.data = str
  node.write(annot)
}
\bye

Attachment: foo.pdf
Description: Adobe PDF document

Reply via email to