On 24Aug2018 17:55, Peter Otten <__pete...@web.de> wrote:
Albert-Jan Roskam wrote:
I have Ghostscript files with a table of contents (toc) and I would like
to use this info to generate a human-readable toc. The problem is: I can't
get the (nested) hierarchy right.
import re
toc = """\
[ /PageMode /UseOutlines
/Page 1
/View [/XYZ null null 0]
/DOCVIEW pdfmark
[ /Title (Title page)
/Page 1
/View [/XYZ null null 0]
/OUT pdfmark
[ /Title (Document information)
/Page 2
/View [/XYZ null null 0]
/OUT pdfmark
[...]
What is the best approach to do this?
The best approach is probably to use some tool/library that understands
postscript.
Just to this: I disagree. IIRC, there's no such thing as '/Title' etc in
PostScript - these will all be PostScript functions defined by whatever made
the document. So a generic tool won't have any way to extract semantics like
titles from a document.
The OP presumably has the specific output of a particular tool with this nice
well structured postscript, so he needs to write his/her own special parser.
Cheers,
Cameron Simpson <c...@cskk.id.au>
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
https://mail.python.org/mailman/listinfo/tutor