[Podofo-users] Parsing content streams

Craig Ringer Thu, 08 Nov 2007 02:08:41 -0800

Hi

Re our conversion on parsing content streams, I'm beginning to think 
that doing much more than building a list might be nuking a fly.


According to the PDF reference:

"PDF has no concept of an operand stack as PostScript has. In PDF, all 
of the operands needed by an operator must immediately precede that 
operator. Operators do not return results, and operands cannot be left 
over when an operator finishes execution."

I realised before that that was how it commonly worked out, but I didn't 
realise it was a hard rule; I'd always assumed there was a stack.

That means that a tree representation of a content stream would be very 
boring and simple - just:
                              [root]
                                |
           --------------------------------------------------
           |           |           |               |
      [operator]   [operator]   [operator]       [...]
       |       |                   |
    [operand] [operand]         [operand]

... which might be just as well represented by something like a 
list/vector of pairs, where each pair describes an operator and an array 
of zero or more PdfVariant arguments.

Stream operations would seem to be equally simple - accumulate variants 
until you hit a keyword, then return the keyword and an array of 
arguments to it.

I've put together a quick reader based on Dom's code that can be used to 
read a content stream an operator at a time, returning a pair containing 
the string representation of the operator and a vector of PdfVariant 
operands. There's also a simple function to accumulate the lot of them 
if you want to read a whole stream at once.

It's probably not the fastest way to do things, but it provides 
something to play with.

--
Craig Ringer

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

[Podofo-users] Parsing content streams

Reply via email to