Hi,

Am Wednesday 07 November 2007 schrieb Sargrad, Dave:
> "It would be absoltely wonderful if as part of your work you ended up
> writing even a rudimentary content stream parser that was self contained
> enough to be included in PoFoFo ."
>
> Great. I would love to contribute a content stream parser. I don't quite
> know what this means yet, but perhaps we can talk about the proper API
> (from your perspective).
>
> With your mentoring I may be able to contribute a component to the podofo
> project that you find useful.

I fully agree with Craig here. I would love to see your work ending up in 
PoDoFo. 

>  "Looking at the attached PDF, I think it's safe to say you can handle a
> very restricted subset of PDF and still be OK. I begin to see why you're
> doing it the way you are. A content stream parser for that shouldn't be too
> hard to write at all by the looks."
>
> This was my impression/hope as well. I want to start simple, and yet put
> myself on a road to increasingly understand/use pdf.

I wonder if it is even necessary to write an own content stream parser. In my 
opinion PdfTokenizer::GetNextToken() could be a good start. It should handle 
comments and keywords in PDF content streams quite well.

Maybe it makes sense to wrap PdfTokenizer into a PdfContentStreamTokenizer to 
make content stream parsing easier. But I think you will figure out the way 
to go quite fast.

>
> Now that you've seen the pdf files that im currently interested and
> understand that I'm willing to put in the effort to "do this right", and to
> contribute something back to the community, please help me to understand
> the appropriate initial characteristics (API) of the "content stream
> parser".
Sure, if you have questions we will all be glad to help!

A PDF content streams consists of PDF datatypes (i.e. everything which can be 
represented as a PdfVariant) and keywords. Keywords are operators like 'm' 
or 'l' (moveto and lineto in PostScript-speech). Usualy keywords 
or "operators" are followed by several parameters which should be numbers, 
strings or names. Chapter "3.7.1 Contents Streams" in the PDF reference 
should give you an initial overview.

best regards,
        Dom

>
>
> Thanks for all the help!
> Dave
>
> -----Original Message-----
> From: Craig Ringer [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, November 07, 2007 1:48 PM
> To: Sargrad, Dave
> Cc: [email protected]
> Subject: Re: [Podofo-users] trying to build podofo using visual studio
>
> Sargrad, Dave wrote:
> > I understand that podofo is not a renderer.. I want it specifically
> > because it does not render! We have some very specific rendering
> > requirements.
>
> Glad to hear it. I just wanted to make sure you weren't under any
> misapprehensions.
>
> > My first-use intent (relative to podofo) is to build a document viewer
> > that leverages the ogre renderer, and to use podofo to parse/interpret
> > the objects contained within the pdf file.
>
> That sounds fairly reasonable. Your main issue will be that at present
> PoDoFo won't help you parse content streams. You will need to parse content
> streams as part of rendering a PDF.
>
> It would be absoltely wonderful if as part of your work you ended up
> writing even a rudimentary content stream parser that was self contained
> enough to be included in PoFoFo . This would probably be wise anyway,
> rather than mixing your basic content stream parsing with rendering code. 
> I'd be happy to lend my limited time and more limited skill to help with
> that - and I'm one of those freaks who actually *likes* cleaning up and
> integrating code, fiddling with build systems, etc.
>
> Looking at your PDF after uncompressing it, you might be in an ideal
> situation for a simple content stream parser to be very handy.
>
> > I took a quick look at libpoppler and I believe it makes very specific
> > rendering assumptions, so my approach was to use podofo to interpret
> > the contents of the pdf file, and I would tackle all the higher level
> > rendering.
>
> Fair enough.
>
> > In my case rather than libpoppler, I will be using Ogre as the
> > rendering engine. I can guarantee that I don't yet fully appreciate
> > the work involved to get to the point where I can render the objects
> > that PoDoFo draws from any particular PDF file. I believe there is
> > quite a bit of work, and yet am hoping that I can prototype something
> > in fairly short order.
>
> I sincerely hope so. Rendering PDF is very complicated, and that's if you
> assume most PDF is correct (which is far from the truth). The number of
> verisons and options doesn't help. That said, it's hardly impossible.
>
> If you can limit yourself to a certain subset of PDF, at least initially,
> things might be much easier. Some things are worth getting right from the
> start IMO (ICCBased & CMYK colour, for example) but I wouldn't be too
> surprised if you were able to get away without handling things like layer
> blend modes initially. Then again, for all I know they're easy with your
> rendering backend.
>
> Looking at the attached PDF, I think it's safe to say you can handle a very
> restricted subset of PDF and still be OK. I begin to see why you're doing
> it the way you are. A content stream parser for that shouldn't be too hard
> to write at all by the looks.
>
> %PDF-1.4
> %»°°« Layton Graphics, Inc. pdf++ library
> 1 0 obj
> << /CreationData (D:20070928141754) /Producer (dgn2pdf)
>
> I presume dqn is a custom format in use in your industry, and that you
> don't have access to the original files the PDFs were created from? (Just
> curious...).
>
> Speaking of "nice to have" ... wish our stream interfaces were usable as
> iostream buffers so they could be processed with C++ stream operations... .
> Unforutnately implementing new iostreams interfaces is rather far from fun.
>
> --
> Craig Ringer
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Podofo-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/podofo-users



-- 
**********************************************************************
Dominik Seichter - [EMAIL PROTECTED]
KRename  - http://www.krename.net  - Powerful batch renamer for KDE
KBarcode - http://www.kbarcode.net - Barcode and label printing
PoDoFo - http://podofo.sf.net - PDF generation and parsing library
SchafKopf - http://schafkopf.berlios.de - Schafkopf, a card game,  for KDE
Alan - http://alan.sf.net - A Turing Machine in Java
**********************************************************************

Attachment: pgpUWOWLe7Dt3.pgp
Description: PGP signature

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to