>> 2. PoDoFo will failed to parse pdf file if there are some syntax error
>> in any dictionay.
>>     For example, for the following data, there is string in ( ), and the
>> string contains two '('. PoDoFo will not parse the string correctly
>> until EOF of the file.
>>     There are no bad syntax tolerance codes in the library. I think the
>> bugs should be fixed.
>> "... << ... /Title (....(..(hello world....) >> endobj ... "
>>
> Well, you are right. PoDoFo currently accepts only valid PDF files. Only 
> balanced brackets are allowed to be not escaped in PDF, so your file is 
> broken.

Personally I really like PoDoFo's strict parsing. It frustrates me
endlessly that Acrobat (even with add-ons like PitStop, because the
issue is really in the Adobe PDF library) parse PDF so loosely and
forgivingly; it helps further the creation of buggy apps ("it works with
Acrobat, so it must be fine") and increases the chances that something
will look OK in Acrobat, but RIP incorrectly or fail to RIP on the
platesetter.

> I currently have no time to add a fault-tollerant parser to PoDoFo, but I 
> will 
> of course accept code contributions. I think so that the fault tollerant 
> parser should be optional. So that PoDoFo has to parsers (or a property on 
> PdfParser) to specify that either only valid PDFs are accepted or that PoDoFo 
> should be more fault tollerant.

Yep. Fault tolerance is quite hard, and would be rather time consuming.
It's easy enough to discard any indirect object with a parse error
entirely, scanning until we see the next "n n obj" (and hoping there
isn't any within the broken stream/dictionary!). However, the result
isn't likely to be particularly useful, and what's written out again
will be entirely missing the broken object.

More sophisticated fault tolerence that handles things like
unescaped/unbalanced parens, nested dictionaries with too few >> tags,
etc, is pretty hard and involves a lot of guesswork.

If you feel like contributing a fault-tolerant parser I'm sure it'd be
welcome if it was well written, but to my mind it's a very low priority.
Not that I'm doing much work on the library at the moment anyway...

>> We will be appreciate if your team can fix these bugs in the new verson
>> of PoDoFo code. Thanks very much!

As an open source project, you are also free to fix bugs yourself and
submit patches to the mailing list. If any of these issues are really
high priority ones you might want to consider that; after all, currently
the people doing the work are Dom and Pierre, both of whom are doing it
in their spare time.

If you do submit proposed bug fixes, make sure to include a clear
explanation of what they fix, a test case, and if applicable a reference
to the appropriate part of the Adobe PDF specification.

--
Craig Ringer

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Podofo-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/podofo-users

Reply via email to