[ 
https://issues.apache.org/jira/browse/PDFBOX-183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin LeFebvre updated PDFBOX-183:
-----------------------------------

    Attachment: ConflictList.diff

The problem with the file referenced in this bug report is that the pdf file 
has multiple entries for certain objects. Before, when pdfbox was parsing 
through the file, if it encountered duplicate objects, the new object would 
completely replace the old one. While the spec says that files should not have 
this issue, most readers don't have a problem rendering the file because they 
use the information from the xref table to figure out which objects to use. In 
order to fix this, we added code that deals with these conflicts. When parsing, 
if we see a second instance of an object, we put that instance, its key, and 
its byte offset into a new conflictList of  ConflictObjs. After we're done 
parsing the rest of the file, we now have xref information and use the byte 
offsets to determine if this current object should replace the object we saw 
originally or not. 

> java.lang.NullPointerException in highlighter.generateXMLHig
> ------------------------------------------------------------
>
>                 Key: PDFBOX-183
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-183
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>            Priority: Minor
>         Attachments: ConflictList.diff
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1517476
> Originally submitted by nobody on 2006-07-05 05:11.
> Sample code :
> try
>    {
>     URL pdfURL = new URL( mPdfUrl );
>     
>     doc = PDDocument.load( pdfURL.openStream() );
>     PDFHighlighter highlighter = new PDFHighlighter();
>     highlighter.generateXMLHighlight( doc,
> mHighlightWords.split( " " ), fiw );
>      
>    }
>    catch (Exception e)
> Using ADLIB converted PDF ( see attach file )
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552832&aid=1517476&file_id=183934
> cv1.pdf (application/pdf), 109109 bytes
> pdf

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to