[ 
https://issues.apache.org/jira/browse/PDFBOX-161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-161.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.4
         Assignee: Andreas Lehmkühler

Tilman: I can't confirm that, the issue is still there, e.g. on page 54.

The problem is an unbalanced number of q/Q-operators in the content stream. I 
fixed that in revisions 1557389 (trunk) and 1557395 (1.8 branch). PDFBox now 
just skips the restore if there isn't anything to restore.

Maybe that self healing mechanism should be optional so that preflight can 
detect such issues, see PDFBOX-1812 for details

> java.util.EmptyStackException from PDFTextStripper.writeText
> ------------------------------------------------------------
>
>                 Key: PDFBOX-161
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-161
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>            Assignee: Andreas Lehmkühler
>            Priority: Minor
>             Fix For: 1.8.4, 2.0.0
>
>         Attachments: PDFBOX161-E12860v10P096469EAs.pdf
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1483833
> Originally submitted by gagravarr on 2006-05-08 07:05.
> I'm using PDFBox 0.7.2. On a certain document
> (http://www-wds.worldbank.org/external/default/WDSContentServer/IW3P/IB/2005/12/27/000160016_20051227181308/Rendered/PDF/E12860v10P096469EAs.pdf),
> when I execute the following code:
>  PDFParser pdfParser = new PDFParser( docStream );
>  pdfParser.parse();
>  PDDocument pdfDocument = pdfParser.getPDDocument();
>  StringWriter textWriter = new StringWriter();
>  PDFTextStripper textStripper = new PDFTextStripper();
>  textStripper.writeText(pdfDocument, textWriter);
> I get the following nasty stack trace from deep inside
> PDFBox:
> Exception in thread "main" java.util.EmptyStackException
>         at java.util.Stack.peek(Stack.java:79)
>         at java.util.Stack.pop(Stack.java:61)
>         at
> org.pdfbox.util.operator.GRestore.process(GRestore.java:65)
>         at
> org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:
> 494)
>         at
> org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java
> :207)
>         at
> org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:16
> 0)
>         at
> org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:355)
>         at
> org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:268
> )
>         at
> org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to