[
https://issues.apache.org/jira/browse/PDFBOX-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr resolved PDFBOX-4750.
-------------------------------------
Resolution: Fixed
All done now. Thanks for reporting!
> java.io.IOException: Error:Unknown type in content stream:COSNull{}
> -------------------------------------------------------------------
>
> Key: PDFBOX-4750
> URL: https://issues.apache.org/jira/browse/PDFBOX-4750
> Project: PDFBox
> Issue Type: Bug
> Components: Writing
> Affects Versions: 2.0.8, 2.0.18
> Reporter: tomas kochan
> Assignee: Tilman Hausherr
> Priority: Major
> Fix For: 2.0.19, 3.0.0 PDFBox
>
> Attachments: 01 - K17 - Was dahinter steckt - dsb.pdf,
> PDFBOX-4750-test.pdf, contentAllOperatorsOfCorruptedPage.txt
>
>
> By removing some optional content for specific document, which is bordered
> with Operator BDC and EMC, we are facing an issue by writing the changed set
> of tokens into PDStream.
> The code looks like:
> PDStream updatedStream = new PDStream(document);
> OutputStream out = updatedStream.getCOSObject().createRawOutputStream();
> ContentStreamWriter tokenWriter = new ContentStreamWriter(out);
> tokenWriter.writeTokens(result);
> out.flush();
> out.close();
> page.setContents(updatedStream);
>
> The following exception occurs at line 'tokenWriter.writeTokens(result);' :
> java.io.IOException: Error:Unknown type in content stream:COSNull{}
> at
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeObject(ContentStreamWriter.java:199)
> at
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeObject(ContentStreamWriter.java:146)
> at
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeObject(ContentStreamWriter.java:181)
> at
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeTokens(ContentStreamWriter.java:109)
> at
> de.justiz.eip.pdf.tools.PdfContext.getOrRemoveOptionalTextContentfromPage(PdfContext.java:429)
> at
> de.justiz.eip.pdf.tools.paging.PagingInfoInterpreterPdfContext.removePagingInfo(PagingInfoInterpreterPdfContext.java:325)
>
> After the analyze we figured out two issues:
> 1. We assume, the Pdf Document it's self is corrupted, It contains on some
> place operator BI, which is based on the PDF-Reference-V1.7 a begin of inline
> image object. This Operator is not followed by "ID" or "EI" operator.
> Extract from list of Tokens:
> next PDFOperator\{Do}
> next COSFloat\{0.016674607}
> next COSInt\{0}
> next COSInt\{0}
> next COSFloat\{0.061831153}
> next COSFloat\{0.070509767}
> next COSFloat\{-0.302021403}
> next PDFOperator\{cm}
> next PDFOperator\{BI}
> next PDFOperator\{Q}
> next PDFOperator\{Q}
> next COSName\{OC}
> next COSName\{eAkteOptionalContent7}
> next PDFOperator\{BDC}
> Moreover one "DP" Entry in the "BI" operator's COSDictionary contains
> COSArray with COSNull values. However the assumption is, that the COSNull
> values are not forbidden in the Pdf content.
>
> COSDictionary\{COSName{Interpolate}:true;COSName\{W}:COSInt\{35};COSName\{H}:COSInt\{26};COSName\{CS}:COSName\{RGB};COSName\{BPC}:COSInt\{8};COSName\{F}:COSArray\{[COSName{A85},COSName\{DCT}]};COSName\{DP}:COSArray{[COSNull{},COSNull{}]};}
> 2. Despite wrong content in the pdf-document (described above) the PDF-Box
> api crashed by storing this operators into PDStream by his inability to
> recognize COSNull in the method
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeObject(Object)
>
> The assumption on this place is, that the method "writeObject" forgot to
> cover COSNull as an valid input. The org.apache.pdfbox.cos.COSNull.NULL is
> valid Object, which is broadly used by PDF-Api itself.
> The Method
> org.apache.pdfbox.pdfwriter.ContentStreamWriter.writeObject(Object) PDF-Api
> 2.0.8, also in 2.0.18 doesn't cover the COSNull case in it's if/else
> conditions, instead of it throws the new IOException( "Error:Unknown type in
> content stream:" + o ).
>
> Could you confirm, that the method writeObject contains bug and should be
> corrected to cover also COSNull Object? If so, in which version could we
> expect the fix?
> Thank you
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]