[
https://issues.apache.org/jira/browse/PDFBOX-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13495250#comment-13495250
]
Christian Czech commented on PDFBOX-1438:
-----------------------------------------
Hi Andreas,
here my code:
parser = new PDFStreamParser( cosStream );
Iterator<Object> iter = parser.getTokenIterator();
while( iter.hasNext() ) {
Object next = iter.next();
if( next instanceof COSObject ) {
arguments.add( ((COSObject)next).getObject() );
} else if( next instanceof PDFOperator ) {
processOperator( (PDFOperator)next, arguments, nName );
}
....
protected void processOperator(PDFOperator operator, List arguments, String
nName) throws IOException {
String operation = operator.getOperation();
if (operation.equals("Do")) {
COSName objectName = (COSName) arguments.get(0);
Map xobjects = getResources().getXObjects();
PDXObject xobject = (PDXObject) xobjects.get(objectName.getName());
if (xobject instanceof PDXObjectImage) {
PDXObjectImage image = (PDXObjectImage) xobject;
....
String extension = image.getSuffix();
....
File outputFileTemp = new File(tempFileName);
image.write2file(outputFileTemp);
}
......
And result, please see more attachments
> Problems with Image Extraction from PDF
> ---------------------------------------
>
> Key: PDFBOX-1438
> URL: https://issues.apache.org/jira/browse/PDFBOX-1438
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.7.1
> Environment: Windows XP
> Reporter: Christian Czech
> Attachments: Korrespondenz.PDF
>
>
> PDFBox don't extract images from pdf document correctly
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira