[
https://issues.apache.org/jira/browse/PDFBOX-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496350#comment-13496350
]
Andreas Lehmkühler commented on PDFBOX-1438:
--------------------------------------------
Your code looks good to me, although it might be easier to use the
ExtractImages class. [1]
The result is as expected. The pdf contains 2 images (one on each page) and
both are extracted. The remaining part consists of many lines, curves and boxes
which can't be extracted as image. A possible workaround maybe the conversion
of each page to an image using PDFToImage [2]. But the result would include the
2 small images as well.
[1]
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/ExtractImages.java
[2]
http://svn.apache.org/repos/asf/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/PDFToImage.java
> Problems with Image Extraction from PDF
> ---------------------------------------
>
> Key: PDFBOX-1438
> URL: https://issues.apache.org/jira/browse/PDFBOX-1438
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.7.1
> Environment: Windows XP
> Reporter: Christian Czech
> Attachments: Korrespondenz_000.jpg, Korrespondenz_001.jpg,
> Korrespondenz.PDF
>
>
> PDFBox don't extract images from pdf document correctly
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira