[jira] [Created] (PDFBOX-3926) ExtractImages

JIRA Fri, 08 Sep 2017 01:31:22 -0700

Hasan Karaoğlu created PDFBOX-3926:
--------------------------------------


             Summary: ExtractImages 
                 Key: PDFBOX-3926
                 URL: https://issues.apache.org/jira/browse/PDFBOX-3926
             Project: PDFBox
          Issue Type: Improvement
            Reporter: Hasan Karaoğlu


Hi, I extract texts from pdf by below command. But it doesnt extract images. 
And So, I use extract images command. But how can we merge these two data 
sequentially?


Extract Texts: (First command)
{code:java}
java -jar pdfbox.jar ExtractText -html {{inputFileName}} -startPage 
{{startPage}} -endPage {{endPage}} -encoding UTF-8  {{outputFileName}}

{code}
Extract Images: (Second command)

{code:java}
java -jar pdfbox-app.jar ExtractImages [OPTIONS] <inputfile>
{code}

For example I run first command and I have a output.html file. But this file 
has just text parts of page. There is no image. And I run second command , I 
get  image as file. Then, How can I merge these two seperated files. Order of 
elements in page is important. 

 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (PDFBOX-3926) ExtractImages

Reply via email to