[jira] [Closed] (PDFBOX-950) Null from PDF

Vladimir (JIRA) Wed, 13 Apr 2011 02:33:53 -0700

     [ 
https://issues.apache.org/jira/browse/PDFBOX-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Vladimir closed PDFBOX-950.
---------------------------

    Resolution: Invalid

Yes, I tried to get text from pdf that contains only images.
Thanks. 

> Null from PDF
> -------------
>
>                 Key: PDFBOX-950
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-950
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.4.0
>         Environment: Windows XP [5.1.2600]
> java version "1.6.0_23"
> Java(TM) SE Runtime Environment (build 1.6.0_23-b05)
> Java HotSpot(TM) Client VM (build 19.0-b09, mixed mode, sharing)
>            Reporter: Vladimir
>
> http://www.uss.com/corp/investors/sec_filings/3Q-2010-Earnings-Release.pdf
> In Foxit Reader opened correctly
> This code gets null:
> public static String getHtml(InputStream inputStream) {
>         PDDocument pdDocument = null;
>         String document = null;
>         try {
>             PDFParser parser = new PDFParser(inputStream);
>             parser.parse();
>             pdDocument = parser.getPDDocument();
>             PDFText2HTML pdf2html = new PDFText2HTML(StringUtil.UTF_8());
>             document = pdf2html.getText(pdDocument);
>         } catch (IOException e) {
>             e.printStackTrace();
>         } finally {
>             if (pdDocument != null) {
>                 try {
>                     pdDocument.getDocument().close();
>                 } catch (IOException e) {
>                     e.printStackTrace();
>                 }
>             }
>         }
>         return document;
>     }
> <dependency>
>               <groupId>org.apache.pdfbox</groupId>
>               <artifactId>pdfbox</artifactId>
>               <version>1.4.0</version>
>               </dependency>
>               <dependency>
>                       <groupId>org.bouncycastle</groupId>
>                       <artifactId>bcprov-jdk15</artifactId>
>                       <version>1.45</version>
>               </dependency>
>               <dependency>
>                       <groupId>org.bouncycastle</groupId>
>                       <artifactId>bcmail-jdk15</artifactId>
>                       <version>1.45</version>
>               </dependency>

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Closed] (PDFBOX-950) Null from PDF

Reply via email to