Have a look at the sources for ExtractText in the recent 1.7.0 release: it now extracts embedded PDFs as well.
http://svn.apache.org/repos/asf/pdfbox/branches/1.7/pdfbox/src/main/java/org/apache/pdfbox/ExtractText.java Mike McCandless http://blog.mikemccandless.com On Wed, Jun 13, 2012 at 8:58 AM, Czech, Christian <[email protected]> wrote: > Hello, > > how can I extract embedded files from PDF? > > Here's is my source: > > document = PDDocument.load(inputFile); > > PDDocumentNameDictionary names = new PDDocumentNameDictionary( > document.getDocumentCatalog() ); > > PDEmbeddedFilesNameTreeNode embeddedTree = names.getEmbeddedFiles(); > > if (embeddedTree == null) { > System.out.println("Embedded files doesn't exist"); > } else { > System.out.println("Size: " + embeddedTree.getKids().size()); > } > > Thanks > > Christian > > > ________________________________ > > ELO Digital Office GmbH > Firmensitz: Heilbronner Strasse 150, 70191 Stuttgart > Fon: +49 711 806089-0, Fax: +49 711 806089-19, Web: www.elo.com > Gesch?ftsf?hrer: Karl Heinz Mosbach, Matthias Thiele > BW-Bank, Konto-Nr. 2089782, BLZ 600 501 01 > Registergericht Stuttgart HRB 15059 - USt-IdNr.: DE812471516

