Hamish, Check out http://www.mail-archive.com/dspace-tech@lists.sourceforge.net/msg00182.html, it's supposed to be an "optional dependency" but isn't mentioned in the PDFBox docs.
Scott. >Date: Mon, 19 Feb 2007 15:51:35 +1030 >From: "Brett, Hamish" <[EMAIL PROTECTED]> >Subject: [Dspace-tech] Filter-media error >To: <dspace-tech@lists.sourceforge.net> >Message-ID: > <[EMAIL PROTECTED]> >Content-Type: text/plain; charset="us-ascii" > >Hi, > >Ever since upgrading to 1.4.1 when I run filter-media the following >error > >ERROR filtering, skipping bitstream #1584 java.io.IOException: Invalid >header signature; read 290763650945099227, expected -2226271756974174256 >java.io.IOException: Invalid header signature; read 290763650945099227, >expected -2226271756974174256 > at >org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader. >java:88) > at >org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.j >ava:83) > at >org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.j >ava:48) > at >org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.ja >va:97) > at >org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java >:155) > at >org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte >rManager.java:327) > at >org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana >ger.java:296) > at >org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt >erManager.java:266) > at >org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media >FilterManager.java:234) > at >org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja >va:185) >SKIPPED: bitstream 997 because '2000 J Thermal Stress (One-sided >repair).PDF.txt' already exists >SKIPPED: bitstream 2833 because 'iRoom at DSTO.pdf.txt' already exists >SKIPPED: bitstream 2835 because 'Network Enabled Warfare4.pdf.txt' >already exists >SKIPPED: bitstream 2837 because 'DORC99-Lin-Zhang.PDF.txt' already >exists >SKIPPED: bitstream 2839 because 'icota98.pdf.txt' already exists >Exception in thread "main" java.lang.NoClassDefFoundError: >org/bouncycastle/jce/provider/BouncyCastleProvider > at >org.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:905) > at org.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:489) > at >org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:204) > at >org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:149) > at >org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java >:110) > at >org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java >:155) > at >org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilte >rManager.java:327) > at >org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterMana >ger.java:296) > at >org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilt >erManager.java:266) > at >org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(Media >FilterManager.java:234) > at >org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.ja >va:185) > >Any ideas? > >Thanks > >Hamish >-------------- next part -------------- >An HTML attachment was scrubbed... > >------------------------------ > > ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech