Hi Andrew, Are you using DSpace version 1.5.x or 1.6?
Both versions use Apache's pdfbox 0.7.3 for filtering PDF files which is a older version. Current one is version 1.0.0. I believe this newer version is capable of taking care of your PDF problems. Below is a snippet of an earlier posting and mail exchanges: >> E.g. the pdfbox used is version 0.7.3. and a couple of years old. >> Noted in our production instance that more and more new pdf's are not >> processed. Just trying the new 1.1.0 version and it seems to process >> these pdf's without difficulty. >> You need the uptodate versions of jempbox and fontbox too, >> only updating the pdfbox is not enough, and depending on the input may >> be the bouncy castle provider for the java version you are using. >> See http://pdfbox.apache.org/ >> Claudia But Dspace 1.6 also has provision for XPDF, the details of which is outlined in the manual in the section under media filtering. I have tried this out and it solved issues with large pdf files as well as pdfs created using newer version of Acrobat engine. Hope this will help, Debashree > We are experiencing problems with media filtering of PDF files added in our thesis digitisation project. > > A number of the files (perhaps 10%) will not filter, the command window just pauses for up to 15 minutes or so, then displays: > > "SKIPPED: bitstream 5698 (item: 10182/1780) because filtering was unsuccessful" > > No other error message or clue is given. > > I can see no common feature of the PDFs that won't filter - they can be b&w only or some colour, different PDF versions. Yes, they are all quite large files (10MB or larger), but not all files of this size are failing in this way. > > I find that if I split the file into parts and re-upload, they will then filter OK. > > Has anyone else experienced this and do you have a solution? > > Andrew White > Information Technology Librarian > > George Forbes Memorial Library > PO Box 64 > Lincoln University > Lincoln 7647 > Christchurch, New Zealand > > p +64 3 321 8542 | f +64 3 325 2944 > e andrew.wh...@lincoln.ac.nz<mailto:andrew.wh...@lincoln.ac.nz> | w library.lincoln.ac.nz<http://library.lincoln.ac.nz/> > > Lincoln University, Te Whare Wanaka o Aoraki > New Zealand's Specialist Land Based University > > "The contents of this e-mail (including any attachments) may be > confidential and/or subject to copyright. Any unauthorised use, > distribution, or copying of the contents is expressly prohibited. If you > have received this e-mail in error, please advise the sender > by return e-mail or telephone and then delete this e-mail together with all attachments from your system." > ------------------------------------------------------------------------------ _______________________________________________ > DSpace-tech mailing list > DSpace-tech@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/dspace-tech > ------------------------------------------------------------------------------ _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech