As suggested by dspace, filter-media is set to run nightly. As our instance has grown, so have the number of files that filter media is not able to index. We have 60,738 items in our repository, and as of today, filter-media is not able to index 892. I'm trying to determine if there is anything that can be done so that as many of these 892 items are able to be indexed. I have copied portion of the output of filter-media below. Could someone that better understands filter-media let me know if there is something that can be done. Many thanks! Jose
Applying Media Filters ERROR filtering, skipping bitstream: Item Handle: 2027.42/62012 Bundle Name: ORIGINAL File Size: 58 Checksum: a500810f390e82e2aead21d5220e7325 (MD5) Asset Store: 1 java.lang.IllegalArgumentException: Width (80) and height (0) cannot be <= 0 java.lang.IllegalArgumentException: Width (80) and height (0) cannot be <= 0 at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:999) at java.awt.image.BufferedImage.<init>(BufferedImage.java:312) at org.dspace.app.mediafilter.JPEGFilter.getDestinationStream(JPEGFilter.java:161) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: Item Handle: 2027.42/69214 Bundle Name: ORIGINAL File Size: 268039 Checksum: 4e64d97f5a151819da52b095b1fef5d3 (MD5) Asset Store: 1 java.lang.NullPointerException java.lang.NullPointerException at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194) at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182) at org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226) at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216) at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:139) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: Item Handle: 2027.42/55391 Bundle Name: ORIGINAL File Size: 473660 Checksum: 3686c4d66884a89d81ddfe420a1b661b (MD5) Asset Store: 1 java.io.IOException: Unknown encoding for 'Identity-V' java.io.IOException: Unknown encoding for 'Identity-V' at org.pdfbox.encoding.EncodingManager.getEncoding(EncodingManager.java:82) at org.pdfbox.pdmodel.font.PDFont.getEncoding(PDFont.java:612) at org.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:466) at org.pdfbox.util.PDFStreamEngine.showString(PDFStreamEngine.java:325) at org.pdfbox.util.operator.ShowText.process(ShowText.java:64) at org.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:452) at org.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:215) at org.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:174) at org.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:336) at org.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:259) at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216) at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:139) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: Item Handle: 2027.42/61991 Bundle Name: ORIGINAL File Size: 58 Checksum: a500810f390e82e2aead21d5220e7325 (MD5) Asset Store: 1 java.lang.IllegalArgumentException: Width (80) and height (0) cannot be <= 0 java.lang.IllegalArgumentException: Width (80) and height (0) cannot be <= 0 at java.awt.image.DirectColorModel.createCompatibleWritableRaster(DirectColorModel.java:999) at java.awt.image.BufferedImage.<init>(BufferedImage.java:312) at org.dspace.app.mediafilter.JPEGFilter.getDestinationStream(JPEGFilter.java:161) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: Item Handle: 2027.42/50480 Bundle Name: ORIGINAL File Size: 177152 Checksum: af66e3bb52ebe7f1b4c9cc06fa9a6257 (MD5) Asset Store: 1 java.util.NoSuchElementException java.util.NoSuchElementException at java.util.AbstractList$Itr.next(AbstractList.java:350) at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java:150) at org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.java:95) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: Item Handle: 2027.42/40280 Bundle Name: ORIGINAL File Size: 90205 Checksum: 2437377db51a8e3b9347c784b61906f9 (MD5) Asset Store: 1 java.io.IOException: expected='endobj' firstReadAttempt='endobj154' secondReadAttempt='0' org.pdfbox.io.pushbackinputstr...@122c9df java.io.IOException: expected='endobj' firstReadAttempt='endobj154' secondReadAttempt='0' org.pdfbox.io.pushbackinputstr...@122c9df at org.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:502) at org.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:176) at org.pdfbox.pdmodel.PDDocument.load(PDDocument.java:707) at org.pdfbox.pdmodel.PDDocument.load(PDDocument.java:691) at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:138) at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:674) at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:575) at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:525) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:493) at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:432) at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:363) ERROR filtering, skipping bitstream: ------------------------------------------------------------------------------ This SF.net email is sponsored by Sprint What will you do first with EVO, the first 4G phone? Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech