Hi.

When Tika run text extraction, a excel files protected reading password throws 
exception like attachment text bellow.
(not writing password but reading password)
Is this  known ploblem?

Regards,
Shinichiro Abe.


Reading password:'2'

Attachment: 2_yomitori.xls
Description: MS-Excel spreadsheet

Attachment: 2_yomitori.xlsx
Description: application/vnd.openxmlformats-officedocument.spreadsheetml.sheet

abe:target abe$ java -jar tika-app-0.9.jar 2_yomitori.xls
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected 
RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@50fa70a4
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
        at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:107)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:302)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:91)
Caused by: org.apache.poi.EncryptedDocumentException: Default password is 
invalid for docId/saltData/saltHash
        at 
org.apache.poi.hssf.record.RecordFactoryInputStream$StreamEncryptionInfo.createDecryptingStream(RecordFactoryInputStream.java:101)
        at 
org.apache.poi.hssf.record.RecordFactoryInputStream.<init>(RecordFactoryInputStream.java:169)
        at 
org.apache.poi.hssf.eventusermodel.HSSFEventFactory.genericProcessEvents(HSSFEventFactory.java:139)
        at 
org.apache.poi.hssf.eventusermodel.HSSFEventFactory.processEvents(HSSFEventFactory.java:106)
        at 
org.apache.tika.parser.microsoft.ExcelExtractor$TikaHSSFListener.processFile(ExcelExtractor.java:276)
        at 
org.apache.tika.parser.microsoft.ExcelExtractor.parse(ExcelExtractor.java:136)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:189)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        ... 5 more
abe:target abe$ java -jar tika-app-0.9.jar 2_yomitori.xlsx
Exception in thread "main" org.apache.tika.exception.TikaException: Unexpected 
RuntimeException from org.apache.tika.parser.microsoft.OfficeParser@4d480ea
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
        at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:107)
        at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:302)
        at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:91)
Caused by: java.lang.RuntimeException: Buffer underrun - requested 2 bytes but 
0 was available
        at 
org.apache.poi.poifs.filesystem.DocumentInputStream.checkAvaliable(DocumentInputStream.java:202)
        at 
org.apache.poi.poifs.filesystem.DocumentInputStream.readUShort(DocumentInputStream.java:300)
        at 
org.apache.poi.poifs.filesystem.DocumentInputStream.readShort(DocumentInputStream.java:220)
        at 
org.apache.poi.poifs.crypt.EncryptionHeader.<init>(EncryptionHeader.java:58)
        at 
org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:44)
        at 
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:209)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
        ... 5 more
abe:target abe$ 

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to