[ https://issues.apache.org/jira/browse/TIKA-2818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavel Arnošt updated TIKA-2818: ------------------------------- Summary: RarParser throws EncryptedDocumentException only when whole archive is encrypted (was: RarParser throws EncryptedDocumentException only when whole archiveis encrypted) > RarParser throws EncryptedDocumentException only when whole archive is > encrypted > -------------------------------------------------------------------------------- > > Key: TIKA-2818 > URL: https://issues.apache.org/jira/browse/TIKA-2818 > Project: Tika > Issue Type: Bug > Affects Versions: 1.20 > Reporter: Pavel Arnošt > Priority: Minor > Attachments: rar4_encrypted_content_only.rar > > > RarParser throws EncryptedDocumentException only if whole archive is > encrypted. If encryption is on individial files, parser ends with > org.apache.tika.exception.TikaException: RarParser Exception: > Caused by: org.apache.tika.exception.TikaException: RarParser Exception > at org.apache.tika.parser.pkg.RarParser.parse(RarParser.java:99) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:280) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:143) > at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:159) > at ... 43 more > Caused by: com.github.junrar.exception.RarException: ioError > at com.github.junrar.Archive.getInputStream(Archive.java:525) > at org.apache.tika.parser.pkg.RarParser.parse(RarParser.java:81) > ... 48 more > Caused by: com.github.junrar.exception.RarException: crcError > at com.github.junrar.Archive.doExtractFile(Archive.java:557) > at com.github.junrar.Archive.extractFile(Archive.java:498) > at com.github.junrar.Archive.getInputStream(Archive.java:523) > ... 49 more > File encryption should be checked before trying to extract content on line 79 > like this: > FileHeader header = rar.nextFileHeader(); > if (header.isEncrypted()) { > throw new EncryptedDocumentException(); > } > while (header != null && !Thread.currentThread().isInterrupted()) { > Or maybe insert it into metadata with > TikaCoreProperties.TIKA_META_EXCEPTION_EMBEDDED_STREAM key? I don't know, but > current behaviour is not correct (parsing fails). > Sample document is attached. -- This message was sent by Atlassian JIRA (v7.6.3#76005)