https://issues.apache.org/bugzilla/show_bug.cgi?id=53475
Priority: P2
Bug ID: 53475
Assignee: [email protected]
Summary: Support for more DOCX encryption versions
Severity: major
Classification: Unclassified
Reporter: [email protected]
Hardware: Macintosh
Status: NEW
Version: 3.8
Component: POIFS
Product: POI
Created attachment 29002
--> https://issues.apache.org/bugzilla/attachment.cgi?id=29002&action=edit
Encrypted word doc which crashes POI
PROBLEM
=======
When parsing password protected OOXML Word files, the EncryptionInfo class has
explicit support for (versionMajor == 4 && versionMinor == 4 && encryptionFlags
== 0x40), while all other versions are treated the same. For some enctypted
DOCX documents this causes an exception:
java.lang.RuntimeException: Salt size != 16 !?
at
org.apache.poi.poifs.crypt.EncryptionVerifier.<init>(EncryptionVerifier.java:121)
at org.apache.poi.poifs.crypt.EncryptionInfo.<init>(EncryptionInfo.java:66)
at
org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:211)
HOW TO REPRODUCE
================
Download Apache Tika 1.1
(http://www.apache.org/dyn/closer.cgi/tika/tika-app-1.1.jar) and start it using
java -jar tika-app-1.1.jar password-is-solrcell.docx
which triggers the exception. NOTE: Tika does not yet have an option to pass in
a password but it crashes before we get to dectyption.
SOLUTION
========
We need to dig into the various versions that a doc can have and what
encryption schemes to support. Here is a link to a page explaining the file
formats and also providing a .NET program for dectyption (have not had the
chance to test it on my example docx file though):
http://www.lyquidity.com/devblog/?p=35
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]