[
https://issues.apache.org/jira/browse/CONNECTORS-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16148866#comment-16148866
]
Karl Wright commented on CONNECTORS-1450:
-----------------------------------------
The classes it can't find are present in a jar that is distributed in
connector-common-lib: poi-ooxml-schemas-3.15.jar:
{code}
...
com/
com/microsoft/
com/microsoft/schemas/
com/microsoft/schemas/office/
com/microsoft/schemas/office/excel/
com/microsoft/schemas/office/excel/impl/
com/microsoft/schemas/office/office/
com/microsoft/schemas/office/office/impl/
com/microsoft/schemas/office/visio/
com/microsoft/schemas/office/visio/x2012/
com/microsoft/schemas/office/visio/x2012/main/
com/microsoft/schemas/office/visio/x2012/main/impl/
com/microsoft/schemas/office/x2006/
com/microsoft/schemas/office/x2006/digsig/
com/microsoft/schemas/office/x2006/digsig/impl/
com/microsoft/schemas/office/x2006/encryption/
com/microsoft/schemas/office/x2006/encryption/impl/
com/microsoft/schemas/office/x2006/keyEncryptor/
com/microsoft/schemas/office/x2006/keyEncryptor/certificate/
com/microsoft/schemas/office/x2006/keyEncryptor/certificate/impl/
com/microsoft/schemas/office/x2006/keyEncryptor/password/
com/microsoft/schemas/office/x2006/keyEncryptor/password/impl/
com/microsoft/schemas/vml/
com/microsoft/schemas/vml/impl/
...
{code}
This is a required dependency of poi-ooxml.jar:
{code}
[INFO] +- org.apache.poi:poi-ooxml:jar:3.9:test
[INFO] | +- org.apache.poi:poi-ooxml-schemas:jar:3.9:test
[INFO] | | \- org.apache.xmlbeans:xmlbeans:jar:2.3.0:test
[INFO] | \- dom4j:dom4j:jar:1.6.1:test
{code}
Since the class is present, but since it can't apparently be found, I have to
assume that the ooxml jar loads classes in a non-standard way and is
incompatible with ManifoldCF's class loader setup.
The solution has to be to move these jars (both poi-ooxml-schemas and xmlbeans)
"up a level" to the core classpath. The workaround is to use the Tika external
service instead.
This is a significant enough problem that I think we should consider a point
release to address it.
> Class not found stack trace coming from Tika parsing when visio file found
> --------------------------------------------------------------------------
>
> Key: CONNECTORS-1450
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1450
> Project: ManifoldCF
> Issue Type: Bug
> Components: Tika extractor
> Affects Versions: ManifoldCF 2.8
> Reporter: Karl Wright
> Assignee: Karl Wright
>
> The Tika Extractor runs into problems with Visio files. A stack trace shows
> that the issue is a class that cannot be loaded, which is apparently a
> dependency of Apache POI.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)