[ 
https://issues.apache.org/jira/browse/TIKA-1111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638883#comment-13638883
 ] 

Niels Beekman commented on TIKA-1111:
-------------------------------------

This was not enough however for the parser, as it was missing an import for the 
org.w3c.dom package:

java.lang.NoClassDefFoundError
        at org.apache.xmlbeans.XmlBeans.class$(XmlBeans.java:43)
        at org.apache.xmlbeans.XmlBeans.buildNodeMethod(XmlBeans.java:195)
        at 
org.apache.xmlbeans.XmlBeans.buildNodeToCursorMethod(XmlBeans.java:232)
        at org.apache.xmlbeans.XmlBeans.<clinit>(XmlBeans.java:131)
        at 
org.openxmlformats.schemas.wordprocessingml.x2006.main.DocumentDocument$Factory.parse(Unknown
 Source)
        at 
org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:134)
        at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159)
        at 
org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:116)
        at 
org.apache.poi.xwpf.extractor.XWPFWordExtractor.<init>(XWPFWordExtractor.java:53)
        at 
org.apache.poi.extractor.ExtractorFactory.createExtractor(ExtractorFactory.java:180)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory.parse(OOXMLExtractorFactory.java:87)
        at 
org.apache.tika.parser.microsoft.ooxml.OOXMLParser.parse(OOXMLParser.java:82)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at 
org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.java:221)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.w3c.dom.Node
        at 
org.apache.felix.framework.ModuleImpl.findClassOrResourceByDelegation(ModuleImpl.java:814)
        at org.apache.felix.framework.ModuleImpl.access$100(ModuleImpl.java:61)
        at 
org.apache.felix.framework.ModuleImpl$ModuleClassLoader.loadClass(ModuleImpl.java:1733)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:169)
        ... 18 more

This appears to be the same issue as TIKA-1086
                
> Class loading issues when running in OSGi environment
> -----------------------------------------------------
>
>                 Key: TIKA-1111
>                 URL: https://issues.apache.org/jira/browse/TIKA-1111
>             Project: Tika
>          Issue Type: Bug
>    Affects Versions: 1.3
>         Environment: Tika 1.3 (tika-core and tika-bundle OSGi bundles)
> Felix 2.0.5
>            Reporter: Niels Beekman
>
> When dom4j is on the system classpath, a class loading error occurs during 
> detection of Office Open XML files:
> java.lang.ExceptionInInitializerError
>       at 
> org.apache.poi.openxml4j.opc.internal.unmarshallers.PackagePropertiesUnmarshaller.<clinit>(PackagePropertiesUnmarshaller.java:49)
>       at org.apache.poi.openxml4j.opc.OPCPackage.init(OPCPackage.java:154)
>       at org.apache.poi.openxml4j.opc.OPCPackage.<init>(OPCPackage.java:141)
>       at org.apache.poi.openxml4j.opc.Package.<init>(Package.java:54)
>       at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:99)
>       at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:207)
>       at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detectOfficeOpenXML(ZipContainerDetector.java:194)
>       at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detectZipFormat(ZipContainerDetector.java:134)
>       at 
> org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:77)
>       at 
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
>       at 
> org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:113)
>       at 
> org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.java:221)
>       at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.ClassCastException: org.dom4j.DocumentFactory cannot be 
> cast to org.dom4j.DocumentFactory
>       at org.dom4j.DocumentFactory.getInstance(DocumentFactory.java:97)
>       at org.dom4j.tree.AbstractNode.<clinit>(AbstractNode.java:39)
>       ... 14 more
> As a workaround (maybe a solution), I modified the context classloader when 
> running the detection (wrapped the detector and parser). This appears to be 
> the common fix for dom4j, as it uses the context classloader during 
> initialization. Ideally, the detectors and parsers would be running with 
> their original loader (from ServiceLoader) as context class loader.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to