Hi there,

Recently I've been trying to get enhancements from a PDF file, and the
Apache tika engine fails and logs the following error:

21.06.2016 17:02:56.824 *ERROR* [Thread-12]
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler
Unexpected Exception while processing ContentItem
<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc> with
EnhancementJobManager: class
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.pdfbox.pdmodel.PDPage
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:212)
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:218)
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:184)
        at
org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:212)
        at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:106)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:143)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at
org.apache.stanbol.enhancer.engines.tika.TikaEngine$1.run(TikaEngine.java:275)
        at java.security.AccessController.doPrivileged(Native Method)
        at
org.apache.stanbol.enhancer.engines.tika.TikaEngine.computeEnhancements(TikaEngine.java:256)
        at
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.processEvent(EnhancementJobHandler.java:280)
        at
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.handleEvent(EnhancementJobHandler.java:198)
        at
org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415)
        at
org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:118)
        at
org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:159)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
Execution of Chain tikaChain failed after 14ms for ContentItem
<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc>
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
 finished:     true
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
 state:        failed
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
 chain:        tikaChain
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
 content-item:
<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc>
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
executions:
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl     -
tika completed
21.06.2016 17:02:56.824 *INFO* [qtp158698819-2677]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl Error
Message: Enhancement Chain failed because of required Engi        at
org.apache.stanbol.enhancer.jersey.resource.AbstractEnhancerResource.enhanceFromData(AbstractEnhancerResource.java:213)
        at sun.reflect.GeneratedMethodAccessor105.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at
org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81)
        at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:151)
        at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:171)
        at
org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:152)
        at
org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:104)
        at
org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:406)
        at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:350)
        at
org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:106)
        at
org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:259)
        ... 59 common frames omitted
Caused by: java.lang.IllegalStateException: Unexpected Exception while
processing ContentItem
<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc> with
EnhancementJobManager: class
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
        at
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.handleEvent(EnhancementJobHandler.java:204)
        at
org.apache.felix.eventadmin.impl.handler.EventHandlerProxy.sendEvent(EventHandlerProxy.java:415)
        at
org.apache.felix.eventadmin.impl.tasks.SyncDeliverTasks.execute(SyncDeliverTasks.java:118)
        at
org.apache.felix.eventadmin.impl.tasks.AsyncDeliverTasks$TaskExecuter.run(AsyncDeliverTasks.java:159)
        at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        ... 1 common frames omitted
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.pdfbox.pdmodel.PDPage
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:212)
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:218)
        at
org.apache.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:184)
        at
org.apache.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:212)
        at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:340)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:106)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:143)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242)
        at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
        at
org.apache.stanbol.enhancer.engines.tika.TikaEngine$1.run(TikaEngine.java:275)
        at java.security.AccessController.doPrivileged(Native Method)
        at
org.apache.stanbol.enhancer.engines.tika.TikaEngine.computeEnhancements(TikaEngine.java:256)
        at
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.processEvent(EnhancementJobHandler.java:280)
        at
org.apache.stanbol.enhancer.jobmanager.event.impl.EnhancementJobHandler.handleEvent(EnhancementJobHandler.java:198)
        ... 8 common frames omitted
21.06.2016 17:02:56.825 *WARN* [qtp158698819-2677]
org.eclipse.jetty.server.HttpChannel Could not send response error 500:
javax.servlet.ServletException:
org.glassfish.jersey.server.ContainerException:
org.apache.stanbol.enhancer.servicesapi.ChainException: Enhancement Chain
failed because of required Engine 'tika' failed with Message: Unable to
process ContentItem
'<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc>' with
Enhancement Engine 'tika' because the engine is currently not
active(Reason: Unexpected Exception while processing ContentItem
<urn:content-item-sha1-11f821c604fcbcbaeca7a1c29909065ac7fedafc> with
EnhancementJobManager: class
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl)!

I think the main problem here is: Could not initialize class
org.apache.pdfbox.pdmodel.PDPage, then because of that, the tika engine
fails, and because of thata, it is seen as not active.
Works fine with other file formats, like Office Word Docs, Excel
Spreadsheets... etc

I'm wondering if anyone else as had this problem. I am using a build
checked out from trunk and it is around 2 months old, so, V 1.0, not 0.12.
I am also using a custom launcher and I can provide details on that
launcher if anyone thinks it might be relevant, even though I basically
just stripped out the External Engines.

If there is anything else I can provide, just ask.

Best Regards,
Antero Duarte

Reply via email to