All,

I recently noticed that I'm getting this message logged when there is an 
exception during parsing:

SEVERE: Problem with writing the data, class 
org.apache.tika.server.TikaResource$5, ContentType: text/html

We didn't get this message with Tika 1.6, but we are getting this with Tika 1.7 
and trunk.
Is this to be expected?

Full stack trace is below.  The test document that triggered this is an 
encrypted PDF document.




WARNING: tika: Text extraction failed
org.apache.tika.exception.TikaException: Unable to extract PDF content
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:150)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:146)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256
)
        at org.apache.tika.parser.ParserDecorator.parse(ParserDecorator.java:117
)
        at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:256
)
        at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
20)
        at org.apache.tika.server.TikaResource$5.write(TikaResource.java:368)
        at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataPr
ovider.java:164)
        at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.jav
a:1363)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage
(JAXRSOutInterceptor.java:244)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(
JAXRSOutInterceptor.java:117)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JA
XRSOutInterceptor.java:80)
        at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseIntercept
orChain.java:307)
        at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(Out
goingChainInterceptor.java:83)
        at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseIntercept
orChain.java:307)
        at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainIniti
ationObserver.java:121)
        at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(Abstract
HTTPDestination.java:251)
        at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(Je
ttyHTTPDestination.java:261)
        at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTP
Handler.java:70)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
er.java:1088)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
r.java:1024)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
ava:135)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
extHandlerCollection.java:255)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:370)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac
tHttpConnection.java:494)
        at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpC
onnection.java:982)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.conten
t(AbstractHttpConnection.java:1043)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)

        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnecti
on.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEn
dPoint.java:696)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEnd
Point.java:53)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
l.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
.java:543)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException
        at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:109)
        at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:379)
        at org.apache.pdfbox.cos.COSStream.doDecode(COSStream.java:291)
        at org.apache.pdfbox.cos.COSStream.getUnfilteredStream(COSStream.java:22
5)
        at org.apache.pdfbox.pdfparser.PDFStreamParser.<init>(PDFStreamParser.ja
va:117)
        at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngi
ne.java:251)
        at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngi
ne.java:235)
        at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.
java:215)
        at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.ja
va:460)
        at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.j
ava:385)
        at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java
:344)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134)
        ... 35 more
Caused by: java.util.zip.DataFormatException: incorrect header check
        at java.util.zip.Inflater.inflateBytes(Native Method)
        at java.util.zip.Inflater.inflate(Unknown Source)
        at java.util.zip.Inflater.inflate(Unknown Source)
        at org.apache.pdfbox.filter.FlateFilter.decompress(FlateFilter.java:128)

        at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:101)
        ... 46 more

Feb 27, 2015 9:27:33 AM org.apache.cxf.jaxrs.utils.JAXRSUtils logMessageHandlerP
roblem
SEVERE: Problem with writing the data, class org.apache.tika.server.TikaResource
$5, ContentType: text/html
Feb 27, 2015 9:27:33 AM org.apache.cxf.jaxrs.impl.WebApplicationExceptionMapper
toResponse
WARNING: javax.ws.rs.WebApplicationException: HTTP 500 Internal Server Error
        at org.apache.tika.server.TikaResource$5.write(TikaResource.java:397)
        at org.apache.cxf.jaxrs.provider.BinaryDataProvider.writeTo(BinaryDataPr
ovider.java:164)
        at org.apache.cxf.jaxrs.utils.JAXRSUtils.writeMessageBody(JAXRSUtils.jav
a:1363)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.serializeMessage
(JAXRSOutInterceptor.java:244)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.processResponse(
JAXRSOutInterceptor.java:117)
        at org.apache.cxf.jaxrs.interceptor.JAXRSOutInterceptor.handleMessage(JA
XRSOutInterceptor.java:80)
        at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseIntercept
orChain.java:307)
        at org.apache.cxf.interceptor.OutgoingChainInterceptor.handleMessage(Out
goingChainInterceptor.java:83)
        at org.apache.cxf.phase.PhaseInterceptorChain.doIntercept(PhaseIntercept
orChain.java:307)
        at org.apache.cxf.transport.ChainInitiationObserver.onMessage(ChainIniti
ationObserver.java:121)
        at org.apache.cxf.transport.http.AbstractHTTPDestination.invoke(Abstract
HTTPDestination.java:251)
        at org.apache.cxf.transport.http_jetty.JettyHTTPDestination.doService(Je
ttyHTTPDestination.java:261)
        at org.apache.cxf.transport.http_jetty.JettyHTTPHandler.handle(JettyHTTP
Handler.java:70)
        at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl
er.java:1088)
        at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle
r.java:1024)
        at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j
ava:135)
        at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont
extHandlerCollection.java:255)
        at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper
.java:116)
        at org.eclipse.jetty.server.Server.handle(Server.java:370)
        at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac
tHttpConnection.java:494)
        at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpC
onnection.java:982)
        at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.conten
t(AbstractHttpConnection.java:1043)
        at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:865)
        at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)

        at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnecti
on.java:82)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEn
dPoint.java:696)
        at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEnd
Point.java:53)
        at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo
l.java:608)
        at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool
.java:543)
        at java.lang.Thread.run(Unknown Source)

Best,

     Tim

Reply via email to