[ 
https://issues.apache.org/jira/browse/TIKA-3114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134638#comment-17134638
 ] 

Tilman Hausherr commented on TIKA-3114:
---------------------------------------

[~dbalasub] Your stack trace does not contain anything from tika, the last is 
"com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse". This must 
have been called from tika somewhere.

Alternatively, open your file with tika-app and see what happens. Something 
like "java -jar tika-app.jar yourfile.pdf".

> Error reading transcript from document
> --------------------------------------
>
>                 Key: TIKA-3114
>                 URL: https://issues.apache.org/jira/browse/TIKA-3114
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.18
>            Reporter: Dushyanth Balasubramanian
>            Priority: Major
>
> Fatal Error] :1547:3: The element type "div" must be terminated by the 
> matching end-tag "</div>".Fatal Error] :1547:3: The element type "div" must 
> be terminated by the matching end-tag "</div>".org.xml.sax.SAXParseException; 
> lineNumber: 1547; columnNumber: 3; The element type "div" must be terminated 
> by the matching end-tag "</div>". at 
> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:257)
>  at 
> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to