[
https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ahmad Ajiloo updated TIKA-713:
--
Attachment: ebrat.pdf
this is a persian pdf file that Tika can't parse it.
Tika can not parse all of
[
https://issues.apache.org/jira/browse/TIKA-713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103394#comment-13103394
]
Robert Muir commented on TIKA-713:
--
Thanks Ahmad... I took a look at this PDF and I suspect
[
https://issues.apache.org/jira/browse/TIKA-708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maxim Valyanskiy updated TIKA-708:
--
Comment: was deleted
(was: This bug required additional commit to Tika, r1169702. )
NPE
[
https://issues.apache.org/jira/browse/TIKA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103469#comment-13103469
]
Nick Burch commented on TIKA-431:
-
Any chance someone could work up a failing unit test for
[
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13103679#comment-13103679
]
Michael McCandless commented on TIKA-712:
-
OK I opened
[
https://issues.apache.org/jira/browse/TIKA-712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless updated TIKA-712:
Attachment: testPPT_masterFooter2.pptx
testPPT_masterFooter2.ppt
Corrected
Word art isn't extracted for various doc types
--
Key: TIKA-714
URL: https://issues.apache.org/jira/browse/TIKA-714
Project: Tika
Issue Type: Bug
Reporter: Michael McCandless