[
https://issues.apache.org/jira/browse/TIKA-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Tran updated TIKA-423:
Attachment: tika_test.docx
output.txt
Test docx file, and the output produced by TIka 0.7
>
Parse docx and output to text file missing words
Key: TIKA-423
URL: https://issues.apache.org/jira/browse/TIKA-423
Project: Tika
Issue Type: Bug
Components: parser
Affects Versio
[
https://issues.apache.org/jira/browse/TIKA-423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Tran updated TIKA-423:
Description:
I created a word document using Word 2007 on a Windows Server 2003 machine
(using Remote deskto
[
https://issues.apache.org/jira/browse/TIKA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867989#action_12867989
]
Ken Krugler commented on TIKA-420:
--
Also, do you have a small set of HMTL documents that cou
[
https://issues.apache.org/jira/browse/TIKA-420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867988#action_12867988
]
Ken Krugler commented on TIKA-420:
--
Hi Christian,
I took a look at the patch just now. I'd