[jira] [Commented] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-20 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371960#comment-14371960 ] Tomas Safarik commented on TIKA-1194: - Sorry but no. 1) I don't have the source code.

[jira] [Updated] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-18 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik updated TIKA-1194: Attachment: apache-tika-1.5.patch Just for information. Patch of our changes that workarounds the pro

[jira] [Commented] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-16 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363062#comment-14363062 ] Tomas Safarik commented on TIKA-1194: - I was finally able to prepare version of documen

[jira] [Comment Edited] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-16 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363062#comment-14363062 ] Tomas Safarik edited comment on TIKA-1194 at 3/16/15 10:57 AM: --

[jira] [Updated] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-16 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik updated TIKA-1194: Attachment: OP-06-015.doc > Missing text from MS Word (DOC) file > --

[jira] [Reopened] (TIKA-1194) Missing text from MS Word (DOC) file

2015-03-16 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik reopened TIKA-1194: - Sorry for my late response. I still do not have file I can upload. I did more testing and the bug is

[jira] [Commented] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820139#comment-13820139 ] Tomas Safarik commented on TIKA-1194: - I can see the text missing in Apache POI WordToT

[jira] [Commented] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820080#comment-13820080 ] Tomas Safarik commented on TIKA-1194: - Sorry I needed to remove the document because it

[jira] [Updated] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik updated TIKA-1194: Attachment: (was: OP-06-015.doc) > Missing text from MS Word (DOC) file > --

[jira] [Updated] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik updated TIKA-1194: Description: Hello, we noticed that filtered text from some MS Word DOC files is missing one line

[jira] [Updated] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik updated TIKA-1194: Attachment: OP-06-015.doc "mluvil s Milouškem: poslat nabídku" is the problematic line/cell > Missi

[jira] [Created] (TIKA-1194) Missing text from MS Word (DOC) file

2013-11-12 Thread Tomas Safarik (JIRA)
Tomas Safarik created TIKA-1194: --- Summary: Missing text from MS Word (DOC) file Key: TIKA-1194 URL: https://issues.apache.org/jira/browse/TIKA-1194 Project: Tika Issue Type: Bug Compo

[jira] [Commented] (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly.

2012-07-24 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421711#comment-13421711 ] Tomas Safarik commented on TIKA-431: Hello, it seems that I created duplicate issue TIK

[jira] [Closed] (TIKA-952) HTML meta tags ignored for encoding detection

2012-07-24 Thread Tomas Safarik (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tomas Safarik closed TIKA-952. -- Resolution: Duplicate > HTML meta tags ignored for encoding detection > -

[jira] [Created] (TIKA-952) HTML meta tags ignored for encoding detection

2012-07-10 Thread Tomas Safarik (JIRA)
Tomas Safarik created TIKA-952: -- Summary: HTML meta tags ignored for encoding detection Key: TIKA-952 URL: https://issues.apache.org/jira/browse/TIKA-952 Project: Tika Issue Type: Bug