[
https://issues.apache.org/jira/browse/TIKA-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346160#comment-16346160
]
Matt Sheppard edited comment on TIKA-2559 at 1/31/18 2:36 AM:
--
[
https://issues.apache.org/jira/browse/TIKA-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-2559:
Attachment: acrobat-xi-pdf-accessibility-overview.pdf
> Expose language metadata from PDF documents
>
[
https://issues.apache.org/jira/browse/TIKA-2559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16346160#comment-16346160
]
Matt Sheppard commented on TIKA-2559:
-
Ah,
[http://itaccessibility.arizona.edu/sites/i
Matt Sheppard created TIKA-2559:
---
Summary: Expose language metadata from PDF documents
Key: TIKA-2559
URL: https://issues.apache.org/jira/browse/TIKA-2559
Project: Tika
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/TIKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14733239#comment-14733239
]
Matt Sheppard commented on TIKA-1730:
-
File in question can be downloaded from
https:/
[
https://issues.apache.org/jira/browse/TIKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-1730:
Description:
Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below,
which used t
[
https://issues.apache.org/jira/browse/TIKA-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-1730:
Description:
Noticed while upgrading form Tika 1.8 to 1.10 - An .xls file linked below,
which used t
Matt Sheppard created TIKA-1730:
---
Summary: Excel to HTML filtering seems to produce some font
setting gibberish in output
Key: TIKA-1730
URL: https://issues.apache.org/jira/browse/TIKA-1730
Project: Tik
[
https://issues.apache.org/jira/browse/TIKA-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392196#comment-14392196
]
Matt Sheppard commented on TIKA-1590:
-
Great, thanks - Looking forward to 1.8!
> A par
[
https://issues.apache.org/jira/browse/TIKA-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-1590:
Attachment: National_Audit_tool_CTH_Audit_Report_PDF,_292_KB.pdf
jstack.txt
Attached
Matt Sheppard created TIKA-1590:
---
Summary: A particular PDF seems to trigger an infinite loop when
being converted to HTML
Key: TIKA-1590
URL: https://issues.apache.org/jira/browse/TIKA-1590
Project: Ti
[
https://issues.apache.org/jira/browse/TIKA-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13772821#comment-13772821
]
Matt Sheppard commented on TIKA-1174:
-
On further investigation the characters fall in
[
https://issues.apache.org/jira/browse/TIKA-1174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-1174:
Attachment: map_sp_1c_a4.pdf
Attached a copy of the PDF in question in case the site is changed.
Matt Sheppard created TIKA-1174:
---
Summary: Invalid characters in filtered PDF output
Key: TIKA-1174
URL: https://issues.apache.org/jira/browse/TIKA-1174
Project: Tika
Issue Type: Bug
E
[
https://issues.apache.org/jira/browse/TIKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-911:
---
Attachment: Rust Biosecurity Brochure.pdf.html
> Converted PDF document contains question marks in
[
https://issues.apache.org/jira/browse/TIKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266502#comment-13266502
]
Matt Sheppard commented on TIKA-911:
Confirmed that it still occurs for me on a differen
[
https://issues.apache.org/jira/browse/TIKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266486#comment-13266486
]
Matt Sheppard commented on TIKA-911:
Interesting - I was running Mac OS 10.7.3. Will con
[
https://issues.apache.org/jira/browse/TIKA-911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-911:
---
Attachment: Rust Biosecurity Brochure.pdf
Attached PDF document in case is removed from the source site
Matt Sheppard created TIKA-911:
--
Summary: Converted PDF document contains question marks in place
of spaces and inconsistent case
Key: TIKA-911
URL: https://issues.apache.org/jira/browse/TIKA-911
Project
[
https://issues.apache.org/jira/browse/TIKA-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011000#comment-13011000
]
Matt Sheppard edited comment on TIKA-621 at 3/24/11 11:26 PM:
--
[
https://issues.apache.org/jira/browse/TIKA-621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13011000#comment-13011000
]
Matt Sheppard commented on TIKA-621:
I have reported this issue via
http://bugreport.su
[
https://issues.apache.org/jira/browse/TIKA-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-621:
---
Description:
I've run across an RTF documents which tika is failing to convert on 64bit
platforms (Win
[
https://issues.apache.org/jira/browse/TIKA-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Sheppard updated TIKA-621:
---
Description:
I've run across an RTF documents which tika is failing to convert on 64bit
platforms (Win
RTF parsing fails with Java 7 early access on 64bit platforms
-
Key: TIKA-621
URL: https://issues.apache.org/jira/browse/TIKA-621
Project: Tika
Issue Type: Bug
Components:
24 matches
Mail list logo