[
https://issues.apache.org/jira/browse/TIKA-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16809316#comment-16809316
]
Tim Allison commented on TIKA-2847:
---
The main {{document.xml}} decompresses to ~100MB...which is not
Ashish Tiwari created TIKA-2847:
---
Summary: OutOfMemoryError - tika1.19.1.jar
Key: TIKA-2847
URL: https://issues.apache.org/jira/browse/TIKA-2847
Project: Tika
Issue Type: Bug
Affects
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808996#comment-16808996
]
Hudson commented on TIKA-2846:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1639 (See
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808966#comment-16808966
]
Hudson commented on TIKA-2846:
--
UNSTABLE: Integrated in Jenkins build tika-2.x-windows #395 (See
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808939#comment-16808939
]
Hudson commented on TIKA-2846:
--
SUCCESS: Integrated in Jenkins build tika-branch-1x #176 (See
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808893#comment-16808893
]
Tim Allison commented on TIKA-2846:
---
Thank you, again, [~tilman]!
> Add per page unicode mapping stats
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2846.
---
Resolution: Fixed
Assignee: Tim Allison
Fix Version/s: 1.21
> Add per page unicode
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2846:
--
Description:
As part of TIKA-2749, it would be useful to gather stats on characters that did
not have
[
https://issues.apache.org/jira/browse/TIKA-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2846:
--
Description:
As part of TIKA-2749, it would be useful to gather stats on characters that did
not have
Tim Allison created TIKA-2846:
-
Summary: Add per page unicode mapping stats to the metadata in the
PDFParser
Key: TIKA-2846
URL: https://issues.apache.org/jira/browse/TIKA-2846
Project: Tika
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808821#comment-16808821
]
Hudson commented on TIKA-2845:
--
SUCCESS: Integrated in Jenkins build tika-branch-1x #175 (See
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808818#comment-16808818
]
Hudson commented on TIKA-2845:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1638 (See
[
https://issues.apache.org/jira/browse/TIKA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808791#comment-16808791
]
Tim Allison commented on TIKA-2749:
---
Thank you, [~tilman]!
> OCR on PDFs should "just work" out of the
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808787#comment-16808787
]
Hudson commented on TIKA-2845:
--
UNSTABLE: Integrated in Jenkins build tika-2.x-windows #394 (See
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-2845.
---
Resolution: Fixed
Assignee: Tim Allison
Fix Version/s: 1.21
> Override ProcessPages
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2845:
--
Description:
On the PDFBox user list, [~lehmi] confirmed (and [~tilman] clarified) that
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808708#comment-16808708
]
Tim Allison edited comment on TIKA-2845 at 4/3/19 1:17 PM:
---
The attached file
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808708#comment-16808708
]
Tim Allison commented on TIKA-2845:
---
The attached file opens in Adobe, has no "contents" element but
[
https://issues.apache.org/jira/browse/TIKA-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-2845:
--
Attachment: testPDFFileEmbInAnnotation_noContents.pdf
> Override ProcessPages in PDFTextStripper
>
Tim Allison created TIKA-2845:
-
Summary: Override ProcessPages in PDFTextStripper
Key: TIKA-2845
URL: https://issues.apache.org/jira/browse/TIKA-2845
Project: Tika
Issue Type: Task
[
https://issues.apache.org/jira/browse/TIKA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808400#comment-16808400
]
Tilman Hausherr commented on TIKA-2749:
---
See the accepted answer here:
21 matches
Mail list logo