[jira] [Commented] (TIKA-3267) Method getEnableImageProcessing() in TesseractOCRConfig should be renamed

2021-01-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263716#comment-17263716 ] Hudson commented on TIKA-3267: -- UNSTABLE: Integrated in Jenkins build Tika » tika-main-jdk8 #

[jira] [Resolved] (TIKA-3267) Method getEnableImageProcessing() in TesseractOCRConfig should be renamed

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-3267. --- Fix Version/s: 2.0.0 Assignee: Tim Allison Resolution: Fixed > Method getEnableImagePr

[jira] [Commented] (TIKA-3267) Method getEnableImageProcessing() in TesseractOCRConfig should be renamed

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263661#comment-17263661 ] Tim Allison commented on TIKA-3267: --- There are numerous cases of this throughout the cod

[jira] [Commented] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263644#comment-17263644 ] Tim Allison commented on TIKA-3270: --- Hahaha...the former...do we have one within our sma

[jira] [Commented] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263638#comment-17263638 ] Tilman Hausherr commented on TIKA-3270: --- If your question was whether there is a met

[jira] [Commented] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263634#comment-17263634 ] Tilman Hausherr commented on TIKA-3270: --- Don't know, but I just created one by print

[jira] [Updated] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated TIKA-3270: -- Attachment: tiger.pdf > Render non-text in PDFs for OCR > --- > >

[jira] [Commented] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263623#comment-17263623 ] Tim Allison commented on TIKA-3270: --- Thank you [~tilman] !  Are examples of vector graph

[jira] [Commented] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263618#comment-17263618 ] Tilman Hausherr commented on TIKA-3270: --- {quote} to render only the image components

[jira] [Commented] (TIKA-3271) Change default image resize size in TesseractParser's pre-processing step

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263576#comment-17263576 ] Tim Allison commented on TIKA-3271: --- 900% was the original value added as part of TIKA-2

[jira] [Created] (TIKA-3271) Change default image resize size in TesseractParser's pre-processing step

2021-01-12 Thread Tim Allison (Jira)
Tim Allison created TIKA-3271: - Summary: Change default image resize size in TesseractParser's pre-processing step Key: TIKA-3271 URL: https://issues.apache.org/jira/browse/TIKA-3271 Project: Tika

[jira] [Commented] (TIKA-3258) Run OCR on PDFs with 'auto' mode as default in Tika 2.0.0

2021-01-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17263437#comment-17263437 ] Tim Allison commented on TIKA-3258: --- Opened TIKA-3270 to track [~lfcnassif]'s recommenda

[jira] [Created] (TIKA-3270) Render non-text in PDFs for OCR

2021-01-12 Thread Tim Allison (Jira)
Tim Allison created TIKA-3270: - Summary: Render non-text in PDFs for OCR Key: TIKA-3270 URL: https://issues.apache.org/jira/browse/TIKA-3270 Project: Tika Issue Type: Improvement Repo