[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17524341#comment-17524341
]
Alexander Bias commented on TIKA-2359:
--
=== English version follows ===
Sehr geehrter Absender,
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076480#comment-16076480
]
Tim Allison commented on TIKA-2359:
---
We should make the warning a static-level-once-only warn, and the
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015808#comment-16015808
]
Chris A. Mattmann commented on TIKA-2359:
-
totally agree! this is good for 1.15! thanks Tim and
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015636#comment-16015636
]
Hudson commented on TIKA-2359:
--
SUCCESS: Integrated in Jenkins build Tika-trunk #1270 (See
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015601#comment-16015601
]
Luis Filipe Nassif commented on TIKA-2359:
--
Hi [~talli...@mitre.org]! I am ok with the message for
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015571#comment-16015571
]
Tim Allison commented on TIKA-2359:
---
How about:
{noformat}
LOG.info("Tesseract OCR is
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008953#comment-16008953
]
Luis Filipe Nassif commented on TIKA-2359:
--
Also, in the long run, disabling by default non pure
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008843#comment-16008843
]
Luis Filipe Nassif commented on TIKA-2359:
--
Thank you Chris! Reviewing Tika-93, the original
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008782#comment-16008782
]
Chris A. Mattmann commented on TIKA-2359:
-
Hi [~lfcnassif] great points.
Your point here:
bq. I
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008774#comment-16008774
]
Luis Filipe Nassif commented on TIKA-2359:
--
Hi Cris, thank you!
I think this issue demonstrates a
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008628#comment-16008628
]
Chris A. Mattmann commented on TIKA-2359:
-
This is a tough one. In general I'd be fine to add a
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008559#comment-16008559
]
Luis Filipe Nassif commented on TIKA-2359:
--
Still +1 to disable ocr by default.
> Extreme slow
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008188#comment-16008188
]
Tim Allison commented on TIKA-2359:
---
bq. Beside that, you should really claim something else for tike.
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008121#comment-16008121
]
Eugen Mayer commented on TIKA-2359:
---
[~talli...@mitre.org] well i am very good at shouting - but thats
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008102#comment-16008102
]
Tim Allison commented on TIKA-2359:
---
And in response to this apparent (er, real) dichotomy, to paraphrase
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008094#comment-16008094
]
Tim Allison commented on TIKA-2359:
---
For broader community feedback, I put this to vote on twitter:
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008057#comment-16008057
]
Tim Allison commented on TIKA-2359:
---
The other argument in favor of turning off tesseract by default is
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008048#comment-16008048
]
Luis Filipe Nassif commented on TIKA-2359:
--
The problem is back compat. Some users now expect ocr
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007718#comment-16007718
]
Eugen Mayer commented on TIKA-2359:
---
Guys as far as i understood you just explained that you
1. Are not
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007416#comment-16007416
]
Tim Allison commented on TIKA-2359:
---
Sorry, took me a while to dig into this. I hadn't seen our
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007276#comment-16007276
]
Luis Filipe Nassif commented on TIKA-2359:
--
In the past I was against enabling tesseract by
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006750#comment-16006750
]
Eugen Mayer commented on TIKA-2359:
---
any informations how the binaries are called and how to disable
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006697#comment-16006697
]
Tim Allison commented on TIKA-2359:
---
IIRC, might also want to check ExifTool and Strings...which I think
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006692#comment-16006692
]
Eugen Mayer commented on TIKA-2359:
---
Anyways, case closed, thank you for the quick response
> Extreme
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006689#comment-16006689
]
Eugen Mayer commented on TIKA-2359:
---
oh holy..seriously? By default OCR by simply having a lib installed
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006680#comment-16006680
]
Tim Allison commented on TIKA-2359:
---
y, Tika will call tesseract on every image file in your document,
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006667#comment-16006667
]
Eugen Mayer commented on TIKA-2359:
---
interestingly, no, i get an option list:
tesseract
Usage:
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006665#comment-16006665
]
Tim Allison commented on TIKA-2359:
---
Doh, right, tika-app. Thank you.
To confirm, if you type
[
https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006661#comment-16006661
]
Eugen Mayer commented on TIKA-2359:
---
thats my call
java -jar tika.jar Sample-doc-file-2000kb.doc
So to
29 matches
Mail list logo