[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2022-04-19 Thread Alexander Bias (Jira)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17524341#comment-17524341 ] Alexander Bias commented on TIKA-2359: -- === English version follows === Sehr geehrter Absender,

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-07-06 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16076480#comment-16076480 ] Tim Allison commented on TIKA-2359: --- We should make the warning a static-level-once-only warn, and the

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-18 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015808#comment-16015808 ] Chris A. Mattmann commented on TIKA-2359: - totally agree! this is good for 1.15! thanks Tim and

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-18 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015636#comment-16015636 ] Hudson commented on TIKA-2359: -- SUCCESS: Integrated in Jenkins build Tika-trunk #1270 (See

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-18 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015601#comment-16015601 ] Luis Filipe Nassif commented on TIKA-2359: -- Hi [~talli...@mitre.org]! I am ok with the message for

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-18 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015571#comment-16015571 ] Tim Allison commented on TIKA-2359: --- How about: {noformat} LOG.info("Tesseract OCR is

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008953#comment-16008953 ] Luis Filipe Nassif commented on TIKA-2359: -- Also, in the long run, disabling by default non pure

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008843#comment-16008843 ] Luis Filipe Nassif commented on TIKA-2359: -- Thank you Chris! Reviewing Tika-93, the original

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008782#comment-16008782 ] Chris A. Mattmann commented on TIKA-2359: - Hi [~lfcnassif] great points. Your point here: bq. I

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008774#comment-16008774 ] Luis Filipe Nassif commented on TIKA-2359: -- Hi Cris, thank you! I think this issue demonstrates a

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008628#comment-16008628 ] Chris A. Mattmann commented on TIKA-2359: - This is a tough one. In general I'd be fine to add a

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008559#comment-16008559 ] Luis Filipe Nassif commented on TIKA-2359: -- Still +1 to disable ocr by default. > Extreme slow

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008188#comment-16008188 ] Tim Allison commented on TIKA-2359: --- bq. Beside that, you should really claim something else for tike.

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008121#comment-16008121 ] Eugen Mayer commented on TIKA-2359: --- [~talli...@mitre.org] well i am very good at shouting - but thats

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008102#comment-16008102 ] Tim Allison commented on TIKA-2359: --- And in response to this apparent (er, real) dichotomy, to paraphrase

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008094#comment-16008094 ] Tim Allison commented on TIKA-2359: --- For broader community feedback, I put this to vote on twitter:

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008057#comment-16008057 ] Tim Allison commented on TIKA-2359: --- The other argument in favor of turning off tesseract by default is

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16008048#comment-16008048 ] Luis Filipe Nassif commented on TIKA-2359: -- The problem is back compat. Some users now expect ocr

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-12 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007718#comment-16007718 ] Eugen Mayer commented on TIKA-2359: --- Guys as far as i understood you just explained that you 1. Are not

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007416#comment-16007416 ] Tim Allison commented on TIKA-2359: --- Sorry, took me a while to dig into this. I hadn't seen our

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16007276#comment-16007276 ] Luis Filipe Nassif commented on TIKA-2359: -- In the past I was against enabling tesseract by

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006750#comment-16006750 ] Eugen Mayer commented on TIKA-2359: --- any informations how the binaries are called and how to disable

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006697#comment-16006697 ] Tim Allison commented on TIKA-2359: --- IIRC, might also want to check ExifTool and Strings...which I think

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006692#comment-16006692 ] Eugen Mayer commented on TIKA-2359: --- Anyways, case closed, thank you for the quick response > Extreme

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006689#comment-16006689 ] Eugen Mayer commented on TIKA-2359: --- oh holy..seriously? By default OCR by simply having a lib installed

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006680#comment-16006680 ] Tim Allison commented on TIKA-2359: --- y, Tika will call tesseract on every image file in your document,

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006667#comment-16006667 ] Eugen Mayer commented on TIKA-2359: --- interestingly, no, i get an option list: tesseract Usage:

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006665#comment-16006665 ] Tim Allison commented on TIKA-2359: --- Doh, right, tika-app. Thank you. To confirm, if you type

[jira] [Commented] (TIKA-2359) Extreme slow parsing on the attachment attached

2017-05-11 Thread Eugen Mayer (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006661#comment-16006661 ] Eugen Mayer commented on TIKA-2359: --- thats my call java -jar tika.jar Sample-doc-file-2000kb.doc So to