[ 
https://issues.apache.org/jira/browse/TIKA-1555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14329474#comment-14329474
 ] 

Uwe Schindler commented on TIKA-1555:
-------------------------------------

bq. You can also disable OCR by setting the Tesseract path to "" in theĀ 
TesseractOCRConfig.

This did not work. If this would disable the fork I would be happy. But it just 
disables parser as side effect because it tries to fork an invalid process path 
which is created from empty string and sone sufix.

> posix_spawn is not a supported process launch mechanism on this platform
> ------------------------------------------------------------------------
>
>                 Key: TIKA-1555
>                 URL: https://issues.apache.org/jira/browse/TIKA-1555
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.7
>         Environment: MacOS X 10.10.2
>            Reporter: David Pilato
>            Assignee: Tyler Palsulich
>              Labels: ocr, parser
>
> It could happen on some systems that posix_spawn is not a supported process 
> launch mechanism.
> We are doing random testing which simulates different kind of Locale so I 
> could sometime hit that issue:
> {code}
> java.lang.Error: posix_spawn is not a supported process launch mechanism on 
> this platform.
>       at java.lang.UNIXProcess$1.run(UNIXProcess.java:104)
>       at java.lang.UNIXProcess$1.run(UNIXProcess.java:93)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at java.lang.UNIXProcess.<clinit>(UNIXProcess.java:91)
>       at java.lang.ProcessImpl.start(ProcessImpl.java:130)
>       at java.lang.ProcessBuilder.start(ProcessBuilder.java:1022)
>       at java.lang.Runtime.exec(Runtime.java:617)
>       at java.lang.Runtime.exec(Runtime.java:485)
>       at 
> org.apache.tika.parser.external.ExternalParser.check(ExternalParser.java:344)
>       at 
> org.apache.tika.parser.ocr.TesseractOCRParser.hasTesseract(TesseractOCRParser.java:117)
>       at 
> org.apache.tika.parser.ocr.TesseractOCRParser.getSupportedTypes(TesseractOCRParser.java:90)
>       at 
> org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
>       at 
> org.apache.tika.parser.DefaultParser.getParsers(DefaultParser.java:95)
>       at 
> org.apache.tika.parser.CompositeParser.getSupportedTypes(CompositeParser.java:229)
>       at 
> org.apache.tika.parser.CompositeParser.getParsers(CompositeParser.java:81)
>       at 
> org.apache.tika.parser.CompositeParser.getParser(CompositeParser.java:209)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
>       at org.apache.tika.Tika.parseToString(Tika.java:506)
> {code}
> It sounds like it's related to this: 
> http://java.thedizzyheights.com/2014/07/java-error-posix_spawn-is-not-a-supported-process-launch-mechanism-on-this-platform-when-trying-to-spawn-a-process/
> Though I have hard time to reproduce it!
> BTW I wonder if we could add a setting which can return {{false}} for 
> {{TesseractOCRParser#hasTesseract}} even if we have tesseract available.
> For example, let say that my machine shares multiple application and for one 
> of them I don't want any OCR on my documents.
> Hope this helps.
> Let me know if you need more details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to