[
https://issues.apache.org/jira/browse/TIKA-3582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-3582.
-------------------------------
Fix Version/s: 2.1.1
Assignee: Tim Allison
Resolution: Fixed
I added an override in the tesseract parser for timeouts per parse.
Eventually, we can get around to deprecating the current methods.
Users can now add {{X-Tika-Timeout-Millis}} via the headers for /rmeta and
/tika. This value cannot be greater than {{taskTimeoutMillis}} as configured
for tika-server. The reason I made this choice was out of concern for
security. There may be some circumstances where folks hosting the server would
not want clients to set whatever they felt was a reasonable timeout.
So, for now, the server should have the largest {{taskTimeoutMillis}} desired,
but the clients should specify a smaller limit.
I'm not held to this decision. Please reopen if this doesn't make sense.
> Tika does not respect a configuration value passed over a HTTP Header
> ---------------------------------------------------------------------
>
> Key: TIKA-3582
> URL: https://issues.apache.org/jira/browse/TIKA-3582
> Project: Tika
> Issue Type: Bug
> Components: server
> Affects Versions: 2.1.0
> Reporter: dataminer.accolade
> Assignee: Tim Allison
> Priority: Major
> Fix For: 2.1.1
>
> Attachments: sampleimage.png
>
>
>
> I think the value of TikaServerConfig.TaskTimeoutMillis should be overridden
> for the current request over *X-Tika-OCRTimeoutSeconds* header. The following
> request takes more than 120 seconds.
> *curl -vvv -X PUT -T sampleimage.png http://localhost:9998/tika --header
> "X-Tika-OCRTimeoutSeconds: 600"*
>
> Tesserect is configured with tessdata_best models
--
This message was sent by Atlassian Jira
(v8.3.4#803005)