[ 
https://issues.apache.org/jira/browse/STANBOL-217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13043435#comment-13043435
 ] 

Florent ANDRE commented on STANBOL-217:
---------------------------------------

Seems that you hit the good class ! 

On your point about override the value, I had a look at configuration manager / 
Apache Felix Jetty Based Http Service but don't see a configuration for that... 
And reading class show that the value is hard coded.
Don't see for now how to directly deal with the API.

This timeout error discovery don't have for origin the ContentHub, but the 
/engines endpoint.

The main problem with this timeout is that's a global value for all the 
processing chain and not for each engine... 

So even if each engine do his job in less than 1 minute, an aggregation of many 
engines working on a some particularly big documents (let's says a PHD thesis 
of 400 pages) can reach the "1 minute" limit.

I may be wrong, put this seems to me an important limit, and something that can 
create some unpredictable errors depending on the size / number of entities of 
the document (even if all is offline).

> Plateform timeout when engine process more than 1 minute
> --------------------------------------------------------
>
>                 Key: STANBOL-217
>                 URL: https://issues.apache.org/jira/browse/STANBOL-217
>             Project: Stanbol
>          Issue Type: Bug
>          Components: Enhancer
>            Reporter: Florent ANDRE
>         Attachments: global-timeout.patch
>
>
> After a first mail [1], I do some clean and investigations, and it seems that 
> a kind of timeout occur somewhere in the platform when engine processing take 
> more than 1 minute.
> This seems not related to an uncaught Throwable or OutOfMemoryError, or at 
> least nothing appear in the logs or console.
> This is just on the browser side that the timeout occur, the engine continue 
> to process.
> With the use of this curl call : 
> time curl -m 600 -X POST -H "Accept: text/turtle" -H "Content-type: 
> text/plain" -F "data=@document-important" http://localhost:8080/engines
> Browser timeout can be an eliminate usual suspect as -m means :
>  -m/--max-time <seconds>
>               Maximum time in seconds that you allow the whole operation to 
> take.  This is useful for preventing your batch jobs from hanging for hours 
> due to slow
>               networks or links going down.  See also the --connect-timeout 
> option.
> In order to test this well, I create an engine that do nothing more than 
> wait. This engine is really simple, have just one parameter that allow to fix 
> the wait period in ms. By default it's set to 1 minute (60000) that cause the 
> bug. If you set this wait time lower (57000 for example), this work.
> 1 minute processing is important, but not huge if we consider big text files 
> and a complete enhancement chain.
> [1] 
> http://mail-archives.apache.org/mod_mbox/incubator-stanbol-dev/201105.mbox/%[email protected]%3E

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to