[ 
https://issues.apache.org/jira/browse/S4-87?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421494#comment-13421494
 ] 

Daniel Gómez Ferro commented on S4-87:
--------------------------------------

I managed to reproduce it with a fetch task that times out, so subsequent tasks 
are rejected.

The proposed patch fixes it, +1
                
> Checkpointing: recovery : avoid rejections upon fetching
> --------------------------------------------------------
>
>                 Key: S4-87
>                 URL: https://issues.apache.org/jira/browse/S4-87
>             Project: Apache S4
>          Issue Type: Bug
>    Affects Versions: 0.5
>            Reporter: Matthieu Morel
>            Assignee: Matthieu Morel
>
> Tests pass fine on macosx with jdk 1.6.0_33 but fail on ubuntu with the same 
> jdk version (oracle).
> Here is the stacktrace: (I added some logging to see the error)
> {code}
> java.util.concurrent.RejectedExecutionException: null
>       at 
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:1768)
>  ~[na:1.6.0_33]
>       at 
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:767) 
> ~[na:1.6.0_33]
>       at 
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:658) 
> ~[na:1.6.0_33]
>       at 
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:92)
>  ~[na:1.6.0_33]
>       at 
> org.apache.s4.core.ft.SafeKeeper.fetchSerializedState(SafeKeeper.java:239) 
> ~[main/:na]
>       at 
> org.apache.s4.core.ProcessingElement.recover(ProcessingElement.java:759) 
> [main/:na]
>       at 
> org.apache.s4.core.ProcessingElement.handleInputEvent(ProcessingElement.java:411)
>  [main/:na]
>       at org.apache.s4.core.Stream.run(Stream.java:299) [main/:na]
>       at java.lang.Thread.run(Thread.java:662) [na:1.6.0_33]
>  [words seen stream] ERROR org.apache.s4.core.ProcessingElement - Cannot 
> fetch serialized stated for [org.apache.s4.wordcount.WordCounterPE/doobie
> {code}
> This could be due to the fact that we use a handoff queue, though it is not 
> clear to me.
> Anyway, since there may be parallel recovery request from different 
> prototypes, it may be more adequate to use a bounded queue, with the 
> possibility to use multiple threads for the fetch operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to