[ 
https://issues.apache.org/jira/browse/FLINK-2586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717109#comment-14717109
 ] 

Stephan Ewen commented on FLINK-2586:
-------------------------------------

That is not really the problem here.

The problem is that the test sleeps 10 seconds and thinks that this is enough 
time to produce the output that it verifies after the sleep. But it sometimes 
is not, if the container in the CI infrastructure stalls for some reason.

> Unstable Storm Compatibility Tests
> ----------------------------------
>
>                 Key: FLINK-2586
>                 URL: https://issues.apache.org/jira/browse/FLINK-2586
>             Project: Flink
>          Issue Type: Bug
>          Components: Storm Compatibility
>    Affects Versions: 0.10
>            Reporter: Stephan Ewen
>            Priority: Critical
>             Fix For: 0.10
>
>
> The Storm Compatibility tests frequently fail.
> The reason is that they kill the topologies after a certain time interval. 
> That may fail on CI infrastructure when certain steps are delayed beyond 
> usual. Trying to guarantee progress by time is inherently problematic:
>   - Waiting too short makes tests unstable
>   - Waiting too long makes tests slow
> The right way to go is letting the program decide when to terminate, for 
> example by throwing a special {{SuccessException}}.
> Have a look at the Kafka connector tests, they do this a lot and hence run 
> exactly as short or as long as they need to.
> Here is an example of a failed run: 
> https://s3.amazonaws.com/archive.travis-ci.org/jobs/77499577/log.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to