[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609926#comment-15609926
 ] 

ASF GitHub Bot commented on FLINK-4283:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2661


> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15608632#comment-15608632
 ] 

ASF GitHub Bot commented on FLINK-4283:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2661
  
Changes look good to me 👍  Thanks for your contribution @AlexanderShoshin 
:-)

The failing tests are unrelated. Will merge this PR with the next batch.


> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15588998#comment-15588998
 ] 

ASF GitHub Bot commented on FLINK-4283:
---

GitHub user AlexanderShoshin opened a pull request:

https://github.com/apache/flink/pull/2661

[FLINK-4283] ExecutionGraphRestartTest fails

Tests were falling by timeout on less than 3 core CPU machines. There were 
several graph *RestartStrategies* that blocked threads from the 
*ExecutionContext* thread pool by calling '*sleep(Long.MAX_VALUE)*' while 
asynchronous graph restarting. This was a cause for some tests to wait for free 
threads and to terminate by timeout.

I replaced these *RestartStrategies* by a new testing class 
(*InfiniteDelayRestartStrategy*) that has the same functionality - it promises 
to restart execution graph after a very long delay. But it doesn't use new 
threads from the thread pool. So it can't block other tests execution.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/AlexanderShoshin/flink 
FLINK-4283_infinite_restart_strategy

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2661.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2661


commit 92fcba525cec6330ddd14643edb675e27afcbdcc
Author: Alexander Shoshin 
Date:   2016-10-18T10:21:51Z

[FLINK-4283] Use new InfiniteDelayRestartStrategy instead of 
FixedDelayRestartStrategy to avoid blocking threads




> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-18 Thread Alexander Shoshin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15585134#comment-15585134
 ] 

Alexander Shoshin commented on FLINK-4283:
--

It will look like this:
https://github.com/apache/flink/compare/master...AlexanderShoshin:FLINK-4283_infinite_restart_strategy
I replaced all {{Long.MAX_VALUE}} delay strategies with new 
InfiniteDelayRestartStrategy. They will not use new threads at all.

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-17 Thread Alexander Shoshin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582125#comment-15582125
 ] 

Alexander Shoshin commented on FLINK-4283:
--

I agree that these tests shouldn't depends on available threads number.
Adding a _RestartStrategy_ to test casses that will emulate infinite restart 
delay by doing nothing when _restart(...)_ method called sounds good.
Should I do this way?

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582116#comment-15582116
 ] 

Till Rohrmann commented on FLINK-4283:
--

I think we can do this by not using the {{FixedDelayRestartStrategy}} with a 
delay of {{Long.MAX_VALUE}}.

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-17 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582023#comment-15582023
 ] 

Stephan Ewen commented on FLINK-4283:
-

There seems to be something wrong if this blocks threads such that a minimum 
number of threads is needed to complete the test.
Shouldn't we address rather that?

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-17 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581770#comment-15581770
 ] 

Till Rohrmann commented on FLINK-4283:
--

Maybe we could also replace the {{ExecutionContext.global}} by a dedicated 
{{ExecutionContext}}. Then we could control how many threads there are 
available. 

Maybe we could also add a testing {{RestartStrategy}} which will simply not 
call the {{ExecutionGraph.restart}} method. Then we wouldn't block the threads.

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-17 Thread Alexander Shoshin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15581727#comment-15581727
 ] 

Alexander Shoshin commented on FLINK-4283:
--

I found out that these tests will fail if you run them on less than 3 core CPU 
machine.
ExecutionContext.global which is used to restart execution graph in 
ExecutionGraphRestartTest.java sets the pool threads number to the amount of 
available processors by default.
There are 3 tests in the class that will lock thread from the pool by calling 
'sleep(Long.MAX_VALUE)' while asynchronous graph restart.
So no one else can use ExecutionContext to restart execution graph while all 
available threads are sleeping. Thats why some tests fails.

My approach is to set small restart delays that will be enough to finish tests 
successfully:
https://github.com/apache/flink/compare/master...AlexanderShoshin:FLINK-4283%231

But if these 'infinite' delays have some reason to be in code (though i don't 
see any) we can use VM attributes to increase the default pool threads number:
https://github.com/apache/flink/compare/master...AlexanderShoshin:FLINK-4283%232

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-12 Thread Alexander Shoshin (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568940#comment-15568940
 ] 

Alexander Shoshin commented on FLINK-4283:
--

It is still relevant. I have just strated researching it.
These tests finishs successfully when run separately but fails when run 
together with other tests from the same class.
I will write when I find a cause.

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-10-12 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15568819#comment-15568819
 ] 

Stephan Ewen commented on FLINK-4283:
-

Is this issue still relevant, or has this instability been resolved?
[~AlexanderShoshin] what is your approach to fix this?

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>Assignee: Alexander Shoshin
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-08-01 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402332#comment-15402332
 ] 

Till Rohrmann commented on FLINK-4283:
--

I tried to reproduce this issue locally but I was not successful. Could you 
maybe attach the logs for these test cases with log level DEBUG to the issue 
[~Zentol]. Maybe this helps to figure out the problem you're encountering.

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-4283) ExecutionGraphRestartTest fails

2016-07-29 Thread Stephan Ewen (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399484#comment-15399484
 ] 

Stephan Ewen commented on FLINK-4283:
-

To the attention of the distributed coordination shepherd ([~till.rohrmann])

> ExecutionGraphRestartTest fails
> ---
>
> Key: FLINK-4283
> URL: https://issues.apache.org/jira/browse/FLINK-4283
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.1.0
> Environment: Ubuntu 14.04
> W10
>Reporter: Chesnay Schepler
>  Labels: test-stability
>
> I encounter reliable failures for the following tests:
> testRestartAutomatically(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.089 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testRestartAutomatically(ExecutionGraphRestartTest.java:155)
> taskShouldNotFailWhenFailureRateLimitWasNotExceeded(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 2.055 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.restartAfterFailure(ExecutionGraphRestartTest.java:680)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.taskShouldNotFailWhenFailureRateLimitWasNotExceeded(ExecutionGraphRestartTest.java:180)
> testFailingExecutionAfterRestart(org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest)
>   Time elapsed: 120.079 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.executiongraph.ExecutionGraphRestartTest.testFailingExecutionAfterRestart(ExecutionGraphRestartTest.java:397)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)