[ 
https://issues.apache.org/jira/browse/FLINK-7352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119545#comment-16119545
 ] 

ASF GitHub Bot commented on FLINK-7352:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/4501

    [FLINK-7352] [tests] Stabilize ExecutionGraphRestartTest

    ## What is the purpose of the change
    
    Introduce an explicit waiting for the deployment of tasks. This replaces 
the loose
    ordering induced by Thread.sleep and fixes the race conditions caused by it.
    
    ## Brief change log
    
    - Introduce `WaitForTasks` consumer which is given to the 
`SimpleAckingTaskManagerGateway`
    - Using a single `SimpleAckingTaskManagerGateway` to receive all task 
submission calls
    
    ## Verifying this change
    
    This change is a trivial rework / code cleanup without any test coverage.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink 
fixExecutionGraphRestartTest

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4501.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4501
    
----
commit 40cd0c860dd600ce2baa69b0f0ba8cf7a787ff63
Author: Till Rohrmann <trohrm...@apache.org>
Date:   2017-08-09T07:57:56Z

    [FLINK-7352] [tests] Stabilize ExecutionGraphRestartTest
    
    Introduce an explicit waiting for the deployment of tasks. This replaces 
the loose
    ordering induced by Thread.sleep and fixes the race conditions caused by it.

----


> ExecutionGraphRestartTest timeouts
> ----------------------------------
>
>                 Key: FLINK-7352
>                 URL: https://issues.apache.org/jira/browse/FLINK-7352
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination, Tests
>    Affects Versions: 1.4.0, 1.3.2
>            Reporter: Nico Kruber
>            Assignee: Till Rohrmann
>            Priority: Critical
>              Labels: test-stability
>
> Recently, I received timeouts from some tests in 
> {{ExecutionGraphRestartTest}} like this
> {code}
> Tests in error: 
>   ExecutionGraphRestartTest.testConcurrentLocalFailAndRestart:638 ยป Timeout
> {code}
> This particular instance is from 1.3.2 RC2 and stuck in 
> {{ExecutionGraphTestUtils#waitUntilDeployedAndSwitchToRunning()}} but I also 
> had instances stuck in {{ExecutionGraphTestUtils#waitUntilJobStatus}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to