[ 
https://issues.apache.org/jira/browse/BEAM-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123188#comment-17123188
 ] 

Beam JIRA Bot commented on BEAM-5108:
-------------------------------------

This issue is P2 but has been unassigned without any comment for 60 days so it 
has been labeled "stale-P2". If this issue is still affecting you, we care! 
Please comment and remove the label. Otherwise, in 14 days the issue will be 
moved to P3.

Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed 
explanation of what these priorities mean.


> Improve Python test framework to prevent streaming pipeline leaks
> -----------------------------------------------------------------
>
>                 Key: BEAM-5108
>                 URL: https://issues.apache.org/jira/browse/BEAM-5108
>             Project: Beam
>          Issue Type: Task
>          Components: testing
>            Reporter: Mark Liu
>            Priority: P2
>              Labels: stale-P2
>
> Recently, few Python streaming pipelines on Dataflow apache-beam-testing 
> project run for more than 5 days. This look like a leaking from Jenkins job 
> that runs e2e integration tests.
> Test framework has a pipeline resource clean up and applies to all 
> integration test, which is defined in 
> [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L67].
>  However, the cancellation may failed in a special case, like following (from 
> [this Jenkins 
> run|https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/5636/consoleFull]):
> {quote}
> Workflow modification failed. Causes: (c53cc746f7bc7f49): Operation cancel 
> not allowed for job 2018-08-01_13_10_24-5019826606522054507. Job is not yet 
> ready for canceling. Please retry in a few minutes.
> {quote}
> Two possible approaches to improve:
> 1. Add retry to the framework cancellation.
> 2. Instead of wait until pipeline in RUNNING state 
> ([here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L57]),
>  we want to wait more to make sure worker pool starts successfully.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to