[ https://issues.apache.org/jira/browse/BEAM-5108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17123188#comment-17123188 ]
Beam JIRA Bot commented on BEAM-5108: ------------------------------------- This issue is P2 but has been unassigned without any comment for 60 days so it has been labeled "stale-P2". If this issue is still affecting you, we care! Please comment and remove the label. Otherwise, in 14 days the issue will be moved to P3. Please see https://beam.apache.org/contribute/jira-priorities/ for a detailed explanation of what these priorities mean. > Improve Python test framework to prevent streaming pipeline leaks > ----------------------------------------------------------------- > > Key: BEAM-5108 > URL: https://issues.apache.org/jira/browse/BEAM-5108 > Project: Beam > Issue Type: Task > Components: testing > Reporter: Mark Liu > Priority: P2 > Labels: stale-P2 > > Recently, few Python streaming pipelines on Dataflow apache-beam-testing > project run for more than 5 days. This look like a leaking from Jenkins job > that runs e2e integration tests. > Test framework has a pipeline resource clean up and applies to all > integration test, which is defined in > [TestDataflowRunner|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L67]. > However, the cancellation may failed in a special case, like following (from > [this Jenkins > run|https://builds.apache.org/view/A-D/view/Beam/job/beam_PostCommit_Python_Verify/5636/consoleFull]): > {quote} > Workflow modification failed. Causes: (c53cc746f7bc7f49): Operation cancel > not allowed for job 2018-08-01_13_10_24-5019826606522054507. Job is not yet > ready for canceling. Please retry in a few minutes. > {quote} > Two possible approaches to improve: > 1. Add retry to the framework cancellation. > 2. Instead of wait until pipeline in RUNNING state > ([here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L57]), > we want to wait more to make sure worker pool starts successfully. -- This message was sent by Atlassian Jira (v8.3.4#803005)