[ https://issues.apache.org/jira/browse/BEAM-5812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16662925#comment-16662925 ]
Mark Liu edited comment on BEAM-5812 at 10/24/18 10:18 PM: ----------------------------------------------------------- To Henning's comment, the full error message should be: RuntimeError: Timeout after 120 seconds while waiting for job <...> enters expected state CANCELLED. Current state is CANCELLING. The job should run successfully but take long time to be cancelled. From [dataflow job console|https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-10-24_10_32_33-9690855475082478889?project=apache-beam-testing], we can find it took ~5mins to stop worker pool, however the default timeout is 120s defined [here|https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/dataflow/test_dataflow_runner.py#L35]. This error should be different with previous IT failures. The exception message contains too much job details that makes triage difficult. I'll fix it to only include job id. was (Author: markflyhigh): To Henning's comment, the full error message should be: RuntimeError: Timeout after 120 seconds while waiting for job <...> enters expected state CANCELLED. Current state is CANCELLING. The job should run successfully but take long time to be cancelled. From [dataflow job console|https://console.cloud.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-10-24_10_32_33-9690855475082478889?project=apache-beam-testing], we can find it took ~5mins to stop worker pool. This error should be different with previous IT failures. The exception message contains too much job details that makes triage difficult. I'll fix it to only include job id. > Timeout in Python ITs: WordCountIT (fn api and legacy), HourlyTeamScoreIT, > FastavroIT > ------------------------------------------------------------------------------------- > > Key: BEAM-5812 > URL: https://issues.apache.org/jira/browse/BEAM-5812 > Project: Beam > Issue Type: Bug > Components: test-failures > Reporter: Kenneth Knowles > Assignee: Valentyn Tymofieiev > Priority: Critical > Labels: flake > > [https://builds.apache.org/job/beam_PostCommit_Python_Verify/6341/] > [https://scans.gradle.com/s/ivjuxhni54azk/console-log?task=:beam-sdks-python:postCommitITTests] > I don't see anything about these tests that would say much. It could be > environmental but it might just imply that all the timeouts should be higher > if they are sensitive, or their was (and continues to be) an outage of some > sort. > Assignee chosen because I see you as reviewer of avro changes and author of > other changes to the py sdk in the last few days. If you have no idea, that's > fine, I just didn't want to leave it unassigned. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)