[ https://issues.apache.org/jira/browse/BEAM-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312235#comment-16312235 ]
Ahmet Altay commented on BEAM-3411: ----------------------------------- Here is one of the failing dataflow jobs: https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-01-04_13_16_26-2332266460968770664?project=apache-beam-testing&organizationId=433637338589 I see the following error in the worker logs: I 2018/01/04 21:20:37 Traceback (most recent call last): I 2018/01/04 21:20:37 File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main I 2018/01/04 21:20:37 "__main__", fname, loader, pkg_name) I 2018/01/04 21:20:37 File "/usr/lib/python2.7/runpy.py", line 72, in _run_code I 2018/01/04 21:20:37 exec code in run_globals I 2018/01/04 21:20:37 File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py", line 195, in <module> I 2018/01/04 21:20:37 main(sys.argv) I 2018/01/04 21:20:37 File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py", line 134, in main I 2018/01/04 21:20:37 worker_count=_get_worker_count(sdk_pipeline_options)).run() I 2018/01/04 21:20:37 File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 97, in run I 2018/01/04 21:20:37 work_request) I 2018/01/04 21:20:37 File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py", line 162, in _request_process_bundle_progress I 2018/01/04 21:20:37 worker = self._instruction_id_vs_worker[request.instruction_id] I 2018/01/04 21:20:37 KeyError: u'-39' The error started happening likely after https://github.com/apache/beam/commit/8188db40ee369dd54d69c7ef6020cf47463c8e85 which started using a newer fnapi worker container. Potentially a PR between 12/19 - 12/22 introduce this issue, but was not tested. (Assigning to [~angoenka], because it might be related his changes.) (cc: [~alanmyrvold], with automated testing using containers built at head we should be able to notice these issues earlier.) > Test apache_beam.examples.wordcount_it_test.WordCountIT times out > ----------------------------------------------------------------- > > Key: BEAM-3411 > URL: https://issues.apache.org/jira/browse/BEAM-3411 > Project: Beam > Issue Type: Bug > Components: sdk-py-core > Reporter: Henning Rohde > Assignee: Ankur Goenka > > Failed run: > https://builds.apache.org/job/beam_PostCommit_Python_Verify/3876/console > Log snippet: > test_wordcount_fnapi_it (apache_beam.examples.wordcount_it_test.WordCountIT) > ... ERROR > ====================================================================== > ERROR: test_wordcount_fnapi_it > (apache_beam.examples.wordcount_it_test.WordCountIT) > ---------------------------------------------------------------------- > Traceback (most recent call last): > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/plugins/multiprocess.py", > line 812, in run > test(orig) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py", > line 45, in __call__ > return self.run(*arg, **kwarg) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py", > line 133, in run > self.runTest(result) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py", > line 151, in runTest > test(result) > File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__ > return self.run(*args, **kwds) > File "/usr/lib/python2.7/unittest/case.py", line 331, in run > testMethod() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/wordcount_it_test.py", > line 77, in test_wordcount_fnapi_it > on_success_matcher=PipelineStateMatcher())) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/wordcount_fnapi.py", > line 130, in run > result.wait_until_finish() > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py", > line 956, in wait_until_finish > time.sleep(5.0) > File > "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/plugins/multiprocess.py", > line 276, in signalhandler > raise TimedOutException() > TimedOutException: 'test_wordcount_fnapi_it > (apache_beam.examples.wordcount_it_test.WordCountIT)' > ---------------------------------------------------------------------- > Ran 3 tests in 901.290s > FAILED (errors=1) > Build step 'Execute shell' marked build as failure -- This message was sent by Atlassian JIRA (v6.4.14#64029)