[ 
https://issues.apache.org/jira/browse/BEAM-3411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16312235#comment-16312235
 ] 

Ahmet Altay commented on BEAM-3411:
-----------------------------------

Here is one of the failing dataflow jobs:

https://pantheon.corp.google.com/dataflow/jobsDetail/locations/us-central1/jobs/2018-01-04_13_16_26-2332266460968770664?project=apache-beam-testing&organizationId=433637338589

I see the following error in the worker logs:

I  2018/01/04 21:20:37 Traceback (most recent call last): 
I  2018/01/04 21:20:37   File "/usr/lib/python2.7/runpy.py", line 162, in 
_run_module_as_main 
I  2018/01/04 21:20:37     "__main__", fname, loader, pkg_name) 
I  2018/01/04 21:20:37   File "/usr/lib/python2.7/runpy.py", line 72, in 
_run_code 
I  2018/01/04 21:20:37     exec code in run_globals 
I  2018/01/04 21:20:37   File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py",
 line 195, in <module> 
I  2018/01/04 21:20:37     main(sys.argv) 
I  2018/01/04 21:20:37   File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker_main.py",
 line 134, in main 
I  2018/01/04 21:20:37     
worker_count=_get_worker_count(sdk_pipeline_options)).run() 
I  2018/01/04 21:20:37   File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
 line 97, in run 
I  2018/01/04 21:20:37     work_request) 
I  2018/01/04 21:20:37   File 
"/usr/local/lib/python2.7/dist-packages/apache_beam/runners/worker/sdk_worker.py",
 line 162, in _request_process_bundle_progress 
I  2018/01/04 21:20:37     worker = 
self._instruction_id_vs_worker[request.instruction_id] 
I  2018/01/04 21:20:37 KeyError: u'-39'

The error started happening likely after 
https://github.com/apache/beam/commit/8188db40ee369dd54d69c7ef6020cf47463c8e85 
which started using a newer fnapi worker container. Potentially a PR between 
12/19 - 12/22 introduce this issue, but was not tested. (Assigning to 
[~angoenka], because it might be related his changes.)

(cc: [~alanmyrvold], with automated testing using containers built at head we 
should be able to notice these issues earlier.)
 


> Test apache_beam.examples.wordcount_it_test.WordCountIT times out
> -----------------------------------------------------------------
>
>                 Key: BEAM-3411
>                 URL: https://issues.apache.org/jira/browse/BEAM-3411
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py-core
>            Reporter: Henning Rohde
>            Assignee: Ankur Goenka
>
> Failed run: 
> https://builds.apache.org/job/beam_PostCommit_Python_Verify/3876/console
> Log snippet:
> test_wordcount_fnapi_it (apache_beam.examples.wordcount_it_test.WordCountIT) 
> ... ERROR
> ======================================================================
> ERROR: test_wordcount_fnapi_it 
> (apache_beam.examples.wordcount_it_test.WordCountIT)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/plugins/multiprocess.py",
>  line 812, in run
>     test(orig)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py",
>  line 45, in __call__
>     return self.run(*arg, **kwarg)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py",
>  line 133, in run
>     self.runTest(result)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/case.py",
>  line 151, in runTest
>     test(result)
>   File "/usr/lib/python2.7/unittest/case.py", line 395, in __call__
>     return self.run(*args, **kwds)
>   File "/usr/lib/python2.7/unittest/case.py", line 331, in run
>     testMethod()
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/wordcount_it_test.py",
>  line 77, in test_wordcount_fnapi_it
>     on_success_matcher=PipelineStateMatcher()))
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/examples/wordcount_fnapi.py",
>  line 130, in run
>     result.wait_until_finish()
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/apache_beam/runners/dataflow/dataflow_runner.py",
>  line 956, in wait_until_finish
>     time.sleep(5.0)
>   File 
> "/home/jenkins/jenkins-slave/workspace/beam_PostCommit_Python_Verify/src/sdks/python/.eggs/nose-1.3.7-py2.7.egg/nose/plugins/multiprocess.py",
>  line 276, in signalhandler
>     raise TimedOutException()
> TimedOutException: 'test_wordcount_fnapi_it 
> (apache_beam.examples.wordcount_it_test.WordCountIT)'
> ----------------------------------------------------------------------
> Ran 3 tests in 901.290s
> FAILED (errors=1)
> Build step 'Execute shell' marked build as failure



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to