rizenfrmtheashes commented on issue #23670:
URL: https://github.com/apache/beam/issues/23670#issuecomment-1642867644
I want to bump/note that I still experience this issue. even on beam 2.48. I
continue to use a patched version. I have had need to launch new kinds of
pipelines as of late, also writing to BQ via file loads and dynamic table
destinations, and I encounter this issue still.
```
Error message from worker: generic::unknown: Traceback (most recent call
last):
File "apache_beam/runners/common.py", line 1459, in
apache_beam.runners.common.DoFnRunner._invoke_bundle_method
File "apache_beam/runners/common.py", line 562, in
apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
File "apache_beam/runners/common.py", line 567, in
apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
File "apache_beam/runners/common.py", line 1731, in
apache_beam.runners.common._OutputHandler.finish_bundle_outputs
File
"/usr/local/lib/python3.11/site-packages/apache_beam/io/gcp/bigquery_file_loads.py",
line 594, in finish_bundle
self.bq_wrapper.wait_for_bq_job(
File
"/usr/local/lib/python3.11/site-packages/apache_beam/io/gcp/bigquery_tools.py",
line 656, in wait_for_bq_job
raise RuntimeError(
RuntimeError: BigQuery job
beam_bq_job_COPY_devREDACTEDv0001estuarysessionstreamingdee97770f8ae41cda27e46ed9ad361dbe411c41020230719133458358779_COPY_STEP_803_be4d9436fd109617c17082aaf3b8be8d
failed. Error Result: <ErrorProto
message: 'Not found: Table
PROJECT_REDACTED:867395ee_660f_4f8d_89b9_51052234406b.beam_bq_job_LOAD_devREDACTEDv0001estuarysessionstreamingdee97770f8ae41cda27e46ed9ad361dbe411c41020230719133458358779_LOAD_STEP_187_048806eef0c9a1f1281a7b31aa85d4ff_b6eba9ce9ae24e6d8d23f15692ce5f01'
reason: 'notFound'>
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File
"/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 295, in _execute
response = task()
^^^^^^
File
"/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 370, in <lambda>
lambda: self.create_worker().do_instruction(request), request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 629, in do_instruction
return getattr(self, request_type)(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/sdk_worker.py",
line 667, in process_bundle
bundle_processor.process_bundle(instruction_id))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/usr/local/lib/python3.11/site-packages/apache_beam/runners/worker/bundle_processor.py",
line 1067, in process_bundle
op.finish()
File "apache_beam/runners/worker/operations.py", line 939, in
apache_beam.runners.worker.operations.DoOperation.finish
File "apache_beam/runners/worker/operations.py", line 942, in
apache_beam.runners.worker.operations.DoOperation.finish
File "apache_beam/runners/worker/operations.py", line 943, in
apache_beam.runners.worker.operations.DoOperation.finish
File "apache_beam/runners/common.py", line 1480, in
apache_beam.runners.common.DoFnRunner.finish
File "apache_beam/runners/common.py", line 1461, in
apache_beam.runners.common.DoFnRunner._invoke_bundle_method
File "apache_beam/runners/common.py", line 1508, in
apache_beam.runners.common.DoFnRunner._reraise_augmented
File "apache_beam/runners/common.py", line 1459, in
apache_beam.runners.common.DoFnRunner._invoke_bundle_method
File "apache_beam/runners/common.py", line 562, in
apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
File "apache_beam/runners/common.py", line 567, in
apache_beam.runners.common.DoFnInvoker.invoke_finish_bundle
File "apache_beam/runners/common.py", line 1731, in
apache_beam.runners.common._OutputHandler.finish_bundle_outputs
File
"/usr/local/lib/python3.11/site-packages/apache_beam/io/gcp/bigquery_file_loads.py",
line 594, in finish_bundle
self.bq_wrapper.wait_for_bq_job(
File
"/usr/local/lib/python3.11/site-packages/apache_beam/io/gcp/bigquery_tools.py",
line 656, in wait_for_bq_job
raise RuntimeError(
RuntimeError: BigQuery job
beam_bq_job_COPY_devREDACTEDv0001estuarysessionstreamingdee97770f8ae41cda27e46ed9ad361dbe411c41020230719133458358779_COPY_STEP_803_be4d9436fd109617c17082aaf3b8be8d
failed. Error Result: <ErrorProto
message: 'Not found: Table
PROJECT_REDACTED:867395ee_660f_4f8d_89b9_51052234406b.beam_bq_job_LOAD_devREDACTEDv0001estuarysessionstreamingdee97770f8ae41cda27e46ed9ad361dbe411c41020230719133458358779_LOAD_STEP_187_048806eef0c9a1f1281a7b31aa85d4ff_b6eba9ce9ae24e6d8d23f15692ce5f01'
reason: 'notFound'> [while running 'Write BQ Sessioned Events to
BigQuery/BigQueryBatchFileLoads/ParDo(TriggerCopyJobs)/ParDo(TriggerCopyJobs)-ptransform-24']
```
It would be helpful to know if this is being explored/actioned because
otherwise I have to continue using my patch that never deletes the temp tables,
with which I have had no issues with.
This thread had a suspicion that some retry before a group by commit barrier
was causing write items to get replayed, with third party sideeffects causing
these temp tables to be deleted and so not working on the subsequent retries.
but we couldn't figure out where the retry loop could be happening where the
temp tables wouldn't be regenerated or renamed. I wonder if there's a mix up in
the names of the jobs to tables?
Also as a note, I only saw this happening whenever I drained a job. Might
be a red herring.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]