The GitHub Actions job "Tests (AMD)" on 
airflow.git/fix-beam-dataflow-job-id-resolve-by-name has succeeded.
Run started by GitHub user evgeniy-b (triggered by evgeniy-b).

Head commit for run:
ebf385dcc1554470fb130786930f928fa1e602a8 / Evgeniy Belyaev 
<[email protected]>
Resolve Beam Dataflow job id by name after launcher returns

`process_line_and_extract_dataflow_job_id_callback` in
`airflow.providers.google.cloud.hooks.dataflow` extracts the Dataflow
job id from the Beam SDK's stdout via `JOB_ID_PATTERN`. When the line is
missing or formatted differently, `dataflow_job_id` stays `None` and any
downstream call that requires it (deferred polling, on_kill, xcom
consumers) fails.

Drop the stdout scrape from
`BeamRunPythonPipelineOperator.execute_on_dataflow`,
`BeamRunJavaPipelineOperator.execute_on_dataflow`, and
`BeamRunGoPipelineOperator.execute_on_dataflow`, and look the job id up
once via the Dataflow API after the Beam launcher subprocess returns.
Add `DataflowHook.fetch_job_id_by_name` alongside the other name-based
lookups (`is_job_dataflow_running`, `cancel_job`, `get_job`): list
active jobs whose name starts with the configured `dataflow_job_name`
and return the id when exactly one match is found. Lookup failures are
logged and swallowed.

Report URL: https://github.com/apache/airflow/actions/runs/26637780739

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to