I am creating an Apache beam pipeline using Python SDK.I want to use some
standard template of dataflow (this one
<https://console.cloud.google.com/gcr/images/dataflow-templates-base/global/python310-template-launcher-base?tab=info>).
But when I am specifying it using 'template_location' key while creating
pipeline_options object, I am getting an error `FileNotFoundError: [Errno
2] No such file or directory: '
gcr.io/dataflow-templates-base/python310-template-launcher-base'`
I also tried to specify the complete version `
gcr.io/dataflow-templates-base/python310-template-launcher-base::flex_templates_base_image_release_20231127_RC00`
but got the same error. Can someone suggest what I might be doing wrong?
The code snippet to create pipeline_options is as follows-
def __create_pipeline_options_dataflow(job_name):
# Set up the Dataflow runner options
gcp_project_id = os.environ.get(GCP_PROJECT_ID)
# TODO:Move to environmental variables
pipeline_options = {
'project': gcp_project_id,
'region': "us-east1",
'job_name': job_name, # Provide a unique job name
'temp_location':
f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/temp',
'staging_location':
f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/staging',
'runner': 'DataflowRunner',
'save_main_session': True,
'service_account_email': service_account,
# 'network': f'projects/{gcp_project_id}/global/networks/default',
# 'subnetwork':
f'projects/{gcp_project_id}/regions/us-east1/subnetworks/default'
'template_location': '
gcr.io/dataflow-templates-base/python310-template-launcher-base'
}
logger.debug(f"pipeline_options created as {pipeline_options}")
return pipeline_options