I am creating an Apache beam pipeline using Python SDK.I want to use some
standard template of dataflow (this one
<https://console.cloud.google.com/gcr/images/dataflow-templates-base/global/python310-template-launcher-base?tab=info>).
But when I am specifying it using 'template_location' key while creating
pipeline_options object, I am getting an error `FileNotFoundError: [Errno
2] No such file or directory: '
gcr.io/dataflow-templates-base/python310-template-launcher-base'`

I also tried to specify the complete version `
gcr.io/dataflow-templates-base/python310-template-launcher-base::flex_templates_base_image_release_20231127_RC00`
but got the same error. Can someone suggest what I might be doing wrong?
The code snippet to create pipeline_options is as follows-

def __create_pipeline_options_dataflow(job_name):


    # Set up the Dataflow runner options
    gcp_project_id = os.environ.get(GCP_PROJECT_ID)
    # TODO:Move to environmental variables
    pipeline_options = {
        'project': gcp_project_id,
        'region': "us-east1",
        'job_name': job_name,  # Provide a unique job name
        'temp_location':
f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/temp',
        'staging_location':
f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/staging',
        'runner': 'DataflowRunner',
        'save_main_session': True,
        'service_account_email': service_account,
        # 'network': f'projects/{gcp_project_id}/global/networks/default',
        # 'subnetwork':
f'projects/{gcp_project_id}/regions/us-east1/subnetworks/default'
        'template_location': '
gcr.io/dataflow-templates-base/python310-template-launcher-base'

    }
    logger.debug(f"pipeline_options created as {pipeline_options}")
    return pipeline_options

Reply via email to