Hi Sofia and XQ,

The application is failing because I have loggers defined in every file and
the method to create a logger tries to create an object of
UplightTelemetry. If I use flex templated, will the environmental variables
I supply be loaded before the application gets loaded? If not, it would not
serve my purpose.

Thanks & Regards,
Sumit Desai

On Thu, Dec 21, 2023 at 10:02 AM Sumit Desai <sumit.de...@uplight.com>
wrote:

> Thank you HQ. Will take a look at this.
>
> Regards,
> Sumit Desai
>
> On Wed, Dec 20, 2023 at 8:13 PM XQ Hu <x...@google.com> wrote:
>
>> Dataflow VMs cannot know your local env variable. I think you should use
>> custom container:
>> https://cloud.google.com/dataflow/docs/guides/using-custom-containers.
>> Here is a sample project: https://github.com/google/dataflow-ml-starter
>>
>> On Wed, Dec 20, 2023 at 4:57 AM Sofia’s World <mmistr...@gmail.com>
>> wrote:
>>
>>> Hello Sumit
>>>  Thanks. Sorry...I guess if the value of the env variable is always the
>>> same u can pass it as job params?..though it doesn't sound like a
>>> viable option...
>>> Hth
>>>
>>> On Wed, 20 Dec 2023, 09:49 Sumit Desai, <sumit.de...@uplight.com> wrote:
>>>
>>>> Hi Sofia,
>>>>
>>>> Thanks for the response. For now, we have decided not to use flex
>>>> template. Is there a way to pass environmental variables without using any
>>>> template?
>>>>
>>>> Thanks & Regards,
>>>> Sumit Desai
>>>>
>>>> On Wed, Dec 20, 2023 at 3:16 PM Sofia’s World <mmistr...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi
>>>>>  My 2 cents. .have u ever considered using flex templates to run your
>>>>> pipeline? Then you can pass all your parameters at runtime..
>>>>> (Apologies in advance if it does not cover your use case...)
>>>>>
>>>>> On Wed, 20 Dec 2023, 09:35 Sumit Desai via user, <user@beam.apache.org>
>>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have a Python application which is using Apache beam and Dataflow
>>>>>> as runner. The application uses a non-public Python package
>>>>>> 'uplight-telemetry' which is configured using 'extra_packages' while
>>>>>> creating pipeline_options object. This package expects an environmental
>>>>>> variable named 'OTEL_SERVICE_NAME' and since this variable is not present
>>>>>> in the Dataflow worker, it is resulting in an error during application
>>>>>> startup.
>>>>>>
>>>>>> I am passing this variable using custom pipeline options. Code to
>>>>>> create pipeline options is as follows-
>>>>>>
>>>>>> pipeline_options = ProcessBillRequests.CustomOptions(
>>>>>>     project=gcp_project_id,
>>>>>>     region="us-east1",
>>>>>>     job_name=job_name,
>>>>>>     
>>>>>> temp_location=f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/temp',
>>>>>>     
>>>>>> staging_location=f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/staging',
>>>>>>     runner='DataflowRunner',
>>>>>>     save_main_session=True,
>>>>>>     service_account_email= service_account,
>>>>>>     subnetwork=os.environ.get(SUBNETWORK_URL),
>>>>>>     extra_packages=[uplight_telemetry_tar_file_path],
>>>>>>     setup_file=setup_file_path,
>>>>>>     OTEL_SERVICE_NAME=otel_service_name,
>>>>>>     OTEL_RESOURCE_ATTRIBUTES=otel_resource_attributes
>>>>>>     # Set values for additional custom variables as needed
>>>>>> )
>>>>>>
>>>>>>
>>>>>> And the code that executes the pipeline is as follows-
>>>>>>
>>>>>>
>>>>>> result = (
>>>>>>         pipeline
>>>>>>         | "ReadPendingRecordsFromDB" >> read_from_db
>>>>>>         | "Parse input PCollection" >> 
>>>>>> beam.Map(ProcessBillRequests.parse_bill_data_requests)
>>>>>>         | "Fetch bills " >> 
>>>>>> beam.ParDo(ProcessBillRequests.FetchBillInformation())
>>>>>> )
>>>>>>
>>>>>> pipeline.run().wait_until_finish()
>>>>>>
>>>>>> Is there a way I can set the environmental variables in custom
>>>>>> options available in the worker?
>>>>>>
>>>>>> Thanks & Regards,
>>>>>> Sumit Desai
>>>>>>
>>>>>
On Wed, Dec 20, 2023 at 8:13 PM XQ Hu <x...@google.com> wrote:

> Dataflow VMs cannot know your local env variable. I think you should use
> custom container:
> https://cloud.google.com/dataflow/docs/guides/using-custom-containers.
> Here is a sample project: https://github.com/google/dataflow-ml-starter
>
> On Wed, Dec 20, 2023 at 4:57 AM Sofia’s World <mmistr...@gmail.com> wrote:
>
>> Hello Sumit
>>  Thanks. Sorry...I guess if the value of the env variable is always the
>> same u can pass it as job params?..though it doesn't sound like a
>> viable option...
>> Hth
>>
>> On Wed, 20 Dec 2023, 09:49 Sumit Desai, <sumit.de...@uplight.com> wrote:
>>
>>> Hi Sofia,
>>>
>>> Thanks for the response. For now, we have decided not to use flex
>>> template. Is there a way to pass environmental variables without using any
>>> template?
>>>
>>> Thanks & Regards,
>>> Sumit Desai
>>>
>>> On Wed, Dec 20, 2023 at 3:16 PM Sofia’s World <mmistr...@gmail.com>
>>> wrote:
>>>
>>>> Hi
>>>>  My 2 cents. .have u ever considered using flex templates to run your
>>>> pipeline? Then you can pass all your parameters at runtime..
>>>> (Apologies in advance if it does not cover your use case...)
>>>>
>>>> On Wed, 20 Dec 2023, 09:35 Sumit Desai via user, <user@beam.apache.org>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a Python application which is using Apache beam and Dataflow as
>>>>> runner. The application uses a non-public Python package
>>>>> 'uplight-telemetry' which is configured using 'extra_packages' while
>>>>> creating pipeline_options object. This package expects an environmental
>>>>> variable named 'OTEL_SERVICE_NAME' and since this variable is not present
>>>>> in the Dataflow worker, it is resulting in an error during application
>>>>> startup.
>>>>>
>>>>> I am passing this variable using custom pipeline options. Code to
>>>>> create pipeline options is as follows-
>>>>>
>>>>> pipeline_options = ProcessBillRequests.CustomOptions(
>>>>>     project=gcp_project_id,
>>>>>     region="us-east1",
>>>>>     job_name=job_name,
>>>>>     
>>>>> temp_location=f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/temp',
>>>>>     
>>>>> staging_location=f'gs://{TAS_GCS_BUCKET_NAME_PREFIX}{os.getenv("UP_PLATFORM_ENV")}/staging',
>>>>>     runner='DataflowRunner',
>>>>>     save_main_session=True,
>>>>>     service_account_email= service_account,
>>>>>     subnetwork=os.environ.get(SUBNETWORK_URL),
>>>>>     extra_packages=[uplight_telemetry_tar_file_path],
>>>>>     setup_file=setup_file_path,
>>>>>     OTEL_SERVICE_NAME=otel_service_name,
>>>>>     OTEL_RESOURCE_ATTRIBUTES=otel_resource_attributes
>>>>>     # Set values for additional custom variables as needed
>>>>> )
>>>>>
>>>>>
>>>>> And the code that executes the pipeline is as follows-
>>>>>
>>>>>
>>>>> result = (
>>>>>         pipeline
>>>>>         | "ReadPendingRecordsFromDB" >> read_from_db
>>>>>         | "Parse input PCollection" >> 
>>>>> beam.Map(ProcessBillRequests.parse_bill_data_requests)
>>>>>         | "Fetch bills " >> 
>>>>> beam.ParDo(ProcessBillRequests.FetchBillInformation())
>>>>> )
>>>>>
>>>>> pipeline.run().wait_until_finish()
>>>>>
>>>>> Is there a way I can set the environmental variables in custom options
>>>>> available in the worker?
>>>>>
>>>>> Thanks & Regards,
>>>>> Sumit Desai
>>>>>
>>>>

Reply via email to