This might be relevant:
https://cloud.google.com/dataflow/docs/resources/faq#how_do_i_handle_nameerrors

If "save_main_session" is set Dataflow tries to pickle the main session. So
you might have to define such objects locally (for example, within
functions, DoFn classes, etc.) or update the pipeline to not set
"save_main_session" (and set dependencies according to this
<https://beam.apache.org/documentation/sdks/python-pipeline-dependencies/>
guide if needed).

Thanks,
Cham

On Mon, Mar 29, 2021 at 12:31 PM Rajnil Guha <rajnil94.g...@gmail.com>
wrote:

> Hi Beam Community,
>
> I am running a Dataflow pipeline using the Python SDK. I am doing do some
> ETL processing on my data and then write the output into Big Query. When I
> try to write into Big Query I get below error in Dataflow job. However when
> running this pipeline from my local on DirectRunner the same code runs
> successfully and data is written into Big Query.
>
>  "Clients have non-trivial state that is local and unpickleable.",
> _pickle.PicklingError: Pickling client objects is explicitly not supported.
> Clients have non-trivial state that is local and unpickleable.
>
> I have added the full traceback in the attached file.
> I am trying to write data into Big Query as below:-
>
> write_delivered_orders = (delivered_orders
>                              | "ConvertDeliveredToJSON" >>
> beam.Map(to_json)
>                              | "WriteDeliveredOrders" >>
> beam.io.WriteToBigQuery(
>                                  delivered_order_table_spec,
>                                  schema=table_schema,
>
>  create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
>
>  write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
>
>  additional_bq_parameters={'timePartitioning': {'type': 'DAY'}}
>                              )
>                              )
>
> Has anyone encountered this error, then can you please help me to
> understand and resolve it.
> Thanks in Advance.
>
> Thanks & Regards
> Rajnil Guha
>
>

Reply via email to