I tried both as mentioned below:- 1. Tried importing the modules locally in DoFns and functions and then run the pipeline with “save_main_session” as True. 2. Imported modules globally and created a requirements.txt file to set the dependencies and ran the pipeline as below with “save_main_session” as False. python batch_pipeline.py --project=$PROJECT --region=us-central1 --runner=DataflowRunner --staging=gs://dataflow-code-bucket/test --temp_location gs://dataflow-code-bucket/test --input gs://dataflow-code- bucket/input-files/food_daily.csv requirements_file requirements.txt --save_main_session False Both times the pipeline fails with the same pickling error. The requirements.txt file is attached. |
apache-beam==2.24.0 astroid==2.4.2 avro-python3==1.9.2.1 cachetools==3.1.1 certifi==2020.6.20 cffi==1.14.3 chardet==3.0.4 crcmod==1.7 cryptography==3.1.1 dill==0.3.1.1 docopt==0.6.2 fastavro==0.23.6 fasteners==0.15 future==0.18.2 google-api-core==1.22.2 google-apitools==0.5.31 google-auth==1.21.2 google-cloud-bigquery==1.28.0 google-cloud-bigtable==1.5.1 google-cloud-core==1.4.1 google-cloud-datastore==1.15.3 google-cloud-dlp==1.0.0 google-cloud-language==1.3.0 google-cloud-pubsub==1.7.0 google-cloud-spanner==1.19.1 google-cloud-storage==1.31.1 google-cloud-videointelligence==1.16.0 google-cloud-vision==1.0.0 google-crc32c==1.0.0 google-resumable-media==1.0.0 googleapis-common-protos==1.52.0 grpc-google-iam-v1==0.12.3 grpcio==1.32.0 grpcio-gcp==0.2.2 hdfs==2.5.8 httplib2==0.17.4 idna==2.10 isort==5.5.3 lazy-object-proxy==1.4.3 mccabe==0.6.1 mock==2.0.0 monotonic==1.5 numpy==1.19.2 oauth2client==3.0.0 pbr==5.5.0 protobuf==3.13.0 pyarrow==0.17.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycparser==2.20 pydot==1.4.1 pylint==2.6.0 pymongo==3.11.0 pyOpenSSL==19.1.0 pyparsing==2.4.7 python-dateutil==2.8.1 pytz==2020.1 regex==2021.3.17 requests==2.24.0 rsa==4.6 six==1.15.0 toml==0.10.1 typed-ast==1.4.1 typing-extensions==3.7.4.3 urllib3==1.25.10 wrapt==1.12.1
Thanks & Regards Rajnil Guha
|