Seems like the image we use in KFP to orchestrate the job has cloudpickle==0.8.1 and that one doesn't seem to cause issues. I think I'm unblock for now but I'm sure I won't be the last one to try to do this using GCP managed notebooks :(
Thanks for all the help! On Tue, Feb 4, 2020 at 12:24 PM Alan Krumholz <alan.krumh...@betterup.co> wrote: > I'm using a managed notebook instance from GCP > It seems those already come with cloudpickle==1.2.2 as soon as you > provision it. apache-beam[gcp] will then install dill==0.3.1.1 I'm going > to try to uninstall cloudpickle before installing apache-beam and see if > this fixes the problem > > Thank you > > On Tue, Feb 4, 2020 at 11:54 AM Valentyn Tymofieiev <valen...@google.com> > wrote: > >> The fact that you have cloudpickle==1.2.2 further confirms that you may >> be hitting the same error as >> https://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype >> . >> >> Could you try to start over with a clean virtual environment? >> >> On Tue, Feb 4, 2020 at 11:46 AM Alan Krumholz <alan.krumh...@betterup.co> >> wrote: >> >>> Hi Valentyn, >>> >>> Here is my pip freeze on my machine (note that the error is in dataflow, >>> the job runs fine in my machine) >>> >>> ansiwrap==0.8.4 >>> apache-beam==2.19.0 >>> arrow==0.15.5 >>> asn1crypto==1.3.0 >>> astroid==2.3.3 >>> astropy==3.2.3 >>> attrs==19.3.0 >>> avro-python3==1.9.1 >>> azure-common==1.1.24 >>> azure-storage-blob==2.1.0 >>> azure-storage-common==2.1.0 >>> backcall==0.1.0 >>> bcolz==1.2.1 >>> binaryornot==0.4.4 >>> bleach==3.1.0 >>> boto3==1.11.9 >>> botocore==1.14.9 >>> cachetools==3.1.1 >>> certifi==2019.11.28 >>> cffi==1.13.2 >>> chardet==3.0.4 >>> Click==7.0 >>> cloudpickle==1.2.2 >>> colorama==0.4.3 >>> configparser==4.0.2 >>> confuse==1.0.0 >>> cookiecutter==1.7.0 >>> crcmod==1.7 >>> cryptography==2.8 >>> cycler==0.10.0 >>> daal==2019.0 >>> datalab==1.1.5 >>> decorator==4.4.1 >>> defusedxml==0.6.0 >>> dill==0.3.1.1 >>> distro==1.0.1 >>> docker==4.1.0 >>> docopt==0.6.2 >>> docutils==0.15.2 >>> entrypoints==0.3 >>> enum34==1.1.6 >>> fairing==0.5.3 >>> fastavro==0.21.24 >>> fasteners==0.15 >>> fsspec==0.6.2 >>> future==0.18.2 >>> gcsfs==0.6.0 >>> gitdb2==2.0.6 >>> GitPython==3.0.5 >>> google-api-core==1.16.0 >>> google-api-python-client==1.7.11 >>> google-apitools==0.5.28 >>> google-auth==1.11.0 >>> google-auth-httplib2==0.0.3 >>> google-auth-oauthlib==0.4.1 >>> google-cloud-bigquery==1.17.1 >>> google-cloud-bigtable==1.0.0 >>> google-cloud-core==1.2.0 >>> google-cloud-dataproc==0.6.1 >>> google-cloud-datastore==1.7.4 >>> google-cloud-language==1.3.0 >>> google-cloud-logging==1.14.0 >>> google-cloud-monitoring==0.31.1 >>> google-cloud-pubsub==1.0.2 >>> google-cloud-secret-manager==0.1.1 >>> google-cloud-spanner==1.13.0 >>> google-cloud-storage==1.25.0 >>> google-cloud-translate==2.0.0 >>> google-compute-engine==20191210.0 >>> google-resumable-media==0.4.1 >>> googleapis-common-protos==1.51.0 >>> grpc-google-iam-v1==0.12.3 >>> grpcio==1.26.0 >>> h5py==2.10.0 >>> hdfs==2.5.8 >>> html5lib==1.0.1 >>> htmlmin==0.1.12 >>> httplib2==0.12.0 >>> icc-rt==2020.0.133 >>> idna==2.8 >>> ijson==2.6.1 >>> imageio==2.6.1 >>> importlib-metadata==1.4.0 >>> intel-numpy==1.15.1 >>> intel-openmp==2020.0.133 >>> intel-scikit-learn==0.19.2 >>> intel-scipy==1.1.0 >>> ipykernel==5.1.4 >>> ipython==7.9.0 >>> ipython-genutils==0.2.0 >>> ipython-sql==0.3.9 >>> ipywidgets==7.5.1 >>> isort==4.3.21 >>> jedi==0.16.0 >>> Jinja2==2.11.0 >>> jinja2-time==0.2.0 >>> jmespath==0.9.4 >>> joblib==0.14.1 >>> json5==0.8.5 >>> jsonschema==3.2.0 >>> jupyter==1.0.0 >>> jupyter-aihub-deploy-extension==0.1 >>> jupyter-client==5.3.4 >>> jupyter-console==6.1.0 >>> jupyter-contrib-core==0.3.3 >>> jupyter-contrib-nbextensions==0.5.1 >>> jupyter-core==4.6.1 >>> jupyter-highlight-selected-word==0.2.0 >>> jupyter-http-over-ws==0.0.7 >>> jupyter-latex-envs==1.4.6 >>> jupyter-nbextensions-configurator==0.4.1 >>> jupyterlab==1.2.6 >>> jupyterlab-git==0.9.0 >>> jupyterlab-server==1.0.6 >>> keyring==10.1 >>> keyrings.alt==1.3 >>> kiwisolver==1.1.0 >>> kubernetes==10.0.1 >>> lazy-object-proxy==1.4.3 >>> llvmlite==0.31.0 >>> lxml==4.4.2 >>> Markdown==3.1.1 >>> MarkupSafe==1.1.1 >>> matplotlib==3.0.3 >>> mccabe==0.6.1 >>> missingno==0.4.2 >>> mistune==0.8.4 >>> mkl==2019.0 >>> mkl-fft==1.0.6 >>> mkl-random==1.0.1.1 >>> mock==2.0.0 >>> monotonic==1.5 >>> more-itertools==8.1.0 >>> nbconvert==5.6.1 >>> nbdime==1.1.0 >>> nbformat==5.0.4 >>> networkx==2.4 >>> nltk==3.4.5 >>> notebook==6.0.3 >>> numba==0.47.0 >>> numpy==1.15.1 >>> oauth2client==3.0.0 >>> oauthlib==3.1.0 >>> opencv-python==4.1.2.30 >>> oscrypto==1.2.0 >>> packaging==20.1 >>> pandas==0.25.3 >>> pandas-profiling==1.4.0 >>> pandocfilters==1.4.2 >>> papermill==1.2.1 >>> parso==0.6.0 >>> pathlib2==2.3.5 >>> pbr==5.4.4 >>> pexpect==4.8.0 >>> phik==0.9.8 >>> pickleshare==0.7.5 >>> Pillow-SIMD==6.2.2.post1 >>> pipdeptree==0.13.2 >>> plotly==4.5.0 >>> pluggy==0.13.1 >>> poyo==0.5.0 >>> prettytable==0.7.2 >>> prometheus-client==0.7.1 >>> prompt-toolkit==2.0.10 >>> protobuf==3.11.2 >>> psutil==5.6.7 >>> ptyprocess==0.6.0 >>> py==1.8.1 >>> pyarrow==0.15.1 >>> pyasn1==0.4.8 >>> pyasn1-modules==0.2.8 >>> pycparser==2.19 >>> pycrypto==2.6.1 >>> pycryptodomex==3.9.6 >>> pycurl==7.43.0 >>> pydaal==2019.0.0.20180713 >>> pydot==1.4.1 >>> Pygments==2.5.2 >>> pygobject==3.22.0 >>> PyJWT==1.7.1 >>> pylint==2.4.4 >>> pymongo==3.10.1 >>> pyOpenSSL==19.1.0 >>> pyparsing==2.4.6 >>> pyrsistent==0.15.7 >>> pytest==5.3.4 >>> pytest-pylint==0.14.1 >>> python-apt==1.4.1 >>> python-dateutil==2.8.1 >>> pytz==2019.3 >>> PyWavelets==1.1.1 >>> pyxdg==0.25 >>> PyYAML==5.3 >>> pyzmq==18.1.1 >>> qtconsole==4.6.0 >>> requests==2.22.0 >>> requests-oauthlib==1.3.0 >>> retrying==1.3.3 >>> rsa==4.0 >>> s3transfer==0.3.2 >>> scikit-image==0.15.0 >>> scikit-learn==0.19.2 >>> scipy==1.1.0 >>> seaborn==0.9.1 >>> SecretStorage==2.3.1 >>> Send2Trash==1.5.0 >>> simplegeneric==0.8.1 >>> six==1.14.0 >>> smmap2==2.0.5 >>> snowflake-connector-python==2.2.0 >>> SQLAlchemy==1.3.13 >>> sqlparse==0.3.0 >>> tbb==2019.0 >>> tbb4py==2019.0 >>> tenacity==6.0.0 >>> terminado==0.8.3 >>> testpath==0.4.4 >>> textwrap3==0.9.2 >>> tornado==5.1.1 >>> tqdm==4.42.0 >>> traitlets==4.3.3 >>> typed-ast==1.4.1 >>> typing==3.7.4.1 >>> typing-extensions==3.7.4.1 >>> unattended-upgrades==0.1 >>> uritemplate==3.0.1 >>> urllib3==1.24.2 >>> virtualenv==16.7.9 >>> wcwidth==0.1.8 >>> webencodings==0.5.1 >>> websocket-client==0.57.0 >>> Werkzeug==0.16.1 >>> whichcraft==0.6.1 >>> widgetsnbextension==3.5.1 >>> wrapt==1.11.2 >>> zipp==1.1.0 >>> >>> >>> On Tue, Feb 4, 2020 at 11:33 AM Valentyn Tymofieiev <valen...@google.com> >>> wrote: >>> >>>> It don't think there is a mismatch between dill versions here, but >>>> https://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype >>>> mentions >>>> a similar error and may be related. What is the output of pip freeze on >>>> your machine (or better: pip install pipdeptree; pipdeptree)? >>>> >>>> >>>> On Tue, Feb 4, 2020 at 11:22 AM Alan Krumholz < >>>> alan.krumh...@betterup.co> wrote: >>>> >>>>> Here is a test job that sometimes fails and sometimes doesn't (but >>>>> most times do)..... >>>>> There seems to be something stochastic that causes this as after >>>>> several tests a couple of them did succeed.... >>>>> >>>>> >>>>> def test_error( >>>>> bq_table: str) -> str: >>>>> >>>>> import apache_beam as beam >>>>> from apache_beam.options.pipeline_options import PipelineOptions >>>>> >>>>> class GenData(beam.DoFn): >>>>> def process(self, _): >>>>> for _ in range (20000): >>>>> yield {'a':1,'b':2} >>>>> >>>>> >>>>> def get_bigquery_schema(): >>>>> from apache_beam.io.gcp.internal.clients import bigquery >>>>> >>>>> table_schema = bigquery.TableSchema() >>>>> columns = [ >>>>> ["a","integer","nullable"], >>>>> ["b","integer","nullable"] >>>>> ] >>>>> >>>>> for column in columns: >>>>> column_schema = bigquery.TableFieldSchema() >>>>> column_schema.name = column[0] >>>>> column_schema.type = column[1] >>>>> column_schema.mode = column[2] >>>>> table_schema.fields.append(column_schema) >>>>> >>>>> return table_schema >>>>> >>>>> pipeline = beam.Pipeline(options=PipelineOptions( >>>>> project='my-project', >>>>> temp_location = 'gs://my-bucket/temp', >>>>> staging_location = 'gs://my-bucket/staging', >>>>> runner='DataflowRunner' >>>>> )) >>>>> #pipeline = beam.Pipeline() >>>>> >>>>> ( >>>>> pipeline >>>>> | 'Empty start' >> beam.Create(['']) >>>>> | 'Generate Data' >> beam.ParDo(GenData()) >>>>> #| 'print' >> beam.Map(print) >>>>> | 'Write to BigQuery' >> beam.io.WriteToBigQuery( >>>>> project=bq_table.split(':')[0], >>>>> dataset=bq_table.split(':')[1].split('.')[0], >>>>> table=bq_table.split(':')[1].split('.')[1], >>>>> schema=get_bigquery_schema(), >>>>> >>>>> create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED, >>>>> >>>>> write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE) >>>>> ) >>>>> >>>>> result = pipeline.run() >>>>> result.wait_until_finish() >>>>> >>>>> return True >>>>> >>>>> test_error( >>>>> bq_table = 'my-project:my_dataset.my_table' >>>>> ) >>>>> >>>>> On Tue, Feb 4, 2020 at 10:04 AM Alan Krumholz < >>>>> alan.krumh...@betterup.co> wrote: >>>>> >>>>>> I tried breaking apart my pipeline. Seems the step that breaks it is: >>>>>> beam.io.WriteToBigQuery >>>>>> >>>>>> Let me see if I can create a self contained example that breaks to >>>>>> share with you >>>>>> >>>>>> Thanks! >>>>>> >>>>>> On Tue, Feb 4, 2020 at 9:53 AM Pablo Estrada <pabl...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Hm that's odd. No changes to the pipeline? Are you able to share >>>>>>> some of the code? >>>>>>> >>>>>>> +Udi Meiri <eh...@google.com> do you have any idea what could be >>>>>>> going on here? >>>>>>> >>>>>>> On Tue, Feb 4, 2020 at 9:25 AM Alan Krumholz < >>>>>>> alan.krumh...@betterup.co> wrote: >>>>>>> >>>>>>>> Hi Pablo, >>>>>>>> This is strange... it doesn't seem to be the last beam release as >>>>>>>> last night it was already using 2.19.0 I wonder if it was some release >>>>>>>> from >>>>>>>> the DataFlow team (not beam related): >>>>>>>> Job typeBatch >>>>>>>> Job status Succeeded >>>>>>>> SDK version >>>>>>>> Apache Beam Python 3.5 SDK 2.19.0 >>>>>>>> Region >>>>>>>> us-central1 >>>>>>>> Start timeFebruary 3, 2020 at 9:28:35 PM GMT-8 >>>>>>>> Elapsed time5 min 11 sec >>>>>>>> >>>>>>>> On Tue, Feb 4, 2020 at 9:15 AM Pablo Estrada <pabl...@google.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Alan, >>>>>>>>> could it be that you're picking up the new Apache Beam 2.19.0 >>>>>>>>> release? Could you try depending on beam 2.18.0 to see if the issue >>>>>>>>> surfaces when using the new release? >>>>>>>>> >>>>>>>>> If something was working and no longer works, it sounds like a >>>>>>>>> bug. This may have to do with how we pickle (dill / cloudpickle) - >>>>>>>>> see this >>>>>>>>> question >>>>>>>>> https://stackoverflow.com/questions/42960637/python-3-5-dill-pickling-unpickling-on-different-servers-keyerror-classtype >>>>>>>>> Best >>>>>>>>> -P. >>>>>>>>> >>>>>>>>> On Tue, Feb 4, 2020 at 6:22 AM Alan Krumholz < >>>>>>>>> alan.krumh...@betterup.co> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I was running a dataflow job in GCP last night and it was running >>>>>>>>>> fine. >>>>>>>>>> This morning this same exact job is failing with the following >>>>>>>>>> error: >>>>>>>>>> >>>>>>>>>> Error message from worker: Traceback (most recent call last): >>>>>>>>>> File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", >>>>>>>>>> line 286, in loads return dill.loads(s) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 275, in >>>>>>>>>> loads >>>>>>>>>> return load(file, ignore, **kwds) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 270, in >>>>>>>>>> load >>>>>>>>>> return Unpickler(file, ignore=ignore, **kwds).load() File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 472, in >>>>>>>>>> load >>>>>>>>>> obj = StockUnpickler.load(self) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 577, in >>>>>>>>>> _load_type return _reverse_typemap[name] KeyError: 'ClassType' During >>>>>>>>>> handling of the above exception, another exception occurred: >>>>>>>>>> Traceback >>>>>>>>>> (most recent call last): File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dataflow_worker/batchworker.py", >>>>>>>>>> line 648, in do_work work_executor.execute() File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dataflow_worker/executor.py", >>>>>>>>>> line >>>>>>>>>> 176, in execute op.start() File >>>>>>>>>> "apache_beam/runners/worker/operations.py", >>>>>>>>>> line 649, in apache_beam.runners.worker.operations.DoOperation.start >>>>>>>>>> File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 651, in >>>>>>>>>> apache_beam.runners.worker.operations.DoOperation.start File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 652, in >>>>>>>>>> apache_beam.runners.worker.operations.DoOperation.start File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 261, in >>>>>>>>>> apache_beam.runners.worker.operations.Operation.start File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 266, in >>>>>>>>>> apache_beam.runners.worker.operations.Operation.start File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 597, in >>>>>>>>>> apache_beam.runners.worker.operations.DoOperation.setup File >>>>>>>>>> "apache_beam/runners/worker/operations.py", line 602, in >>>>>>>>>> apache_beam.runners.worker.operations.DoOperation.setup File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/apache_beam/internal/pickler.py", >>>>>>>>>> line 290, in loads return dill.loads(s) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 275, in >>>>>>>>>> loads >>>>>>>>>> return load(file, ignore, **kwds) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 270, in >>>>>>>>>> load >>>>>>>>>> return Unpickler(file, ignore=ignore, **kwds).load() File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 472, in >>>>>>>>>> load >>>>>>>>>> obj = StockUnpickler.load(self) File >>>>>>>>>> "/usr/local/lib/python3.5/site-packages/dill/_dill.py", line 577, in >>>>>>>>>> _load_type return _reverse_typemap[name] KeyError: 'ClassType' >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> If I use a local runner it still runs fine. >>>>>>>>>> Anyone else experiencing something similar today? (or know how to >>>>>>>>>> fix this?) >>>>>>>>>> >>>>>>>>>> Thanks! >>>>>>>>>> >>>>>>>>>