Hello,

When trying to run the coders.py cookbook example [1], it raises an
exception

```
   PicklingError : _pickle.PicklingError: Can't pickle <class
'__main__.JsonCoder'>: it's not the same object as __main__.JsonCoder
[while running 'read/Read/Map(<lambda at iobase.py:899>)']
```

The example pipeline works when it's run using the coders_test.py file [2].
In the existing test, the DAG creates the input collection instead of
reading from a file.

I am using Python 3.7.9, the Python Beam SDK 2.29.0 and DirectRunner, under
Darwin.

To reproduce the issue, run:
```
  python coders.py --input input.ndjson --output output.txt
```

Where the input file (input.ndjson) has the same values as coders_test.py:
```
{"host": ["Germany", 1], "guest": ["Italy", 0]}
{"host": ["Germany", 1], "guest": ["Brasil", 3]}
{"host": ["Brasil", 1], "guest": ["Italy", 0]}
```

The full stack trace is at the bottom of the email.

Questions:
1. Am I doing something wrong?
2. Is a coder recommended to decode NDJSON files or is it advised to just
use ParDos / Maps to deserialize them?

I really appreciate any help you can provide.

Kind regards,

Tatiana

[1]
https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/coders.py
[2]
https://github.com/apache/beam/blob/35bac6a62f1dc548ee908cfeff7f73ffcac38e6f/sdks/python/apache_beam/examples/cookbook/coders_test.py


```
Traceback (most recent call last):

 File "apache_beam/runners/common.py", line 1233, in
apache_beam.runners.common.DoFnRunner.process

 File "apache_beam/runners/common.py", line 581, in
apache_beam.runners.common.SimpleInvoker.invoke_process

 File "apache_beam/runners/common.py", line 1395, in
apache_beam.runners.common._OutputProcessor.process_outputs

 File "apache_beam/runners/worker/operations.py", line 219, in
apache_beam.runners.worker.operations.SingletonConsumerSet.receive

 File "apache_beam/runners/worker/operations.py", line 183, in
apache_beam.runners.worker.operations.ConsumerSet.update_counters_start

 File "apache_beam/runners/worker/opcounters.py", line 217, in
apache_beam.runners.worker.opcounters.OperationCounters.update_from

 File "apache_beam/runners/worker/opcounters.py", line 255, in
apache_beam.runners.worker.opcounters.OperationCounters.do_sample

 File "apache_beam/coders/coder_impl.py", line 1311, in
apache_beam.coders.coder_impl.WindowedValueCoderImpl.get_estimated_size_and_observables

 File "apache_beam/coders/coder_impl.py", line 1322, in
apache_beam.coders.coder_impl.WindowedValueCoderImpl.get_estimated_size_and_observables

 File "apache_beam/coders/coder_impl.py", line 354, in
apache_beam.coders.coder_impl.FastPrimitivesCoderImpl.get_estimated_size_and_observables

 File "apache_beam/coders/coder_impl.py", line 418, in
apache_beam.coders.coder_impl.FastPrimitivesCoderImpl.encode_to_stream

 File "apache_beam/coders/coder_impl.py", line 261, in
apache_beam.coders.coder_impl.CallbackCoderImpl.encode_to_stream

 File
"/private/tmp/dataflow_sandbox/lib/python3.7/site-packages/apache_beam/coders/coders.py",
line 749, in <lambda>

  lambda x: dumps(x, protocol), pickle.loads)

_pickle.PicklingError: Can't pickle <class '__main__.JsonCoder'>: it's not
the same object as __main__.JsonCoder
```

Reply via email to