Alvaro created BEAM-11230:
-----------------------------

             Summary: ReadFromBigQuery fails when the table has repeated records
                 Key: BEAM-11230
                 URL: https://issues.apache.org/jira/browse/BEAM-11230
             Project: Beam
          Issue Type: Bug
          Components: sdk-py-core
    Affects Versions: 2.25.0
            Reporter: Alvaro


This is pretty much similar to the issue mentioned here: 
https://issues.apache.org/jira/browse/BEAM-10524

I've upgraded the python sdk version from 2.24 to 2.25 and the ReadFromBigQuery 
start failing with this stacktrace:

 
{code:java}
....

"/usr/local/lib/python3.7/site-packages/dataflow_worker/batchworker.py", line 
649, in do_work
    work_executor.execute()
  File "/usr/local/lib/python3.7/site-packages/dataflow_worker/executor.py", 
line 179, in execute
    op.start()
  File "dataflow_worker/native_operations.py", line 38, in 
dataflow_worker.native_operations.NativeReadOperation.start
  File "dataflow_worker/native_operations.py", line 39, in 
dataflow_worker.native_operations.NativeReadOperation.start
  File "dataflow_worker/native_operations.py", line 44, in 
dataflow_worker.native_operations.NativeReadOperation.start
  File "dataflow_worker/native_operations.py", line 48, in 
dataflow_worker.native_operations.NativeReadOperation.start
  File 
"/usr/local/lib/python3.7/site-packages/apache_beam/io/concat_source.py", line 
89, in read
    range_tracker.sub_range_tracker(source_ix)):
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/textio.py", line 
210, in read_records
    yield self._coder.decode(record)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py", 
line 633, in decode
    return self._decode_with_schema(value, self.fields)
  File "/usr/local/lib/python3.7/site-packages/apache_beam/io/gcp/bigquery.py", 
line 656, in _decode_with_schema
    value[field.name] = converter(value[field.name])
TypeError: int() argument must be a string, a bytes-like object or a number, 
not 'list'{code}
According to the aforementioned issue, this should be fixed on the 2.25 but it 
is actually the opposite in my case. 

Code: 
https://github.com/apache/beam/blob/release-2.25.0/sdks/python/apache_beam/io/gcp/bigquery.py#L656

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to