jubebo commented on issue #27166: URL: https://github.com/apache/beam/issues/27166#issuecomment-1630871409
As discussed in [the second PR](https://github.com/apache/beam/pull/27317), implementing these suggestions requires changes to beam core, which is why I was investigating alternative solutions for my problem. I have found that a simple call to ```python beam.io.ReadFromBigQuery( project="your-project", dataset="your-dataset", table="your-table" method=beam.io.ReadFromBigQuery.Method.DIRECT_READ ) ``` will solve the problems that I am currently facing, while still enabling me to parse nested data directly from BigQuery into Beam. From the documentation in [Google Cloud DataFlow](https://cloud.google.com/dataflow/docs/guides/read-from-bigquery#serialization) I also understand that manually parsing the output format can yield performance improvements compared to reading TableRow (or beam.Row) data types. Please let me know if there are any major concers regarding this alternative approach, or if you identify additional improvements when following the suggestions of this feature request as discussed in the beginning of this conversation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
