https://github.com/apache/beam/pull/12389

Hi everyone, in the above pull request I am attempting to add support for 
writing Avro records with maps to a BigQuery table (via Beam Schema).  The 
write portion is fairly straightforward - we convert the map to an array of 
structs with key and value fields (seemingly the closest possible approximation 
of a map in BigQuery).  But the read back portion is more controversial because 
we simply check if a field is an array of structs with exactly two fields - key 
and value - and assume that should be read into a Schema map field.

So the possibility exists that an array of structs with key and value fields, 
which wasn't originally written from a map, could be unexpectedly read into a 
map.  In the PR review I suggested a few options for tagging the BigQuery 
field, so that we could know it was written from a Beam Schema map and should 
be read back into one, but I'm not very satisfied with any of the options.

Andrew Pilloud suggested that I write to this group to get some feedback on the 
issue.  Should we be concerned that all arrays of structs with exactly 'key' 
and 'value' fields would be read into a Schema map or could this be considered 
a feature?  If the former, how would you suggest that we limit reading into a 
map only those fields that were originally written from a map?

Thanks for any feedback to help bump this PR along!

NOTICE:

This message, and any attachments, contain(s) information that may be 
confidential or protected by privilege from disclosure and is intended only for 
the individual or entity named above. No one else may disclose, copy, 
distribute or use the contents of this message for any purpose. Its 
unauthorized use, dissemination or duplication is strictly prohibited and may 
be unlawful. If you receive this message in error or you otherwise are not an 
authorized recipient, please immediately delete the message and any attachments 
and notify the sender.

Reply via email to