teabot edited a comment on pull request #1046:
URL: https://github.com/apache/iceberg/pull/1046#issuecomment-635487412


   Thanks for the comprehensive reply @rdblue. The SQL experience argument is a 
compelling one, and I agree that if we look at formats that support union, we 
find that support for this type is often (and perhaps understandably) 
overlooked/avoided within integrations and that this would probably be the case 
with Iceberg also. In our platforms we've tended to 'prohibit' producers from 
using unions because they effectively end up creating unreadable batch datasets 
downstream. The choice that Iceberg has made seems a sensible one.
   
   That said, we're now seeing use cases in the stream domain that in my 
opinion are better modelled with union (multi-type streams). In practice these 
are currently modelled using loose set of independent schemas that have no 
complete explicit contract. I don't like this because it pushes elements of the 
producer/consumer schema contract outside of Avro and into an implicit 
convention that cannot enjoy compatibility checking for example.
   
   I digress - what I hope to distill from this kind of discourse are good 
patterns for bridging data in streams and at rest for these use cases. But I 
admit that this is not a problem that Iceberg should solve.
   
   Apologies for the distraction and thanks for your time.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to