aokolnychyi commented on pull request #1318: URL: https://github.com/apache/iceberg/pull/1318#issuecomment-672237164
> How can we reduce storage in these scenarios? Can these additional fields be nulls? I had the same question. I think they should be null in most cases to reduce the size of delete files. It is good to have a flexible format, though. > Why not just primary keys definition for table? Will equality field IDs be different between files? It can be used as schema evolution? This seems like an unnecessary restriction. Users would need to know all the columns they will update by at the table creation time and include them into the composite natural key. While it is essential to sort the data by the upsert key for performance, we can still append an extra column later on. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
