rdblue commented on pull request #1981:
URL: https://github.com/apache/iceberg/pull/1981#issuecomment-751026580


   > To make sure I understand correctly, changes in projection methods ensure 
that both behaviors before and after this fix will be accounted for by the 
projection, so that we might not need to have separate implementations for 
format v1 versus v2, with a slight penalty that in v2 we may scan more data 
than we have to?
   
   Yes, this fixes the transforms and ensures that predicate projection 
includes the partitions that were written with bad values. That means that we 
won't need different implementations for v2, but it also means that we can 
avoid scanning the extra partitions in the future. Because this is fixed before 
v2, we can ensure that all v2 tables have been fixed. So if a table is created 
as v2, we should be able to know that no older writers with the bug have 
written to the table. The only case where a v2 table would have bad metadata is 
when a v1 table has been converted. We should add a flag to signal that the 
table was converted from v1 or one that signals it was created as v2 that 
allows us to skip the extra partitions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to