gaborkaszab commented on issue #6042: URL: https://github.com/apache/iceberg/issues/6042#issuecomment-1326214188
Well, in my opinion if we talk about whether we should apply delete files when we calculate the existing 'file_count' and 'record_count' metrics then it might not worth the extra performance impact as we can always say that these 2 metrics are for the contents of the data files (without applying the delete files). Showing a partition in the output that has already been renamed is a different story I think. I'm not saying that we should definitely implement the code to apply the delete files to omit these partitions, I just think that we might want to re-consider if this is worth the work and the extra perf impact. Hence, I opened #6257 because this topic feels a bit different as the current ticket is for introducing new fields in the metadata table while the one I opened is for discussing whether we want to introduce applying delete files to get rid of partitions from the output that got renamed meanwhile. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
