ahshahid commented on issue #6424: URL: https://github.com/apache/iceberg/issues/6424#issuecomment-1351924900
Right.. I was also thinking that this is where I have a misunderstanding or bug... The question is : where the recordCount represents the scanned fraction row count, or the total row count of the split/file. I will look into the code, but as per me., the recordCount has to be partial scanned count. This is for 2 reasons: 1) The estimated Row count function needs to return the total row count in the split/file. If that was available , then calculations should not even be needed. 2) For a split which is partial on a single file, or if it spawns multiple files, the estimated row count of total split will need to be calculated. and for that the recordCount should be something which refers to partial scanned row count. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
