gaborkaszab commented on PR #15979:
URL: https://github.com/apache/iceberg/pull/15979#issuecomment-4371995087

   Thank you for sharing your view, @ajantha-bhat !
   
   Let me sum up how I understand the situation so far:
   - Calculating total_record_count as a derived value on the write path 
probably does't add much value, because that a) can't work on all use-cases 
(eq-deletes, v2 pos-deletes) and b) could easily be calculated by the user 
engines if the partition doesn't have or has only v3 DVs.
   - Calculating total_record_count for all use cases requires a full data scan 
applying the deletes. Engines might be able to do this when performing a full 
scan for other purposes, but this is very speculative, I don't think any engine 
would do this.
   
   I'd lean towards dropping it from the spec then, because I don't see any 
motivation to write it TBH. Keeping it there so that some engine in the future 
might write it doesn't make sense to me. WDYT @ajantha-bhat, @pvary  ?
   
   Additionally, we could consider if it's still valuable to calculate it as a 
derived value on the read path, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to