Re: [PR] Flink: Add RemoveDanglingDeleteFiles to maintenance API [iceberg]

via GitHub Sat, 02 May 2026 06:30:59 -0700


Guosmilesmile commented on PR #16171:
URL: https://github.com/apache/iceberg/pull/16171#issuecomment-4363885397


   In the current implementation of this feature, there are places where 
DataFiles/DeleteFiles need to be read one by one. It also requires building a 
table-level Set and constructing partition-level minDataSeqByPartition. In 
large table scenarios, this will have performance issues and OOM. So I hope the 
Flink implementation can leverage distributed computing capabilities just like 
Spark does.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Flink: Add RemoveDanglingDeleteFiles to maintenance API [iceberg]

Reply via email to