aokolnychyi commented on code in PR #11481:
URL: https://github.com/apache/iceberg/pull/11481#discussion_r1831735172
##########
data/src/main/java/org/apache/iceberg/data/BaseDeleteLoader.java:
##########
@@ -146,6 +151,26 @@ private <T> Iterable<T> materialize(CloseableIterable<T>
iterable) {
@Override
public PositionDeleteIndex loadPositionDeletes(
Iterable<DeleteFile> deleteFiles, CharSequence filePath) {
+ if (containsDVs(deleteFiles)) {
+ DeleteFile dv = Iterables.getOnlyElement(deleteFiles);
+ validateDV(dv, filePath);
+ return readDV(dv); // TODO: support caching entire DV files
Review Comment:
A bit of context about how caching works for V2 deletes. If we estimate the
content of the entire file to fit into the cache (its in-memory
representation), we read the entire file and cache the result. For position
delete files, we cache a bitmap for each referenced data file. We can do
similar stuff for Puffin. I need to explore the performance impact of not
knowing the footer size upfront.
Any early feedback is welcome!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]