[ https://issues.apache.org/jira/browse/IMPALA-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zoltán Borók-Nagy resolved IMPALA-13109. ---------------------------------------- Fix Version/s: Impala 4.5.0 Resolution: Fixed > Use RoaringBitmap in IcebergDeleteNode > -------------------------------------- > > Key: IMPALA-13109 > URL: https://issues.apache.org/jira/browse/IMPALA-13109 > Project: IMPALA > Issue Type: Improvement > Reporter: Zoltán Borók-Nagy > Assignee: Zoltán Borók-Nagy > Priority: Major > Labels: impala-iceberg > Fix For: Impala 4.5.0 > > > IcebergDeleteNode currently uses an ordered int64_t array for each data file > to hold the deleted positions. This can consume significant amount of memory > when there are lots of deleted records. > E.g. 100 Million delete records consume 800 MiB memory. > RoaringBitmap is a highly compressed and highly efficient data structure to > store bitmaps: > [https://arxiv.org/pdf/1603.06549] > [https://github.com/RoaringBitmap/CRoaring] > We could use it to store the deleted file positions instead of the sorted > arrays, as > * it consumes significantly less memory > * makes the code simpler > * *_might_* have perf benefits -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org