[ 
https://issues.apache.org/jira/browse/IMPALA-13109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltán Borók-Nagy resolved IMPALA-13109.
----------------------------------------
    Fix Version/s: Impala 4.5.0
       Resolution: Fixed

> Use RoaringBitmap in IcebergDeleteNode
> --------------------------------------
>
>                 Key: IMPALA-13109
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13109
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>             Fix For: Impala 4.5.0
>
>
> IcebergDeleteNode currently uses an ordered int64_t array for each data file 
> to hold the deleted positions. This can consume significant amount of memory 
> when there are lots of deleted records.
> E.g. 100 Million delete records consume 800 MiB memory.
> RoaringBitmap is a highly compressed and highly efficient data structure to 
> store bitmaps:
> [https://arxiv.org/pdf/1603.06549]
> [https://github.com/RoaringBitmap/CRoaring]
> We could use it to store the deleted file positions instead of the sorted 
> arrays, as
>  * it consumes significantly less memory
>  * makes the code simpler
>  * *_might_* have perf benefits



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to