Hello everyone, This is the final call for feedback on this FLIP. If there is no further feedback, I will start the vote tomorrow.
Best regards, dylanhz > 2025年12月11日 10:43,dylanhz <[email protected]> 写道: > > Hi David, > > Thanks for your review! Here are the clarifications: > > 1. You are right about the value range, the parenthesis excludes the upper > boundary. > > 2. Yes, we support sparsely populated bitmaps. We use RoaringBitmap[1] for > internal implementation, which efficiently handles both sparse and dense data > via adaptive compression (Array/Bitmap/Run containers). > > 3.1 Operations like BITMAP_AND are set algebra operations > (intersection/union) on the integers, not bit-level operations. Here is an > example: > ``` > SELECT BITMAP_OR(BITMAP_BUILD(ARRAY[1,2,3]), BITMAP_BUILD(ARRAY[3,6,9])) >> {1,2,3,6,9} > ``` > > 3.2 Bitmap level null semantics are stated in section “BITMAP Semantics”. Any > integer value is either in (bit 1) or not in (bit 0) the bitmap, there is no > “null” state. > > 3.3 Bitmap stores integers only, so you have to do mapping before storing > non-integer values. > As for the timestamps example, I’m not sure why and how you store them in a > bitmap, could you explain a little more? > > [1] https://github.com/RoaringBitmap/RoaringFormatSpec > > > Best regards, > dylanhz > > > > >> 2025年12月10日 20:21,David Radley <[email protected]> 写道: >> >> Hi, >> Looks good. A couple of thoughts : >> >> * >> It says “Only integers in the logical range [0, 2^32) are supported”. I >> assume the top value should be 2^32-1. >> * >> I assume we can have sparsely populated bitmaps. >> * >> When we do an AND or OR, I assume this is at the bit level, not at the >> integer level. Would a null value at position 6 be the same as 32 0’s for >> the purposes of these operations? It would be useful to show some examples >> around this for example around timestamps >> >> Kind regards, David. >> >> >> From: dylanhz <[email protected]> >> Date: Wednesday, 26 November 2025 at 12:09 >> To: [email protected] <[email protected]> >> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-556 Introduce BITMAP Data Type >> >> You are right that there is a binding, but I want to clarify: Bitmap is >> bound to Roaring as the external serialization format, not the internal >> implementation. The internal implementation can be changed independently >> without affecting users. >> >> If users need other serialization formats in the future, we can add a format >> parameter to Bitmap#toBytes and Bitmap#fromBytes methods, as well as their >> corresponding built-in functions. Users can then work with the BYTES type >> directly to serialize/deserialize in their preferred format when exchanging >> data with external systems. >> >> >> Best regards, >> dylanhz >> >> >>> 2025年11月26日 19:16,Xuyang <[email protected]> 写道: >>> >>> +1 for this feature. >>> Looks good to see the support for bitmaps to enable Flink handling >>> computations in extremely high-dimensional scenarios. After reviewing the >>> entire FLIP, I have one question: >>> Regarding the Bitmap#toBytes interface, I noticed that it will output bytes >>> in RoaringBitmap format default. Does this imply a strong binding to the >>> internal implementation of RoaringBitmap? For the writers on the sink >>> table, they need to be aware that the bytes are in RoaringBitmap format, >>> right? >>> >>> >>> >>> -- >>> >>> Best! >>> Xuyang >>> >>> >>> >>> 在 2025-11-24 18:09:25,"Lincoln Lee" <[email protected]> 写道: >>>> +1 for this feature! Expanding the bitmap type will help users unlock more >>>> computation scenarios and integrate more easily with external systems. >>>> >>>> >>>> Best, >>>> Lincoln Lee >>>> >>>> >>>> dylanhz <[email protected]> 于2025年11月21日周五 11:07写道: >>>> >>>>> Hi everyone, >>>>> >>>>> >>>>> I would like to start a discussion about FLIP-556 Introduce BITMAP Data >>>>> Type[1]. >>>>> >>>>> >>>>> Flink currently has no native, compressed data type for large integer >>>>> sets, forcing users to rely on external libraries like RoaringBitmap via >>>>> UDFs. >>>>> This limits performance, maintainability, and integration with Flink’s >>>>> type system and SQL engine. >>>>> We propose adding a built‑in BITMAP type based on RoaringBitmap to provide >>>>> compact storage, exact deduplication, and efficient set operations (AND, >>>>> OR, XOR) directly within Flink. >>>>> >>>>> >>>>> I have had some initial discussions with @Lincoln Lee and @Jark Wu >>>>> regarding this FLIP. >>>>> Looking forward to your feedback and suggestions. >>>>> >>>>> >>>>> [1] >>>>> https://docs.google.com/document/d/1YNgIt93iFboogHMoKbDD4LjP5UrfqtF65hitGKRtKMs/edit?usp=sharing >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Best regards, >>>>> dylanhz >> >> >> Unless otherwise stated above: >> >> IBM United Kingdom Limited >> Registered in England and Wales with number 741598 >> Registered office: Building C, IBM Hursley Office, Hursley Park Road, >> Winchester, Hampshire SO21 2JN >
