[ 
https://issues.apache.org/jira/browse/ARROW-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16891335#comment-16891335
 ] 

Wes McKinney commented on ARROW-4722:
-------------------------------------

I have run into this issue in ARROW-3772 where Parquet decoders currently pass 
around bitmaps in pieces (see eg 
https://github.com/apache/arrow/blob/master/cpp/src/parquet/encoding.h#L56). 
Currently we allow passing null/not-null markers using a bytemap in the builder 
classes, but not a bitmap. I propose that we have a simple bitmap object with 
structure like:

{code}
class Bitmap {
  const uint8_t* data_;
  int64_t length_;
  int64_t offset_;
};
{code}

I'm not sure how to best handle non-const bitmaps, though. We might have to have

{code}
class MutableBitmap : public Bitmap {
  ...
};
{code}

A first place where this would be useful to have is as an alternative to any 
ArrayBuilder method that currently takes {{valid_bytes}}, like

https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/builder_primitive.h#L119

{{Bitmap}} can have helper methods {{GetReader}} and {{GetWriter}} for 
constructing {{BitmapReader}} and {{BitmapWriter}} respectively

> [C++] Implement Bitmap class to modularize handling of bitmaps
> --------------------------------------------------------------
>
>                 Key: ARROW-4722
>                 URL: https://issues.apache.org/jira/browse/ARROW-4722
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Benjamin Kietzman
>            Priority: Minor
>             Fix For: 1.0.0
>
>
> This could be a simple view or it could own a {{shared_ptr<Buffer>}}. In 
> either case, it would greatly simplify situations where a {{pointer, offset, 
> length}} are currently passed



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to