[ https://issues.apache.org/jira/browse/ARROW-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567865#comment-17567865 ]
Yibo Cai commented on ARROW-17096: ---------------------------------- cc [~jorisvandenbossche] , [~apitrou] for comments. In below test, the printed result *[false, false]* is wrong, but the underlying buffer is correct *[false, true]* (00, 01). Fiddling the buffer directly, looks pyarrow is treating the buffer as bitmap (one bit per value), not one byte per value like C++ compute kernel. {code:python} In [1]: import pyarrow.compute as pc In [2]: import pyarrow as pa In [3]: m = pc.mode(pa.array([True, False]), 2) In [4]: m.field(0) Out[4: <pyarrow.lib.BooleanArray object at 0x7fc50da06460> [ false, false ] In [5]: m.field(0).buffers()[1].to_pybytes() Out[5]: b'\x00\x01' {code} > pyarrow.compute.mode for boolean arrays does not return true when mixed with > false > ---------------------------------------------------------------------------------- > > Key: ARROW-17096 > URL: https://issues.apache.org/jira/browse/ARROW-17096 > Project: Apache Arrow > Issue Type: Improvement > Components: C++, Python > Affects Versions: 8.0.0 > Reporter: Matthew Roeschke > Assignee: Yibo Cai > Priority: Major > > {code:java} > In [1]: import pyarrow.compute as pc > In [2]: import pyarrow as pa > In [3]: pa.__version__ > Out[3]: '8.0.0' > In [4]: pc.mode(pa.array([True, True])) > # Correct > Out[4]: > <pyarrow.lib.StructArray object at 0x1266d5c60> > -- is_valid: all not null > -- child 0 type: bool > [ > true > ] > -- child 1 type: int64 > [ > 2 > ] > # Incorrect > In [5]: pc.mode(pa.array([True, False]), 2) > Out[5]: > <pyarrow.lib.StructArray object at 0x1262110c0> > -- is_valid: all not null > -- child 0 type: bool > [ > false, # should be true > false > ] > -- child 1 type: int64 > [ > 1, > 1 > ] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)