[ 
https://issues.apache.org/jira/browse/ARROW-17096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17567865#comment-17567865
 ] 

Yibo Cai commented on ARROW-17096:
----------------------------------

cc [~jorisvandenbossche] , [~apitrou] for comments.

In below test, the printed result *[false, false]* is wrong, but the underlying 
buffer is correct *[false, true]* (00, 01).
Fiddling the buffer directly, looks pyarrow is treating the buffer as bitmap 
(one bit per value), not one byte per value like C++ compute kernel.

{code:python}
In [1]: import pyarrow.compute as pc

In [2]: import pyarrow as pa

In [3]: m = pc.mode(pa.array([True, False]), 2)

In [4]: m.field(0)
Out[4: 
<pyarrow.lib.BooleanArray object at 0x7fc50da06460>
[
  false,
  false
]

In [5]: m.field(0).buffers()[1].to_pybytes()
Out[5]: b'\x00\x01'
{code}

> pyarrow.compute.mode for boolean arrays does not return true when mixed with 
> false
> ----------------------------------------------------------------------------------
>
>                 Key: ARROW-17096
>                 URL: https://issues.apache.org/jira/browse/ARROW-17096
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>    Affects Versions: 8.0.0
>            Reporter: Matthew Roeschke
>            Assignee: Yibo Cai
>            Priority: Major
>
> {code:java}
> In [1]: import pyarrow.compute as pc
> In [2]: import pyarrow as pa
> In [3]: pa.__version__
> Out[3]: '8.0.0'
> In [4]: pc.mode(pa.array([True, True]))
> # Correct
> Out[4]:
> <pyarrow.lib.StructArray object at 0x1266d5c60>
> -- is_valid: all not null
> -- child 0 type: bool
>   [
>     true
>   ]
> -- child 1 type: int64
>   [
>     2
>   ]
> # Incorrect
> In [5]: pc.mode(pa.array([True, False]), 2)
> Out[5]:
> <pyarrow.lib.StructArray object at 0x1262110c0>
> -- is_valid: all not null
> -- child 0 type: bool
>   [
>     false, # should be true
>     false
>   ]
> -- child 1 type: int64
>   [
>     1,
>     1
>   ] {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to