[ 
https://issues.apache.org/jira/browse/ARROW-12695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346046#comment-17346046
 ] 

Joris Van den Bossche commented on ARROW-12695:
-----------------------------------------------

Currently pyarrow doesn't implement any {{\_\_bool\_\_}}. In general, Python 
will then always return True by default, but it seems that if your object is 
"sequence-like" (having a {\_\_len\_\_}}), it will check the length. This is 
described at https://docs.python.org/3/library/stdtypes.html#truth-value-testing

So here the underlying reason is that this fails:

{code}
>>> len(pa.scalar([1, 2], type=pa.list_(pa.int32())))
2

>>> len(pa.scalar(None, type=pa.list_(pa.int32())))
...
TypeError: object of type 'NoneType' has no len()
{code}

But the question is also, what should this return instead? Returning 0 in this 
case also doesn't feel correct, as you can also have an empty list scalar with 
a length of zero.

In general, I think it will be hard to give a nice and consistent interface for 
pyarrow scalars involving null scalars (we could provide better error messages 
though?)

[~mosalx] what's your use case for wanting to do {{bool(null_scalar)}}, and 
what do you think it should return? (also True as the other scalars?)

> [Python] bool value of scalars depends on data type
> ---------------------------------------------------
>
>                 Key: ARROW-12695
>                 URL: https://issues.apache.org/jira/browse/ARROW-12695
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 4.0.0
>         Environment: Windows 10
> python 3.9.4
>            Reporter: Sergey Mozharov
>            Priority: Major
>
> `pyarrow.Scalar` and its subclasses do not implement `__bool__` method. The 
> default implementation does not seem to do the right thing. For example:
> {code:java}
> >>> import pyarrow as pa
> >>> na_value = pa.scalar(None, type=pa.int32())
> >>> bool(na_value)
> True
> >>> na_value = pa.scalar(None, type=pa.struct([('a', pa.int32())]))
> >>> bool(na_value)
> False
> >>> bool(pa.scalar(None, type=pa.list_(pa.int32())))
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "pyarrow\scalar.pxi", line 572, in pyarrow.lib.ListScalar.__len__
> TypeError: object of type 'NoneType' has no len()
> >>>
> {code}
> Please consider implementing `___bool____` method. It seems reasonable to 
> delegate to the `____bool___` method of the wrapped object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to