[
https://issues.apache.org/jira/browse/ARROW-16718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17557995#comment-17557995
]
Alessandro Molina commented on ARROW-16718:
-------------------------------------------
In general I noticed that the user experience with null values is not very easy
to wrap your head around.
For example, it might not be immediately obvious that to check is something is
{{NULL}} comparing it to {{pyarrow.NA}} is not going to work
{code}
>>> null_string = pa.scalar(None, pa.string())
>>> null_string.is_valid
False
>>> null_string == pa.NA
False
{code}
but comparing it to a Null value of same type will work
{code}
>>> null_string == pa.scalar(None, pa.string())
True
{code}
So in this case we are behaving partially null-safe (a string null is different
than a null but is equal to another string null)
On the counterpart using {{pyarrow.compute.equal}} does behave differently
{code}
>>> pc.equal(null_string, pa.scalar(None, pa.string()))
<pyarrow.BooleanScalar: None>
{code}
Which can be considered not null safe.
> [C++] Implement is_distinct_from and is_not_distinct_from kernels
> -----------------------------------------------------------------
>
> Key: ARROW-16718
> URL: https://issues.apache.org/jira/browse/ARROW-16718
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Ian Cook
> Priority: Minor
>
> Some SQL engines have the comparison operators {{IS DISTINCT FROM}} and
> {{{}IS NOT DISTINCT FROM{}}}. These are so-called {_}null-safe comparison
> operators{_}.
> As explained in the Impala docs:
> {quote}The IS DISTINCT FROM operator, and its converse the IS NOT DISTINCT
> FROM operator, test whether or not values are identical. IS NOT DISTINCT FROM
> is similar to the = operator, and IS DISTINCT FROM is similar to the !=
> operator, except that NULL values are treated as identical. Therefore, IS NOT
> DISTINCT FROM returns true rather than NULL, and IS DISTINCT FROM returns
> false rather than NULL, when comparing two NULL values. If one of the values
> being compared is NULL and the other is not, IS DISTINCT FROM returns true
> and IS NOT DISTINCT FROM returns false, again instead of returning NULL in
> both cases.
> {quote}
> It would be a nice convenience to have these implemented as kernels in Arrow.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)