[ https://issues.apache.org/jira/browse/ARROW-10663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou reassigned ARROW-10663: -------------------------------------- Assignee: Antoine Pitrou > [C++/Doc] The IsIn kernel ignores the skip_nulls option of SetLookupOptions > --------------------------------------------------------------------------- > > Key: ARROW-10663 > URL: https://issues.apache.org/jira/browse/ARROW-10663 > Project: Apache Arrow > Issue Type: Bug > Components: C++ > Reporter: Joris Van den Bossche > Assignee: Antoine Pitrou > Priority: Major > Fix For: 3.0.0 > > > The C++ docs of {{SetLookupOptions}} has this explanation of the > {{skip_nulls}} option: > {code} > /// Whether nulls in `value_set` count for lookup. > /// > /// If true, any null in `value_set` is ignored and nulls in the input > /// produce null (IndexIn) or false (IsIn) values in the output. > /// If false, any null in `value_set` is successfully matched in > /// the input. > bool skip_nulls; > {code} > (from > https://github.com/apache/arrow/blob/8b9f6b9d28b4524724e60fac589fb1a3552a32b4/cpp/src/arrow/compute/api_scalar.h#L78-L84) > However, for {{IsIn}} this explanation doesn't seem to hold in practice: > {code} > In [16]: arr = pa.array([1, 2, None]) > In [17]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=True) > Out[17]: > <pyarrow.lib.BooleanArray object at 0x7fcf666f9408> > [ > true, > false, > true > ] > In [18]: pc.is_in(arr, value_set=pa.array([1, None]), skip_null=False) > Out[18]: > <pyarrow.lib.BooleanArray object at 0x7fcf666b13a8> > [ > true, > false, > true > ] > {code} > This documentation was added in https://github.com/apache/arrow/pull/7695 > (ARROW-8989)/ > . > BTW, for "index_in", it works as documented: > {code} > In [19]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=True) > Out[19]: > <pyarrow.lib.Int32Array object at 0x7fcf666f04c8> > [ > 0, > null, > null > ] > In [20]: pc.index_in(arr, value_set=pa.array([1, None]), skip_null=False) > Out[20]: > <pyarrow.lib.Int32Array object at 0x7fcf666f0ee8> > [ > 0, > null, > 1 > ] > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)