Hi, The "is_in" docstring is not directly clear about it, but you need to pass the second argument as a keyword argument using "value_set" keyword name. Small example:
In [19]: pc.is_in(pa.array(["a", "b", "c", "d"]), value_set=pa.array(["a", "c"])) Out[19]: <pyarrow.lib.BooleanArray object at 0x7f508af95ac8> [ true, false, true, false ] You can find this keyword in the keywords of pc.SetLookupOptions. Best, Joris On Wed, 18 Nov 2020 at 16:43, Vibhatha Abeykoon <[email protected]> wrote: > Hello, > > I am working on a dataset API on top of Arrow kernels. I am looking into > the usage of > *is_in* function in the compute API. > > I couldn't figure out how arguments are passed for a is_in check. A simple > scenario would be; > > > *cylon_tb.from_list([[2,1], [1,0]]* > *cylon_tb.isin([2])* > > Is this very similar to Pandas isin: > https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isin.html > ? If not how could we use *is_in* op? > > With Regards, > Vibhatha Abeykoon >
