Re: compute::is_in rejects duplicates in value_set

2021-04-26 Thread Niranda Perera
Sure. PFA the JIRA https://issues.apache.org/jira/browse/ARROW-12554 On Mon, Apr 26, 2021 at 4:31 PM Wes McKinney wrote: > In principle I don't see an issue with having duplicates in the value set, > could you open a Jira issue? > > On Mon, Apr 26, 2021 at 3:27 PM Niranda Perera > wrote: > > >

Re: compute::is_in rejects duplicates in value_set

2021-04-26 Thread Wes McKinney
In principle I don't see an issue with having duplicates in the value set, could you open a Jira issue? On Mon, Apr 26, 2021 at 3:27 PM Niranda Perera wrote: > Hi all, > > In the arrow release-4.0.0 branch, the compute::is_in operation rejects > duplicate values in the value_set [1]. This was no

compute::is_in rejects duplicates in value_set

2021-04-26 Thread Niranda Perera
Hi all, In the arrow release-4.0.0 branch, the compute::is_in operation rejects duplicate values in the value_set [1]. This was not the case in arrow 2.0 >=. I was wondering if this strict restriction is required? Because ultimately, a hash set would be created from the value_set values, and ther