[ https://issues.apache.org/jira/browse/ARROW-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17452435#comment-17452435 ]
Joris Van den Bossche commented on ARROW-14946: ----------------------------------------------- This is also related to numpy's {{nonzero}} in combination with an equality comparison: {code} In [66]: values = np.array([1, 2, 2, 3, 4, 1]) In [67]: np.nonzero(values == 1) Out[67]: (array([0, 5]),) {code} which is also being discussed in ARROW-13035. Although for this case having to go through a boolean array to only find the indices might give an additional overhead (this might be worth experimenting with). --- > This would be a binary vector kernel IMO. For a scalar right-value (as in your example above), the expected behaviour is clear. But would it be limited to scalars? (the expected behaviour for non-scalars is not really obvious to me) > [C++][Python] An operator for finding indices of a value > --------------------------------------------------------- > > Key: ARROW-14946 > URL: https://issues.apache.org/jira/browse/ARROW-14946 > Project: Apache Arrow > Issue Type: New Feature > Components: C++, Python > Reporter: Niranda Perera > Priority: Major > > As discussed in this mail thread [1], it would be nice to have a search > operator returning the indices of a Value. > ex: > {code:java} > values = pa.array([1, 2, 2, 3, 4, 1]) > indices = find_indices(values, 1) # expected = [0, 5]{code} > currently there is an option to get the "first index" of a value using > aggregates.index method. This would be a binary vector kernel IMO. > This is somewhat similar to `numpy.where` [2] but without a `y` input. > > [1] [https://lists.apache.org/thread/o8d4m905fxswcg0qjjx7gj3ql2d582k4] > [2] https://numpy.org/doc/stable/reference/generated/numpy.where.html -- This message was sent by Atlassian Jira (v8.20.1#820001)