Andrew Lamb created ARROW-11182: ----------------------------------- Summary: [Rust] [DataFusion] Improve performance if IN list function Key: ARROW-11182 URL: https://issues.apache.org/jira/browse/ARROW-11182 Project: Apache Arrow Issue Type: Improvement Reporter: Andrew Lamb
The initial implementation of IN and NOT IN followed the "functional first, and then fast" There are several potential performance improvements for the IN and NOT IN implementation in Data fusion such as optimizing for large lists (use a hash table rather than repeated comparisons) and short circuiting results. There are a bunch of good ideas in the comments on this PR: https://github.com/apache/arrow/pull/9038/files -- This message was sent by Atlassian Jira (v8.3.4#803005)