Andy Grove created ARROW-10686:
----------------------------------

             Summary: [Rust] [DataFusion] Combine conjunctive filters
                 Key: ARROW-10686
                 URL: https://issues.apache.org/jira/browse/ARROW-10686
             Project: Apache Arrow
          Issue Type: Improvement
          Components: Rust - DataFusion
            Reporter: Andy Grove


When using the DataFrame API, it is natural to chain together filter operations 
like this:
{code:java}
.filter(col("l_commitdate").lt(col("l_receiptdate")))?
.filter(col("l_shipdate").lt(col("l_commitdate")))?
.filter(col("l_receiptdate").gt_eq(lit("1994-01-01")))?
.filter(col("l_receiptdate").lt(lit("1995-01-01")))?{code}
This results in the following plan:
{code:java}
    Filter: #l_receiptdate Lt Utf8("1995-01-01")
      Filter: #l_receiptdate GtEq Utf8("1994-01-01")
        Filter: #l_shipdate Lt #l_commitdate
          Filter: #l_commitdate Lt #l_receiptdate{code}
We could implement an optimizer rule that combines these into a single filter:
{code:java}
Filter: #l_receiptdate Lt Utf8("1995-01-01") AND #l_receiptdate GtEq 
Utf8("1994-01-01") AND #l_shipdate Lt #l_commitdate AND #l_commitdate Lt 
#l_receiptdate  {code}
This will lead to a more concise plan and possibly will reduce some overhead.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to