xudong963 opened a new pull request, #2858:
URL: https://github.com/apache/arrow-datafusion/pull/2858

   Closes #217 
   
   Query plan for tpch 18, focus on **filter plan**
   ```shell
   === Logical plan ===
   Projection: #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) 
AS revenue
     Aggregate: groupBy=[[]], aggr=[[SUM(#lineitem.l_extendedprice * Int64(1) - 
#lineitem.l_discount)]]
       Filter: #part.p_partkey = #lineitem.l_partkey AND #part.p_brand = 
Utf8("Brand#12") AND #part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), 
Utf8("SM PACK"), Utf8("SM PKG")]) AND #lineitem.l_quantity >= Int64(1) AND 
#lineitem.l_quantity <= Int64(1) + Int64(10) AND #part.p_size BETWEEN Int64(1) 
AND Int64(5) AND #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND 
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") OR #part.p_partkey = 
#lineitem.l_partkey AND #part.p_brand = Utf8("Brand#23") AND #part.p_container 
IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND 
#lineitem.l_quantity >= Int64(10) AND #lineitem.l_quantity <= Int64(10) + 
Int64(10) AND #part.p_size BETWEEN Int64(1) AND Int64(10) AND 
#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND 
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") OR #part.p_partkey = 
#lineitem.l_partkey AND #part.p_brand = Utf8("Brand#34") AND #part.p_container 
IN ([Utf8("LG CASE"), Utf8("LG BOX
 "), Utf8("LG PACK"), Utf8("LG PKG")]) AND #lineitem.l_quantity >= Int64(20) 
AND #lineitem.l_quantity <= Int64(20) + Int64(10) AND #part.p_size BETWEEN 
Int64(1) AND Int64(15) AND #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR 
REG")]) AND #lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
         CrossJoin:
           TableScan: lineitem
           TableScan: part
   
   === Optimized logical plan ===
   Projection: #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount) 
AS revenue
     Aggregate: groupBy=[[]], aggr=[[SUM(#lineitem.l_extendedprice * Int64(1) - 
#lineitem.l_discount)]]
       Projection: #part.p_partkey = #lineitem.l_partkey AS 
BinaryExpr-=Column-lineitem.l_partkeyColumn-part.p_partkey, 
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AS 
BinaryExpr-=LiteralDELIVER IN PERSONColumn-lineitem.l_shipinstruct, 
#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AS 
InList-falseLiteralAIR REGLiteralAIRColumn-lineitem.l_shipmode, 
#lineitem.l_quantity, #lineitem.l_extendedprice, #lineitem.l_discount, 
#part.p_brand, #part.p_size, #part.p_container
         Filter: #part.p_partkey = #lineitem.l_partkey AND #part.p_brand = 
Utf8("Brand#12") AND #part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"), 
Utf8("SM PACK"), Utf8("SM PKG")]) AND #lineitem.l_quantity >= Int64(1) AND 
#lineitem.l_quantity <= Int64(11) AND #part.p_size BETWEEN Int64(1) AND 
Int64(5) OR #part.p_brand = Utf8("Brand#23") AND #part.p_container IN 
([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND 
#lineitem.l_quantity >= Int64(10) AND #lineitem.l_quantity <= Int64(20) AND 
#part.p_size BETWEEN Int64(1) AND Int64(10) OR #part.p_brand = Utf8("Brand#34") 
AND #part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"), 
Utf8("LG PKG")]) AND #lineitem.l_quantity >= Int64(20) AND #lineitem.l_quantity 
<= Int64(30) AND #part.p_size BETWEEN Int64(1) AND Int64(15)
           CrossJoin:
             Filter: #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) 
AND #lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
               TableScan: lineitem projection=[l_partkey, l_quantity, 
l_extendedprice, l_discount, l_shipinstruct, l_shipmode], 
partial_filters=[#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]), 
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")]
             TableScan: part projection=[p_partkey, p_brand, p_size, 
p_container]
   ```
   
   We need to migrate the `cross join -> inner join optimization` from the 
planner to the optimizer so that tpch 19 can be further optimized to inner join 
using the predicate extracted by `rewrite_disjunctive_predicate`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to