xudong963 opened a new pull request, #2858:
URL: https://github.com/apache/arrow-datafusion/pull/2858
Closes #217
Query plan for tpch 18, focus on **filter plan**
```shell
=== Logical plan ===
Projection: #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)
AS revenue
Aggregate: groupBy=[[]], aggr=[[SUM(#lineitem.l_extendedprice * Int64(1) -
#lineitem.l_discount)]]
Filter: #part.p_partkey = #lineitem.l_partkey AND #part.p_brand =
Utf8("Brand#12") AND #part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"),
Utf8("SM PACK"), Utf8("SM PKG")]) AND #lineitem.l_quantity >= Int64(1) AND
#lineitem.l_quantity <= Int64(1) + Int64(10) AND #part.p_size BETWEEN Int64(1)
AND Int64(5) AND #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") OR #part.p_partkey =
#lineitem.l_partkey AND #part.p_brand = Utf8("Brand#23") AND #part.p_container
IN ([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND
#lineitem.l_quantity >= Int64(10) AND #lineitem.l_quantity <= Int64(10) +
Int64(10) AND #part.p_size BETWEEN Int64(1) AND Int64(10) AND
#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AND
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") OR #part.p_partkey =
#lineitem.l_partkey AND #part.p_brand = Utf8("Brand#34") AND #part.p_container
IN ([Utf8("LG CASE"), Utf8("LG BOX
"), Utf8("LG PACK"), Utf8("LG PKG")]) AND #lineitem.l_quantity >= Int64(20)
AND #lineitem.l_quantity <= Int64(20) + Int64(10) AND #part.p_size BETWEEN
Int64(1) AND Int64(15) AND #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR
REG")]) AND #lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
CrossJoin:
TableScan: lineitem
TableScan: part
=== Optimized logical plan ===
Projection: #SUM(lineitem.l_extendedprice * Int64(1) - lineitem.l_discount)
AS revenue
Aggregate: groupBy=[[]], aggr=[[SUM(#lineitem.l_extendedprice * Int64(1) -
#lineitem.l_discount)]]
Projection: #part.p_partkey = #lineitem.l_partkey AS
BinaryExpr-=Column-lineitem.l_partkeyColumn-part.p_partkey,
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON") AS
BinaryExpr-=LiteralDELIVER IN PERSONColumn-lineitem.l_shipinstruct,
#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]) AS
InList-falseLiteralAIR REGLiteralAIRColumn-lineitem.l_shipmode,
#lineitem.l_quantity, #lineitem.l_extendedprice, #lineitem.l_discount,
#part.p_brand, #part.p_size, #part.p_container
Filter: #part.p_partkey = #lineitem.l_partkey AND #part.p_brand =
Utf8("Brand#12") AND #part.p_container IN ([Utf8("SM CASE"), Utf8("SM BOX"),
Utf8("SM PACK"), Utf8("SM PKG")]) AND #lineitem.l_quantity >= Int64(1) AND
#lineitem.l_quantity <= Int64(11) AND #part.p_size BETWEEN Int64(1) AND
Int64(5) OR #part.p_brand = Utf8("Brand#23") AND #part.p_container IN
([Utf8("MED BAG"), Utf8("MED BOX"), Utf8("MED PKG"), Utf8("MED PACK")]) AND
#lineitem.l_quantity >= Int64(10) AND #lineitem.l_quantity <= Int64(20) AND
#part.p_size BETWEEN Int64(1) AND Int64(10) OR #part.p_brand = Utf8("Brand#34")
AND #part.p_container IN ([Utf8("LG CASE"), Utf8("LG BOX"), Utf8("LG PACK"),
Utf8("LG PKG")]) AND #lineitem.l_quantity >= Int64(20) AND #lineitem.l_quantity
<= Int64(30) AND #part.p_size BETWEEN Int64(1) AND Int64(15)
CrossJoin:
Filter: #lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")])
AND #lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")
TableScan: lineitem projection=[l_partkey, l_quantity,
l_extendedprice, l_discount, l_shipinstruct, l_shipmode],
partial_filters=[#lineitem.l_shipmode IN ([Utf8("AIR"), Utf8("AIR REG")]),
#lineitem.l_shipinstruct = Utf8("DELIVER IN PERSON")]
TableScan: part projection=[p_partkey, p_brand, p_size,
p_container]
```
We need to migrate the `cross join -> inner join optimization` from the
planner to the optimizer so that tpch 19 can be further optimized to inner join
using the predicate extracted by `rewrite_disjunctive_predicate`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]