ozankabak commented on issue #10881:
URL: https://github.com/apache/datafusion/issues/10881#issuecomment-2169129789
If we look at the following plan:
<pre>
> EXPLAIN SELECT e.emp_id, e.name, d.department
FROM employees e
LEFT JOIN department d
ON (e.name = 'Alice' OR e.name = 'Bob');
+---------------+----------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+----------------------------------------------------------------------------+
| logical_plan | Left Join: Filter: e.name = Utf8("Alice") OR e.name =
Utf8("Bob") |
| | SubqueryAlias: e
|
<i>| | Filter: employees.name = Utf8("Alice") OR
employees.name = Utf8("Bob") |</i>
| | TableScan: employees projection=[emp_id, name]
|
| | SubqueryAlias: d
|
| | TableScan: department projection=[department]
|
| physical_plan | NestedLoopJoinExec: join_type=Left, filter=name@0 = Alice
OR name@0 = Bob |
| | CoalesceBatchesExec: target_batch_size=8192
|
<i>| | FilterExec: name@1 = Alice OR name@1 = Bob
|</i>
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+----------------------------------------------------------------------------+
</pre>
The filters in italics seem wrong. They indeed disappear in the example I
gave above:
<pre>
> EXPLAIN SELECT e.emp_id, e.name, d.department
FROM employees e
LEFT JOIN department d
ON (e.name = 'Alice' OR e.name = 'Bob' OR d.department = 'FOOBAR');
+---------------+-------------------------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+-------------------------------------------------------------------------------------------------------+
| logical_plan | Left Join: Filter: e.name = Utf8("Alice") OR e.name =
Utf8("Bob") OR d.department = Utf8("FOOBAR") |
| | SubqueryAlias: e
|
| | TableScan: employees projection=[emp_id, name]
|
| | SubqueryAlias: d
|
| | TableScan: department projection=[department]
|
| physical_plan | ProjectionExec: expr=[emp_id@1 as emp_id, name@2 as name,
department@0 as department] |
| | NestedLoopJoinExec: join_type=Right, filter=name@0 =
Alice OR name@0 = Bob OR department@1 = FOOBAR |
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+-------------------------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.016 seconds.
</pre>
These filters are also not present if have an AND clause instead:
<pre>
> EXPLAIN SELECT e.emp_id, e.name, d.department
FROM employees e
LEFT JOIN department d
ON (e.name = 'Alice' AND e.name = 'Bob');
+---------------+---------------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+---------------------------------------------------------------------------------------+
| logical_plan | Left Join: Filter: e.name = Utf8("Alice") AND e.name =
Utf8("Bob") |
| | SubqueryAlias: e
|
| | TableScan: employees projection=[emp_id, name]
|
| | SubqueryAlias: d
|
| | TableScan: department projection=[department]
|
| physical_plan | ProjectionExec: expr=[emp_id@1 as emp_id, name@2 as name,
department@0 as department] |
| | NestedLoopJoinExec: join_type=Right, filter=name@0 =
Alice AND name@0 = Bob |
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+---------------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.013 seconds.
</pre>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]