UBarney commented on code in PR #16210: URL: https://github.com/apache/datafusion/pull/16210#discussion_r2144133973
########## datafusion/physical-plan/src/joins/nested_loop_join.rs: ########## @@ -178,6 +187,18 @@ pub struct NestedLoopJoinExec { metrics: ExecutionPlanMetricsSet, /// Cache holding plan properties like equivalences, output partitioning etc. cache: PlanProperties, + /// Null matching behavior: If `null_equals_null` is true, rows that have + /// `null`s in both left and right equijoin columns will be matched. + /// Otherwise, rows that have `null`s in the join columns will not be + /// matched and thus will not appear in the output. + null_equals_null: bool, + /// Set of equijoin columns from the relations: `(left_col, right_col)` + /// + /// This is optional as a nested loop join can be passed a 'on' clause + /// in the case that a Hash Join cost is more expensive than a + /// nested loop join or when a user would like to pick nested loop + /// join by hint + on: Option<Vec<(PhysicalExprRef, PhysicalExprRef)>>, Review Comment: Looks like current nlj already supports null_equal_null 😂 ``` > explain SELECT * FROM (VALUES (1, 'Apple'), (2, 'Banana'), (NULL, 'Cherry')) AS i(id, name) JOIN (VALUES (1, 'Fruit'), (3, 'Vegetable'), (NULL, 'Unknown')) AS c(id, category) ON i.id <=> c.id; +---------------+------------------------------------------------------------+ | plan_type | plan | +---------------+------------------------------------------------------------+ | physical_plan | ┌───────────────────────────┐ | | | │ NestedLoopJoinExec ├──────────────┐ | | | └─────────────┬─────────────┘ │ | | | ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ | | | │ ProjectionExec ││ ProjectionExec │ | | | │ -------------------- ││ -------------------- │ | | | │ id: column1 ││ category: column2 │ | | | │ name: column2 ││ id: column1 │ | | | └─────────────┬─────────────┘└─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐┌─────────────┴─────────────┐ | | | │ DataSourceExec ││ DataSourceExec │ | | | │ -------------------- ││ -------------------- │ | | | │ bytes: 1400 ││ bytes: 1400 │ | | | │ format: memory ││ format: memory │ | | | │ rows: 1 ││ rows: 1 │ | | | └───────────────────────────┘└───────────────────────────┘ | | | | +---------------+------------------------------------------------------------+ 1 row(s) fetched. Elapsed 0.001 seconds. > SELECT * FROM (VALUES (1, 'Apple'), (2, 'Banana'), (NULL, 'Cherry')) AS i(id, name) JOIN (VALUES (1, 'Fruit'), (3, 'Vegetable'), (NULL, 'Unknown')) AS c(id, category) ON i.id <=> c.id; +------+--------+------+----------+ | id | name | id | category | +------+--------+------+----------+ | 1 | Apple | 1 | Fruit | | NULL | Cherry | NULL | Unknown | +------+--------+------+----------+ 2 row(s) fetched. Elapsed 0.001 seconds. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org