kosiew commented on code in PR #17518:
URL: https://github.com/apache/datafusion/pull/17518#discussion_r2359423705
##########
datafusion/common/src/join_type.rs:
##########
@@ -74,6 +74,12 @@ pub enum JoinType {
RightMark,
}
+const LEFT_PRESERVING: &[JoinType] =
+ &[JoinType::Left, JoinType::Full, JoinType::LeftMark];
Review Comment:
Thanks for double-checking! In this file the LEFT_PRESERVING array is meant
to capture only the join variants that are guaranteed to emit every left row at
least once—i.e. the outer and mark joins. Semi/anti joins intentionally aren’t
listed because they drop rows from their respective inputs, so they don’t
satisfy that preservation property.
That distinction matters a few lines later when we decide which child can
safely receive a dynamic filter. If we marked LeftSemi/LeftAnti as
left-preserving, dynamic_filter_pushdown_side would classify them the same way
as a left outer join and start attaching the dynamic filter to the right
input—the exact misbehaviour you’re warning about for predicates that reference
b.y. Because they remain non-preserving, those join types fall through to the
(false, false) arm and we keep the dynamic filter on the left side only, which
means predicates like a.x < 5 are still pushed while b.y < 10 is not.
I’ll add a short comment in the code/tests to spell this out so it’s harder
to miss in the future.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]