Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-19 Thread via GitHub


nuno-faria commented on PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2736367170

   Thank you @chenkovsky and @goldmedal for fixing the issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-19 Thread via GitHub


goldmedal merged PR #15212:
URL: https://github.com/apache/datafusion/pull/15212


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-17 Thread via GitHub


goldmedal commented on code in PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#discussion_r1998987034


##
.github/workflows/rust.yml:
##
@@ -246,6 +246,7 @@ jobs:
 mc cp -r /source/* localminio/data"
   - name: Run tests (excluding doctests)
 env:
+  RUST_MIN_STACK: 4194304

Review Comment:
   This test has been modified in the `main` branch. It seems that it never 
happens stack overflow anymore.



##
datafusion/sql/src/unparser/expr.rs:
##
@@ -94,6 +94,7 @@ impl Unparser<'_> {
 Ok(root_expr)
 }
 
+#[cfg_attr(feature = "recursive_protection", recursive::recursive)]

Review Comment:
   This annotation should be enabled by using the  `recursive_protection` 
feature. I found the CI doesn't enable it. I experimented with it 
https://github.com/apache/datafusion/pull/15272 (It seems to work well)
   You can see how I modified the workflow.



##
.github/workflows/extended.yml:
##
@@ -81,6 +81,7 @@ jobs:
   - name: Run tests (excluding doctests)
 env:
   RUST_BACKTRACE: 1
+  RUST_MIN_STACK: 4194304

Review Comment:
   After enabling `recursive_protection`, it can be removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-17 Thread via GitHub


chenkovsky commented on code in PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#discussion_r1999031362


##
datafusion/sql/src/unparser/expr.rs:
##
@@ -94,6 +94,7 @@ impl Unparser<'_> {
 Ok(root_expr)
 }
 
+#[cfg_attr(feature = "recursive_protection", recursive::recursive)]

Review Comment:
   interesting. actually, I tested in my env, without this annotation, stack 
will be overflowed. but with this annotation, it works. so I haven't checked 
whether it's enabled in CI. you saved my day.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub


goldmedal commented on code in PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#discussion_r1995622142


##
datafusion/sql/tests/cases/plan_to_sql.rs:
##
@@ -1746,3 +1749,153 @@ fn test_unparse_subquery_alias_with_table_pushdown() -> 
Result<()> {
 assert_eq!(sql.to_string(), expected);
 Ok(())
 }
+
+#[test]
+fn test_unparse_left_anti_join() -> Result<()> {
+let schema = Schema::new(vec![
+Field::new("c", DataType::Int32, false),
+Field::new("d", DataType::Int32, false),
+]);
+
+// LeftAnti Join: t1.c = __correlated_sq_1.c
+//   TableScan: t1 projection=[c]
+//   SubqueryAlias: __correlated_sq_1
+// TableScan: t2 projection=[c]
+
+let table_scan1 = table_scan(Some("t1"), &schema, Some(vec![0, 
1]))?.build()?;
+let table_scan2 = table_scan(Some("t2"), &schema, Some(vec![0]))?.build()?;
+let subquery = subquery_alias(table_scan2, "__correlated_sq_1")?;
+let plan = LogicalPlanBuilder::from(table_scan1)
+.project(vec![col("t1.d")])?
+.join_on(
+subquery,
+datafusion_expr::JoinType::LeftAnti,
+vec![col("t1.c").eq(col("__correlated_sq_1.c"))],
+)?
+.build()?;
+
+let unparser = Unparser::new(&UnparserPostgreSqlDialect {});
+let sql = unparser.plan_to_sql(&plan)?;
+assert_eq!("SELECT \"t1\".\"d\" FROM \"t1\" WHERE NOT EXISTS (SELECT 1 
FROM \"t2\" AS \"__correlated_sq_1\" WHERE (\"t1\".\"c\" = 
\"__correlated_sq_1\".\"c\"))", sql.to_string());
+Ok(())
+}
+
+#[test]
+fn test_unparse_left_semi_join() -> Result<()> {
+let schema = Schema::new(vec![
+Field::new("c", DataType::Int32, false),
+Field::new("d", DataType::Int32, false),
+]);
+
+// LeftAnti Join: t1.c = __correlated_sq_1.c
+//   TableScan: t1 projection=[c]
+//   SubqueryAlias: __correlated_sq_1
+// TableScan: t2 projection=[c]

Review Comment:
   I think the comment sould be `semi_join`, not `left anti join`.



##
datafusion/sql/src/unparser/ast.rs:
##
@@ -176,6 +177,41 @@ impl SelectBuilder {
 self.lateral_views = value;
 self
 }
+
+/// Replaces the selection with a new value.
+///
+/// This function is used to replace a specific expression within the 
selection.
+/// Unlike the `selection` method which combines existing and new 
selections with AND,
+/// this method searches for and replaces occurrences of a specific 
expression.
+///
+/// This method is primarily used to modify LEFT MARK JOIN expressions.
+/// When processing a LEFT MARK JOIN, we need to replace the placeholder 
expression
+/// with the actual join condition in the selection clause.
+///
+/// # Arguments
+///
+/// * `existing_expr` - The expression to replace
+/// * `value` - The new expression to set as the selection
+///
+/// # Returns
+///
+/// A mutable reference to self for method chaining

Review Comment:
   ```suggestion
   ```
   It's redundant.



##
datafusion/sql/tests/cases/plan_to_sql.rs:
##
@@ -1746,3 +1749,153 @@ fn test_unparse_subquery_alias_with_table_pushdown() -> 
Result<()> {
 assert_eq!(sql.to_string(), expected);
 Ok(())
 }
+
+#[test]
+fn test_unparse_left_anti_join() -> Result<()> {
+let schema = Schema::new(vec![
+Field::new("c", DataType::Int32, false),
+Field::new("d", DataType::Int32, false),
+]);
+
+// LeftAnti Join: t1.c = __correlated_sq_1.c
+//   TableScan: t1 projection=[c]
+//   SubqueryAlias: __correlated_sq_1
+// TableScan: t2 projection=[c]
+
+let table_scan1 = table_scan(Some("t1"), &schema, Some(vec![0, 
1]))?.build()?;
+let table_scan2 = table_scan(Some("t2"), &schema, Some(vec![0]))?.build()?;
+let subquery = subquery_alias(table_scan2, "__correlated_sq_1")?;
+let plan = LogicalPlanBuilder::from(table_scan1)
+.project(vec![col("t1.d")])?
+.join_on(
+subquery,
+datafusion_expr::JoinType::LeftAnti,
+vec![col("t1.c").eq(col("__correlated_sq_1.c"))],
+)?
+.build()?;
+
+let unparser = Unparser::new(&UnparserPostgreSqlDialect {});
+let sql = unparser.plan_to_sql(&plan)?;
+assert_eq!("SELECT \"t1\".\"d\" FROM \"t1\" WHERE NOT EXISTS (SELECT 1 
FROM \"t2\" AS \"__correlated_sq_1\" WHERE (\"t1\".\"c\" = 
\"__correlated_sq_1\".\"c\"))", sql.to_string());
+Ok(())
+}
+
+#[test]
+fn test_unparse_left_semi_join() -> Result<()> {
+let schema = Schema::new(vec![
+Field::new("c", DataType::Int32, false),
+Field::new("d", DataType::Int32, false),
+]);
+
+// LeftAnti Join: t1.c = __correlated_sq_1.c
+//   TableScan: t1 projection=[c]
+//   SubqueryAlias: __correlated_sq_1
+// TableScan: t2 projection=[c]
+
+let table_scan1 = table_scan(Some("t1"), &schema, Some(vec![0, 
1]))?.build()?;
+let table_scan2 = table_scan(So

Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub


chenkovsky commented on PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2725105874

   > > for stackoverflow problem. do you have any idea?
   > 
   > No, I don't have any idea currently 🤔. I'm not sure, but I guess 
`recursive_protection` is the right direction.
   
   the wierd thing is that after recursive_protection was added, on amd64 
linux, stack is still overflowed. but on other platform, it's ok. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub


goldmedal commented on PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2725013843

   > for stackoverflow problem. do you have any idea?
   
   No, I don't have any idea currently 🤔. I'm not sure, but I guess 
`recursive_protection` is the right direction.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-14 Thread via GitHub


chenkovsky commented on PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2724945557

   > Thanks @chenkovsky. I think the results of `left mark join`, `right semi 
join` and `right anit join` aren't correct (they aren't executable). I left 
some comments for them. Could you double-check them?
   
   thank you for your reply. I'm struggling with stackoverflow problem, so I 
missed something carelessly. it's my fault. I will fix it asap.
   
   for stackoverflow problem. do you have any idea?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-13 Thread via GitHub


chenkovsky commented on PR #15212:
URL: https://github.com/apache/datafusion/pull/15212#issuecomment-2723305518

   stackoverflow again. ಥ_ಥ


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



[PR] fix: unparsing left/ right semi/mark join [datafusion]

2025-03-13 Thread via GitHub


chenkovsky opened a new pull request, #15212:
URL: https://github.com/apache/datafusion/pull/15212

   ## Which issue does this PR close?
   
   - Closes #15127.
   
   ## Rationale for this change
   
   Most SQL dialects don't support specialized join types like LEFT SEMI, LEFT 
ANTI, LEFT MARK, RIGHT SEMI, or RIGHT ANTI joins. 
   
   ## What changes are included in this PR?
   
   For these cases, this PR rewrites these joins into equivalent EXISTS 
subqueries. 
   
   However, an alternative approach would be to generate dialect-specific SQL 
for different database engines.
   
   
   ## Are these changes tested?
   
   unit test
   
   ## Are there any user-facing changes?
   
   No.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org