[GitHub] [spark] cloud-fan commented on pull request #37074: [SPARK-39672][SQL][3.1] Fix de-duplicating conflicting attributes when rewriting subquery

2022-07-12 Thread GitBox
cloud-fan commented on PR #37074: URL: https://github.com/apache/spark/pull/37074#issuecomment-1181935798 After more thoughts, I think we should treat correlated subquery as a join in optimizer rules. So in this case, once we remove the `Project`, the plan becomes invalid, because the

[GitHub] [spark] cloud-fan commented on pull request #37074: [SPARK-39672][SQL][3.1] Fix de-duplicating conflicting attributes when rewriting subquery

2022-07-11 Thread GitBox
cloud-fan commented on PR #37074: URL: https://github.com/apache/spark/pull/37074#issuecomment-1180582242 OK I think `DeduplicateRelations` needs some fix. Ideally the outer and inner plan should not have conflicting output attributes after analysis, but this local relation + project case

[GitHub] [spark] cloud-fan commented on pull request #37074: [SPARK-39672][SQL][3.1] Fix de-duplicating conflicting attributes when rewriting subquery

2022-07-11 Thread GitBox
cloud-fan commented on PR #37074: URL: https://github.com/apache/spark/pull/37074#issuecomment-1180579396 > This check is not accurate when there's And expression in the Join condition as in this case. Hence, this PR proposes to add a check whether the intersected attributes exist in all