[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-07 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1459329942 @hvanhovell Do we still need this change ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-07 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1458193681 > E... SQL/scala/Python all use the analyzer; they are all just frontends to the same thing. I found the reason. Although the scala API use analyzer too. `object ResolveLambda

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-07 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1458109080 > @beliefer here is the thing. When this was designed it was mainly aimed at sql, and there we definitely do not generate unique names in lambda functions either. This is all done in the

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-06 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1457418420 @hvanhovell Scala also uses `UnresolvedNamedLambdaVariable.freshVarName("x")` to get the unique names. see: https://github.com/apache/spark/blob/201e08c03a31c763e3120540ac1b1ca8ef252

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-06 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1456026258 It seems pyspark supports the nested lambda variables and two PR fix the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-05 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1455392063 > I guess we will need to rewrite the lamda function in spark connect planner. Yeah. -- This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-05 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1455390728 ![image](https://user-images.githubusercontent.com/8486025/223014232-bf9b26ee-d0e8-4de4-a8fe-2d252813ac4d.png) -- This is an automated message from the Apache Git Service. To respo

[GitHub] [spark] beliefer commented on pull request #40287: [SPARK-42562][CONNECT] UnresolvedNamedLambdaVariable in python do not need unique names

2023-03-05 Thread via GitHub
beliefer commented on PR #40287: URL: https://github.com/apache/spark/pull/40287#issuecomment-1455384317 @hvanhovell After my test, `python/run-tests --testnames 'pyspark.sql.connect.dataframe'` will not passed. -- This is an automated message from the Apache Git Service. To respond to th