[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229797016 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- @cloud-fan Makes sense. Thanks for clarification... --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229585323 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- > Is it okay to drop the backtick from the first identifier AFAIK both `name` and `sql` are for display/message. I think dropping backtick is fine if no ambiguous --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229583440 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- @dilipbiswal Currently ```sql``` is not the same as ```name``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229578738 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- @cloud-fan I had a question, the 3rd example from huaxin, wanted to confirm again if the output looks okay to you. **from huaxin** ``` $"`a`.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql ``` with my previous fix: ``` `a`.`b` ``` make sql same as name: ``` a.b ``` Is it okay to drop the backtick from the first identifier ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user dilipbiswal commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229577806 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- @huaxingao Were you planning to make a change to make `name` same as `sql` ? Currently they are different ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229538040 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229479893 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- Yes. I agree. For the four examples, we will have the following results: ``` $"a.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql ``` with my previous fix: ``` `a`.`b` ``` make sql same as name: ``` a.b ``` ``` $"`a.b`".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql ``` with my previous fix: ``` `a`.`b` ``` make sql same as name: ``` `a.b` ``` ``` $"`a`.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql ``` with my previous fix: ``` `a`.`b` ``` make sql same as name: ``` a.b ``` ``` $"`a.b`.c".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql ``` with my previous fix: ``` `a.b`.`c` ``` make sql same as name: ``` `a.b`.c ``` Does this look good to everybody? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r229354247 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), Row(BigDecimal("26.393499451"))) } } + + test("SPARK-25769 escape nested columns") { +import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute +val sql1 = $"a.b".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql1 === "`a`.`b`") + +val sql2 = $"`a.b`".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql2 === "`a.b`") + +val sql3 = $"`a`.b".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql3 === "`a`.`b`") + +val sql4 = $"`a.b`.c".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql4 === "`a.b`.`c`") + } --- End diff -- Merged. Thank you very much! @dongjoon-hyun --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r228781145 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1 struct<> -- !query 18 output org.apache.spark.sql.AnalysisException -cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 +cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, mydb2.t1.i1]; line 1 pos 7 --- End diff -- do you think we should just make `sql` same as `name`? It looks to me that `'db1.t1.i1'` is better than `` '`db1`.`t1`.`i1`' ``, as it's more compact and is not ambiguous. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r228770171 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), Row(BigDecimal("26.393499451"))) } } + + test("SPARK-25769 escape nested columns") { +import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute +val sql1 = $"a.b".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql1 === "`a`.`b`") + +val sql2 = $"`a.b`".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql2 === "`a.b`") + +val sql3 = $"`a`.b".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql3 === "`a`.`b`") + +val sql4 = $"`a.b`.c".expr.asInstanceOf[UnresolvedAttribute].sql +assert(sql4 === "`a.b`.`c`") + } --- End diff -- Hi, @huaxingao . Can we move this test to `ColumnExpressionSuite`? I made a [PR](https://github.com/huaxingao/spark/pull/1) to you. Please review and merge. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r227203199 --- Diff: sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out --- @@ -81,7 +81,7 @@ SELECT t1.i1 FROM t1, mydb1.t1 struct<> -- !query 9 output org.apache.spark.sql.AnalysisException -Reference 't1.i1' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 1 pos 7 +Reference '`t1`.`i1`' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 1 pos 7 --- End diff -- These examples only make sense when we have the outer backticks. e.g. `'t1.i1'` is good. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r227202767 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala --- @@ -99,7 +99,7 @@ case class UnresolvedTableValuedFunction( case class UnresolvedAttribute(nameParts: Seq[String]) extends Attribute with Unevaluable { def name: String = -nameParts.map(n => if (n.contains(".")) s"`$n`" else n).mkString(".") +nameParts.map(n => if (nameParts.length > 1 || n.contains(".")) s"`$n`" else n).mkString(".") --- End diff -- I don't think this is better for `name`, we should update `sql` though. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/22788#discussion_r227152273 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2702,7 +2702,7 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext { val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i FROM v)")) assert(e.message == -"cannot resolve '`v.i`' given input columns: [__auto_generated_subquery_name.i]") --- End diff -- The out-most backticks is added in ```case _```, in ```case class UnresolvedAttribute(nameParts: Seq[String])``` ``` override def sql: String = name match { case ParserUtils.escapedIdentifier(_) | ParserUtils.qualifiedEscapedIdentifier(_, _) => name case _ => quoteIdentifier(name) } ``` the ```nameParts``` is a Seq of ```"v" "i"```, name is ```v.i``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org