[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-31 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229797016
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

@cloud-fan Makes sense. Thanks for clarification...


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229585323
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

> Is it okay to drop the backtick from the first identifier

AFAIK both `name` and `sql` are for display/message. I think dropping 
backtick is fine if no ambiguous


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-31 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229583440
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

@dilipbiswal 
Currently ```sql``` is not the same as ```name```




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229578738
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

@cloud-fan I had a question, the 3rd example from huaxin, wanted to confirm 
again if the output looks okay to you.

**from huaxin**
```

$"`a`.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql
```
with my previous fix:
```
`a`.`b`
``` 
make sql same as name:
```
a.b
```
Is it okay to drop the backtick from the first identifier ?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229577806
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

@huaxingao Were you planning to make a change to make `name` same as `sql` 
? Currently they are different ? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229538040
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

LGTM


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229479893
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

Yes. I agree. For the four examples, we will have the following results:
```

$"a.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql
```
with my previous fix: 
```
`a`.`b` 
``` 
make sql same as name: 
```
a.b
```
```

$"`a.b`".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql
```
with my previous fix: 
```
`a`.`b`
```  
make sql same as name: 
```
`a.b`
```
```

$"`a`.b".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql
```
with my previous fix: 
```
`a`.`b` 
```
make sql same as name: 
```
a.b
```
```

$"`a.b`.c".expr.asInstanceOf[org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute].sql
```
with my previous fix: 
```
`a.b`.`c`
``` 
make sql same as name: 
```
`a.b`.c 
```
Does this look good to everybody?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-30 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r229354247
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), 
Row(BigDecimal("26.393499451")))
 }
   }
+
+  test("SPARK-25769 escape nested columns") {
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+val sql1 = $"a.b".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql1 === "`a`.`b`")
+
+val sql2 = $"`a.b`".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql2 === "`a.b`")
+
+val sql3 = $"`a`.b".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql3 === "`a`.`b`")
+
+val sql4 = $"`a.b`.c".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql4 === "`a.b`.`c`")
+  }
--- End diff --

Merged. Thank you very much! @dongjoon-hyun 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-28 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r228781145
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -161,7 +161,7 @@ SELECT db1.t1.i1 FROM t1, mydb2.t1
 struct<>
 -- !query 18 output
 org.apache.spark.sql.AnalysisException
-cannot resolve '`db1.t1.i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
+cannot resolve '`db1`.`t1`.`i1`' given input columns: [mydb2.t1.i1, 
mydb2.t1.i1]; line 1 pos 7
--- End diff --

do you think we should just make `sql` same as `name`? It looks to me that 
`'db1.t1.i1'` is better than `` '`db1`.`t1`.`i1`' ``, as it's more compact and 
is not ambiguous.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-28 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r228770171
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2856,6 +2856,21 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
   checkAnswer(sql("select 26393499451 / (1e6 * 1000)"), 
Row(BigDecimal("26.393499451")))
 }
   }
+
+  test("SPARK-25769 escape nested columns") {
+import org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute
+val sql1 = $"a.b".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql1 === "`a`.`b`")
+
+val sql2 = $"`a.b`".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql2 === "`a.b`")
+
+val sql3 = $"`a`.b".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql3 === "`a`.`b`")
+
+val sql4 = $"`a.b`.c".expr.asInstanceOf[UnresolvedAttribute].sql
+assert(sql4 === "`a.b`.`c`")
+  }
--- End diff --

Hi, @huaxingao . Can we move this test to `ColumnExpressionSuite`? I made a 
[PR](https://github.com/huaxingao/spark/pull/1) to you. Please review and merge.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r227203199
  
--- Diff: 
sql/core/src/test/resources/sql-tests/results/columnresolution-negative.sql.out 
---
@@ -81,7 +81,7 @@ SELECT t1.i1 FROM t1, mydb1.t1
 struct<>
 -- !query 9 output
 org.apache.spark.sql.AnalysisException
-Reference 't1.i1' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; line 
1 pos 7
+Reference '`t1`.`i1`' is ambiguous, could be: mydb1.t1.i1, mydb1.t1.i1.; 
line 1 pos 7
--- End diff --

These examples only make sense when we have the outer backticks. e.g. 
`'t1.i1'` is good.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-22 Thread cloud-fan
Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r227202767
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -99,7 +99,7 @@ case class UnresolvedTableValuedFunction(
 case class UnresolvedAttribute(nameParts: Seq[String]) extends Attribute 
with Unevaluable {
 
   def name: String =
-nameParts.map(n => if (n.contains(".")) s"`$n`" else n).mkString(".")
+nameParts.map(n => if (nameParts.length > 1 || n.contains(".")) 
s"`$n`" else n).mkString(".")
--- End diff --

I don't think this is better for `name`, we should update `sql` though.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22788: [SPARK-25769][SQL]escape nested columns by backti...

2018-10-22 Thread huaxingao
Github user huaxingao commented on a diff in the pull request:

https://github.com/apache/spark/pull/22788#discussion_r227152273
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2702,7 +2702,7 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 
   val e = intercept[AnalysisException](sql("SELECT v.i from (SELECT i 
FROM v)"))
   assert(e.message ==
-"cannot resolve '`v.i`' given input columns: 
[__auto_generated_subquery_name.i]")
--- End diff --

The out-most backticks is added in 
```case _```, in ```case class UnresolvedAttribute(nameParts: 
Seq[String])```
```
  override def sql: String = name match {
case ParserUtils.escapedIdentifier(_) | 
ParserUtils.qualifiedEscapedIdentifier(_, _) => name
case _ => quoteIdentifier(name)
  }
```
the ```nameParts``` is a Seq of ```"v" "i"```, name is ```v.i``` 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org