[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/18930


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133817184
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a');
+SELECT json_tuple(jsonField, 'b', CAST(NULL AS STRING), a) FROM jsonTable;
+-- Clean up
+DROP VIEW IF EXISTS jsonTable;
--- End diff --

This is not a big deal. Thus, I will merge it when the test can pass. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133817086
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a');
+SELECT json_tuple(jsonField, 'b', CAST(NULL AS STRING), a) FROM jsonTable;
+-- Clean up
+DROP VIEW IF EXISTS jsonTable;
--- End diff --

Nit: just FYI, we do not need to drop the temp view in SQLQueryTestSuite. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133723664
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
--- End diff --

ok, we have altered the `'a'` to table field `a`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133723113
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
+SELECT json_tuple(jsonField, b, CAST(NULL AS STRING), 'a') FROM jsonTable;
+-- Clean up
+DROP VIEW IF EXISTS jsonTable;
--- End diff --

@HyukjinKwon @viirya Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133709165
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
+SELECT json_tuple(jsonField, b, CAST(NULL AS STRING), 'a') FROM jsonTable;
+-- Clean up
+DROP VIEW IF EXISTS jsonTable;
--- End diff --

+1. I remember @gatorsmile has also suggested to add it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133708907
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
--- End diff --

I suggest we can rewrite to:

CREATE TEMPORARY VIEW jsonTable(jsonField, a) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a'); 
SELECT json_tuple(jsonField, 'b', CAST(NULL AS STRING), a) FROM 
jsonTable; 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133705132
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
+SELECT json_tuple(jsonField, b, CAST(NULL AS STRING), 'a') FROM jsonTable;
+-- Clean up
+DROP VIEW IF EXISTS jsonTable;
--- End diff --

It looks the project style does not require a newline at the end but I 
would personally add this ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-17 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133705015
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+SELECT json_tuple('{"a" : 1, "b" : 2}', CAST(NULL AS STRING), 'b', 
CAST(NULL AS STRING), 'a');
+CREATE TEMPORARY VIEW jsonTable(jsonField, a, b) AS SELECT * FROM VALUES 
('{"a": 1, "b": 2}', 'a', 'b');
--- End diff --

Looks 'a' field is not used here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133614456
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

@gatorsmile @viirya Thank you for your time to review the code. SQL 
statements are consistent in style and the golden file of `json-functions.sql` 
also committed. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133610560
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

Remember to drop the created view `DROP VIEW IF EXISTS jsonTable;`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133507781
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
--- End diff --

no need to add these two desc commands.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133507734
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
--- End diff --

To be consistent with the other SQL commands, use upper cases for SQL 
keywords


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133507510
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

Also add an extra space at the end of this file


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133507416
  
--- Diff: sql/core/src/test/resources/sql-tests/inputs/json-functions.sql 
---
@@ -20,3 +20,9 @@ select from_json('{"a":1}', 'a InvalidType');
 select from_json('{"a":1}', 'a INT', named_struct('mode', 'PERMISSIVE'));
 select from_json('{"a":1}', 'a INT', map('mode', 1));
 select from_json();
+-- json_tuple
+describe function json_tuple;
+describe function extended json_tuple;
+select json_tuple('{"a" : 1, "b" : 2}', cast(NULL AS STRING), 'b', 
cast(NULL AS STRING), 'a')
+create temporary view jsonTable(jsonField, a, b) as select * from values 
'{"a": 1, "b": 2}', 'a', 'b';
+SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') FROM jsonTable
--- End diff --

To generate the result file, you need to run the command 
```
SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133479566
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
--- End diff --

ok, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133480372
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
--- End diff --

@gatorsmile has added unit test case in `JsonExpressionsSuite`
@viirya also add end-to-end test in `json-functions.sql`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-16 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133479509
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
+
+// mixes constant field name and non constant one
+withTempView("jsonTable") {
+  Seq(("""{"a": 1, "b": 2}""", "a", "b"))
+.toDF("jsonField", "a", "b")
+.createOrReplaceTempView("jsonTable")
+
+  checkAnswer(
+sql("""SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') 
FROM jsonTable"""),
--- End diff --

will move L2053 to json-functions.sql


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133347400
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
--- End diff --

The end-to-end test at L2047 may not be able to move to 
`JsonExpressionsSuite`. We can have some unit test cases similar to L2039 in 
`JsonExpressionsSuite` as @gatorsmile suggested.

It is also good to have similar end-to-end tests in `json-functions.sql`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133270299
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
--- End diff --

This is just an end-to-end test case. We also need to add unit test cases 
in `JsonExpressionsSuite` 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133269222
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
--- End diff --

Nit: move `Row(null, "2", null, "1"))` to the next line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133249207
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
+
+// mixes constant field name and non constant one
+withTempView("jsonTable") {
+  Seq(("""{"a": 1, "b": 2}""", "a", "b"))
+.toDF("jsonField", "a", "b")
+.createOrReplaceTempView("jsonTable")
+
+  checkAnswer(
+sql("""SELECT json_tuple(jsonField, b, cast(NULL AS STRING), 'a') 
FROM jsonTable"""),
--- End diff --

Nit: """ -> "


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133202748
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,25 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
--- End diff --

Could we move this to 
`spark/sql/core/src/test/resources/sql-tests/inputs/json-functions.sql` and/or 
`spark/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/JsonExpressionsSuite.scala`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133200977
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -362,12 +362,12 @@ case class JsonTuple(children: Seq[Expression])
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
   // eagerly evaluate any foldable the field names
-  @transient private lazy val foldableFieldNames: IndexedSeq[String] = {
+  @transient private lazy val foldableFieldNames: 
IndexedSeq[Option[String]] = {
 fieldExpressions.map {
-  case expr if expr.foldable => 
expr.eval().asInstanceOf[UTF8String].toString
+  case expr if expr.foldable => 
Option(expr.eval()).map(_.asInstanceOf[UTF8String].toString)
   case _ => null
-}.toIndexedSeq
-  }
+}
+  }.toIndexedSeq
--- End diff --

@viirya Done: (1) remove redundant comment  (2) move `toIndexedSeq` after 
the map


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133190654
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -362,12 +362,12 @@ case class JsonTuple(children: Seq[Expression])
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
   // eagerly evaluate any foldable the field names
-  @transient private lazy val foldableFieldNames: IndexedSeq[String] = {
+  @transient private lazy val foldableFieldNames: 
IndexedSeq[Option[String]] = {
 fieldExpressions.map {
-  case expr if expr.foldable => 
expr.eval().asInstanceOf[UTF8String].toString
+  case expr if expr.foldable => 
Option(expr.eval()).map(_.asInstanceOf[UTF8String].toString)
   case _ => null
-}.toIndexedSeq
-  }
+}
+  }.toIndexedSeq
--- End diff --

Can we move `toIndexedSeq` to inner block, i.e. after the map?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133190497
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -426,10 +426,11 @@ case class JsonTuple(children: Seq[Expression])
   // prefer the cached copy when available
   foldableFieldNames.zip(fieldExpressions).map {
 case (null, expr) => 
expr.eval(input).asInstanceOf[UTF8String].toString
-case (fieldName, _) => fieldName
+case (fieldName, _) => fieldName.orNull
   }
 }
 
+// Array[String]
--- End diff --

Can we remove this line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-15 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133135302
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -359,14 +359,14 @@ case class JsonTuple(children: Seq[Expression])
   @transient private lazy val jsonExpr: Expression = children.head
 
   // the fields to query are the remaining children
-  @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
+  @transient private lazy val fieldExpressions: Array[Expression] = 
children.tail.toArray
 
   // eagerly evaluate any foldable the field names
-  @transient private lazy val foldableFieldNames: IndexedSeq[String] = {
+  @transient private lazy val foldableFieldNames: Array[Option[String]] = {
--- End diff --

@viirya ok, thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-14 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133117021
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -359,14 +359,14 @@ case class JsonTuple(children: Seq[Expression])
   @transient private lazy val jsonExpr: Expression = children.head
 
   // the fields to query are the remaining children
-  @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
+  @transient private lazy val fieldExpressions: Array[Expression] = 
children.tail.toArray
 
   // eagerly evaluate any foldable the field names
-  @transient private lazy val foldableFieldNames: IndexedSeq[String] = {
+  @transient private lazy val foldableFieldNames: Array[Option[String]] = {
--- End diff --

I think we should continue to use `IndexedSeq` which is more efficient as 
`foldableFieldNames` will be used for many times.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-14 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133116488
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,13 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
--- End diff --

@viirya Done, the added test case contains column name, constant field 
name, and null field name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-14 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r133112441
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala
 ---
@@ -2034,4 +2034,13 @@ class JsonSuite extends QueryTest with 
SharedSQLContext with TestJsonData {
   }
 }
   }
+
+  test("SPARK-21677: json_tuple throws NullPointException when column is 
null as string type") {
+checkAnswer(sql(
+  """
+|SELECT json_tuple('{"a" : 1, "b" : 2}'
+|, cast(NULL AS STRING), 'b'
+|, cast(NULL AS STRING), 'a')
+  """.stripMargin), Row(null, "2", null, "1"))
--- End diff --

@jmchung Can we also add the test we discussed in slack which mixes 
constant field name and non constant one?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-14 Thread jmchung
Github user jmchung commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r132984129
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -361,10 +361,18 @@ case class JsonTuple(children: Seq[Expression])
   // the fields to query are the remaining children
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
+  // a field name given with constant null will be replaced with this 
pseudo field name
+  private val nullFieldName = "__NullFieldName"
--- End diff --

@HyukjinKwon @viirya  Yep, we've discarded the fake field name and use 
Option here. We made a slight revision to deal with the None in 
`foldableFieldNames` instead of creating a new function.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-13 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r132860287
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -361,10 +361,18 @@ case class JsonTuple(children: Seq[Expression])
   // the fields to query are the remaining children
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
+  // a field name given with constant null will be replaced with this 
pseudo field name
+  private val nullFieldName = "__NullFieldName"
--- End diff --

Yeah, I've also considered using Option here. But don't want to come out 
Option version first, so we can experience review process. It looks good to me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-13 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r132846456
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -361,10 +361,18 @@ case class JsonTuple(children: Seq[Expression])
   // the fields to query are the remaining children
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
+  // a field name given with constant null will be replaced with this 
pseudo field name
+  private val nullFieldName = "__NullFieldName"
--- End diff --

@jmchung, could we maybe compute this foldable related optimization ahead -
 
https://github.com/jmchung/spark/blob/ffa575a6731fef3e0731b73e0f7311cb024e831b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L425-L439
 and remove this fake field name?

I think we can make a function for the above codes first and then use it 
for computation for each row. Did I understand correctly?

I tried a rough version I thought - 
https://github.com/jmchung/spark/compare/SPARK-21677...HyukjinKwon:tmp-18930?expand=1,
 @viirya what do you think about this?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #18930: [SPARK-21677][SQL] json_tuple throws NullPointExc...

2017-08-12 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/18930#discussion_r132830750
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -361,10 +361,18 @@ case class JsonTuple(children: Seq[Expression])
   // the fields to query are the remaining children
   @transient private lazy val fieldExpressions: Seq[Expression] = 
children.tail
 
+  // toString on null will throw NullPointerException so that return a 
very unlikely column name
+  private val nullFieldName = "__NullFieldName"
--- End diff --

`A field name given with constant null will be replaced with this pseudo 
field name. `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org