[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-20 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101971506
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Returns a struct value 
with the given `jsonStr` and `schema`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('{"a":1}', '{"type":"struct", "fields":[{"name":"a", 
"type":"integer", "nullable":true}]}');
--- End diff --

I checked related code though, if we use `named_struc` here, we need to add 
substantial code to convert `named_struct` to `StructType`...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-20 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101968164
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Returns a struct value 
with the given `jsonStr` and `schema`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('{"a":1}', '{"type":"struct", "fields":[{"name":"a", 
"type":"integer", "nullable":true}]}');
--- End diff --

@gatorsmile, do you mind if I ask to elaborate what you think wtih 
`named_struct`? I am just curious.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101950039
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Returns a struct value 
with the given `jsonStr` and `schema`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('{"a":1}', '{"type":"struct", "fields":[{"name":"a", 
"type":"integer", "nullable":true}]}');
--- End diff --

I'll check


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101950073
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -174,4 +174,44 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
   .select(to_json($"struct").as("json"))
 checkAnswer(dfTwo, readBackTwo)
   }
+
+  test("SPARK-19637 Support to_json/from_json in SQL") {
+// to_json
+val df1 = Seq(Tuple1(Tuple1(1))).toDF("a")
+checkAnswer(
+  df1.selectExpr("to_json(a)"),
+  Row("""{"_1":1}""") :: Nil)
+
+val df2 = Seq(Tuple1(Tuple1(java.sql.Timestamp.valueOf("2015-08-26 
18:00:00.0".toDF("a")
+checkAnswer(
+  df2.selectExpr("""to_json(a, '{"timestampFormat": "dd/MM/ 
HH:mm"}')"""),
+  Row("""{"_1":"26/08/2015 18:00"}""") :: Nil)
+
+val errMsg1 = intercept[AnalysisException] {
+  df2.selectExpr("""to_json(a, '{"k": [{"k": "v"}]}')""").collect
+}
+assert(errMsg1.getMessage.startsWith(
+  """The format must be '{"key": "value", ...}', but {"k": [{"k": 
"v"}]}"""))
+
+// from_json
+val df3 = Seq("""{"a": 1}""").toDS()
+val schema1 = new StructType().add("a", IntegerType)
+checkAnswer(
+  df3.selectExpr(s"from_json(value, '${schema1.json}')"),
+  Row(Row(1)) :: Nil)
+
+val df4 = Seq("""{"time": "26/08/2015 18:00"}""").toDS()
+val schema2 = new StructType().add("time", TimestampType)
+checkAnswer(
+  df4.selectExpr(
+s"""from_json(value, '${schema2.json}', """ +
+   """'{"timestampFormat": "dd/MM/ HH:mm"}')"""),
--- End diff --

okay, I'll fix in that way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101948866
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Returns a struct value 
with the given `jsonStr` and `schema`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('{"a":1}', '{"type":"struct", "fields":[{"name":"a", 
"type":"integer", "nullable":true}]}');
--- End diff --

Can we let users call `named_struct` function to specify the schema?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101948552
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -174,4 +174,44 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
   .select(to_json($"struct").as("json"))
 checkAnswer(dfTwo, readBackTwo)
   }
+
+  test("SPARK-19637 Support to_json/from_json in SQL") {
+// to_json
+val df1 = Seq(Tuple1(Tuple1(1))).toDF("a")
+checkAnswer(
+  df1.selectExpr("to_json(a)"),
+  Row("""{"_1":1}""") :: Nil)
+
+val df2 = Seq(Tuple1(Tuple1(java.sql.Timestamp.valueOf("2015-08-26 
18:00:00.0".toDF("a")
+checkAnswer(
+  df2.selectExpr("""to_json(a, '{"timestampFormat": "dd/MM/ 
HH:mm"}')"""),
+  Row("""{"_1":"26/08/2015 18:00"}""") :: Nil)
+
+val errMsg1 = intercept[AnalysisException] {
+  df2.selectExpr("""to_json(a, '{"k": [{"k": "v"}]}')""").collect
--- End diff --

`collect ` is not needed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101948378
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -174,4 +174,44 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
   .select(to_json($"struct").as("json"))
 checkAnswer(dfTwo, readBackTwo)
   }
+
+  test("SPARK-19637 Support to_json/from_json in SQL") {
+// to_json
+val df1 = Seq(Tuple1(Tuple1(1))).toDF("a")
+checkAnswer(
+  df1.selectExpr("to_json(a)"),
+  Row("""{"_1":1}""") :: Nil)
+
+val df2 = Seq(Tuple1(Tuple1(java.sql.Timestamp.valueOf("2015-08-26 
18:00:00.0".toDF("a")
+checkAnswer(
+  df2.selectExpr("""to_json(a, '{"timestampFormat": "dd/MM/ 
HH:mm"}')"""),
+  Row("""{"_1":"26/08/2015 18:00"}""") :: Nil)
+
+val errMsg1 = intercept[AnalysisException] {
+  df2.selectExpr("""to_json(a, '{"k": [{"k": "v"}]}')""").collect
+}
+assert(errMsg1.getMessage.startsWith(
+  """The format must be '{"key": "value", ...}', but {"k": [{"k": 
"v"}]}"""))
+
+// from_json
+val df3 = Seq("""{"a": 1}""").toDS()
+val schema1 = new StructType().add("a", IntegerType)
+checkAnswer(
+  df3.selectExpr(s"from_json(value, '${schema1.json}')"),
+  Row(Row(1)) :: Nil)
+
+val df4 = Seq("""{"time": "26/08/2015 18:00"}""").toDS()
+val schema2 = new StructType().add("time", TimestampType)
+checkAnswer(
+  df4.selectExpr(
+s"""from_json(value, '${schema2.json}', """ +
+   """'{"timestampFormat": "dd/MM/ HH:mm"}')"""),
--- End diff --

Regarding the format of options, another way is to use the MapType.

For example, ```Scala
from_json(value, '${schema2.json}', map("timestampFormat", "dd/MM/ 
HH:mm"))
```

I am not sure whether using JSON to represent options is a good way.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101948229
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,24 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException(s"Must be a string literal, but: 
$e")
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
+
+  /**
+   * Convert a literal including a json option string (e.g., '{"mode": 
"PERMISSIVE", ...}')
--- End diff --

Aha, you mean we use a map literal, directly? Sorry, but I missed that 
idea. This json option is totally meaningless? If yes, I'll fix to use a map 
literal here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101947987
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Returns a struct value 
with the given `jsonStr` and `schema`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_('{"a":1}', '{"type":"struct", "fields":[{"name":"a", 
"type":"integer", "nullable":true}]}');
+   {"a":1}
--- End diff --

More examples are needed to show users how to use option.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101947601
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,24 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException(s"Must be a string literal, but: 
$e")
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
+
+  /**
+   * Convert a literal including a json option string (e.g., '{"mode": 
"PERMISSIVE", ...}')
--- End diff --

What is the reason we use the Json option string?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101947505
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Return a struct value with 
the given `jsonStr` and `schema`.",
--- End diff --

`Return` -> `Returns`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101947175
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
--- End diff --

okay, I'll do that ;)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101947094
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
--- End diff --

Ah, thanks. Yes, if it throws a class cast exception, I think we should 
produce a better exception and message rather than just one saying `A cannot be 
cast to B`. Maybe, add a util for both places?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-19 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101946265
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
--- End diff --

I just wrote this way along with here 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L3010.
 Both is okay to me though, if we modify the code in a way you suggested, we 
need to modify `from_json` code, too?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101909627
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
--- End diff --

trivial..

```
s"Must be a string literal, but: $e"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101909598
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
+
+  /**
+   * Convert a literal including a json option string (e.g., '{"mode": 
"PERMISSIVE", ...}')
+   * to Map-type data.
+   */
+  def validateOptionsLiteral(exp: Expression): Map[String, String] = {
+implicit val formats = org.json4s.DefaultFormats
+val json = validateStringLiteral(exp)
+Try(parse(json).extract[Map[String, String]]) match {
+  case Success(m) => m
+  case Failure(_) =>
+throw new AnalysisException(s"""The format must be '{"key": 
"value", ...}', but ${json})
--- End diff --

Could we do this as below

```scala
Try(parse(json).extract[Map[String, String]]).getOrElse {
  throw new AnalysisException(...)
}
```

or maybe just a try-catch block?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101909847
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +60,26 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
--- End diff --

I guess we should check if it is `StructType` and throw a proper exception 
because it seems `JsonToStruct` does not check if `exp` is `StructType` and it 
probably throws a cast exception (if I haven't missed something here).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101909916
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala
 ---
@@ -482,6 +482,15 @@ case class JsonTuple(children: Seq[Expression])
 /**
  * Converts an json input string to a [[StructType]] with the specified 
schema.
  */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(jsonStr, schema[, options]) - Return a `StructType` 
value with the given `jsonStr` and `schema`.",
--- End diff --

Probably just `struct` instead of `` `StructType` `` (as I found a example 
in 
https://github.com/apache/spark/blob/e7f982b20d8a1c0db711e0dcfe26b2f39f98dd64/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/generators.scala#L340
 as a reference).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101891758
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -18,7 +18,14 @@
 package org.apache.spark.sql.catalyst.json
 
 import com.fasterxml.jackson.core.{JsonParser, JsonToken}
+import org.json4s._
+import org.json4s.JsonAST.JValue
+import org.json4s.JsonDSL._
+import org.json4s.jackson.JsonMethods._
--- End diff --

This is just my personal opinion but should we maybe consider minimising 
this import? For example, `import org.json4s.jackson.JsonMethods.parse`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101891914
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/JsonFunctionsSuite.scala ---
@@ -174,4 +174,44 @@ class JsonFunctionsSuite extends QueryTest with 
SharedSQLContext {
   .select(to_json($"struct").as("json"))
 checkAnswer(dfTwo, readBackTwo)
   }
+
+  test("SPARK-19637 Support to_json/from_json in SQL") {
+// to_json
+val df1 = Seq(Tuple1(Tuple1(1))).toDF("a")
+checkAnswer(
+  df1.selectExpr("to_json(a)"),
+  Row("""{"_1":1}""") :: Nil)
+
+val df2 = Seq(Tuple1(Tuple1(java.sql.Timestamp.valueOf("2015-08-26 
18:00:00.0".toDF("a")
+checkAnswer(
+  df2.selectExpr("""to_json(a, '{"timestampFormat": "dd/MM/ 
HH:mm"}')"""),
+  Row("""{"_1":"26/08/2015 18:00"}""") :: Nil)
+
+val errMsg1 = intercept[AnalysisException] {
+  df2.selectExpr("""to_json(a, '{"k": [{"k": "v"}]}')""").collect
+}
+assert(errMsg1.getMessage.startsWith(
+  "The format must be '{\"key\": \"value\", ...}', but {\"k\": 
[{\"k\": \"v\"}]}"))
--- End diff --

Maybe, `""" ... """` if more commits should be pushed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-18 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/16981#discussion_r101891903
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/json/JacksonUtils.scala
 ---
@@ -55,4 +62,32 @@ object JacksonUtils {
 
 schema.foreach(field => verifyType(field.name, field.dataType))
   }
+
+  private def validateStringLiteral(exp: Expression): String = exp match {
+case Literal(s, StringType) => s.toString
+case e => throw new AnalysisException("Must be a string literal, but: 
" + e)
+  }
+
+  def validateSchemaLiteral(exp: Expression): StructType =
+DataType.fromJson(validateStringLiteral(exp)).asInstanceOf[StructType]
+
+  /**
+   * Convert a literal including a json option string (e.g., '{"mode": 
"PERMISSIVE", ...}')
+   * to Map-type data.
+   */
+  def validateOptionsLiteral(exp: Expression): Map[String, String] = {
+val json = validateStringLiteral(exp)
+parse(json) match {
+  case JObject(options) =>
--- End diff --

We could try to utilize `val parse(json).extract[Map[String, String]]`. 
Given my observation, it produces empty `Map` if there are no such values or 
empty json and it throws an exception if it is an invalid json.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #16981: [SPARK-19637][SQL] Add from_json/to_json in Funct...

2017-02-17 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/16981

[SPARK-19637][SQL] Add from_json/to_json in FunctionRegistry

## What changes were proposed in this pull request?
This pr added entries  in `FunctionRegistry` and supported  
`from_json`/`to_json` in SQL.

## How was this patch tested?
Added tests in `JsonFunctionsSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-19637

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/16981.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #16981


commit 8df67ec51cccab0d312f1236dff0c5f7506d9668
Author: Takeshi Yamamuro 
Date:   2017-02-17T09:05:02Z

Add from_json/to_json in FunctionRegistry




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org