Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22775#discussion_r227251515 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala --- @@ -770,8 +776,17 @@ case class SchemaOfJson( factory } - override def convert(v: UTF8String): UTF8String = { - val dt = Utils.tryWithResource(CreateJacksonParser.utf8String(jsonFactory, v)) { parser => + @transient + private lazy val json = child.eval().asInstanceOf[UTF8String] --- End diff -- Yea, that was my thought, and yea I don't strongly feel we should enforce it too in the future but was thinking we better enforce for 2.4.x (if the RC fails). I was thinking about other possibilities to use this expression with `from_json` (for instance, let's say .. as an aggregation expression or .. automatically somehow collect few examples from a column and return a schema). Actually Max proposed an aggregation expression first and that got rejected by Reynold, and this way was suggested by him. > It's also weird that users are willing to write verbose json literal in from_json(..., schema = schema_of_json(...)) instead of DDL string. Yea, it is, but that's what current codes do (https://github.com/apache/spark/blob/dbf9cabfb80c157cf0e52014b07c4abeb1aa2ff6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L803). Nothing serious but I was just thinking about allowing what we need for now given there have been some discussion about it before.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org