richardc-db commented on code in PR #46312:
URL: https://github.com/apache/spark/pull/46312#discussion_r1586697875


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -84,9 +84,16 @@ object ResolveDefaultColumns extends QueryErrorsBase
     if (SQLConf.get.enableDefaultColumns) {
       val newFields: Seq[StructField] = tableSchema.fields.map { field =>
         if (field.metadata.contains(CURRENT_DEFAULT_COLUMN_METADATA_KEY)) {
-          val analyzed: Expression = analyze(field, statementType)
+          val defaultSql: String = if 
(field.dataType.isInstanceOf[VariantType]) {
+            // A variant's SQL/string representation is its JSON string which 
cannot be directly

Review Comment:
   Hmm, i believe all default values are already stored as string sql 
expressions (usually literal expressions). I.e. the existing values (as a 
string sql expression) is retrieved 
[here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L389)
 and is analyzed using `analyze` and eventually coerced into the target type 
[here](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala#L302).
 We could "coerce" the variant string into a variant by wrapping it with 
`parse_json` as mentioned in the PR description. However, this feels 
inefficient. We would evaluate the default expression to produce a variant, 
transform it to its string representation (by `expr.sql`) and then transform it 
back to its variant value later. Instead, we could just leave the expression as 
is and lazily evaluate it to avoid the extra variant-> strin
 g and string -> variant conversions.
   
   Regarding your other comment, I believe this doesn't work even if 
`ParseJson` can be constant folded (I quickly tried it) because `analyze` 
attempts to analyze the expression as a variant `{"k": "v"}`, which isn't valid 
sql syntax (there aren't even quotations around it). The root of the problem 
here is that there isn't a way to represent variant types as a literal



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to