ksbeyer commented on code in PR #51466:
URL: https://github.com/apache/spark/pull/51466#discussion_r2237947869


##########
sql/core/src/test/resources/sql-tests/analyzer-results/charvarchar.sql.out:
##########
@@ -376,7 +376,7 @@ CreateDataSourceTableCommand 
`spark_catalog`.`default`.`char_tbl4`, false
 insert into char_tbl4 select c, c, v, c from str_view
 -- !query analysis
 InsertIntoHadoopFsRelationCommand file:[not included in 
comparison]/{warehouse_dir}/char_tbl4, false, Parquet, [path=file:[not included 
in comparison]/{warehouse_dir}/char_tbl4], Append, 
`spark_catalog`.`default`.`char_tbl4`, 
org.apache.spark.sql.execution.datasources.InMemoryFileIndex(file:[not included 
in comparison]/{warehouse_dir}/char_tbl4), [c7, c8, v, s]
-+- Project 
[static_invoke(CharVarcharCodegenUtils.charTypeWriteSideCheck(cast(c#x as 
string), 7)) AS c7#x, 
static_invoke(CharVarcharCodegenUtils.charTypeWriteSideCheck(cast(c#x as 
string), 8)) AS c8#x, 
static_invoke(CharVarcharCodegenUtils.varcharTypeWriteSideCheck(cast(v#x as 
string), 6)) AS v#x, cast(c#x as string) AS s#x]
++- Project 
[static_invoke(CharVarcharCodegenUtils.charTypeWriteSideCheck(cast(c#x as 
string), 7)) AS c7#x, 
static_invoke(CharVarcharCodegenUtils.charTypeWriteSideCheck(cast(c#x as 
string), 8)) AS c8#x, 
static_invoke(CharVarcharCodegenUtils.varcharTypeWriteSideCheck(cast(v#x as 
string), 6)) AS v#x, c#x AS s#x]

Review Comment:
   The cast is avoided by the change in 
      org.apache.spark.sql.catalyst.analysis.TableOutputResolver.checkField
   from
      if (isCompatible(tableAttr, queryExpr))
   where isCompatible:
       DataTypeUtils.sameType(tableAttr.dataType, queryExpr.dataType) &&
         tableAttr.name == queryExpr.name &&
         tableAttr.metadata == queryExpr.metadata
   to
     if (DataTypeUtils.sameType(tableAttr.dataType, queryExpr.dataType))  
   
   The name check is not required because we will make sure the name is correct 
in applyColumnMetadata.  The metadata check seems wrong. Why do we care if the 
table side has a default value that matches the query side?  
   
   I'm guessing the concern is solely for the char/varchar metadata.  For this, 
the cast did help in a complicated way: it helped to keep the char/varchar 
metadata from propagating to the write command, but it is a no-op cast from 
String (with char/varchar metadata) to String as far as i can tell.  With the 
new code, applyColumnMetadata will keep the char/varchar metadata from 
propagating (along with the rewrite changes to not undo its work).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to