Github user wzhfy commented on a diff in the pull request: https://github.com/apache/spark/pull/19479#discussion_r150134850 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -1034,11 +1034,18 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat schema.fields.map(f => (f.name, f.dataType)).toMap stats.colStats.foreach { case (colName, colStat) => colStat.toMap(colName, colNameTypeMap(colName)).foreach { case (k, v) => - statsProperties += (columnStatKeyPropName(colName, k) -> v) + val statKey = columnStatKeyPropName(colName, k) + val threshold = conf.get(SCHEMA_STRING_LENGTH_THRESHOLD) + if (v.length > threshold) { + throw new AnalysisException(s"Cannot persist '$statKey' into hive metastore as " + --- End diff -- Hive's exception is not friendly to Spark users. Spark user may not know what's wrong in his operation: ``` org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. Put request failed : INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES (?,?,?) org.datanucleus.exceptions.NucleusDataStoreException: Put request failed : INSERT INTO TABLE_PARAMS (PARAM_VALUE,TBL_ID,PARAM_KEY) VALUES (?,?,?) ... Caused by: java.sql.SQLDataException: A truncation error was encountered trying to shrink VARCHAR 'TFo0QmxvY2smeREAANBdAAALz3IBM0AUAAEAQgPoP/ALAAQUACNAJBAAEy4I&' to length 4000. ... ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org