[ https://issues.apache.org/jira/browse/SPARK-49044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SPARK-49044: ----------------------------------- Labels: pull-request-available (was: ) > Improve error message in ValidateExternalType > --------------------------------------------- > > Key: SPARK-49044 > URL: https://issues.apache.org/jira/browse/SPARK-49044 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 4.0.0 > Reporter: Mark Andreev > Priority: Trivial > Labels: pull-request-available > > When we have mixed schema rows error message "\{actual} is not a valid > external type for schema of \{expected}" that don't help to understand column > with problem. I suggest to add information about source column. > h2. How to reproduce > {code:java} > class ErrorMsgSuite extends AnyFunSuite with SharedSparkContext { > test("shouldThrowSchemaError") { > val seq: Seq[Row] = Seq( > Row( > toBytes("0"), > toBytes(""), > 1L, > ), > Row( > toBytes("0"), > toBytes(""), > 1L, > ), > ) val schema: StructType = new StructType() > .add("f1", BinaryType) > .add("f3", StringType) > .add("f2", LongType) val df = > sqlContext.createDataFrame(sqlContext.sparkContext.parallelize(seq), schema) > val exception = intercept[RuntimeException] { > df.show() > } assert( > exception.getCause.getMessage > .contains("[B is not a valid external type for schema of string") > ) > assertResult( > "[B is not a valid external type for schema of string" > )(exception.getCause.getMessage) > } def toBytes(x: String): Array[Byte] = x.toCharArray.map(_.toByte) > } {code} > After fix error message may contain extra info > {code:java} > [B is not a valid external type for schema of string at > getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), > 1, f3) {code} > Code: > [https://github.com/mrk-andreev/example-spark-schema/blob/main/spark_4.0.0/src/test/scala/ErrorMsgSuite.scala] > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org