[ 
https://issues.apache.org/jira/browse/SPARK-49044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-49044:
-----------------------------------
    Labels: pull-request-available  (was: )

> Improve error message in ValidateExternalType
> ---------------------------------------------
>
>                 Key: SPARK-49044
>                 URL: https://issues.apache.org/jira/browse/SPARK-49044
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 4.0.0
>            Reporter: Mark Andreev
>            Priority: Trivial
>              Labels: pull-request-available
>
> When we have mixed schema rows error message "\{actual} is not a valid 
> external type for schema of \{expected}" that don't help to understand column 
> with problem. I suggest to add information about source column.
> h2. How to reproduce
> {code:java}
> class ErrorMsgSuite extends AnyFunSuite with SharedSparkContext {
>   test("shouldThrowSchemaError") {
>     val seq: Seq[Row] = Seq(
>       Row(
>         toBytes("0"),
>         toBytes(""),
>         1L,
>       ),
>       Row(
>         toBytes("0"),
>         toBytes(""),
>         1L,
>       ),
>     )    val schema: StructType = new StructType()
>       .add("f1", BinaryType)
>       .add("f3", StringType)
>       .add("f2", LongType)    val df = 
> sqlContext.createDataFrame(sqlContext.sparkContext.parallelize(seq), schema)  
>   val exception = intercept[RuntimeException] {
>       df.show()
>     }    assert(
>       exception.getCause.getMessage
>         .contains("[B is not a valid external type for schema of string")
>     )
>     assertResult(
>       "[B is not a valid external type for schema of string"
>     )(exception.getCause.getMessage)
>   }  def toBytes(x: String): Array[Byte] = x.toCharArray.map(_.toByte)
> } {code}
> After fix error message may contain extra info
> {code:java}
> [B is not a valid external type for schema of string at 
> getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 
> 1, f3) {code}
> Code: 
> [https://github.com/mrk-andreev/example-spark-schema/blob/main/spark_4.0.0/src/test/scala/ErrorMsgSuite.scala]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to