[ 
https://issues.apache.org/jira/browse/SPARK-30687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon resolved SPARK-30687.
----------------------------------
    Resolution: Cannot Reproduce

> When reading from a file with pre-defined schema and encountering a single 
> value that is not the same type as that of its column , Spark nullifies the 
> entire row
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-30687
>                 URL: https://issues.apache.org/jira/browse/SPARK-30687
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Bao Nguyen
>            Priority: Major
>
> When reading from a file with pre-defined schema and encountering a single 
> value that is not the same type as that of its column , Spark nullifies the 
> entire row instead of setting the value at that cell to be null.
>  
> {code:java}
> case class TestModel(
>   num: Double, test: String, mac: String, value: Double
> )
> val schema = 
> ScalaReflection.schemaFor[TestModel].dataType.asInstanceOf[StructType]
> //here's the content of the file test.data
> //    1~test~mac1~2
> //    1.0~testdatarow2~mac2~non-numeric
> //    2~test1~mac1~3
> val ds = spark
>   .read
>   .schema(schema)
>   .option("delimiter", "~")
>   .csv("/test-data/test.data")
> ds.show();
> //the content of data frame. second row is all null. 
> //      +----+-----+----+-----+
> //      | num| test| mac|value|
> //      +----+-----+----+-----+
> //      | 1.0| test|mac1|  2.0|
> //      |null| null|null| null|
> //      | 2.0|test1|mac1|  3.0|
> //      +----+-----+----+-----+
> //should be
> // +----+--------------+----+-----+ 
> // | num| test         | mac|value| 
> // +----+--------------+----+-----+ 
> // | 1.0| test         |mac1| 2.0 | 
> // |1.0 |testdatarow2  |mac2| null| 
> // | 2.0|test1         |mac1| 3.0 | 
> // +----+--------------+----+-----+{code}
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to