[ 
https://issues.apache.org/jira/browse/SPARK-18558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15703424#comment-15703424
 ] 

Miao Wang commented on SPARK-18558:
-----------------------------------

scala> val df = spark.read.option("header", "true").option("inferSchema", 
"true").format("csv").load("example.csv")
df: org.apache.spark.sql.DataFrame = [column1: int]

scala> df.printSchema
root
 |-- column1: integer (nullable = true)


scala> 

scala> 

scala> df.show(5)
+-------+
|column1|
+-------+
|      1|
|      2|
|   null|
+-------+

Same here.

> spark-csv: infer data type for mixed integer/null columns causes exception
> --------------------------------------------------------------------------
>
>                 Key: SPARK-18558
>                 URL: https://issues.apache.org/jira/browse/SPARK-18558
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.0.2
>            Reporter: Peter Rose
>
> Null pointer exception when using the following csv file:
> example.csv:
> column1
> "1"
> "2"
> ""
>  Dataset<Row> df = spark
>                       .read()
>                       .option("header", "true")
>                       .option("inferSchema", "true")
>                       .format("csv")
>                       .load(example.csv);
>  df.printSchema();
> The type is correctly inferred:
> root
>  |-- col1: integer (nullable = true)
> df.show(5);
> The show method leads to this exception:
> java.lang.NumberFormatException: null
>       at java.lang.Integer.parseInt(Integer.java:542) ~[?:1.8.0_25]
>       at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_25]
>       at 
> scala.collection.immutable.StringLike$class.toInt(StringLike.scala:272) 
> ~[scala-library-2.11.8.jar:?]
>       at scala.collection.immutable.StringOps.toInt(StringOps.scala:29) 
> ~[scala-library-2.11.8.jar:?]
>       at 
> org.apache.spark.sql.execution.datasources.csv.CSVTypeCast$.castTo(CSVInferSchema.scala:241)
>  ~[spark-sql_2.11-2.0.2.jar:2.0.2]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to