[ https://issues.apache.org/jira/browse/SPARK-19729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15885758#comment-15885758 ]
Hyukjin Kwon commented on SPARK-19729: -------------------------------------- I am sorry that I am a bit confused. {code} scala> Seq("var1,var2,,").toDF().write.text("/tmp/testcsv") scala> val df = spark.read.csv("/tmp/testcsv") df: org.apache.spark.sql.DataFrame = [_c0: string, _c1: string ... 2 more fields] scala> df.show() +----+----+----+----+ | _c0| _c1| _c2| _c3| +----+----+----+----+ |var1|var2|null|null| +----+----+----+----+ scala> val row = df.first() row: org.apache.spark.sql.Row = [var1,var2,null,null] scala> row.size res19: Int = 4 scala> row.fieldIndex("_c2") res20: Int = 2 scala> row.getAs[String]("_c2") res21: String = null scala> row.get(2) res22: Any = null scala> print(row) [var1,var2,null,null] {code} Could you tell me which one makes you feel like an issue? > Strange behaviour with reading csv with schema into dataframe > ------------------------------------------------------------- > > Key: SPARK-19729 > URL: https://issues.apache.org/jira/browse/SPARK-19729 > Project: Spark > Issue Type: Bug > Components: Java API, SQL > Affects Versions: 2.0.1 > Reporter: Mazen Melouk > > I have the following schema > [{first,string_type,false} > ,{second,string_type,false} > ,{third,string_type,false} > ,{fourth,string_type,false}] > Example lines: > var1,var2,, > when accessing the row I get the following > row.size =4 > row.fieldIndex(third_string)=2 > row.getAs(third_string)=var2 > row.get(2)=var2 > print(row)= var1,var2 > Any idea why the null values are missing? -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org