[ 
https://issues.apache.org/jira/browse/SPARK-11553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000616#comment-15000616
 ] 

Bartlomiej Alberski edited comment on SPARK-11553 at 11/11/15 5:10 PM:
-----------------------------------------------------------------------

Ok. I think that I know what is the problem. It can be reproduced with scala 
2.11.6 and DataFrame API.

If you are using DataFrame API from scala and you are trying to get 
Int|Long|Boolean etc - value that extends AnyVal, you will receive "zero value" 
specific for given type (0 for Long and Int, false for Boolean etc), while API 
suggest that NPE will be raised.

Example modified in order to ilustrate problem (from 
http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes)
{code}
val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df = sqlContext.read.json("examples/src/main/resources/people.json")

// Displays the content of the DataFrame to stdout
df.show()
val res = df.map(x => x.getLong(x.fieldIndex("name"))).collect()
println(res.mkString(",")
{code}

Problem comes from implementation of getInt|Float|Boolean|... methods: 
{code}
getInt(i: Int): Int = getAs[Int](i)
getAs[T](i: Int): T = get(i).asInstanceOf[T]
{code}

null.asInstanceOf[Long] returns 0 (because Long cannot be null - it extends 
AnyVal)

Examplary invocations from scala REPL
{code}
scala> null.asInstanceOf[Int]
res0: Int = 0

scala> null.asInstanceOf[Long]
res1: Long = 0

scala> null.asInstanceOf[Short]
res2: Short = 0

scala> null.asInstanceOf[Boolean]
res3: Boolean = false

scala> null.asInstanceOf[Double]
res4: Double = 0.0

scala> null.asInstanceOf[Float]
res5: Float = 0.0
{code}

I will be more than happy to prepare PR solving this issue.


was (Author: alberskib):
Ok. I think that I know what is the problem. It can be reproduced with scala 
2.11.6 and DataFrame API.

If you are using DataFrame API from scala and you are trying to get 
Int|Long|Boolean etc - value that extends AnyVal, you will receive "zero value" 
specific for given type (0 for Long and Int, false for Boolean etc), while API 
suggest that NPE will be raised.

Example modified in order to ilustrate problem (from 
http://spark.apache.org/docs/latest/sql-programming-guide.html#dataframes)
{code}
val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df = sqlContext.read.json("examples/src/main/resources/people.json")

// Displays the content of the DataFrame to stdout
df.show()
val res = df.map(x => x.getLong(x.fieldIndex("name"))).collect()
println(res.mkString(",")
{code}

Problem comes from implementation of getInt|Float|Boolean|... methods: 
{code}
getInt(i: Int): Int = getAs[Int](i)
getAs[T](i: Int): T = get(i).asInstanceOf[T]
{code}

null.asInstanceOf[Long] returns 0 (because Long cannot be null because it 
extends AnyVal)

Examplary invocations from scala REPL
{code}
scala> null.asInstanceOf[Int]
res0: Int = 0

scala> null.asInstanceOf[Long]
res1: Long = 0

scala> null.asInstanceOf[Short]
res2: Short = 0

scala> null.asInstanceOf[Boolean]
res3: Boolean = false

scala> null.asInstanceOf[Double]
res4: Double = 0.0

scala> null.asInstanceOf[Float]
res5: Float = 0.0
{code}

I will be more than happy to prepare PR solving this issue.

> row.getInt(i) if row[i]=null returns 0
> --------------------------------------
>
>                 Key: SPARK-11553
>                 URL: https://issues.apache.org/jira/browse/SPARK-11553
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Tofigh
>            Priority: Minor
>
> row.getInt|Float|Double in SPARK RDD return 0 if row[index] is null. (Even 
> according to the document they should throw nullException error)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to