getAs defined as: def getAs[T](i: Int): T = get(i).asInstanceOf[T]
and when you do toString you call Object.toString which doesn't depend on the type, so asInstanceOf[T] get dropped by the compiler, i.e. row.getAs[Int](0).toString -> row.get(0).toString we can confirm that by writing a simple scala code: import org.apache.spark.sql._ object Test { val row = Row(null) row.getAs[Int](0).toString } and then compiling it: $ scalac -classpath $SPARK_HOME/jars/'*' -print test.scala [[syntax trees at end of cleanup]] // test.scala package <empty> { object Test extends Object { private[this] val row: org.apache.spark.sql.Row = _; <stable> <accessor> def row(): org.apache.spark.sql.Row = Test.this.row; def <init>(): Test.type = { Test.super.<init>(); Test.this.row = org.apache.spark.sql.Row.apply(scala.this.Predef.genericWrapArray(Array[Object]{null})); *Test.this.row().getAs(0).toString();* () } } } So the proper way would be: String.valueOf(row.getAs[Int](0)) On Tue, Dec 19, 2017 at 4:23 AM, Anurag Sharma <anu...@logistimo.com> wrote: > The following Scala (Spark 1.6) code for reading a value from a Row fails > with a NullPointerException when the value is null. > > val test = row.getAs[Int]("ColumnName").toString > > while this works fine > > val test1 = row.getAs[Int]("ColumnName") // returns 0 for nullval test2 = > test1.toString // converts to String fine > > What is causing NullPointerException and what is the recommended way to > handle such cases? > > PS: getting row from DataFrame as follows: > > val myRDD = myDF > .repartition(partitions) > .mapPartitions { > rows => > rows.flatMap { > row => > functionWithRows(row) //has above logic to read null column > which fails > } > } > > functionWithRows has then above mentioned NullPointerException > > MyDF schema: > > root > |-- LDID: string (nullable = true) > |-- KTAG: string (nullable = true) > |-- ColumnName: integer (nullable = true) > >