Pierre Borckmans created SPARK-12477: ----------------------------------------
Summary: [SQL] Tungsten projection fails for null values in array fields Key: SPARK-12477 URL: https://issues.apache.org/jira/browse/SPARK-12477 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.5.2, 1.6.0 Reporter: Pierre Borckmans Accessing null elements in an array field fails when tungsten is enabled. The following code works in Spark 1.3.1, and in Spark > 1.5 with Tungsten disabled: ``` // Array of String case class AS( as: Seq[String] ) val dfAS = sc.parallelize( Seq( AS ( Seq("a",null,"b") ) ) ).toDF dfAS.registerTempTable("T_AS") for (i <- 0 to 10) { println(i + " = " + sqlContext.sql(s"select as[$i] from T_AS").collect.mkString(","))} // Array of Int case class AI( ai: Seq[Option[Int]] ) val dfAI = sc.parallelize( Seq( AI ( Seq(Some(1),None,Some(2) ) ) ) ).toDF dfAI.registerTempTable("T_AI") for (i <- 0 to 10) { println(i + " = " + sqlContext.sql(s"select ai[$i] from T_AI").collect.mkString(","))} // Array of struct[Int,String] case class B(x: Option[Int], y: String) case class A( b: Seq[B] ) val df1 = sc.parallelize( Seq( A ( Seq( B(Some(1),"a"),B(Some(2),"b"), B(None, "c"), B(Some(4),null), B(None,null), null ) ) ) ).toDF df1.registerTempTable("T1") val df2 = sc.parallelize( Seq( A ( Seq( B(Some(1),"a"),B(Some(2),"b"), B(None, "c"), B(Some(4),null), B(None,null), null ) ), A(null) ) ).toDF df2.registerTempTable("T2") for (i <- 0 to 10) { println(i + " = " + sqlContext.sql(s"select b[$i].x, b[$i].y from T1").collect.mkString(","))} for (i <- 0 to 10) { println(i + " = " + sqlContext.sql(s"select b[$i].x, b[$i].y from T2").collect.mkString(","))} // Struct[Int,String] case class C(b: B) val df3 = sc.parallelize( Seq( C ( B(Some(1),"test") ), C(null) ) ).toDF df3.registerTempTable("T3") sqlContext.sql("select b.x, b.y from T3").collect ``` With Tungsten enabled, it reaches NullPointerException. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org