[ https://issues.apache.org/jira/browse/SPARK-10422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Davies Liu resolved SPARK-10422. -------------------------------- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 8578 [https://github.com/apache/spark/pull/8578] > String column in InMemoryColumnarCache needs to override clone method > --------------------------------------------------------------------- > > Key: SPARK-10422 > URL: https://issues.apache.org/jira/browse/SPARK-10422 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.5.0 > Reporter: Yin Huai > Assignee: Yin Huai > Fix For: 1.5.0 > > > We have a clone method in {{ColumnType}} > (https://github.com/apache/spark/blob/v1.5.0-rc3/sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala#L103). > Seems we need to override it for String > (https://github.com/apache/spark/blob/v1.5.0-rc3/sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala#L314) > because we are dealing with UTF8String. > {code} > val df = > ctx.range(1, 30000).selectExpr("id % 500 as id").rdd.map(id => > Tuple1(s"str_$id")).toDF("i") > val cached = df.cache() > cached.count() > [info] - SPARK-10422: String column in InMemoryColumnarCache needs to > override clone method *** FAILED *** (9 seconds, 152 milliseconds) > [info] org.apache.spark.SparkException: Job aborted due to stage failure: > Task 1 in stage 0.0 failed 1 times, most recent failure: Lost task 1.0 in > stage 0.0 (TID 1, localhost): java.util.NoSuchElementException: key not > found: str_[0] > [info] at scala.collection.MapLike$class.default(MapLike.scala:228) > [info] at scala.collection.AbstractMap.default(Map.scala:58) > [info] at scala.collection.mutable.HashMap.apply(HashMap.scala:64) > [info] at > org.apache.spark.sql.columnar.compression.DictionaryEncoding$Encoder.compress(compressionSchemes.scala:258) > [info] at > org.apache.spark.sql.columnar.compression.CompressibleColumnBuilder$class.build(CompressibleColumnBuilder.scala:110) > [info] at > org.apache.spark.sql.columnar.NativeColumnBuilder.build(ColumnBuilder.scala:87) > [info] at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) > [info] at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1$$anonfun$next$2.apply(InMemoryColumnarTableScan.scala:152) > [info] at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > [info] at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > [info] at > scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) > [info] at > scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) > [info] at > scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > [info] at > scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108) > [info] at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:152) > [info] at > org.apache.spark.sql.columnar.InMemoryRelation$$anonfun$3$$anon$1.next(InMemoryColumnarTableScan.scala:120) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org