Maximum Partitioner size
Hi, I have implemented a custom Partitioner (org.apache.spark.Partitioner) that contains a medium-sized object (some megabytes). Unfortunately Spark (2.1.0) fails with a StackOverflowError, and I suspect it is because of the size of the partitioner that needs to be serialized. My question is, what is the maximum size of a Partitioner accepted by Spark? Thanks!
Support for Joda
Hi, I have an RDD with objects containing Joda's LocalDate. When trying to save the RDD as Parquet, I get an exception. Here is the code: - val sqlC = new org.apache.spark.sql.SQLContext(sc) import sqlC._ myRDD.saveAsParquetFile("parquet") - The exception: Exception in thread "main" scala.MatchError: org.joda.time.LocalDate (of class scala.reflect.internal.Types$TypeRef$$anon$6) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:105) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33) at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:125) at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:123) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:123) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33) at org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:100) at org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33) at org.apache.spark.sql.catalyst.ScalaReflection$class.attributesFor(ScalaReflection.scala:94) at org.apache.spark.sql.catalyst.ScalaReflection$.attributesFor(ScalaReflection.scala:33) at org.apache.spark.sql.SQLContext.createSchemaRDD(SQLContext.scala:111) Is it possible to extend Spark with adapters in order to support new types? How to add support for Joda types? I am using spark 1.2.1 with cloudera 5.3.2 Patrick.