Maximum Partitioner size

2017-04-20 Thread Patrick GRANDJEAN
Hi,
I have implemented a custom Partitioner (org.apache.spark.Partitioner) that 
contains a medium-sized object (some megabytes). Unfortunately Spark (2.1.0) 
fails with a StackOverflowError, and I suspect it is because of the size of the 
partitioner that needs to be serialized. My question is, what is the maximum 
size of a Partitioner accepted by Spark?
Thanks!





Support for Joda

2015-04-08 Thread Patrick Grandjean
Hi,

I have an RDD with objects containing Joda's LocalDate. When trying to save
the RDD as Parquet, I get an exception. Here is the code:

-
val sqlC = new org.apache.spark.sql.SQLContext(sc)
import sqlC._

myRDD.saveAsParquetFile("parquet")
-

The exception:

Exception in thread "main" scala.MatchError: org.joda.time.LocalDate (of
class scala.reflect.internal.Types$TypeRef$$anon$6)
at
org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:105)
at
org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33)
at
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:125)
at
org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$schemaFor$1.apply(ScalaReflection.scala:123)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:123)
at
org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33)
at
org.apache.spark.sql.catalyst.ScalaReflection$class.schemaFor(ScalaReflection.scala:100)
at
org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:33)
at
org.apache.spark.sql.catalyst.ScalaReflection$class.attributesFor(ScalaReflection.scala:94)
at
org.apache.spark.sql.catalyst.ScalaReflection$.attributesFor(ScalaReflection.scala:33)
at org.apache.spark.sql.SQLContext.createSchemaRDD(SQLContext.scala:111)

Is it possible to extend Spark with adapters in order to support new types?
How to add support for Joda types?

I am using spark 1.2.1 with cloudera 5.3.2

Patrick.