Re: Spark streaming: java.lang.ClassCastException: org.apache.spark.util.SerializableConfiguration ... on restart from checkpoint

2015-12-17 Thread Bartłomiej Alberski
I prepared simple example helping in reproducing problem: https://github.com/alberskib/spark-streaming-broadcast-issue I think that in that way it will be easier for you to understand problem and find solution (if any exists) Thanks Bartek 2015-12-16 23:34 GMT+01:00 Bartłomiej Alberski <alb

Re: Spark streaming: java.lang.ClassCastException: org.apache.spark.util.SerializableConfiguration ... on restart from checkpoint

2015-12-16 Thread Bartłomiej Alberski
+01:00 Tathagata Das <t...@databricks.com>: > Could you test serializing and deserializing the MyClassReporter class > separately? > > On Mon, Dec 14, 2015 at 8:57 AM, Bartłomiej Alberski <albers...@gmail.com> > wrote: > >> Below is the full stacktrace(real nam

Re: Spark streaming: java.lang.ClassCastException: org.apache.spark.util.SerializableConfiguration ... on restart from checkpoint

2015-12-14 Thread Bartłomiej Alberski
Below is the full stacktrace(real names of my classes were changed) with short description of entries from my code: rdd.mapPartitions{ partition => //this is the line to which second stacktrace entry is pointing val sender = broadcastedValue.value // this is the maing place to which first

Re: Ensuring eager evaluation inside mapPartitions

2015-10-16 Thread Bartłomiej Alberski
I mean getResults is called only after foo has been called on all records. It could be useful if foo is asynchronous call to external service returning Future that provide you some additional data i.e REST API (IO operations). If such API has latency of 100ms, sending all requests (for 1000

Re: Issue with the class generated from avro schema

2015-10-09 Thread Bartłomiej Alberski
I knew that one possible solution will be to map loaded object into another class just after reading from HDFS. I was looking for solution enabling reuse of avro generated classes. It could be useful in situation when your record have more 22 records, because you do not need to write boilerplate