It does look somehow like the state of the DateTime object isn't being
recreated properly on deserialization somehow, given where the NPE
occurs (look at the Joda source code). However the object is
java.io.Serializable. Are you sure the Kryo serialization is correct?

It doesn't quite explain why the map operation works by itself. It
could be the difference between executing locally (take(1) will look
at 1 partition in 1 task which prefers to be local) and executing
remotely (groupBy is going to need a shuffle).

On Thu, Jan 14, 2016 at 1:01 PM, Spencer, Alex (Santander)
<alex.spen...@santander.co.uk.invalid> wrote:
> Hello,
>
>
>
> I was wondering if somebody is able to help me get to the bottom of a null
> pointer exception I’m seeing in my code. I’ve managed to narrow down a
> problem in a larger class to my use of Joda’s DateTime functions. I’ve
> successfully run my code in scala, but I’ve hit a few problems when adapting
> it to run in spark.
>
>
>
> Spark version: 1.3.0
>
> Scala version: 2.10.4
>
> Java HotSpot 1.7
>
>
>
> I have a small case class called Transaction, which looks something like
> this:
>
>
>
> case class Transaction(date : org.joda.time.DateTime = new
> org.joda.time.DateTime())
>
>
>
> I have an RDD[Transactions] trans:
>
> org.apache.spark.rdd.RDD[Transaction] = MapPartitionsRDD[4] at map at
> <console>:44
>
>
>
> I am able to run this successfully:
>
>
>
> val test = trans.map(_.date.minusYears(10))
>
> test.take(1)
>
>
>
> However if I do:
>
>
>
> val groupedTrans = trans.groupBy(_.account)
>
>
>
> //For each group, process transactions in turn:
>
> val test = groupedTrans.flatMap { case (_, transList) =>
>
>   transList.map {transaction =>
>
>     transaction.date.minusYears(10)
>
>   }
>
> }
>
> test.take(1)
>
>
>
> I get:
>
>
>
> java.lang.NullPointerException
>
>         at org.joda.time.DateTime.minusYears(DateTime.java:1268)
>
>
>
> Should the second operation not be equivalent to the first .map one? (It’s a
> long way round of producing my error – but it’s extremely similar to what’s
> happening in my class).
>
>
>
> I’ve got a custom registration class for Kryo which I think is working -
> before I added this the original .map did not work – but shouldn’t it be
> able to serialize all instances of Joda DateTime?
>
>
>
> Thank you for any help / pointers you can give me.
>
>
>
> Kind Regards,
>
> Alex.
>
>
>
> Alex Spencer
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to