Hi,

I tried take(1500) and test.collect and these both work on the "single" map 
statement. 

I'm very new to Kryo serialisation, I managed to find some code and I copied 
and pasted and that's what originally made the single map statement work:

class MyRegistrator extends KryoRegistrator {
  override def registerClasses(kryo: Kryo) {
        kryo.register(classOf[org.joda.time.DateTime])
  }
}

Is it because the groupBy sees a different class type? Maybe Array[DateTime]? I 
don’t want to find the answer by trial and error though.

Alex

-----Original Message-----
From: Sean Owen [mailto:so...@cloudera.com] 
Sent: 14 January 2016 14:07
To: Spencer, Alex (Santander)
Cc: user@spark.apache.org
Subject: Re: NPE when using Joda DateTime

It does look somehow like the state of the DateTime object isn't being 
recreated properly on deserialization somehow, given where the NPE occurs (look 
at the Joda source code). However the object is java.io.Serializable. Are you 
sure the Kryo serialization is correct?

It doesn't quite explain why the map operation works by itself. It could be the 
difference between executing locally (take(1) will look at 1 partition in 1 
task which prefers to be local) and executing remotely (groupBy is going to 
need a shuffle).

On Thu, Jan 14, 2016 at 1:01 PM, Spencer, Alex (Santander) 
<alex.spen...@santander.co.uk.invalid> wrote:
> Hello,
>
>
>
> I was wondering if somebody is able to help me get to the bottom of a 
> null pointer exception I’m seeing in my code. I’ve managed to narrow 
> down a problem in a larger class to my use of Joda’s DateTime 
> functions. I’ve successfully run my code in scala, but I’ve hit a few 
> problems when adapting it to run in spark.
>
>
>
> Spark version: 1.3.0
>
> Scala version: 2.10.4
>
> Java HotSpot 1.7
>
>
>
> I have a small case class called Transaction, which looks something 
> like
> this:
>
>
>
> case class Transaction(date : org.joda.time.DateTime = new
> org.joda.time.DateTime())
>
>
>
> I have an RDD[Transactions] trans:
>
> org.apache.spark.rdd.RDD[Transaction] = MapPartitionsRDD[4] at map at
> <console>:44
>
>
>
> I am able to run this successfully:
>
>
>
> val test = trans.map(_.date.minusYears(10))
>
> test.take(1)
>
>
>
> However if I do:
>
>
>
> val groupedTrans = trans.groupBy(_.account)
>
>
>
> //For each group, process transactions in turn:
>
> val test = groupedTrans.flatMap { case (_, transList) =>
>
>   transList.map {transaction =>
>
>     transaction.date.minusYears(10)
>
>   }
>
> }
>
> test.take(1)
>
>
>
> I get:
>
>
>
> java.lang.NullPointerException
>
>         at org.joda.time.DateTime.minusYears(DateTime.java:1268)
>
>
>
> Should the second operation not be equivalent to the first .map one? 
> (It’s a long way round of producing my error – but it’s extremely 
> similar to what’s happening in my class).
>
>
>
> I’ve got a custom registration class for Kryo which I think is working 
> - before I added this the original .map did not work – but shouldn’t 
> it be able to serialize all instances of Joda DateTime?
>
>
>
> Thank you for any help / pointers you can give me.
>
>
>
> Kind Regards,
>
> Alex.
>
>
>
> Alex Spencer
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org

Emails aren't always secure, and they may be intercepted or changed after
they've been sent. Santander doesn't accept liability if this happens. If you
think someone may have interfered with this email, please get in touch with the
sender another way. This message doesn't create or change any contract.
Santander doesn't accept responsibility for damage caused by any viruses
contained in this email or its attachments. Emails may be monitored. If you've
received this email by mistake, please let the sender know at once that it's
gone to the wrong person and then destroy it without copying, using, or telling
anyone about its contents.
Santander UK plc Reg. No. 2294747 and Abbey National Treasury Services plc Reg.
No. 2338548 Registered Offices: 2 Triton Square, Regent's Place, London NW1 3AN.
Registered in England. www.santander.co.uk. Authorised by the Prudential
Regulation Authority and regulated by the Financial Conduct Authority and the
Prudential Regulation Authority. FCA Reg. No. 106054 and 146003 respectively.
Santander Sharedealing is a trading name of Abbey Stockbrokers Limited Reg. No.
02666793. Registered Office: Kingfisher House, Radford Way, Billericay, Essex
CM12 0GZ. Authorised and regulated by the Financial Conduct Authority. FCA Reg.
No. 154210. You can check this on the Financial Services Register by visiting
the FCA’s website www.fca.org.uk/register or by contacting the FCA on 0800 111
6768. Santander UK plc is also licensed by the Financial Supervision Commission
of the Isle of Man for its branch in the Isle of Man. Deposits held with the
Isle of Man branch are covered by the Isle of Man Depositors’ Compensation
Scheme as set out in the Isle of Man Depositors’ Compensation Scheme Regulations
2010. In the Isle of Man, Santander UK plc’s principal place of business is at
19/21 Prospect Hill, Douglas, Isle of Man, IM1 1ET. Santander and the flame logo
are registered trademarks.
Santander Asset Finance plc. Reg. No. 1533123. Registered Office: 2 Triton
Square, Regent’s Place, London NW1 3AN. Registered in England. Santander
Corporate & Commercial is a brand name used by Santander UK plc, Abbey National
Treasury Services plc and Santander Asset Finance plc.
Ref:[PDB#1-4A]

Reply via email to