That's right, though it's possible the default way Kryo chooses to
serialize the object doesn't work. I'd debug a little more and print
out as much as you can about the DateTime object at the point it
appears to not work. I think there's a real problem and it only
happens to not turn up for the map + take(1) for reasons below.

Sandy I know you work with DateTime for spark-timeseries; does this ring a bell?

On Thu, Jan 14, 2016 at 2:28 PM, Spencer, Alex (Santander)
<alex.spen...@santander.co.uk> wrote:
> Hi,
>
> I tried take(1500) and test.collect and these both work on the "single" map 
> statement.
>
> I'm very new to Kryo serialisation, I managed to find some code and I copied 
> and pasted and that's what originally made the single map statement work:
>
> class MyRegistrator extends KryoRegistrator {
>   override def registerClasses(kryo: Kryo) {
>         kryo.register(classOf[org.joda.time.DateTime])
>   }
> }
>
> Is it because the groupBy sees a different class type? Maybe Array[DateTime]? 
> I don’t want to find the answer by trial and error though.
>
> Alex
>
> -----Original Message-----
> From: Sean Owen [mailto:so...@cloudera.com]
> Sent: 14 January 2016 14:07
> To: Spencer, Alex (Santander)
> Cc: user@spark.apache.org
> Subject: Re: NPE when using Joda DateTime
>
> It does look somehow like the state of the DateTime object isn't being 
> recreated properly on deserialization somehow, given where the NPE occurs 
> (look at the Joda source code). However the object is java.io.Serializable. 
> Are you sure the Kryo serialization is correct?
>
> It doesn't quite explain why the map operation works by itself. It could be 
> the difference between executing locally (take(1) will look at 1 partition in 
> 1 task which prefers to be local) and executing remotely (groupBy is going to 
> need a shuffle).
>
> On Thu, Jan 14, 2016 at 1:01 PM, Spencer, Alex (Santander) 
> <alex.spen...@santander.co.uk.invalid> wrote:
>> Hello,
>>
>>
>>
>> I was wondering if somebody is able to help me get to the bottom of a
>> null pointer exception I’m seeing in my code. I’ve managed to narrow
>> down a problem in a larger class to my use of Joda’s DateTime
>> functions. I’ve successfully run my code in scala, but I’ve hit a few
>> problems when adapting it to run in spark.
>>
>>
>>
>> Spark version: 1.3.0
>>
>> Scala version: 2.10.4
>>
>> Java HotSpot 1.7
>>
>>
>>
>> I have a small case class called Transaction, which looks something
>> like
>> this:
>>
>>
>>
>> case class Transaction(date : org.joda.time.DateTime = new
>> org.joda.time.DateTime())
>>
>>
>>
>> I have an RDD[Transactions] trans:
>>
>> org.apache.spark.rdd.RDD[Transaction] = MapPartitionsRDD[4] at map at
>> <console>:44
>>
>>
>>
>> I am able to run this successfully:
>>
>>
>>
>> val test = trans.map(_.date.minusYears(10))
>>
>> test.take(1)
>>
>>
>>
>> However if I do:
>>
>>
>>
>> val groupedTrans = trans.groupBy(_.account)
>>
>>
>>
>> //For each group, process transactions in turn:
>>
>> val test = groupedTrans.flatMap { case (_, transList) =>
>>
>>   transList.map {transaction =>
>>
>>     transaction.date.minusYears(10)
>>
>>   }
>>
>> }
>>
>> test.take(1)
>>
>>
>>
>> I get:
>>
>>
>>
>> java.lang.NullPointerException
>>
>>         at org.joda.time.DateTime.minusYears(DateTime.java:1268)
>>
>>
>>
>> Should the second operation not be equivalent to the first .map one?
>> (It’s a long way round of producing my error – but it’s extremely
>> similar to what’s happening in my class).
>>
>>
>>
>> I’ve got a custom registration class for Kryo which I think is working
>> - before I added this the original .map did not work – but shouldn’t
>> it be able to serialize all instances of Joda DateTime?
>>
>>
>>
>> Thank you for any help / pointers you can give me.
>>
>>
>>
>> Kind Regards,
>>
>> Alex.
>>
>>
>>
>> Alex Spencer
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
> commands, e-mail: user-h...@spark.apache.org
>
> Emails aren't always secure, and they may be intercepted or changed after
> they've been sent. Santander doesn't accept liability if this happens. If you
> think someone may have interfered with this email, please get in touch with 
> the
> sender another way. This message doesn't create or change any contract.
> Santander doesn't accept responsibility for damage caused by any viruses
> contained in this email or its attachments. Emails may be monitored. If you've
> received this email by mistake, please let the sender know at once that it's
> gone to the wrong person and then destroy it without copying, using, or 
> telling
> anyone about its contents.
> Santander UK plc Reg. No. 2294747 and Abbey National Treasury Services plc 
> Reg.
> No. 2338548 Registered Offices: 2 Triton Square, Regent's Place, London NW1 
> 3AN.
> Registered in England. www.santander.co.uk. Authorised by the Prudential
> Regulation Authority and regulated by the Financial Conduct Authority and the
> Prudential Regulation Authority. FCA Reg. No. 106054 and 146003 respectively.
> Santander Sharedealing is a trading name of Abbey Stockbrokers Limited Reg. 
> No.
> 02666793. Registered Office: Kingfisher House, Radford Way, Billericay, Essex
> CM12 0GZ. Authorised and regulated by the Financial Conduct Authority. FCA 
> Reg.
> No. 154210. You can check this on the Financial Services Register by visiting
> the FCA’s website www.fca.org.uk/register or by contacting the FCA on 0800 111
> 6768. Santander UK plc is also licensed by the Financial Supervision 
> Commission
> of the Isle of Man for its branch in the Isle of Man. Deposits held with the
> Isle of Man branch are covered by the Isle of Man Depositors’ Compensation
> Scheme as set out in the Isle of Man Depositors’ Compensation Scheme 
> Regulations
> 2010. In the Isle of Man, Santander UK plc’s principal place of business is at
> 19/21 Prospect Hill, Douglas, Isle of Man, IM1 1ET. Santander and the flame 
> logo
> are registered trademarks.
> Santander Asset Finance plc. Reg. No. 1533123. Registered Office: 2 Triton
> Square, Regent’s Place, London NW1 3AN. Registered in England. Santander
> Corporate & Commercial is a brand name used by Santander UK plc, Abbey 
> National
> Treasury Services plc and Santander Asset Finance plc.
> Ref:[PDB#1-4A]

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to