Re: java.lang.NegativeArraySizeException? as iterating a big RDD

2015-10-23 Thread Yifan LI
Thanks for your advice, Jem. :) I will increase the partitioning and see if it helps. Best, Yifan LI > On 23 Oct 2015, at 12:48, Jem Tucker wrote: > > Hi Yifan, > > I think this is a result of Kryo trying to seriallize something too large. > Have you tried to

java.lang.NegativeArraySizeException? as iterating a big RDD

2015-10-23 Thread Yifan LI
Hi, I have a big sorted RDD sRdd(~962million elements), and need to scan its elements in order(using sRdd.toLocalIterator). But the process failed when the scanning was done after around 893million elements, returned with following exception: Anyone has idea? Thanks! Exception in thread

Re: java.lang.NegativeArraySizeException? as iterating a big RDD

2015-10-23 Thread Jem Tucker
Hi Yifan, I think this is a result of Kryo trying to seriallize something too large. Have you tried to increase your partitioning? Cheers, Jem On Fri, Oct 23, 2015 at 11:24 AM Yifan LI wrote: > Hi, > > I have a big sorted RDD sRdd(~962million elements), and need to scan

Re: java.lang.NegativeArraySizeException? as iterating a big RDD

2015-10-23 Thread Todd Nist
Hi Yifan, You could also try increasing the spark.kryoserializer.buffer.max.mb *spark.kryoserializer.buffer.max.mb *(64 Mb by default) : useful if your default buffer size goes further than 64 Mb; Per doc: Maximum allowable size of Kryo serialization buffer. This must be larger than any object