Re: What does KryoException: java.lang.NegativeArraySizeException mean?

2014-10-21 Thread Fengyun RAO
Thanks, Guilaume, Below is when the exception happens, nothing has spilled to disk yet. And there isn't a join, but a partitionBy and groupBy action. Actually if numPartitions is small, it succeeds, while if it's large, it fails. Partition was simply done by override def getPartition(key:

What does KryoException: java.lang.NegativeArraySizeException mean?

2014-10-20 Thread Fengyun RAO
The exception drives me crazy, because it occurs randomly. I didn't know which line of my code causes this exception. I didn't even understand what KryoException: java.lang.NegativeArraySizeException means, or even implies? 14/10/20 15:59:01 WARN scheduler.TaskSetManager: Lost task 32.2 in stage

Re: What does KryoException: java.lang.NegativeArraySizeException mean?

2014-10-20 Thread Fengyun RAO
Thank you, Guillaume, my dataset is not that large, it's totally ~2GB 2014-10-20 16:58 GMT+08:00 Guillaume Pitel guillaume.pi...@exensa.com: Hi, It happened to me with blocks which take more than 1 or 2 GB once serialized I think the problem was that during serialization, a Byte Array is

Re: What does KryoException: java.lang.NegativeArraySizeException mean?

2014-10-20 Thread Guillaume Pitel
Well, reading your logs, here is what happens : You do a combineByKey (so you have a join probably somewhere), which spills on disk because it's too big. To spill on disk it serializes, and the blocks are 2GB. From a 2GB dataset, it's easy to exand to several TB Increase parallelism, make