Hi, Sorry I am not very familiar with Java. I found that if I set the RDD partition number to be higher, I meet this error message"java.lang.OutOfMemoryError: Requested array size exceeds VM limit"; however if I set the RDD partition number to be lower, the error is gone.
My aws ec2 cluster has 72 cores, so I first set the partition number to be 150, and met the above problem. Then I set the partition number to be 100, the error is gone. Could anybody explain? Thanks a lot! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/array-size-limit-vs-partition-number-tp15695.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org