can you give your entire spark submit command? Are you missing
--executor-cores num_cpu? Also, if you intend to use all 6 nodes, you
also need --num-executors 6
On Mon, May 4, 2015 at 2:07 AM, Xi Shen davidshe...@gmail.com wrote:
Hi,
I have two small RDD, each has about 600 records. In my
Hi,
I have two small RDD, each has about 600 records. In my code, I did
val rdd1 = sc...cache()
val rdd2 = sc...cache()
val result = rdd1.cartesian(rdd2).*repartition*(num_cpu).map {case (a,b) =
some_expensive_job(a,b)
}
I ran my job in YARN cluster with --master yarn-cluster, I have 6