Hi Ted thanks I know by default spark.sql.shuffle.partition are 200. It would be great if you help me solve OOM issue.
On Mon, Aug 31, 2015 at 11:43 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Please see this thread w.r.t. spark.sql.shuffle.partitions : > http://search-hadoop.com/m/q3RTtE7JOv1bDJtY > > FYI > > On Mon, Aug 31, 2015 at 11:03 AM, unk1102 <umesh.ka...@gmail.com> wrote: > >> Hi I have Spark job and its executors hits OOM issue after some time and >> my >> job hangs because of it followed by couple of IOException, Rpc client >> disassociated, shuffle not found etc >> >> I have tried almost everything dont know how do I solve this OOM issue >> please guide I am fed up now. Here what I tried but nothing worked >> >> -I tried 60 executors with each executor having 12 Gig/2 core >> -I tried 30 executors with each executor having 20 Gig/2 core >> -I tried 40 executors with each executor having 30 Gig/6 core (I also >> tried >> 7 and 8 core) >> -I tried to set spark.storage.memoryFraction to 0.2 in order to solve OOM >> issue I also tried to set it 0.0 >> -I tried to set spark.shuffle.memoryFraction to 0.4 since I need more >> shuffling memory >> -I tried to set spark.default.parallelism to 500,1000,1500 but it did not >> help avoid OOM what is the ideal value for it? >> -I also tried to set spark.sql.shuffle.partitions to 500 but it did not >> help >> it just creates 500 output part files. Please make me understand >> difference >> between spark.default.parallelism and spark.sql.shuffle.partitions. >> >> My data is skewed but not that much large I dont understand why it is >> hitting OOM I dont cache anything I jsut have four group by queries I am >> calling using hivecontext.sql(). I have around 1000 threads which I spawn >> from driver and each thread will execute these four queries. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executor-OOM-issue-on-YARN-tp24522.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >