I just ran into this issue! Small world. As far as I can tell, by default spark on EMR is completely untuned, but it comes with a flag that you can set to tell EMR to autotune spark. In your configuration.json file, you can add something like:
{ "Classification": "spark", "Properties": { "maximizeResourceAllocation": "true" } }, but keep in mind that, again as far as I can tell, the default parallelism with this config is merely twice the number of executor cores--so for a 10 machine cluster w/ 3 active cores each, 60 partitions. This is pretty low, so you'll likely want to adjust this--I'm currently using the following because spark chokes on datasets that are bigger than about 2g per partition: { "Classification": "spark-defaults", "Properties": { "spark.default.parallelism": "1000" } } Good luck, and I hope this is helpful! --Josh On Mon, Jul 17, 2017 at 4:59 PM, Takashi Sasaki <tsasaki...@gmail.com> wrote: > Hi Pascal, > > The error also occurred frequently in our project. > > As a solution, it was effective to specify the memory size directly > with spark-submit command. > > eg. spark-submit executor-memory 2g > > > Regards, > > Takashi > > > 2017-07-18 5:18 GMT+09:00 Pascal Stammer <stam...@deichbrise.de>: > >> Hi, > >> > >> I am running a Spark 2.1.x Application on AWS EMR with YARN and get > >> following error that kill my application: > >> > >> AM Container for appattempt_1500320286695_0001_000001 exited with > exitCode: > >> -104 > >> For more detailed output, check application tracking > >> page:http://ip-172-31-35-192.eu-central-1.compute.internal: > 8088/cluster/app/application_1500320286695_0001Then, > >> click on links to logs of each attempt. > >> Diagnostics: Container > >> [pid=9216,containerID=container_1500320286695_0001_01_000001] is > running > >> beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical > >> memory used; 3.3 GB of 6.9 GB virtual memory used. Killing container. > >> > >> > >> I already change spark.yarn.executor.memoryOverhead but the error still > >> occurs. Does anybody have a hint for me which parameter or > configuration I > >> have to adapt. > >> > >> Thank you very much. > >> > >> Regards, > >> > >> Pascal Stammer > >> > >> > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >