Re: Running Spark und YARN on AWS EMR

Josh Holbrook Mon, 17 Jul 2017 14:49:23 -0700

I just ran into this issue! Small world.

As far as I can tell, by default spark on EMR is completely untuned, but it
comes with a flag that you can set to tell EMR to autotune spark. In your
configuration.json file, you can add something like:


  {
    "Classification": "spark",
    "Properties": {
      "maximizeResourceAllocation": "true"
    }
  },

but keep in mind that, again as far as I can tell, the default parallelism
with this config is merely twice the number of executor cores--so for a 10
machine cluster w/ 3 active cores each, 60 partitions. This is pretty low,
so you'll likely want to adjust this--I'm currently using the following
because spark chokes on datasets that are bigger than about 2g per
partition:

  {
    "Classification": "spark-defaults",
    "Properties": {
      "spark.default.parallelism": "1000"
    }
  }

Good luck, and I hope this is helpful!

--Josh


On Mon, Jul 17, 2017 at 4:59 PM, Takashi Sasaki <tsasaki...@gmail.com>
wrote:

> Hi Pascal,
>
> The error also occurred frequently in our project.
>
> As a solution, it was effective to specify the memory size directly
> with spark-submit command.
>
> eg. spark-submit executor-memory 2g
>
>
> Regards,
>
> Takashi
>
> > 2017-07-18 5:18 GMT+09:00 Pascal Stammer <stam...@deichbrise.de>:
> >> Hi,
> >>
> >> I am running a Spark 2.1.x Application on AWS EMR with YARN and get
> >> following error that kill my application:
> >>
> >> AM Container for appattempt_1500320286695_0001_000001 exited with
> exitCode:
> >> -104
> >> For more detailed output, check application tracking
> >> page:http://ip-172-31-35-192.eu-central-1.compute.internal:
> 8088/cluster/app/application_1500320286695_0001Then,
> >> click on links to logs of each attempt.
> >> Diagnostics: Container
> >> [pid=9216,containerID=container_1500320286695_0001_01_000001] is
> running
> >> beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical
> >> memory used; 3.3 GB of 6.9 GB virtual memory used. Killing container.
> >>
> >>
> >> I already change spark.yarn.executor.memoryOverhead but the error still
> >> occurs. Does anybody have a hint for me which parameter or
> configuration I
> >> have to adapt.
> >>
> >> Thank you very much.
> >>
> >> Regards,
> >>
> >> Pascal Stammer
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Running Spark und YARN on AWS EMR

Reply via email to