Many retries for Python job

2014-11-21 Thread Brett Meyer
I¹m running a Python script with spark-submit on top of YARN on an EMR cluster with 30 nodes. The script reads in approximately 3.9 TB of data from S3, and then does some transformations and filtering, followed by some aggregate counts. During Stage 2 of the job, everything looks to complete

Re: Many retries for Python job

2014-11-21 Thread Sandy Ryza
Hi Brett, Are you noticing executors dying? Are you able to check the YARN NodeManager logs and see whether YARN is killing them for exceeding memory limits? -Sandy On Fri, Nov 21, 2014 at 9:47 AM, Brett Meyer brett.me...@crowdstrike.com wrote: I’m running a Python script with spark-submit

Re: Many retries for Python job

2014-11-21 Thread Brett Meyer
@spark.apache.org user@spark.apache.org Subject: Re: Many retries for Python job Hi Brett, Are you noticing executors dying? Are you able to check the YARN NodeManager logs and see whether YARN is killing them for exceeding memory limits? -Sandy On Fri, Nov 21, 2014 at 9:47 AM, Brett Meyer