Re: Looking at EMR Logs

2017-04-02 Thread Paul Tremblay
Thanks. That seems to work great, except EMR doesn't always copy the logs to S3. The behavior seems inconsistent and I am debugging it now. On Fri, Mar 31, 2017 at 7:46 AM, Vadim Semenov wrote: > You can provide your own log directory, where Spark log will be

Re: Looking at EMR Logs

2017-03-31 Thread Neil Jonkers
Modifying spark.eventLog.dir to point to a S3 path, you will encounter the following exception in Spark history log on path: /var/log/spark/spark-history-server.out Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.amazon.ws.emr.hadoop.fs.EmrFileSystem not found

Re: Looking at EMR Logs

2017-03-31 Thread Vadim Semenov
You can provide your own log directory, where Spark log will be saved, and that you could replay afterwards. Set in your job this: `spark.eventLog.dir=s3://bucket/some/directory` and run it. Note! The path `s3://bucket/some/directory` must exist before you run your job, it'll not be created

Looking at EMR Logs

2017-03-30 Thread Paul Tremblay
I am looking for tips on evaluating my Spark job after it has run. I know that right now I can look at the history of jobs through the web ui. I also know how to look at the current resources being used by a similar web ui. However, I would like to look at the logs after the job is finished to