Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

2017-08-29 Thread Mikhailau, Alex
Would I use something like this to get to those VM arguments? val runtimeMxBean = ManagementFactory.getRuntimeMXBean val args = runtimeMxBean.getInputArguments val conf = Conf(args) etc. From: Vadim Semenov Date: Tuesday, August 29, 2017 at 11:49 AM To: "Mikhailau,

Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

2017-08-29 Thread Vadim Semenov
Each java process for each of the executors has some environment variables that you can used, for example: > CONTAINER_ID=container_1503994094228_0054_01_13 The executor id gets passed as an argument to the process: > /usr/lib/jvm/java-1.8.0/bin/java … --driver-url

Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

2017-08-28 Thread Mikhailau, Alex
Thanks, Vadim. The issue is not access to logs. I am able to view them. I have cloudwatch logs agent push logs to elasticsearch and then into Kibana using json-event-layout for log4j output. I would like to also log applicationId, executorId, etc in those log statements for clarity. Is there an

Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

2017-08-28 Thread Vadim Semenov
When you create a EMR cluster you can specify a S3 path where logs will be saved after cluster, something like this: s3://bucket/j-18ASDKLJLAKSD/containers/application_1494074597524_0001/container_1494074597524_0001_01_01/stderr.gz

Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

2017-08-28 Thread Mikhailau, Alex
Does anyone have a working solution for logging YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements? Are there specific ENV variables available or other workflow for doing that? Thank you Alex