Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

Vadim Semenov Tue, 29 Aug 2017 08:59:57 -0700

Each java process for each of the executors has some environment variables
that you can used, for example:


> CONTAINER_ID=container_1503994094228_0054_01_000013

The executor id gets passed as an argument to the process:

> /usr/lib/jvm/java-1.8.0/bin/java … --driver-url
spark://CoarseGrainedScheduler@:38151 *--executor-id 3 *--hostname ip-1…

And it gets printed out in the container log:

> 17/08/29 13:02:00 INFO Executor: Starting executor ID 3 on host …



On Mon, Aug 28, 2017 at 5:41 PM, Mikhailau, Alex <alex.mikhai...@mlb.com>
wrote:

> Thanks, Vadim. The issue is not access to logs. I am able to view them.
>
>
>
> I have cloudwatch logs agent push logs to elasticsearch and then into
> Kibana using json-event-layout for log4j output. I would like to also log
> applicationId, executorId, etc in those log statements for clarity. Is
> there an MDC way with spark or something other than to achieve this?
>
>
>
> Alex
>
>
>
> *From: *Vadim Semenov <vadim.seme...@datadoghq.com>
> *Date: *Monday, August 28, 2017 at 5:18 PM
> *To: *"Mikhailau, Alex" <alex.mikhai...@mlb.com>
> *Cc: *"user@spark.apache.org" <user@spark.apache.org>
> *Subject: *Re: Referencing YARN application id, YARN container hostname,
> Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log
> statements?
>
>
>
> When you create a EMR cluster you can specify a S3 path where logs will be
> saved after cluster, something like this:
>
>
>
> s3://bucket/j-18ASDKLJLAKSD/containers/application_
> 1494074597524_0001/container_1494074597524_0001_01_000001/stderr.gz
>
>
>
> http://docs.aws.amazon.com/emr/latest/ManagementGuide/
> emr-manage-view-web-log-files.html
>
>
>
> On Mon, Aug 28, 2017 at 4:43 PM, Mikhailau, Alex <alex.mikhai...@mlb.com>
> wrote:
>
> Does anyone have a working solution for logging YARN application id, YARN
> container hostname, Executor ID and YARN attempt for jobs running on Spark
> EMR 5.7.0 in log statements? Are there specific ENV variables available or
> other workflow for doing that?
>
>
>
> Thank you
>
>
>
> Alex
>
>
>

Re: Referencing YARN application id, YARN container hostname, Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log statements?

Reply via email to