Each java process for each of the executors has some environment variables that you can used, for example:
> CONTAINER_ID=container_1503994094228_0054_01_000013 The executor id gets passed as an argument to the process: > /usr/lib/jvm/java-1.8.0/bin/java … --driver-url spark://CoarseGrainedScheduler@:38151 *--executor-id 3 *--hostname ip-1… And it gets printed out in the container log: > 17/08/29 13:02:00 INFO Executor: Starting executor ID 3 on host … On Mon, Aug 28, 2017 at 5:41 PM, Mikhailau, Alex <alex.mikhai...@mlb.com> wrote: > Thanks, Vadim. The issue is not access to logs. I am able to view them. > > > > I have cloudwatch logs agent push logs to elasticsearch and then into > Kibana using json-event-layout for log4j output. I would like to also log > applicationId, executorId, etc in those log statements for clarity. Is > there an MDC way with spark or something other than to achieve this? > > > > Alex > > > > *From: *Vadim Semenov <vadim.seme...@datadoghq.com> > *Date: *Monday, August 28, 2017 at 5:18 PM > *To: *"Mikhailau, Alex" <alex.mikhai...@mlb.com> > *Cc: *"user@spark.apache.org" <user@spark.apache.org> > *Subject: *Re: Referencing YARN application id, YARN container hostname, > Executor ID and YARN attempt for jobs running on Spark EMR 5.7.0 in log > statements? > > > > When you create a EMR cluster you can specify a S3 path where logs will be > saved after cluster, something like this: > > > > s3://bucket/j-18ASDKLJLAKSD/containers/application_ > 1494074597524_0001/container_1494074597524_0001_01_000001/stderr.gz > > > > http://docs.aws.amazon.com/emr/latest/ManagementGuide/ > emr-manage-view-web-log-files.html > > > > On Mon, Aug 28, 2017 at 4:43 PM, Mikhailau, Alex <alex.mikhai...@mlb.com> > wrote: > > Does anyone have a working solution for logging YARN application id, YARN > container hostname, Executor ID and YARN attempt for jobs running on Spark > EMR 5.7.0 in log statements? Are there specific ENV variables available or > other workflow for doing that? > > > > Thank you > > > > Alex > > >