Hi Spark developers,

For Spark running on YARN, I would like to be able to find out the container 
where an executor is running by looking at the logs. I haven't been able to 
find a way to do this, not even with the Spark UI, as neither the Executors tab 
nor the stage information page show the container id. I was thinking on 
modifying the logs sent in YarnAllocator to log the executor id on container 
start, as follows:

@@ -494,7 +494,8 @@ private[yarn] class YarnAllocator(
       val containerId = container.getId
       val executorId = executorIdCounter.toString
       assert(container.getResource.getMemory >= resource.getMemory)
-      logInfo(s"Launching container $containerId on host $executorHostname")
+      logInfo(s"Launching container $containerId on host $executorHostname " +
+        s"for executor with ID $executorId")

       def updateInternalState(): Unit = synchronized {
         numExecutorsRunning += 1
@@ -528,7 +529,8 @@ private[yarn] class YarnAllocator(
                 updateInternalState()
               } catch {
                 case NonFatal(e) =>
-                  logError(s"Failed to launch executor $executorId on 
container $containerId", e)
+                  logError(s"Failed to launch executor $executorId on 
container $containerId " +
+                    s"for executor with ID $executorId", e)
                   // Assigned container should be released immediately to 
avoid unnecessary resource
                   // occupation.
                   amClient.releaseAssignedContainer(containerId)

Do you think this is a good idea, or there is a better way to achieve this?

Thanks in advance,

Juan ?

Reply via email to