Hadoop 2.4.0. Here is the relevant logs from executor 1136

16/03/18 21:26:58 INFO mapred.SparkHadoopMapRedUtil:
attempt_201603182126_0276_m_000484_0: Committed16/03/18 21:26:58 INFO
executor.Executor: Finished task 484.0 in stage 276.0 (TID 59663).
1080 bytes result sent to driver16/03/18 21:38:18 ERROR
executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL 15:
SIGTERM16/03/18 21:38:18 INFO storage.DiskBlockManager: Shutdown hook
called16/03/18 21:38:18 INFO util.ShutdownHookManager: Shutdown hook
called

On Fri, Mar 18, 2016 at 4:21 PM Ted Yu <yuzhih...@gmail.com> wrote:

Which version of hadoop do you use ?
>
> bq. Requesting to kill executor(s) 1136
>
> Can you find more information on executor 1136 ?
>
> Thanks
>
> On Fri, Mar 18, 2016 at 4:16 PM, Nezih Yigitbasi <
> nyigitb...@netflix.com.invalid> wrote:
>
>> Hi Spark experts,
>> I am using Spark 1.5.2 on YARN with dynamic allocation enabled. I see in
>> the driver/application master logs that the app is marked as SUCCEEDED and
>> then SparkContext stop is called. However, this stop sequence takes > 10
>> minutes to complete, and YARN resource manager kills the application master
>> as it didn’t receive a heartbeat within the last 10 minutes. The resource
>> manager then kills the application master. Any ideas about what may be
>> going on?
>>
>> Here are the relevant logs:
>>
>> *6/03/18 21:26:58 INFO yarn.ApplicationMaster: Final app status: SUCCEEDED, 
>> exitCode: 0
>> 16/03/18 21:26:58 INFO spark.SparkContext: Invoking stop() from shutdown 
>> hook*16/03/18 21:26:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/static/sql,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/execution/json,null}16/03/18 21:26:58 
>> INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/execution,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/static/sql,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/execution/json,null}16/03/18 21:26:58 
>> INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/execution,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/SQL,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/metrics/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage/kill,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/api,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/static,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}16/03/18 
>> 21:26:58 INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/threadDump,null}16/03/18 21:26:58 
>> INFO handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/executors,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/environment/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/environment,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/rdd/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/rdd,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/storage,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/pool/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/pool,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/stage,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/stages,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/job/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/job,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs/json,null}16/03/18 21:26:58 INFO 
>> handler.ContextHandler: stopped 
>> o.s.j.s.ServletContextHandler{/jobs,null}16/03/18 21:26:58 INFO ui.SparkUI: 
>> Stopped Spark web UI at http://10.143.240.240:5270616/03/18 21:27:58 INFO 
>> cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s) 
>> 113516/03/18 21:27:58 INFO yarn.YarnAllocator: Driver requested a total 
>> number of 208 executor(s).16/03/18 21:27:58 INFO 
>> yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 
>> 1135.16/03/18 21:27:58 INFO spark.ExecutorAllocationManager: Removing 
>> executor 1135 because it has been idle for 60 seconds (new desired total 
>> will be 208)16/03/18 21:27:58 INFO cluster.YarnClusterSchedulerBackend: 
>> Requesting to kill executor(s) 112316/03/18 21:27:58 INFO 
>> yarn.YarnAllocator: Driver requested a total number of 207 
>> executor(s).16/03/18 21:27:58 INFO yarn.ApplicationMaster$AMEndpoint: Driver 
>> requested to kill executor(s) 1123.16/03/18 21:27:58 INFO 
>> spark.ExecutorAllocationManager: Removing executor 1123 because it has been 
>> idle for 60 seconds (new desired total will be 207)16/03/18 21:27:58 INFO 
>> cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s) 
>> 111716/03/18 21:27:58 INFO yarn.YarnAllocator: Driver requested a total 
>> number of 206 executor(s).16/03/18 21:27:58 INFO 
>> yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 
>> 1117.16/03/18 21:27:58 INFO spark.ExecutorAllocationManager: Removing 
>> executor 1117 because it has been idle for 60 seconds (new desired total 
>> will be 206)16/03/18 21:27:58 INFO cluster.YarnClusterSchedulerBackend: 
>> Requesting to kill executor(s) 118516/03/18 21:27:58 INFO 
>> yarn.YarnAllocator: Driver requested a total number of 205 
>> executor(s).16/03/18 21:27:58 INFO yarn.ApplicationMaster$AMEndpoint: Driver 
>> requested to kill executor(s) 1185.16/03/18 21:27:58 INFO 
>> spark.ExecutorAllocationManager: Removing executor 1185 because it has been 
>> idle for 60 seconds (new desired total will be 205)16/03/18 21:27:58 INFO 
>> cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s) 
>> 115216/03/18 21:27:58 INFO yarn.YarnAllocator: Driver requested a total 
>> number of 204 executor(s).16/03/18 21:27:58 INFO 
>> yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 
>> 1152.16/03/18 21:27:58 INFO spark.ExecutorAllocationManager: Removing 
>> executor 1152 because it has been idle for 60 seconds (new desired total 
>> will be 204)16/03/18 21:27:58 INFO cluster.YarnClusterSchedulerBackend: 
>> Requesting to kill executor(s) 114016/03/18 21:27:58 INFO 
>> yarn.YarnAllocator: Driver requested a total number of 203 
>> executor(s).16/03/18 21:27:58 INFO yarn.ApplicationMaster$AMEndpoint: Driver 
>> requested to kill executor(s) 1140.16/03/18 21:27:58 INFO 
>> spark.ExecutorAllocationManager: Removing executor 1140 because it has been 
>> idle for 60 seconds (new desired total will be 203)16/03/18 21:27:58 INFO 
>> cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s) 
>> 114916/03/18 21:27:58 INFO yarn.YarnAllocator: Driver requested a total 
>> number of 202 executor(s).16/03/18 21:27:58 INFO 
>> yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 
>> 1149.16/03/18 21:27:58 INFO spark.ExecutorAllocationManager: Removing 
>> executor 1149 because it has been idle for 60 seconds (new desired total 
>> will be 202)16/03/18 21:27:58 INFO cluster.YarnClusterSchedulerBackend: 
>> Requesting to kill executor(s) 115416/03/18 21:27:58 INFO 
>> yarn.YarnAllocator: Driver requested a total number of 201 
>> executor(s).16/03/18 21:27:58 INFO yarn.ApplicationMaster$AMEndpoint: Driver 
>> requested to kill executor(s) 1154.16/03/18 21:27:58 INFO 
>> spark.ExecutorAllocationManager: Removing executor 1154 because it has been 
>> idle for 60 seconds (new desired total will be 201)16/03/18 21:27:58 INFO 
>> cluster.YarnClusterSchedulerBackend: Requesting to kill executor(s) 
>> 113616/03/18 21:27:58 INFO yarn.YarnAllocator: Driver requested a total 
>> number of 200 executor(s).16/03/18 21:27:58 INFO 
>> yarn.ApplicationMaster$AMEndpoint: Driver requested to kill executor(s) 
>> 1136.16/03/18 21:27:58 INFO spark.ExecutorAllocationManager: Removing 
>> executor 1136 because it has been idle for 60 seconds (new desired total 
>> will be 200)*16/03/18 21:38:17 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 
>> 15: SIGTERM
>> *
>>
>> ​
>>
>
> ​

Reply via email to