Hi,
When loading ignite cache, we saw the spark job went into hung state at this
step.
We see one of the executor task has been running for hours and below are the
logs from this executor that had the failure.
Stdout log
Launch class org.apache.spark.executor.CoarseGrainedExecutorBackend by
calling
co.cask.cdap.app.runtime.spark.distributed.SparkContainerLauncher.launch
13:12:58.115 [main] INFO c.c.c.l.a.LogAppenderInitializer - Initializing
log appender KafkaLogAppender
13:12:58.679 [authorization-enforcement-service] INFO
c.c.c.s.a.AbstractAuthorizationService - Started authorization enforcement
service...
13:12:59.391 [main] INFO c.c.c.c.g.LocationRuntimeModule - HDFS namespace
is /project/ecpprodcdap
13:12:59.438 [main] INFO c.c.c.a.r.s.d.SparkContainerLauncher - Launch main
class
org.apache.spark.executor.CoarseGrainedExecutorBackend.main([--driver-url,
spark://CoarseGrainedScheduler@10.214.4.161:33947, --executor-id, 29,
--hostname, c893ach.ecom.bigdata.int.thomsonreuters.com, --cores, 5,
--app-id, application_1506331241975_7951, --user-class-path,
file:/data/7/yarn/nm/usercache/bigdata-app-ecplegalanalytics-svc/appcache/application_1506331241975_7951/container_e28_1506331241975_7951_01_30/__app__.jar])
13:12:59.501 [main] WARN c.c.c.i.a.Classes - Cannot patch method
obtainTokenForHiveMetastore in
org.apache.spark.deploy.yarn.YarnSparkHadoopUtil due to non-void return
type: (Lorg/apache/hadoop/conf/Configuration;)Lscala/Option;
13:12:59.501 [main] WARN c.c.c.i.a.Classes - Cannot patch method
obtainTokenForHBase in org.apache.spark.deploy.yarn.YarnSparkHadoopUtil due
to non-void return type:
(Lorg/apache/hadoop/conf/Configuration;)Lscala/Option;
13:13:26.130 [Executor task launch worker-0] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:26 INFO
dataloader.IgniteDataLoader: Starting the Ignite node on - 10.214.4.161
13:13:26.134 [Executor task launch worker-3] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:26 INFO
dataloader.IgniteDataLoader: Starting the Ignite node on - 10.214.4.161
13:13:26.134 [Executor task launch worker-2] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:26 INFO
dataloader.IgniteDataLoader: Starting the Ignite node on - 10.214.4.161
13:13:26.135 [Executor task launch worker-1] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:26 INFO
dataloader.IgniteDataLoader: Starting the Ignite node on - 10.214.4.161
13:13:26.135 [Executor task launch worker-4] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:26 INFO
dataloader.IgniteDataLoader: Starting the Ignite node on - 10.214.4.161
13:13:26.281 [Executor task launch worker-0] ERROR - Failed to resolve
default logging config file: config/java.util.logging.properties
13:13:26.283 [Executor task launch worker-0] WARN
o.a.s.e.CoarseGrainedExecutorBackend - Console logging handler is not
configured.
[13:13:26]__
[13:13:26] / _/ ___/ |/ / _/_ __/ __/
[13:13:26] _/ // (7 7// / / / / _/
[13:13:26] /___/\___/_/|_/___/ /_/ /___/
[13:13:26]
[13:13:26] ver. 1.8.0#20161205-sha1:9ca40dbe
[13:13:26] 2016 Copyright(C) Apache Software Foundation
[13:13:26]
[13:13:26] Ignite documentation: http://ignite.apache.org
[13:13:26]
[13:13:26] Quiet mode.
[13:13:26] ^-- To see **FULL** console log here add -DIGNITE_QUIET=false
or "-v" to ignite.{sh|bat}
[13:13:26]
[13:13:26] OS: Linux 3.10.0-514.16.1.el7.x86_64 amd64
[13:13:26] VM information: Java(TM) SE Runtime Environment 1.8.0_121-b13
Oracle Corporation Java HotSpot(TM) 64-Bit Server VM 25.121-b13
[13:13:26] Configured plugins:
[13:13:26] ^-- None
[13:13:26]
[13:13:26] Security status [authentication=off, tls/ssl=off]
[13:13:27] Topology snapshot [ver=3, servers=3, clients=0, CPUs=48,
heap=96.0GB]
[13:13:27] To start Console Management & Monitoring run
ignitevisorcmd.{sh|bat}
[13:13:27]
[13:13:27] Ignite node started OK (id=e98b003d,
grid=WCAGridapplication_1506331241975_7951)
[13:13:27] Topology snapshot [ver=2, servers=2, clients=0, CPUs=48,
heap=66.0GB]
13:13:27.660 [Executor task launch worker-0] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:27 INFO
dataloader.IgniteDataLoader: Started the Ignite node on - 10.214.4.161
13:13:27.660 [Executor task launch worker-2] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:27 INFO
dataloader.IgniteDataLoader: Started the Ignite node on - 10.214.4.161
13:13:27.661 [Executor task launch worker-3] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:27 INFO
dataloader.IgniteDataLoader: Started the Ignite node on - 10.214.4.161
13:13:27.661 [Executor task launch worker-1] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:27 INFO
dataloader.IgniteDataLoader: Started the Ignite node on - 10.214.4.161
13:13:27.661 [Executor task launch worker-4] WARN
o.a.s.e.CoarseGrainedExecutorBackend - 17/11/16 13:13:27 INFO
dataloader.IgniteDataLoader: Started the Ignite node on - 10.214.4.161
13:13:27.674 [Executor