Thank you so much, Marcelo! It WORKS!
2015-05-21 2:05 GMT+08:00 Marcelo Vanzin <van...@cloudera.com>: > Hello, > > Sorry for the delay. The issue you're running into is because most HBase > classes are in the system class path, while jars added with "--jars" are > only visible to the application class loader created by Spark. So classes > in the system class path cannot see them. > > You can work around this by setting "--driver-classpath > /opt/.../htrace-core-3.1.0-incubating.jar" and "--conf > spark.executor.extraClassPath= > /opt/.../htrace-core-3.1.0-incubating.jar" in your spark-submit command > line. (You can also add those configs to your spark-defaults.conf to avoid > having to type them all the time; and don't forget to include any other > jars that might be needed.) > > > On Mon, May 18, 2015 at 11:14 PM, Fengyun RAO <raofeng...@gmail.com> > wrote: > >> Thanks, Marcelo! >> >> >> Below is the full log, >> >> >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in >> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: Found binding in >> [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >> 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers >> for [TERM, HUP, INT] >> 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: >> appattempt_1432015548391_0003_000001 >> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: >> nobody,raofengyun >> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: >> nobody,raofengyun >> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(nobody, raofengyun); users with modify permissions: Set(nobody, >> raofengyun) >> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application >> in a separate Thread >> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization >> 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context >> initialization ... >> 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0 >> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: >> nobody,raofengyun >> 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: >> nobody,raofengyun >> 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: >> authentication disabled; ui acls disabled; users with view permissions: >> Set(nobody, raofengyun); users with modify permissions: Set(nobody, >> raofengyun) >> 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started >> 15/05/19 14:09:01 INFO Remoting: Starting remoting >> 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses >> :[akka.tcp://sparkDriver@gs-server-v-127:7191] >> 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: >> [akka.tcp://sparkDriver@gs-server-v-127:7191] >> 15/05/19 14:09:01 INFO util.Utils: Successfully started service >> 'sparkDriver' on port 7191. >> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker >> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster >> 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at >> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd >> 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with >> capacity 259.7 MB >> 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is >> /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d >> 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server >> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT >> 15/05/19 14:09:01 INFO server.AbstractConnector: Started >> SocketConnector@0.0.0.0:9349 >> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file >> server' on port 9349. >> 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator >> 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: >> org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter >> 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT >> 15/05/19 14:09:01 INFO server.AbstractConnector: Started >> SelectChannelConnector@0.0.0.0:63023 >> 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on >> port 63023. >> 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at >> http://gs-server-v-127:63023 >> 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created >> YarnClusterScheduler >> 15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on >> 33526 >> 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register >> BlockManager >> 15/05/19 14:09:02 INFO storage.BlockManagerMasterActor: Registering block >> manager gs-server-v-127:33526 with 259.7 MB RAM, BlockManagerId(<driver>, >> gs-server-v-127, 33526) >> 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Registered BlockManager >> 15/05/19 14:09:02 INFO scheduler.EventLoggingListener: Logging events to >> hdfs://gs-server-v-127:8020/user/spark/applicationHistory/application_1432015548391_0003 >> 15/05/19 14:09:02 INFO yarn.ApplicationMaster: Listen to driver: >> akka.tcp://sparkDriver@gs-server-v-127:7191/user/YarnScheduler >> 15/05/19 14:09:02 INFO cluster.YarnClusterSchedulerBackend: >> ApplicationMaster registered as >> Actor[akka://sparkDriver/user/YarnAM#1902752386] >> 15/05/19 14:09:02 INFO client.RMProxy: Connecting to ResourceManager at >> gs-server-v-127/10.200.200.56:8030 >> 15/05/19 14:09:02 INFO yarn.YarnRMClient: Registering the ApplicationMaster >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Will request 2 executor >> containers, each with 1 cores and 4480 MB memory including 384 MB overhead >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Container request (host: Any, >> capability: <memory:4480, vCores:1>) >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Container request (host: Any, >> capability: <memory:4480, vCores:1>) >> 15/05/19 14:09:03 INFO yarn.ApplicationMaster: Started progress reporter >> thread - sleep time : 5000 >> 15/05/19 14:09:03 INFO impl.AMRMClientImpl: Received new token for : >> gs-server-v-127:8041 >> 15/05/19 14:09:03 INFO impl.AMRMClientImpl: Received new token for : >> gs-server-v-129:8041 >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Launching container >> container_1432015548391_0003_01_000002 for on host gs-server-v-127 >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Launching ExecutorRunnable. >> driverUrl: >> akka.tcp://sparkDriver@gs-server-v-127:7191/user/CoarseGrainedScheduler, >> executorHostname: gs-server-v-127 >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Launching container >> container_1432015548391_0003_01_000003 for on host gs-server-v-129 >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Starting Executor Container >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Launching ExecutorRunnable. >> driverUrl: >> akka.tcp://sparkDriver@gs-server-v-127:7191/user/CoarseGrainedScheduler, >> executorHostname: gs-server-v-129 >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Starting Executor Container >> 15/05/19 14:09:03 INFO yarn.YarnAllocator: Received 2 containers from YARN, >> launching executors on 2 of them. >> 15/05/19 14:09:03 INFO impl.ContainerManagementProtocolProxy: >> yarn.client.max-cached-nodemanagers-proxies : 0 >> 15/05/19 14:09:03 INFO impl.ContainerManagementProtocolProxy: >> yarn.client.max-cached-nodemanagers-proxies : 0 >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up >> ContainerLaunchContext >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up >> ContainerLaunchContext >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Preparing Local resources >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Preparing Local resources >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Prepared Local resources >> Map(__app__.jar -> resource { scheme: "hdfs" host: "gs-server-v-127" port: >> 8020 file: >> "/user/raofengyun/.sparkStaging/application_1432015548391_0003/spark-wd-etl-1.0-jar-with-dependencies.jar" >> } size: 10759465 timestamp: 1432015733920 type: FILE visibility: PRIVATE, >> htrace-core-3.1.0-incubating.jar -> resource { scheme: "hdfs" host: >> "gs-server-v-127" port: 8020 file: >> "/user/raofengyun/.sparkStaging/application_1432015548391_0003/htrace-core-3.1.0-incubating.jar" >> } size: 1475955 timestamp: 1432015734434 type: FILE visibility: PRIVATE) >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Prepared Local resources >> Map(__app__.jar -> resource { scheme: "hdfs" host: "gs-server-v-127" port: >> 8020 file: >> "/user/raofengyun/.sparkStaging/application_1432015548391_0003/spark-wd-etl-1.0-jar-with-dependencies.jar" >> } size: 10759465 timestamp: 1432015733920 type: FILE visibility: PRIVATE, >> htrace-core-3.1.0-incubating.jar -> resource { scheme: "hdfs" host: >> "gs-server-v-127" port: 8020 file: >> "/user/raofengyun/.sparkStaging/application_1432015548391_0003/htrace-core-3.1.0-incubating.jar" >> } size: 1475955 timestamp: 1432015734434 type: FILE visibility: PRIVATE) >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up executor with >> environment: Map(CLASSPATH -> >> {{PWD}}<CPS>/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/bin/../lib/hadoop/client/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/conf/yarn-conf:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce//.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/flume-ng/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../parquet/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../avro/*:/opt/cloudera/parcels/GPLEXTRAS-5.2.0-1.cdh5.2.0.p0.20/lib/hadoop/lib/*, >> SPARK_LOG_URL_STDERR -> >> http://gs-server-v-127:8042/node/containerlogs/container_1432015548391_0003_01_000002/raofengyun/stderr?start=0, >> SPARK_DIST_CLASSPATH -> >> /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/bin/../lib/hadoop/client/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/conf/yarn-conf:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce//.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/flume-ng/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../parquet/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../avro/*:/opt/cloudera/parcels/GPLEXTRAS-5.2.0-1.cdh5.2.0.p0.20/lib/hadoop/lib/*, >> SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1432015548391_0003, >> SPARK_YARN_CACHE_FILES_FILE_SIZES -> 10759465,1475955, SPARK_USER -> >> raofengyun, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE, >> SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> >> 1432015733920,1432015734434, SPARK_LOG_URL_STDOUT -> >> http://gs-server-v-127:8042/node/containerlogs/container_1432015548391_0003_01_000002/raofengyun/stdout?start=0, >> SPARK_YARN_CACHE_FILES -> >> hdfs://gs-server-v-127:8020/user/raofengyun/.sparkStaging/application_1432015548391_0003/spark-wd-etl-1.0-jar-with-dependencies.jar#__app__.jar,hdfs://gs-server-v-127:8020/user/raofengyun/.sparkStaging/application_1432015548391_0003/htrace-core-3.1.0-incubating.jar#htrace-core-3.1.0-incubating.jar) >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up executor with >> environment: Map(CLASSPATH -> >> {{PWD}}<CPS>/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar<CPS>$HADOOP_CLIENT_CONF_DIR<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/*<CPS>$HADOOP_COMMON_HOME/lib/*<CPS>$HADOOP_HDFS_HOME/*<CPS>$HADOOP_HDFS_HOME/lib/*<CPS>$HADOOP_YARN_HOME/*<CPS>$HADOOP_YARN_HOME/lib/*<CPS>$HADOOP_MAPRED_HOME/*<CPS>$HADOOP_MAPRED_HOME/lib/*<CPS>$MR2_CLASSPATH<CPS>/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/bin/../lib/hadoop/client/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/conf/yarn-conf:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce//.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/flume-ng/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../parquet/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../avro/*:/opt/cloudera/parcels/GPLEXTRAS-5.2.0-1.cdh5.2.0.p0.20/lib/hadoop/lib/*, >> SPARK_LOG_URL_STDERR -> >> http://gs-server-v-129:8042/node/containerlogs/container_1432015548391_0003_01_000003/raofengyun/stderr?start=0, >> SPARK_DIST_CLASSPATH -> >> /opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/bin/../lib/hadoop/client/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/spark/conf/yarn-conf:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/libexec/../../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce//.//*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hive/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/flume-ng/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../parquet/lib/*:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/../avro/*:/opt/cloudera/parcels/GPLEXTRAS-5.2.0-1.cdh5.2.0.p0.20/lib/hadoop/lib/*, >> SPARK_YARN_STAGING_DIR -> .sparkStaging/application_1432015548391_0003, >> SPARK_YARN_CACHE_FILES_FILE_SIZES -> 10759465,1475955, SPARK_USER -> >> raofengyun, SPARK_YARN_CACHE_FILES_VISIBILITIES -> PRIVATE,PRIVATE, >> SPARK_YARN_MODE -> true, SPARK_YARN_CACHE_FILES_TIME_STAMPS -> >> 1432015733920,1432015734434, SPARK_LOG_URL_STDOUT -> >> http://gs-server-v-129:8042/node/containerlogs/container_1432015548391_0003_01_000003/raofengyun/stdout?start=0, >> SPARK_YARN_CACHE_FILES -> >> hdfs://gs-server-v-127:8020/user/raofengyun/.sparkStaging/application_1432015548391_0003/spark-wd-etl-1.0-jar-with-dependencies.jar#__app__.jar,hdfs://gs-server-v-127:8020/user/raofengyun/.sparkStaging/application_1432015548391_0003/htrace-core-3.1.0-incubating.jar#htrace-core-3.1.0-incubating.jar) >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up executor with >> commands: >> List(LD_LIBRARY_PATH="/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native:$LD_LIBRARY_PATH", >> {{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', >> -Xms4096m, -Xmx4096m, -Djava.io.tmpdir={{PWD}}/tmp, >> '-Dspark.shuffle.service.port=7337', '-Dspark.driver.port=7191', >> '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, >> org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, >> akka.tcp://sparkDriver@gs-server-v-127:7191/user/CoarseGrainedScheduler, >> --executor-id, 2, --hostname, gs-server-v-129, --cores, 1, --app-id, >> application_1432015548391_0003, --user-class-path, file:$PWD/__app__.jar, >> --user-class-path, file:$PWD/htrace-core-3.1.0-incubating.jar, 1>, >> <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr) >> 15/05/19 14:09:03 INFO yarn.ExecutorRunnable: Setting up executor with >> commands: >> List(LD_LIBRARY_PATH="/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native:$LD_LIBRARY_PATH", >> {{JAVA_HOME}}/bin/java, -server, -XX:OnOutOfMemoryError='kill %p', >> -Xms4096m, -Xmx4096m, -Djava.io.tmpdir={{PWD}}/tmp, >> '-Dspark.shuffle.service.port=7337', '-Dspark.driver.port=7191', >> '-Dspark.ui.port=0', -Dspark.yarn.app.container.log.dir=<LOG_DIR>, >> org.apache.spark.executor.CoarseGrainedExecutorBackend, --driver-url, >> akka.tcp://sparkDriver@gs-server-v-127:7191/user/CoarseGrainedScheduler, >> --executor-id, 1, --hostname, gs-server-v-127, --cores, 1, --app-id, >> application_1432015548391_0003, --user-class-path, file:$PWD/__app__.jar, >> --user-class-path, file:$PWD/htrace-core-3.1.0-incubating.jar, 1>, >> <LOG_DIR>/stdout, 2>, <LOG_DIR>/stderr) >> 15/05/19 14:09:03 INFO impl.ContainerManagementProtocolProxy: Opening proxy >> : gs-server-v-127:8041 >> 15/05/19 14:09:03 INFO impl.ContainerManagementProtocolProxy: Opening proxy >> : gs-server-v-129:8041 >> 15/05/19 14:09:07 INFO cluster.YarnClusterSchedulerBackend: Registered >> executor: >> Actor[akka.tcp://sparkExecutor@gs-server-v-127:22773/user/Executor#-351658265] >> with ID 1 >> 15/05/19 14:09:07 INFO storage.BlockManagerMasterActor: Registering block >> manager gs-server-v-127:40594 with 2.1 GB RAM, BlockManagerId(1, >> gs-server-v-127, 40594) >> 15/05/19 14:09:09 INFO cluster.YarnClusterSchedulerBackend: Registered >> executor: >> Actor[akka.tcp://sparkExecutor@gs-server-v-129:44560/user/Executor#-89679559] >> with ID 2 >> 15/05/19 14:09:09 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend >> is ready for scheduling beginning after reached minRegisteredResourcesRatio: >> 0.8 >> 15/05/19 14:09:09 INFO cluster.YarnClusterScheduler: >> YarnClusterScheduler.postStartHook done >> 15/05/19 14:09:09 INFO storage.BlockManagerMasterActor: Registering block >> manager gs-server-v-129:2745 with 2.1 GB RAM, BlockManagerId(2, >> gs-server-v-129, 2745) >> 15/05/19 14:09:09 INFO storage.MemoryStore: ensureFreeSpace(285833) called >> with curMem=0, maxMem=272357130 >> 15/05/19 14:09:09 INFO storage.MemoryStore: Block broadcast_0 stored as >> values in memory (estimated size 279.1 KB, free 259.5 MB) >> 15/05/19 14:09:10 INFO storage.MemoryStore: ensureFreeSpace(22334) called >> with curMem=285833, maxMem=272357130 >> 15/05/19 14:09:10 INFO storage.MemoryStore: Block broadcast_0_piece0 stored >> as bytes in memory (estimated size 21.8 KB, free 259.4 MB) >> 15/05/19 14:09:10 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in >> memory on gs-server-v-127:33526 (size: 21.8 KB, free: 259.7 MB) >> 15/05/19 14:09:10 INFO storage.BlockManagerMaster: Updated info of block >> broadcast_0_piece0 >> 15/05/19 14:09:10 INFO spark.SparkContext: Created broadcast 0 from >> newAPIHadoopRDD at WdEtl.scala:56 >> 15/05/19 14:09:10 INFO spark.SparkContext: Starting job: foreach at >> WdEtl.scala:74 >> 15/05/19 14:09:10 INFO input.FileInputFormat: Total input paths to process : >> 1 >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Registering RDD 1 (flatMap at >> WdEtl.scala:62) >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Got job 0 (foreach at >> WdEtl.scala:74) with 4 output partitions (allowLocal=false) >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Final stage: Stage 1(foreach >> at WdEtl.scala:74) >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Parents of final stage: >> List(Stage 0) >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Missing parents: List(Stage 0) >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Submitting Stage 0 >> (MapPartitionsRDD[1] at flatMap at WdEtl.scala:62), which has no missing >> parents >> 15/05/19 14:09:10 INFO storage.MemoryStore: ensureFreeSpace(3928) called >> with curMem=308167, maxMem=272357130 >> 15/05/19 14:09:10 INFO storage.MemoryStore: Block broadcast_1 stored as >> values in memory (estimated size 3.8 KB, free 259.4 MB) >> 15/05/19 14:09:10 INFO storage.MemoryStore: ensureFreeSpace(2212) called >> with curMem=312095, maxMem=272357130 >> 15/05/19 14:09:10 INFO storage.MemoryStore: Block broadcast_1_piece0 stored >> as bytes in memory (estimated size 2.2 KB, free 259.4 MB) >> 15/05/19 14:09:10 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in >> memory on gs-server-v-127:33526 (size: 2.2 KB, free: 259.7 MB) >> 15/05/19 14:09:10 INFO storage.BlockManagerMaster: Updated info of block >> broadcast_1_piece0 >> 15/05/19 14:09:10 INFO spark.SparkContext: Created broadcast 1 from >> broadcast at DAGScheduler.scala:839 >> 15/05/19 14:09:10 INFO scheduler.DAGScheduler: Submitting 1 missing tasks >> from Stage 0 (MapPartitionsRDD[1] at flatMap at WdEtl.scala:62) >> 15/05/19 14:09:10 INFO cluster.YarnClusterScheduler: Adding task set 0.0 >> with 1 tasks >> 15/05/19 14:09:10 INFO scheduler.TaskSetManager: Starting task 0.0 in stage >> 0.0 (TID 0, gs-server-v-127, NODE_LOCAL, 1356 bytes) >> 15/05/19 14:09:11 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in >> memory on gs-server-v-127:40594 (size: 2.2 KB, free: 2.1 GB) >> 15/05/19 14:09:12 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in >> memory on gs-server-v-127:40594 (size: 21.8 KB, free: 2.1 GB) >> 15/05/19 14:10:38 INFO scheduler.TaskSetManager: Finished task 0.0 in stage >> 0.0 (TID 0) in 87219 ms on gs-server-v-127 (1/1) >> 15/05/19 14:10:38 INFO cluster.YarnClusterScheduler: Removed TaskSet 0.0, >> whose tasks have all completed, from pool >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: Stage 0 (flatMap at >> WdEtl.scala:62) finished in 87.274 s >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: looking for newly runnable >> stages >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: running: Set() >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: waiting: Set(Stage 1) >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: failed: Set() >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: Missing parents for Stage 1: >> List() >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: Submitting Stage 1 >> (MapPartitionsRDD[3] at mapPartitionsWithIndex at WdEtl.scala:64), which is >> now runnable >> 15/05/19 14:10:38 INFO storage.MemoryStore: ensureFreeSpace(4728) called >> with curMem=314307, maxMem=272357130 >> 15/05/19 14:10:38 INFO storage.MemoryStore: Block broadcast_2 stored as >> values in memory (estimated size 4.6 KB, free 259.4 MB) >> 15/05/19 14:10:38 INFO storage.MemoryStore: ensureFreeSpace(2594) called >> with curMem=319035, maxMem=272357130 >> 15/05/19 14:10:38 INFO storage.MemoryStore: Block broadcast_2_piece0 stored >> as bytes in memory (estimated size 2.5 KB, free 259.4 MB) >> 15/05/19 14:10:38 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in >> memory on gs-server-v-127:33526 (size: 2.5 KB, free: 259.7 MB) >> 15/05/19 14:10:38 INFO storage.BlockManagerMaster: Updated info of block >> broadcast_2_piece0 >> 15/05/19 14:10:38 INFO spark.SparkContext: Created broadcast 2 from >> broadcast at DAGScheduler.scala:839 >> 15/05/19 14:10:38 INFO scheduler.DAGScheduler: Submitting 4 missing tasks >> from Stage 1 (MapPartitionsRDD[3] at mapPartitionsWithIndex at >> WdEtl.scala:64) >> 15/05/19 14:10:38 INFO cluster.YarnClusterScheduler: Adding task set 1.0 >> with 4 tasks >> 15/05/19 14:10:38 INFO scheduler.TaskSetManager: Starting task 0.0 in stage >> 1.0 (TID 1, gs-server-v-129, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:38 INFO scheduler.TaskSetManager: Starting task 1.0 in stage >> 1.0 (TID 2, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:38 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in >> memory on gs-server-v-127:40594 (size: 2.5 KB, free: 2.1 GB) >> 15/05/19 14:10:38 INFO spark.MapOutputTrackerMasterActor: Asked to send map >> output locations for shuffle 0 to sparkExecutor@gs-server-v-127:22773 >> 15/05/19 14:10:38 INFO spark.MapOutputTrackerMaster: Size of output statuses >> for shuffle 0 is 148 bytes >> 15/05/19 14:10:38 INFO storage.BlockManagerInfo: Added broadcast_2_piece0 in >> memory on gs-server-v-129:2745 (size: 2.5 KB, free: 2.1 GB) >> 15/05/19 14:10:38 INFO spark.MapOutputTrackerMasterActor: Asked to send map >> output locations for shuffle 0 to sparkExecutor@gs-server-v-129:44560 >> 15/05/19 14:10:40 INFO scheduler.TaskSetManager: Starting task 2.0 in stage >> 1.0 (TID 3, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:40 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 1.0 >> (TID 2, gs-server-v-127): java.io.IOException: >> java.lang.reflect.InvocationTargetException >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) >> at com.gridsum.spark.wd.SessionHandler.<init>(SessionHandler.scala:59) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:65) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:64) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) >> ... 16 more >> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace >> at >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) >> at >> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) >> at >> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635) >> ... 21 more >> Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> ... 27 more >> >> 15/05/19 14:10:41 INFO scheduler.TaskSetManager: Starting task 1.1 in stage >> 1.0 (TID 4, gs-server-v-129, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:41 INFO scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 >> (TID 1) on executor gs-server-v-129: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 1] >> 15/05/19 14:10:42 INFO scheduler.TaskSetManager: Starting task 0.1 in stage >> 1.0 (TID 5, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:42 INFO scheduler.TaskSetManager: Lost task 2.0 in stage 1.0 >> (TID 3) on executor gs-server-v-127: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 2] >> 15/05/19 14:10:43 INFO scheduler.TaskSetManager: Starting task 2.1 in stage >> 1.0 (TID 6, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:43 INFO scheduler.TaskSetManager: Lost task 0.1 in stage 1.0 >> (TID 5) on executor gs-server-v-127: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 3] >> 15/05/19 14:10:43 INFO scheduler.TaskSetManager: Starting task 0.2 in stage >> 1.0 (TID 7, gs-server-v-129, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:43 INFO scheduler.TaskSetManager: Lost task 1.1 in stage 1.0 >> (TID 4) on executor gs-server-v-129: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 4] >> 15/05/19 14:10:44 INFO scheduler.TaskSetManager: Starting task 1.2 in stage >> 1.0 (TID 8, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:44 INFO scheduler.TaskSetManager: Lost task 2.1 in stage 1.0 >> (TID 6) on executor gs-server-v-127: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 5] >> 15/05/19 14:10:45 INFO scheduler.TaskSetManager: Starting task 2.2 in stage >> 1.0 (TID 9, gs-server-v-129, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:45 INFO scheduler.TaskSetManager: Lost task 0.2 in stage 1.0 >> (TID 7) on executor gs-server-v-129: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 6] >> 15/05/19 14:10:46 INFO scheduler.TaskSetManager: Starting task 0.3 in stage >> 1.0 (TID 10, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:46 INFO scheduler.TaskSetManager: Lost task 1.2 in stage 1.0 >> (TID 8) on executor gs-server-v-127: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 7] >> 15/05/19 14:10:46 INFO scheduler.TaskSetManager: Starting task 1.3 in stage >> 1.0 (TID 11, gs-server-v-129, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:46 INFO scheduler.TaskSetManager: Lost task 2.2 in stage 1.0 >> (TID 9) on executor gs-server-v-129: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 8] >> 15/05/19 14:10:47 INFO scheduler.TaskSetManager: Starting task 2.3 in stage >> 1.0 (TID 12, gs-server-v-127, PROCESS_LOCAL, 1056 bytes) >> 15/05/19 14:10:47 INFO scheduler.TaskSetManager: Lost task 0.3 in stage 1.0 >> (TID 10) on executor gs-server-v-127: java.io.IOException >> (java.lang.reflect.InvocationTargetException) [duplicate 9] >> 15/05/19 14:10:47 ERROR scheduler.TaskSetManager: Task 0 in stage 1.0 failed >> 4 times; aborting job >> 15/05/19 14:10:47 INFO cluster.YarnClusterScheduler: Cancelling stage 1 >> 15/05/19 14:10:47 INFO cluster.YarnClusterScheduler: Stage 1 was cancelled >> 15/05/19 14:10:47 INFO scheduler.DAGScheduler: Job 0 failed: foreach at >> WdEtl.scala:74, took 96.765394 s >> 15/05/19 14:10:47 ERROR yarn.ApplicationMaster: User class threw exception: >> Job aborted due to stage failure: Task 0 in stage 1.0 failed 4 times, most >> recent failure: Lost task 0.3 in stage 1.0 (TID 10, gs-server-v-127): >> java.io.IOException: java.lang.reflect.InvocationTargetException >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) >> at com.gridsum.spark.wd.SessionHandler.<init>(SessionHandler.scala:59) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:65) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:64) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) >> ... 16 more >> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace >> at >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) >> at >> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) >> at >> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635) >> ... 21 more >> >> Driver stacktrace: >> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in >> stage 1.0 failed 4 times, most recent failure: Lost task 0.3 in stage 1.0 >> (TID 10, gs-server-v-127): java.io.IOException: >> java.lang.reflect.InvocationTargetException >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) >> at com.gridsum.spark.wd.SessionHandler.<init>(SessionHandler.scala:59) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:65) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:64) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) >> ... 16 more >> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace >> at >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) >> at >> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) >> at >> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635) >> ... 21 more >> >> Driver stacktrace: >> at >> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1203) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1191) >> at >> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) >> at >> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1191) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693) >> at scala.Option.foreach(Option.scala:236) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393) >> at >> org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354) >> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) >> 15/05/19 14:10:47 INFO yarn.ApplicationMaster: Final app status: FAILED, >> exitCode: 15, (reason: User class threw exception: Job aborted due to stage >> failure: Task 0 in stage 1.0 failed 4 times, most recent failure: Lost task >> 0.3 in stage 1.0 (TID 10, gs-server-v-127): java.io.IOException: >> java.lang.reflect.InvocationTargetException >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:119) >> at com.gridsum.spark.wd.SessionHandler.<init>(SessionHandler.scala:59) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:65) >> at com.gridsum.spark.wd.WdEtl$$anonfun$main$3.apply(WdEtl.scala:64) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:647) >> at >> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) >> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277) >> at org.apache.spark.rdd.RDD.iterator(RDD.scala:244) >> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61) >> at org.apache.spark.scheduler.Task.run(Task.scala:64) >> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >> at >> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >> at >> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >> at java.lang.reflect.Constructor.newInstance(Constructor.java:526) >> at >> org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) >> ... 16 more >> Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace >> at >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) >> at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) >> at >> org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) >> at >> org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) >> at >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:635) >> ... 21 more >> >> Driver stacktrace:) >> 15/05/19 14:10:47 INFO yarn.ApplicationMaster: Invoking sc stop from >> shutdown hook >> 15/05/19 14:10:47 WARN scheduler.TaskSetManager: Lost task 2.3 in stage 1.0 >> (TID 12, gs-server-v-127): TaskKilled (killed intentionally) >> 15/05/19 14:10:47 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, >> whose tasks have all completed, from pool >> 15/05/19 14:10:47 WARN scheduler.TaskSetManager: Lost task 1.3 in stage 1.0 >> (TID 11, gs-server-v-129): TaskKilled (killed intentionally) >> 15/05/19 14:10:47 INFO cluster.YarnClusterScheduler: Removed TaskSet 1.0, >> whose tasks have all completed, from pool >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/metrics/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage/kill,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/static,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/threadDump/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/threadDump,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/executors,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/environment/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/environment,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/rdd/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/rdd,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/storage,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/pool/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/pool,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/stage,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/stages,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/job/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/job,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs/json,null} >> 15/05/19 14:10:47 INFO handler.ContextHandler: stopped >> o.s.j.s.ServletContextHandler{/jobs,null} >> 15/05/19 14:10:47 INFO ui.SparkUI: Stopped Spark web UI at >> http://gs-server-v-127:63023 >> 15/05/19 14:10:47 INFO scheduler.DAGScheduler: Stopping DAGScheduler >> 15/05/19 14:10:47 INFO cluster.YarnClusterSchedulerBackend: Shutting down >> all executors >> 15/05/19 14:10:47 INFO cluster.YarnClusterSchedulerBackend: Asking each >> executor to shut down >> 15/05/19 14:10:47 INFO >> scheduler.OutputCommitCoordinator$OutputCommitCoordinatorActor: >> OutputCommitCoordinator stopped! >> 15/05/19 14:10:47 INFO spark.MapOutputTrackerMasterActor: >> MapOutputTrackerActor stopped! >> 15/05/19 14:10:47 INFO storage.MemoryStore: MemoryStore cleared >> 15/05/19 14:10:47 INFO storage.BlockManager: BlockManager stopped >> 15/05/19 14:10:47 INFO storage.BlockManagerMaster: BlockManagerMaster stopped >> 15/05/19 14:10:47 INFO remote.RemoteActorRefProvider$RemotingTerminator: >> Shutting down remote daemon. >> 15/05/19 14:10:47 INFO remote.RemoteActorRefProvider$RemotingTerminator: >> Remote daemon shut down; proceeding with flushing remote transports. >> 15/05/19 14:10:47 INFO spark.SparkContext: Successfully stopped SparkContext >> >> >> 2015-05-19 1:12 GMT+08:00 Marcelo Vanzin <van...@cloudera.com>: >> >>> On Sun, May 17, 2015 at 3:53 PM, Wilfred Spiegelenburg < >>> wspiegelenb...@cloudera.com> wrote: >>> >>>> When you run the driver in the cluster the application really runs from >>>> the cluster and the client goes away. If the driver does not have access to >>>> the jars, i.e. if they are not on the cluster available somewhere, this >>>> will happen. >>>> >>> >>> That's not true. Files specified in "--jars" and "--files" are uploaded >>> to the cluster before the app starts (unless they have the "local:" >>> prefix). The visible effect on the configuration is that these files will >>> show up in "spark.yarn.secondary.jars" as Fengyun mentioned in one of >>> his messages. >>> >>> Fengyun, woule you mind sharing more than just a partial stack trace? >>> e.g., the full driver logs would help in figuring out what's going on with >>> that file. >>> >>> -- >>> Marcelo >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "CDH Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to cdh-user+unsubscr...@cloudera.org. >>> For more options, visit >>> https://groups.google.com/a/cloudera.org/d/optout. >>> >> >> > > > -- > Marcelo >