Re: --jars works in yarn-client but not yarn-cluster mode, why?
Hello, Sorry for the delay. The issue you're running into is because most HBase classes are in the system class path, while jars added with --jars are only visible to the application class loader created by Spark. So classes in the system class path cannot see them. You can work around this by setting --driver-classpath /opt/.../htrace-core-3.1.0-incubating.jar and --conf spark.executor.extraClassPath= /opt/.../htrace-core-3.1.0-incubating.jar in your spark-submit command line. (You can also add those configs to your spark-defaults.conf to avoid having to type them all the time; and don't forget to include any other jars that might be needed.) On Mon, May 18, 2015 at 11:14 PM, Fengyun RAO raofeng...@gmail.com wrote: Thanks, Marcelo! Below is the full log, SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1432015548391_0003_01 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/05/19 14:09:01 INFO Remoting: Starting remoting 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'sparkDriver' on port 7191. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with capacity 259.7 MB 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:9349 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file server' on port 9349. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:63023 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on port 63023. 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at http://gs-server-v-127:63023 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on 33526 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/05/19 14:09:02 INFO storage.BlockManagerMasterActor: Registering block manager gs-server-v-127:33526 with 259.7 MB RAM,
Re: --jars works in yarn-client but not yarn-cluster mode, why?
Thank you so much, Marcelo! It WORKS! 2015-05-21 2:05 GMT+08:00 Marcelo Vanzin van...@cloudera.com: Hello, Sorry for the delay. The issue you're running into is because most HBase classes are in the system class path, while jars added with --jars are only visible to the application class loader created by Spark. So classes in the system class path cannot see them. You can work around this by setting --driver-classpath /opt/.../htrace-core-3.1.0-incubating.jar and --conf spark.executor.extraClassPath= /opt/.../htrace-core-3.1.0-incubating.jar in your spark-submit command line. (You can also add those configs to your spark-defaults.conf to avoid having to type them all the time; and don't forget to include any other jars that might be needed.) On Mon, May 18, 2015 at 11:14 PM, Fengyun RAO raofeng...@gmail.com wrote: Thanks, Marcelo! Below is the full log, SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1432015548391_0003_01 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/05/19 14:09:01 INFO Remoting: Starting remoting 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'sparkDriver' on port 7191. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with capacity 259.7 MB 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:9349 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file server' on port 9349. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:63023 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on port 63023. 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at http://gs-server-v-127:63023 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on 33526 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/05/19 14:09:02 INFO
Re: --jars works in yarn-client but not yarn-cluster mode, why?
Thanks, Marcelo! Below is the full log, SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/jars/avro-tools-1.7.6-cdh5.4.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 15/05/19 14:08:58 INFO yarn.ApplicationMaster: Registered signal handlers for [TERM, HUP, INT] 15/05/19 14:08:59 INFO yarn.ApplicationMaster: ApplicationAttemptId: appattempt_1432015548391_0003_01 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization 15/05/19 14:09:00 INFO yarn.ApplicationMaster: Waiting for spark context initialization ... 15/05/19 14:09:00 INFO spark.SparkContext: Running Spark version 1.3.0 15/05/19 14:09:00 INFO spark.SecurityManager: Changing view acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: Changing modify acls to: nobody,raofengyun 15/05/19 14:09:00 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nobody, raofengyun); users with modify permissions: Set(nobody, raofengyun) 15/05/19 14:09:01 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/05/19 14:09:01 INFO Remoting: Starting remoting 15/05/19 14:09:01 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO Remoting: Remoting now listens on addresses: [akka.tcp://sparkDriver@gs-server-v-127:7191] 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'sparkDriver' on port 7191. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering MapOutputTracker 15/05/19 14:09:01 INFO spark.SparkEnv: Registering BlockManagerMaster 15/05/19 14:09:01 INFO storage.DiskBlockManager: Created local directory at /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/blockmgr-3250910b-693e-46ff-b057-26d552fd8abd 15/05/19 14:09:01 INFO storage.MemoryStore: MemoryStore started with capacity 259.7 MB 15/05/19 14:09:01 INFO spark.HttpFileServer: HTTP File server directory is /data1/cdh/yarn/nm/usercache/raofengyun/appcache/application_1432015548391_0003/httpd-5bc614bc-d8b1-473d-a807-4d9252eb679d 15/05/19 14:09:01 INFO spark.HttpServer: Starting HTTP Server 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:9349 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'HTTP file server' on port 9349. 15/05/19 14:09:01 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/05/19 14:09:01 INFO ui.JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter 15/05/19 14:09:01 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/05/19 14:09:01 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:63023 15/05/19 14:09:01 INFO util.Utils: Successfully started service 'SparkUI' on port 63023. 15/05/19 14:09:01 INFO ui.SparkUI: Started SparkUI at http://gs-server-v-127:63023 15/05/19 14:09:02 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 15/05/19 14:09:02 INFO netty.NettyBlockTransferService: Server created on 33526 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Trying to register BlockManager 15/05/19 14:09:02 INFO storage.BlockManagerMasterActor: Registering block manager gs-server-v-127:33526 with 259.7 MB RAM, BlockManagerId(driver, gs-server-v-127, 33526) 15/05/19 14:09:02 INFO storage.BlockManagerMaster: Registered BlockManager 15/05/19 14:09:02 INFO scheduler.EventLoggingListener: Logging events to hdfs://gs-server-v-127:8020/user/spark/applicationHistory/application_1432015548391_0003 15/05/19 14:09:02 INFO yarn.ApplicationMaster: Listen to driver: akka.tcp://sparkDriver@gs-server-v-127:7191/user/YarnScheduler 15/05/19 14:09:02 INFO cluster.YarnClusterSchedulerBackend: ApplicationMaster registered as Actor[akka://sparkDriver/user/YarnAM#1902752386] 15/05/19 14:09:02 INFO client.RMProxy: Connecting to ResourceManager at gs-server-v-127/10.200.200.56:8030 15/05/19 14:09:02 INFO yarn.YarnRMClient: Registering the ApplicationMaster 15/05/19 14:09:03 INFO yarn.YarnAllocator: Will request 2 executor containers, each with 1 cores and 4480 MB memory
Re: --jars works in yarn-client but not yarn-cluster mode, why?
thanks, Wilfred. In our program, the htrace-core-3.1.0-incubating.jar dependency is only required in the executor, not in the driver. while in both yarn-client and yarn-cluster, the executor runs in cluster. and it's clearly in yarn-cluster mode, the jar IS in spark.yarn.secondary.jars, but still throws ClassNotFoundException 2015-05-14 18:52 GMT+08:00 Wilfred Spiegelenburg wspiegelenb...@cloudera.com: In the cluster the driver runs in the cluster and not locally in the spark-submit JVM. This changes what is available on your classpath. It looks like you are running into a similar situation as described in SPARK-5377. Wilfred On 14/05/2015 13:47, Fengyun RAO wrote: I look into the Environment in both modes. yarn-client: spark.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar yarn-cluster: spark.yarn.secondary.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar I wonder why htrace exists in spark.yarn.secondary.jars but still not found in URLClassLoader. I tried both local and file mode for the jar, still the same error. 2015-05-14 11:37 GMT+08:00 Fengyun RAO raofeng...@gmail.com mailto:raofeng...@gmail.com: Hadoop version: CDH 5.4. We need to connect to HBase, thus need extra /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar dependency. It works in yarn-client mode: spark-submit --class xxx.xxx.MyApp --master yarn-client --num-executors 10 --executor-memory 10g --jars /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar my-app.jar /input /output However, if we change yarn-client to yarn-cluster', it throws an ClassNotFoundException (actually the class exists in htrace-core-3.1.0-incubating.jar): Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.init(ConnectionManager.java:635) ... 21 more Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) Why --jars doesn't work in yarn-cluster mode? How to add extra dependency in yarn-cluster mode? -- --- You received this message because you are subscribed to the Google Groups CDH Users group. To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscr...@cloudera.org mailto:cdh-user+unsubscr...@cloudera.org. For more options, visit https://groups.google.com/a/cloudera.org/d/optout . -- Wilfred Spiegelenburg Backline Customer Operations Engineer YARN/MapReduce/Spark http://www.cloudera.com -- http://five.sentenc.es -- --- You received this message because you are subscribed to the Google Groups CDH Users group. To unsubscribe from this group and stop receiving emails from it, send an email to cdh-user+unsubscr...@cloudera.org. For more options, visit https://groups.google.com/a/cloudera.org/d/optout.
Re: --jars works in yarn-client but not yarn-cluster mode, why?
I look into the Environment in both modes. yarn-client: spark.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,file:/home/xxx/my-app.jar yarn-cluster: spark.yarn.secondary.jars local:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar I wonder why htrace exists in spark.yarn.secondary.jars but still not found in URLClassLoader. I tried both local and file mode for the jar, still the same error. 2015-05-14 11:37 GMT+08:00 Fengyun RAO raofeng...@gmail.com: Hadoop version: CDH 5.4. We need to connect to HBase, thus need extra /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar dependency. It works in yarn-client mode: spark-submit --class xxx.xxx.MyApp --master yarn-client --num-executors 10 --executor-memory 10g --jars /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar my-app.jar /input /output However, if we change yarn-client to yarn-cluster', it throws an ClassNotFoundException (actually the class exists in htrace-core-3.1.0-incubating.jar): Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.init(ConnectionManager.java:635) ... 21 more Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) Why --jars doesn't work in yarn-cluster mode? How to add extra dependency in yarn-cluster mode?
--jars works in yarn-client but not yarn-cluster mode, why?
Hadoop version: CDH 5.4. We need to connect to HBase, thus need extra /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar dependency. It works in yarn-client mode: spark-submit --class xxx.xxx.MyApp --master yarn-client --num-executors 10 --executor-memory 10g --jars /opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core-3.1.0-incubating.jar my-app.jar /input /output However, if we change yarn-client to yarn-cluster', it throws an ClassNotFoundException (actually the class exists in htrace-core-3.1.0-incubating.jar): Caused by: java.lang.NoClassDefFoundError: org/apache/htrace/Trace at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:218) at org.apache.hadoop.hbase.zookeeper.ZKUtil.checkExists(ZKUtil.java:481) at org.apache.hadoop.hbase.zookeeper.ZKClusterId.readClusterIdZNode(ZKClusterId.java:65) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:86) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:850) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.init(ConnectionManager.java:635) ... 21 more Caused by: java.lang.ClassNotFoundException: org.apache.htrace.Trace at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) Why --jars doesn't work in yarn-cluster mode? How to add extra dependency in yarn-cluster mode?