[ https://issues.apache.org/jira/browse/LIVY-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alex Savitsky updated LIVY-447: ------------------------------- Description: Job was launched using the following request: POST /batches {code:java} { "file": "/user/alsavits/oats/oats-spark-controls-exec.jar", "className": "com.rbc.rbccm.regops.controls.CompletenessControl", "args": ["ET", "20171229"] }{code} Relevant Livy config: {code:java} livy.spark.master = yarn livy.spark.deploy-mode = cluster {code} Livy log: {code:java} 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Requesting a new application from cluster with 8 NodeManagers 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (174080 MB per container) 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up container launch context for our AM 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up the launch environment for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Preparing resources for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO YarnSparkHadoopUtil: getting token for namenode: hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 73717 for DRLB0SRVCTRLRW on 10.61.34.124:8020 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Trying to connect to metastore with URI thrift://guedlpahdp002.devfg.rbc.com:9083 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Connected to metastore. 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO YarnSparkHadoopUtil: HBase class not found java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs:///hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs:/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs://guedlpahdp001.devfg.rbc.com:8020/user/alsavits/oats/oats-spark-controls-exec.jar 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Uploading resource file:/tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71/__spark_conf__2946776456430567485.zip -> hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110/__spark_conf__.zip 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with view permissions: Set(); users with modify permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with modify permissions: Set() 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Submitting application application_1520006903702_0110 to ResourceManager 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO YarnClientImpl: Submitted application application_1520006903702_0110 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Application report for application_1520006903702_0110 (state: ACCEPTED) 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: client token: Token { kind: YARN_CLIENT_TOKEN, service: } 18/03/05 12:03:42 INFO LineBufferedStream: stdout: diagnostics: AM container is launched, waiting for AM container to Register with RM 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster host: N/A 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster RPC port: -1 18/03/05 12:03:42 INFO LineBufferedStream: stdout: queue: default 18/03/05 12:03:42 INFO LineBufferedStream: stdout: start time: 1520269422040 18/03/05 12:03:42 INFO LineBufferedStream: stdout: final status: UNDEFINED 18/03/05 12:03:42 INFO LineBufferedStream: stdout: tracking URL: http://guedlpahdp001.devfg.rbc.com:8088/proxy/application_1520006903702_0110/ 18/03/05 12:03:42 INFO LineBufferedStream: stdout: user: DRLB0SRVCTRLRW 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Shutdown hook called 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Deleting directory /tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71 1 {code} Later, when the created batch is queried: GET /batches/2 {code:json} 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Requesting a new application from cluster with 8 NodeManagers 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (174080 MB per container) 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up container launch context for our AM 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up the launch environment for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Preparing resources for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO YarnSparkHadoopUtil: getting token for namenode: hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 73717 for DRLB0SRVCTRLRW on 10.61.34.124:8020 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Trying to connect to metastore with URI thrift://guedlpahdp002.devfg.rbc.com:9083 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Connected to metastore. 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO YarnSparkHadoopUtil: HBase class not found java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs:///hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs:/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs://guedlpahdp001.devfg.rbc.com:8020/user/alsavits/oats/oats-spark-controls-exec.jar 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Uploading resource file:/tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71/__spark_conf__2946776456430567485.zip -> hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110/__spark_conf__.zip 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with view permissions: Set(); users with modify permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with modify permissions: Set() 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Submitting application application_1520006903702_0110 to ResourceManager 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO YarnClientImpl: Submitted application application_1520006903702_0110 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Application report for application_1520006903702_0110 (state: ACCEPTED) 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: client token: Token { kind: YARN_CLIENT_TOKEN, service: } 18/03/05 12:03:42 INFO LineBufferedStream: stdout: diagnostics: AM container is launched, waiting for AM container to Register with RM 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster host: N/A 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster RPC port: -1 18/03/05 12:03:42 INFO LineBufferedStream: stdout: queue: default 18/03/05 12:03:42 INFO LineBufferedStream: stdout: start time: 1520269422040 18/03/05 12:03:42 INFO LineBufferedStream: stdout: final status: UNDEFINED 18/03/05 12:03:42 INFO LineBufferedStream: stdout: tracking URL: http://guedlpahdp001.devfg.rbc.com:8088/proxy/application_1520006903702_0110/ 18/03/05 12:03:42 INFO LineBufferedStream: stdout: user: DRLB0SRVCTRLRW 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Shutdown hook called 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Deleting directory /tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71 1{code} However, when hitting the tracking URL (shown in Livy log above), the application status is RUNNING, and it later completes successfully in 2-3 minutes time. was: Job was launched using the following request: POST /batches {code:java} { "file": "/user/alsavits/oats/oats-spark-controls-exec.jar", "className": "com.rbc.rbccm.regops.controls.CompletenessControl", "args": ["ET", "20171229"] }{code} Relevant Livy config: {code:java} livy.spark.master = yarn livy.spark.deploy-mode = cluster {code} Livy log: {code:java} 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Requesting a new application from cluster with 8 NodeManagers 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (174080 MB per container) 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up container launch context for our AM 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Setting up the launch environment for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO Client: Preparing resources for our AM container 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO YarnSparkHadoopUtil: getting token for namenode: hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 73717 for DRLB0SRVCTRLRW on 10.61.34.124:8020 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Trying to connect to metastore with URI thrift://guedlpahdp002.devfg.rbc.com:9083 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO metastore: Connected to metastore. 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO YarnSparkHadoopUtil: HBase class not found java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs:///hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs:/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Source and destination file systems are the same. Not copying hdfs://guedlpahdp001.devfg.rbc.com:8020/user/alsavits/oats/oats-spark-controls-exec.jar 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO Client: Uploading resource file:/tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71/__spark_conf__2946776456430567485.zip -> hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110/__spark_conf__.zip 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls to: drlb0ots,DRLB0SRVCTRLRW 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing view acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO SecurityManager: Changing modify acls groups to: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with view permissions: Set(); users with modify permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with modify permissions: Set() 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Submitting application application_1520006903702_0110 to ResourceManager 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO YarnClientImpl: Submitted application application_1520006903702_0110 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: Application report for application_1520006903702_0110 (state: ACCEPTED) 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO Client: 18/03/05 12:03:42 INFO LineBufferedStream: stdout: client token: Token { kind: YARN_CLIENT_TOKEN, service: } 18/03/05 12:03:42 INFO LineBufferedStream: stdout: diagnostics: AM container is launched, waiting for AM container to Register with RM 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster host: N/A 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster RPC port: -1 18/03/05 12:03:42 INFO LineBufferedStream: stdout: queue: default 18/03/05 12:03:42 INFO LineBufferedStream: stdout: start time: 1520269422040 18/03/05 12:03:42 INFO LineBufferedStream: stdout: final status: UNDEFINED 18/03/05 12:03:42 INFO LineBufferedStream: stdout: tracking URL: http://guedlpahdp001.devfg.rbc.com:8088/proxy/application_1520006903702_0110/ 18/03/05 12:03:42 INFO LineBufferedStream: stdout: user: DRLB0SRVCTRLRW 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Shutdown hook called 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO ShutdownHookManager: Deleting directory /tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71 1 {code} Later, when the created batch is queried: GET /batches/2 {code:java} Unknown macro: { "id"} , "log": [ "stdout: ", "ls: cannot access /usr/hdp/2.6.3.0-235/hadoop/lib: No such file or directory", "\nstderr: ", "\nYARN Diagnostics: ", "java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found", "org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227) org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161) org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94) org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72) org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187) org.apache.hadoop.service.AbstractService.start(AbstractService.java:193) org.apache.livy.utils.SparkYarnApp$.yarnClient$lzycompute(SparkYarnApp.scala:51) org.apache.livy.utils.SparkYarnApp$.yarnClient(SparkYarnApp.scala:48) org.apache.livy.utils.SparkYarnApp$.$lessinit$greater$default$6(SparkYarnApp.scala:119) org.apache.livy.utils.SparkApp$$anonfun$create$1.apply(SparkApp.scala:91) org.apache.livy.utils.SparkApp$$anonfun$create$1.apply(SparkApp.scala:91) org.apache.livy.utils.SparkYarnApp.org$apache$livy$utils$SparkYarnApp$$getAppIdFromTag(SparkYarnApp.scala:175) org.apache.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:239) org.apache.livy.utils.SparkYarnApp$$anonfun$1$$anonfun$4.apply(SparkYarnApp.scala:236) scala.Option.getOrElse(Option.scala:121) org.apache.livy.utils.SparkYarnApp$$anonfun$1.apply$mcV$sp(SparkYarnApp.scala:236) org.apache.livy.Utils$$anon$1.run(Utils.scala:94)" ] }{code} However, when hitting the tracking URL (shown in Livy log above), the application status is RUNNING, and it later completes successfully in 2-3 minutes time. > Batch session appears "dead" when launched against YARN cluster, but the job > completes fine. > -------------------------------------------------------------------------------------------- > > Key: LIVY-447 > URL: https://issues.apache.org/jira/browse/LIVY-447 > Project: Livy > Issue Type: Bug > Components: Batch > Affects Versions: 0.5.0 > Reporter: Alex Savitsky > Priority: Major > > Job was launched using the following request: > POST /batches > {code:java} > { > "file": "/user/alsavits/oats/oats-spark-controls-exec.jar", > "className": "com.rbc.rbccm.regops.controls.CompletenessControl", > "args": ["ET", "20171229"] > }{code} > Relevant Livy config: > > {code:java} > livy.spark.master = yarn > livy.spark.deploy-mode = cluster > {code} > > Livy log: > {code:java} > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Requesting a new application from cluster with 8 NodeManagers > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Verifying our application has not requested more than the maximum > memory capability of the cluster (174080 MB per container) > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Will allocate AM container, with 1408 MB memory including 384 MB > overhead > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Setting up container launch context for our AM > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Setting up the launch environment for our AM container > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Preparing resources for our AM container > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > YarnSparkHadoopUtil: getting token for namenode: > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110 > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > DFSClient: Created HDFS_DELEGATION_TOKEN token 73717 for DRLB0SRVCTRLRW on > 10.61.34.124:8020 > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > metastore: Trying to connect to metastore with URI > thrift://guedlpahdp002.devfg.rbc.com:9083 > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > metastore: Connected to metastore. > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > YarnSparkHadoopUtil: HBase class not found java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.HBaseConfiguration > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Use hdfs cache file as spark.yarn.archive for HDP, > hdfsCacheFile:hdfs:///hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Source and destination file systems are the same. Not copying > hdfs:/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Source and destination file systems are the same. Not copying > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/alsavits/oats/oats-spark-controls-exec.jar > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Uploading resource > file:/tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71/__spark_conf__2946776456430567485.zip > -> > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110/__spark_conf__.zip > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing view acls to: drlb0ots,DRLB0SRVCTRLRW > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing modify acls to: drlb0ots,DRLB0SRVCTRLRW > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing view acls groups to: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing modify acls groups to: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > SecurityManager: SecurityManager: authentication disabled; ui acls disabled; > users with view permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with view > permissions: Set(); users with modify permissions: Set(drlb0ots, > DRLB0SRVCTRLRW); groups with modify permissions: Set() > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: Submitting application application_1520006903702_0110 to > ResourceManager > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > YarnClientImpl: Submitted application application_1520006903702_0110 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: Application report for application_1520006903702_0110 (state: > ACCEPTED) > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: client token: Token > { kind: YARN_CLIENT_TOKEN, service: } > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: diagnostics: AM > container is launched, waiting for AM container to Register with RM > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster > host: N/A > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster > RPC port: -1 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: queue: default > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: start time: > 1520269422040 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: final status: > UNDEFINED > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: tracking URL: > http://guedlpahdp001.devfg.rbc.com:8088/proxy/application_1520006903702_0110/ > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: user: DRLB0SRVCTRLRW > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > ShutdownHookManager: Shutdown hook called > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > ShutdownHookManager: Deleting directory > /tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71 > 1 > {code} > Later, when the created batch is queried: > GET /batches/2 > {code:json} > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Requesting a new application from cluster with 8 NodeManagers > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Verifying our application has not requested more than the maximum > memory capability of the cluster (174080 MB per container) > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Will allocate AM container, with 1408 MB memory including 384 MB > overhead > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Setting up container launch context for our AM > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Setting up the launch environment for our AM container > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > Client: Preparing resources for our AM container > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > YarnSparkHadoopUtil: getting token for namenode: > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110 > 18/03/05 12:03:39 INFO LineBufferedStream: stdout: 18/03/05 12:03:39 INFO > DFSClient: Created HDFS_DELEGATION_TOKEN token 73717 for DRLB0SRVCTRLRW on > 10.61.34.124:8020 > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > metastore: Trying to connect to metastore with URI > thrift://guedlpahdp002.devfg.rbc.com:9083 > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > metastore: Connected to metastore. > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > YarnSparkHadoopUtil: HBase class not found java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.HBaseConfiguration > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Use hdfs cache file as spark.yarn.archive for HDP, > hdfsCacheFile:hdfs:///hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Source and destination file systems are the same. Not copying > hdfs:/hdp/apps/2.6.3.0-235/spark2/spark2-hdp-yarn-archive.tar.gz > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Source and destination file systems are the same. Not copying > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/alsavits/oats/oats-spark-controls-exec.jar > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > Client: Uploading resource > file:/tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71/__spark_conf__2946776456430567485.zip > -> > hdfs://guedlpahdp001.devfg.rbc.com:8020/user/DRLB0SRVCTRLRW/.sparkStaging/application_1520006903702_0110/__spark_conf__.zip > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing view acls to: drlb0ots,DRLB0SRVCTRLRW > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing modify acls to: drlb0ots,DRLB0SRVCTRLRW > 18/03/05 12:03:41 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing view acls groups to: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:41 INFO > SecurityManager: Changing modify acls groups to: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > SecurityManager: SecurityManager: authentication disabled; ui acls disabled; > users with view permissions: Set(drlb0ots, DRLB0SRVCTRLRW); groups with view > permissions: Set(); users with modify permissions: Set(drlb0ots, > DRLB0SRVCTRLRW); groups with modify permissions: Set() > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: Submitting application application_1520006903702_0110 to > ResourceManager > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > YarnClientImpl: Submitted application application_1520006903702_0110 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: Application report for application_1520006903702_0110 (state: > ACCEPTED) > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > Client: > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: client token: Token > { kind: YARN_CLIENT_TOKEN, service: } > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: diagnostics: AM > container is launched, waiting for AM container to Register with RM > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster > host: N/A > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: ApplicationMaster > RPC port: -1 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: queue: default > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: start time: > 1520269422040 > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: final status: > UNDEFINED > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: tracking URL: > http://guedlpahdp001.devfg.rbc.com:8088/proxy/application_1520006903702_0110/ > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: user: DRLB0SRVCTRLRW > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > ShutdownHookManager: Shutdown hook called > 18/03/05 12:03:42 INFO LineBufferedStream: stdout: 18/03/05 12:03:42 INFO > ShutdownHookManager: Deleting directory > /tmp/spark-91b652aa-704d-4986-9435-cbd369c62f71 > 1{code} > However, when hitting the tracking URL (shown in Livy log above), the > application status is RUNNING, and it later completes successfully in 2-3 > minutes time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)