Hi there, I want to achieve the following usecase: Start Zeppelin 0.9.0 (in docker) on my local dev machine but let the Spark jobs in the notebook run on a remote cluster via YARN.
For a few hours already, I try to setup that environment with my companies Cloudera CDH 6.3.1 development cluster. That cluster is unsecured (despite that it can only be reached when connected to VPN). With a lot of trial and error I finally achieved a successful connection from my dockerized Zeppelin to the cluster. This means that when I start running a spark cell in Zeppelin, I can see a new application in YARN on the cluster-side [named spark-shared_process] . However, eventually the execution of the cell will fail with the following stack trace in the yarn application [1]. I have no idea where this timeout could potentially come from and I'd be happy if you could help me out here. In the said VPN to the dev cluster, there are no connection restrictions like firewalls or stuff like that engaged. The cell I run is the first one in "3. Spark SQL (Scala)" Zeppelin quick start notebooks with title "Create Dataset/DataFrame via SparkSession". For reference, I also attach my docker-compose file [2] and my Dockerfile for building Zeppelin with Spark and Hadoop [3] (Note that I add hadoop conf files into the image because I'd like to distribute the image as ready-to-run for the other people in my project without needing them to copy over the hadoop conf files). After start of the container, I further change the interpreter settings by setting yarn-cluster in %spark interpreter settings and also set zeppelin.interpreter.connect.timeout to 600.000. Best regards Theo PS: HDFS in general seems to work well. [4] PPS: I also attach the docker container logs from an attempt [5] [1] INFO [2021-04-01 23:48:20,984] ({main} Logging.scala[logInfo]:54) - Registered signal handler for TERM INFO [2021-04-01 23:48:21,005] ({main} Logging.scala[logInfo]:54) - Registered signal handler for HUP INFO [2021-04-01 23:48:21,014] ({main} Logging.scala[logInfo]:54) - Registered signal handler for INT INFO [2021-04-01 23:48:22,158] ({main} Logging.scala[logInfo]:54) - Changing view acls to: yarn,sandbox INFO [2021-04-01 23:48:22,160] ({main} Logging.scala[logInfo]:54) - Changing modify acls to: yarn,sandbox INFO [2021-04-01 23:48:22,161] ({main} Logging.scala[logInfo]:54) - Changing view acls groups to: INFO [2021-04-01 23:48:22,162] ({main} Logging.scala[logInfo]:54) - Changing modify acls groups to: INFO [2021-04-01 23:48:22,168] ({main} Logging.scala[logInfo]:54) - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(yarn, sandbox); groups with view permissions: Set(); users with modify permissions: Set(yarn, sandbox); groups with modify permissions: Set() INFO [2021-04-01 23:48:25,388] ({main} Logging.scala[logInfo]:54) - Preparing Local resources WARN [2021-04-01 23:48:28,111] ({main} NativeCodeLoader.java[<clinit>]:62) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable INFO [2021-04-01 23:48:29,004] ({main} Logging.scala[logInfo]:54) - ApplicationAttemptId: appattempt_1617228950227_5781_000001 INFO [2021-04-01 23:48:29,041] ({main} Logging.scala[logInfo]:54) - Starting the user application in a separate Thread INFO [2021-04-01 23:48:29,289] ({main} Logging.scala[logInfo]:54) - Waiting for spark context initialization... INFO [2021-04-01 23:48:30,007] ({RegisterThread} RemoteInterpreterServer.java[run]:595) - Start registration INFO [2021-04-01 23:48:30,009] ({RemoteInterpreterServer-Thread} RemoteInterpreterServer.java[run]:193) - Launching ThriftServer at 99.99.99.99:44802 INFO [2021-04-01 23:48:31,276] ({RegisterThread} RemoteInterpreterServer.java[run]:609) - Registering interpreter process ERROR [2021-04-01 23:50:09,531] ({main} Logging.scala[logError]:91) - Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:223) at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:227) at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:220) at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:469) at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245) at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245) at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:780) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698) at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:779) at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244) at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:804) at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala) INFO [2021-04-01 23:50:09,547] ({main} Logging.scala[logInfo]:54) - Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: java.util.concurrent.TimeoutException: Futures timed out after [100000 milliseconds] [2] version: '3.7' services: zeppelin: build: zeppelin-customized ports: - "9999:8080" environment: ZEPPELIN_PORT: 8080 ZEPPELIN_JAVA_OPTS: >- -Dspark.driver.memory=1g -Dspark.executor.memory=2g HADOOP_USER_NAME: sandbox volumes: - zeppelindata:/zeppelin/data - zeppelinnotebooks:/zeppelin/notebook volumes: zeppelindata: zeppelinnotebooks: [3] FROM apache/zeppelin:0.9.0 # default user is 1000 in zeppelin base.. USER root RUN mkdir /spark && chown 1000:1000 /spark && mkdir /hadoop && chown 1000:1000 /hadoop USER 1000 # Add Spark RUN cd /spark \ && wget https://artfiles.org/apache.org/spark/spark-2.4.7/spark-2.4.7-bin-hadoop2.7.tgz \ && tar xf spark-2.4.7-bin-hadoop2.7.tgz \ && rm spark-2.4.7-bin-hadoop2.7.tgz \ && cd ~ ENV SPARK_HOME /spark/spark-2.4.7-bin-hadoop2.7 ENV HADOOP_CONF_DIR /zeppelin/conf # Add Hadoop RUN cd /hadoop \ && wget https://archive.apache.org/dist/hadoop/common/hadoop-3.0.0/hadoop-3.0.0.tar.gz \ && tar xf hadoop-3.0.0.tar.gz \ && rm hadoop-3.0.0.tar.gz \ && cd ~ ENV HADOOP_HOME /hadoop/hadoop-3.0.0 ENV HADOOP_INSTALL=$HADOOP_HOME ENV HADOOP_MAPRED_HOME=$HADOOP_HOME ENV HADOOP_COMMON_HOME=$HADOOP_HOME ENV HADOOP_HDFS_HOME=$HADOOP_HOME ENV YARN_HOME=$HADOOP_HOME ENV HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native ENV HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/nativ" ENV PATH="${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${PATH}" ENV USE_HADOOP=true # Copy over /etc/hadoop/conf from one of the cluster nodes... COPY cloudernode/conf/ /zeppelin/conf/ [4] %sh hdfs dfs -ls /user/sandbox => prints out properly. [5] zeppelin_1 | WARN [2021-04-01 23:18:36,440] ({SchedulerFactory4} SparkInterpreterLauncher.java[buildEnvFromProperties]:221) - spark-defaults.conf doesn't exist: /spark/spark-2.4.7-bin-hadoop2.7/conf/spark-defaults.conf zeppelin_1 | INFO [2021-04-01 23:18:36,440] ({SchedulerFactory4} SparkInterpreterLauncher.java[buildEnvFromProperties]:224) - buildEnvFromProperties: {PATH=/hadoop/hadoop-3.0.0/bin:/hadoop/hadoop-3.0.0/sbin:/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin, ZEPPELIN_PORT=8080, HADOOP_CONF_DIR=/zeppelin/conf, ZEPPELIN_JAVA_OPTS=-Dspark.driver.memory=1g -Dspark.executor.memory=2g, ZEPPELIN_LOG_DIR=/opt/zeppelin/logs, MASTER=yarn, ZEPPELIN_WAR=/opt/zeppelin/zeppelin-web-0.9.0.war, ZEPPELIN_ENCODING=UTF-8, ZEPPELIN_SPARK_CONF= --conf spark.yarn.dist.archives=/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr --conf spark.yarn.isPython=true --conf spark.executor.instances=2 --conf spark.app.name=spark-shared_process --conf spark.webui.yarn.useProxy=false --conf spark.driver.cores=1 --conf spark.yarn.maxAppAttempts=1 --conf spark.executor.memory=2g --conf spark.master=yarn-cluster --conf spark.files=/opt/zeppelin/conf/log4j_yarn_cluster.properties --conf spark.driver.memory=1g --conf spark.jars=/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar,/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar --conf spark.executor.cores=1 --conf spark.yarn.submit.waitAppCompletion=false, JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64, JAVA_OPTS= -Dspark.driver.memory=1g -Dspark.executor.memory=2g -Dfile.encoding=UTF-8 -Xms1024m -Xmx1024m -Dlog4j.configuration=file:///opt/zeppelin/conf/log4j.properties -Dzeppelin.log.file=/opt/zeppelin/logs/zeppelin--d5ea32f1f431.log, INTERPRETER_GROUP_ID=spark-shared_process, Z_VERSION=0.9.0, LANG=en_US.UTF-8, JAVA_INTP_OPTS= -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:///opt/zeppelin/conf/log4j.properties -Dlog4j.configurationFile=file:///opt/zeppelin/conf/log4j2.properties, PYSPARK_PYTHON=python, HADOOP_USER_NAME=sandbox, ZEPPELIN_SPARK_YARN_CLUSTER=true, Z_HOME=/opt/zeppelin, SPARK_HOME=/spark/spark-2.4.7-bin-hadoop2.7, ZEPPELIN_CONF_DIR=/opt/zeppelin/conf, YARN_HOME=/hadoop/hadoop-3.0.0, HADOOP_HDFS_HOME=/hadoop/hadoop-3.0.0, ZEPPELIN_RUNNER=/usr/lib/jvm/java-8-openjdk-amd64/bin/java, HADOOP_MAPRED_HOME=/hadoop/hadoop-3.0.0, PWD=/opt/zeppelin, HADOOP_COMMON_HOME=/hadoop/hadoop-3.0.0, HADOOP_INSTALL=/hadoop/hadoop-3.0.0, ZEPPELIN_HOME=/opt/zeppelin, LOG_TAG=[ZEPPELIN_0.9.0]:, ZEPPELIN_INTP_MEM=-Xms1024m -Xmx2048m, HADOOP_OPTS=-Djava.library.path=/hadoop/hadoop-3.0.0/lib/nativ, PYSPARK_DRIVER_PYTHON=python, ZEPPELIN_PID_DIR=/opt/zeppelin/run, ZEPPELIN_ANGULAR_WAR=/opt/zeppelin/zeppelin-web-angular-0.9.0.war, ZEPPELIN_MEM=-Xms1024m -Xmx1024m, HOSTNAME=d5ea32f1f431, LC_ALL=en_US.UTF-8, ZEPPELIN_IDENT_STRING=, PYSPARK_PIN_THREAD=true, HADOOP_HOME=/hadoop/hadoop-3.0.0, USE_HADOOP=true, HADOOP_COMMON_LIB_NATIVE_DIR=/hadoop/hadoop-3.0.0/lib/native, ZEPPELIN_ADDR=0.0.0.0, ZEPPELIN_INTERPRETER_REMOTE_RUNNER=bin/interpreter.sh, SHLVL=0, HOME=/opt/zeppelin} zeppelin_1 | INFO [2021-04-01 23:18:36,445] ({SchedulerFactory4} ProcessLauncher.java[transition]:109) - Process state is transitioned to LAUNCHED zeppelin_1 | INFO [2021-04-01 23:18:36,446] ({SchedulerFactory4} ProcessLauncher.java[launch]:96) - Process is launched: [/opt/zeppelin/bin/interpreter.sh, -d, /opt/zeppelin/interpreter/spark, -c, 172.2.0.2, -p, 46781, -r, :, -i, spark-shared_process, -l, /opt/zeppelin/local-repo/spark, -g, spark] zeppelin_1 | WARN [2021-04-01 23:20:51,930] ({Exec Default Executor} RemoteInterpreterManagedProcess.java[onProcessComplete]:255) - Process is exited with exit value 0 zeppelin_1 | INFO [2021-04-01 23:20:51,933] ({Exec Default Executor} ProcessLauncher.java[transition]:109) - Process state is transitioned to COMPLETED zeppelin_1 | INFO [2021-04-01 23:24:06,162] ({qtp418304857-11} VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln zeppelin_1 | INFO [2021-04-01 23:24:15,933] ({qtp418304857-27} VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln zeppelin_1 | WARN [2021-04-01 23:28:36,539] ({SchedulerFactory4} NotebookServer.java[onStatusChange]:1928) - Job 20180530-101750_1491737301 is finished, status: ERROR, exception: null, result: %text org.apache.zeppelin.interpreter.InterpreterException: java.io.IOException: Fail to launch interpreter process: zeppelin_1 | Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead. zeppelin_1 | 21/04/01 23:18:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable zeppelin_1 | 21/04/01 23:18:44 INFO client.RMProxy: Connecting to ResourceManager at machine1.REMOVEDDOMAIN.de/99.99.99.99:8032 zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (16400 MB per container) zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up container launch context for our AM zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up the launch environment for our AM container zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Preparing resources for our AM container zeppelin_1 | 21/04/01 23:18:45 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. zeppelin_1 | 21/04/01 23:18:53 INFO yarn.Client: Uploading resource file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_libs__5266504625643101044.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_libs__5266504625643101044.zip zeppelin_1 | 21/04/01 23:20:09 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-interpreter-0.9.0.jar zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-scala-2.11-0.9.0.jar zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/zeppelin-interpreter-shaded-0.9.0.jar zeppelin_1 | 21/04/01 23:20:41 INFO yarn.Client: Uploading resource file:/opt/zeppelin/conf/log4j_yarn_cluster.properties -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/log4j_yarn_cluster.properties zeppelin_1 | 21/04/01 23:20:42 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/sparkr.zip zeppelin_1 | 21/04/01 23:20:43 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/pyspark.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/pyspark.zip zeppelin_1 | 21/04/01 23:20:44 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/py4j-0.10.7-src.zip zeppelin_1 | 21/04/01 23:20:45 INFO yarn.Client: Uploading resource file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_conf__8289533000141907930.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_conf__.zip zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls to: zeppelin,sandbox zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls to: zeppelin,sandbox zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls groups to: zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls groups to: zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zeppelin, sandbox); groups with view permissions: Set(); users with modify permissions: Set(zeppelin, sandbox); groups with modify permissions: Set() zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Submitting application application_1617315347811_0170 to ResourceManager zeppelin_1 | 21/04/01 23:20:51 INFO impl.YarnClientImpl: Submitted application application_1617315347811_0170 zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Application report for application_1617315347811_0170 (state: ACCEPTED) zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: zeppelin_1 | client token: N/A zeppelin_1 | diagnostics: N/A zeppelin_1 | ApplicationMaster host: N/A zeppelin_1 | ApplicationMaster RPC port: -1 zeppelin_1 | queue: root.users.sandbox zeppelin_1 | start time: 1617319251597 zeppelin_1 | final status: UNDEFINED zeppelin_1 | tracking URL: http://machine1.REMOVEDDOMAIN.de:8088/proxy/application_1617315347811_0170/ zeppelin_1 | user: sandbox zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Shutdown hook called zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-1d86bc2c-eade-48f5-9650-423eef0fbda2 zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440 zeppelin_1 | zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:129) zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:271) zeppelin_1 | at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:444) zeppelin_1 | at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:72) zeppelin_1 | at org.apache.zeppelin.scheduler.Job.run(Job.java:172) zeppelin_1 | at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132) zeppelin_1 | at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:182) zeppelin_1 | at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) zeppelin_1 | at java.util.concurrent.FutureTask.run(FutureTask.java:266) zeppelin_1 | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) zeppelin_1 | at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) zeppelin_1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) zeppelin_1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) zeppelin_1 | at java.lang.Thread.run(Thread.java:748) zeppelin_1 | Caused by: java.io.IOException: Fail to launch interpreter process: zeppelin_1 | Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead. zeppelin_1 | 21/04/01 23:18:44 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable zeppelin_1 | 21/04/01 23:18:44 INFO client.RMProxy: Connecting to ResourceManager at machine1.REMOVEDDOMAIN.de/99.99.99.99:8032 zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Requesting a new application from cluster with 4 NodeManagers zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (16400 MB per container) zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up container launch context for our AM zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Setting up the launch environment for our AM container zeppelin_1 | 21/04/01 23:18:45 INFO yarn.Client: Preparing resources for our AM container zeppelin_1 | 21/04/01 23:18:45 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME. zeppelin_1 | 21/04/01 23:18:53 INFO yarn.Client: Uploading resource file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_libs__5266504625643101044.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_libs__5266504625643101044.zip zeppelin_1 | 21/04/01 23:20:09 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/spark/spark-interpreter-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-interpreter-0.9.0.jar zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/spark/scala-2.11/spark-scala-2.11-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/spark-scala-2.11-0.9.0.jar zeppelin_1 | 21/04/01 23:20:35 INFO yarn.Client: Uploading resource file:/opt/zeppelin/interpreter/zeppelin-interpreter-shaded-0.9.0.jar -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/zeppelin-interpreter-shaded-0.9.0.jar zeppelin_1 | 21/04/01 23:20:41 INFO yarn.Client: Uploading resource file:/opt/zeppelin/conf/log4j_yarn_cluster.properties -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/log4j_yarn_cluster.properties zeppelin_1 | 21/04/01 23:20:42 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/R/lib/sparkr.zip#sparkr -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/sparkr.zip zeppelin_1 | 21/04/01 23:20:43 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/pyspark.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/pyspark.zip zeppelin_1 | 21/04/01 23:20:44 INFO yarn.Client: Uploading resource file:/spark/spark-2.4.7-bin-hadoop2.7/python/lib/py4j-0.10.7-src.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/py4j-0.10.7-src.zip zeppelin_1 | 21/04/01 23:20:45 INFO yarn.Client: Uploading resource file:/tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440/__spark_conf__8289533000141907930.zip -> hdfs://machine1.REMOVEDDOMAIN.de:8020/user/sandbox/.sparkStaging/application_1617315347811_0170/__spark_conf__.zip zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls to: zeppelin,sandbox zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls to: zeppelin,sandbox zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing view acls groups to: zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: Changing modify acls groups to: zeppelin_1 | 21/04/01 23:20:46 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(zeppelin, sandbox); groups with view permissions: Set(); users with modify permissions: Set(zeppelin, sandbox); groups with modify permissions: Set() zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Submitting application application_1617315347811_0170 to ResourceManager zeppelin_1 | 21/04/01 23:20:51 INFO impl.YarnClientImpl: Submitted application application_1617315347811_0170 zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: Application report for application_1617315347811_0170 (state: ACCEPTED) zeppelin_1 | 21/04/01 23:20:51 INFO yarn.Client: zeppelin_1 | client token: N/A zeppelin_1 | diagnostics: N/A zeppelin_1 | ApplicationMaster host: N/A zeppelin_1 | ApplicationMaster RPC port: -1 zeppelin_1 | queue: root.users.sandbox zeppelin_1 | start time: 1617319251597 zeppelin_1 | final status: UNDEFINED zeppelin_1 | tracking URL: http://machine1.REMOVEDDOMAIN.de:8088/proxy/application_1617315347811_0170/ zeppelin_1 | user: sandbox zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Shutdown hook called zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-4c2bf1a1-2e67-42a9-8524-7810e1448440 zeppelin_1 | 21/04/01 23:20:51 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-1d86bc2c-eade-48f5-9650-423eef0fbda2 zeppelin_1 | zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:126) zeppelin_1 | at org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:68) zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:104) zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:154) zeppelin_1 | at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:126) zeppelin_1 | ... 13 more zeppelin_1 | zeppelin_1 | INFO [2021-04-01 23:28:36,542] ({SchedulerFactory4} VFSNotebookRepo.java[save]:144) - Saving note 2EYUV26VR to Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln