Hyung, Yes, here they are.
zeppelin-env.sh: export ZEPPELIN_PORT=8890 export ZEPPELIN_CONF_DIR=/etc/zeppelin/conf export ZEPPELIN_LOG_DIR=/var/log/zeppelin export ZEPPELIN_PID_DIR=/var/run/zeppelin export ZEPPELIN_PID=$ZEPPELIN_PID_DIR/zeppelin.pid export ZEPPELIN_NOTEBOOK_DIR=/var/lib/zeppelin/notebook export ZEPPELIN_WAR_TEMPDIR=/var/run/zeppelin/webapps export MASTER=yarn-client export SPARK_HOME=/usr/lib/spark export HADOOP_CONF_DIR=/etc/hadoop/conf export CLASSPATH=":/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/* :/usr/share/aws/emr/emrfs/auxlib/*" export JAVA_HOME=/usr/lib/jvm/java-1.8.0 export ZEPPELIN_NOTEBOOK_S3_BUCKET=mybucket export ZEPPELIN_NOTEBOOK_S3_USER=zeppelin export ZEPPELIN_NOTEBOOK_STORAGE=org.apache.zeppelin.notebook.repo.S3NotebookRepo spark-defaults.conf: spark.master yarn spark.driver.extraClassPath /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf :/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* spark.driver.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native spark.executor.extraClassPath /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf :/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* spark.executor.extraLibraryPath /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native spark.eventLog.enabled true spark.eventLog.dir hdfs:///var/log/spark/apps spark.history.fs.logDirectory hdfs:///var/log/spark/apps spark.yarn.historyServer.address ip-172-30-54-30.ec2.internal:18080 spark.history.ui.port 18080 spark.shuffle.service.enabled true spark.driver.extraJavaOptions -Dlog4j.configuration=file:///etc/spark/conf/log4j.properties -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=512M -XX:OnOutOfMemoryError='kill -9 %p' spark.dynamicAllocation.enabled true spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CM SClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' spark.executor.memory 8640m spark.executor.cores 7 spark.authenticate.enableSaslEncryption true spark.driver.memory 1g spark.network.sasl.serverAlwaysEncrypt true spark.driver.cores 1 spark.ssl.protocol TLSv1.2 spark.ssl.keyStorePassword password spark.yarn.maxAppAttempts 1 spark.ssl.keyStore /etc/emr/security/keystore.jks spark.authenticate true spark.ssl.keyPassword password spark.ssl.enabled true spark.ssl.enabledAlgorithms TLS_RSA_WITH_AES_256_CBC_SHA spark.ssl.trustStore /etc/emr/security/truststore.jks spark.authenticate.secret secret spark.ssl.trustStorePassword password On Mon, Jun 27, 2016 at 7:33 PM, Hyung Sung Shim <hss...@nflabs.com> wrote: > Hi. > Could you share your conf/zeppelin-env.sh and spark-defaults.conf ? > > 2016-06-28 8:52 GMT+09:00 Jonathan Esterhazy <jonathan.esterh...@gmail.com > >: > >> I am having trouble using zeppelin in a spark cluster that has spark node >> authentication turned on (e.g. with spark.authenticate=true, >> spark.authenticate.secret=...) >> >> Notebook code that calls built-in spark functions (or other things on >> executor classpath) work fine, but functions defined in the notebook >> (anonymous or named) throw ClassNotFoundExceptions when called from an >> executor. >> >> For example, this code works: >> >> val rdd = sc.textFile("hdfs://my-text-file") >> rdd.take(1).foreach(println) >> >> rdd.saveAsTextFile("hdfs:///my-other-text-file") >> >> but code like this... >> >> rdd.filter(_.contains("my data")) >> >> fails with >> >> Caused by: java.lang.ClassNotFoundException: >> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1 >> at >> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:84) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:348) >> ... >> >> >> I get the same kind of error if the filter function is defined as a named >> function in the notebook, or as a member of singleton object defined in the >> notebook. >> >> When I look at the executor's log output, I see this error: >> >> 16/06/27 21:36:23 ERROR repl.ExecutorClassLoader: Failed to check >> existence of class >> $line31.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1 on REPL >> class server at https://172.30.54.30:34980 >> java.lang.NullPointerException >> at >> org.apache.spark.repl.ExecutorClassLoader.getClassFileInputStreamFromHttpServer(ExecutorClassLoader.scala:113) >> at >> org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:146) >> at >> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:76) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:348) >> >> ... >> >> >> If I disable spark authentication, everything works as expected. I am >> running zeppelin 0.5.6 on spark 1.6.1 with yarn. >> >> Has anyone been able to get zeppelin working with spark authentication? >> >> >> >