Thank you. Let me try. 2016-06-28 22:18 GMT+09:00 Jonathan Esterhazy <jonathan.esterh...@gmail.com> :
> Hyung, > > Yes, here they are. > > zeppelin-env.sh: > > export ZEPPELIN_PORT=8890 > export ZEPPELIN_CONF_DIR=/etc/zeppelin/conf > export ZEPPELIN_LOG_DIR=/var/log/zeppelin > export ZEPPELIN_PID_DIR=/var/run/zeppelin > export ZEPPELIN_PID=$ZEPPELIN_PID_DIR/zeppelin.pid > export ZEPPELIN_NOTEBOOK_DIR=/var/lib/zeppelin/notebook > export ZEPPELIN_WAR_TEMPDIR=/var/run/zeppelin/webapps > export MASTER=yarn-client > export SPARK_HOME=/usr/lib/spark > export HADOOP_CONF_DIR=/etc/hadoop/conf > export > CLASSPATH=":/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/* > :/usr/share/aws/emr/emrfs/auxlib/*" > export JAVA_HOME=/usr/lib/jvm/java-1.8.0 > export ZEPPELIN_NOTEBOOK_S3_BUCKET=mybucket > export ZEPPELIN_NOTEBOOK_S3_USER=zeppelin > export > ZEPPELIN_NOTEBOOK_STORAGE=org.apache.zeppelin.notebook.repo.S3NotebookRepo > > spark-defaults.conf: > > spark.master yarn > spark.driver.extraClassPath > > /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf > :/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* > spark.driver.extraLibraryPath > /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native > spark.executor.extraClassPath > > /etc/hadoop/conf:/etc/hive/conf:/usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf > :/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/* > spark.executor.extraLibraryPath > /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native > spark.eventLog.enabled true > spark.eventLog.dir hdfs:///var/log/spark/apps > spark.history.fs.logDirectory hdfs:///var/log/spark/apps > spark.yarn.historyServer.address ip-172-30-54-30.ec2.internal:18080 > spark.history.ui.port 18080 > spark.shuffle.service.enabled true > spark.driver.extraJavaOptions > -Dlog4j.configuration=file:///etc/spark/conf/log4j.properties > -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 > -XX:MaxHeapFreeRatio=70 > -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=512M > -XX:OnOutOfMemoryError='kill -9 %p' > spark.dynamicAllocation.enabled true > spark.executor.extraJavaOptions -verbose:gc -XX:+PrintGCDetails > -XX:+PrintGCDateStamps -XX:+UseConcMarkSweepGC > -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CM > SClassUnloadingEnabled -XX:OnOutOfMemoryError='kill -9 %p' > spark.executor.memory 8640m > spark.executor.cores 7 > spark.authenticate.enableSaslEncryption true > spark.driver.memory 1g > spark.network.sasl.serverAlwaysEncrypt true > spark.driver.cores 1 > spark.ssl.protocol TLSv1.2 > spark.ssl.keyStorePassword password > spark.yarn.maxAppAttempts 1 > spark.ssl.keyStore /etc/emr/security/keystore.jks > spark.authenticate true > spark.ssl.keyPassword password > spark.ssl.enabled true > spark.ssl.enabledAlgorithms TLS_RSA_WITH_AES_256_CBC_SHA > spark.ssl.trustStore /etc/emr/security/truststore.jks > spark.authenticate.secret secret > spark.ssl.trustStorePassword password > > > > On Mon, Jun 27, 2016 at 7:33 PM, Hyung Sung Shim <hss...@nflabs.com> > wrote: > >> Hi. >> Could you share your conf/zeppelin-env.sh and spark-defaults.conf ? >> >> 2016-06-28 8:52 GMT+09:00 Jonathan Esterhazy < >> jonathan.esterh...@gmail.com>: >> >>> I am having trouble using zeppelin in a spark cluster that has spark >>> node authentication turned on (e.g. with spark.authenticate=true, >>> spark.authenticate.secret=...) >>> >>> Notebook code that calls built-in spark functions (or other things on >>> executor classpath) work fine, but functions defined in the notebook >>> (anonymous or named) throw ClassNotFoundExceptions when called from an >>> executor. >>> >>> For example, this code works: >>> >>> val rdd = sc.textFile("hdfs://my-text-file") >>> rdd.take(1).foreach(println) >>> >>> rdd.saveAsTextFile("hdfs:///my-other-text-file") >>> >>> but code like this... >>> >>> rdd.filter(_.contains("my data")) >>> >>> fails with >>> >>> Caused by: java.lang.ClassNotFoundException: >>> $iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1 >>> at >>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:84) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> at java.lang.Class.forName0(Native Method) >>> at java.lang.Class.forName(Class.java:348) >>> ... >>> >>> >>> I get the same kind of error if the filter function is defined as a >>> named function in the notebook, or as a member of singleton object defined >>> in the notebook. >>> >>> When I look at the executor's log output, I see this error: >>> >>> 16/06/27 21:36:23 ERROR repl.ExecutorClassLoader: Failed to check >>> existence of class >>> $line31.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1 on REPL >>> class server at https://172.30.54.30:34980 >>> java.lang.NullPointerException >>> at >>> org.apache.spark.repl.ExecutorClassLoader.getClassFileInputStreamFromHttpServer(ExecutorClassLoader.scala:113) >>> at >>> org.apache.spark.repl.ExecutorClassLoader.findClassLocally(ExecutorClassLoader.scala:146) >>> at >>> org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:76) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357) >>> at java.lang.Class.forName0(Native Method) >>> at java.lang.Class.forName(Class.java:348) >>> >>> ... >>> >>> >>> If I disable spark authentication, everything works as expected. I am >>> running zeppelin 0.5.6 on spark 1.6.1 with yarn. >>> >>> Has anyone been able to get zeppelin working with spark authentication? >>> >>> >>> >> >