Re: Running a notebook in a standalone cluster mode issues

Sofiane Cherchalli Wed, 03 May 2017 11:29:15 -0700

Hi Moon,

So in my case, if II have standalone or yarn cluster, the workaround would
be to install zeppelin along every worker, proxy them,  and run each
zeppelin in client mode ?


Thanks,
Sofiane

El El mié, 3 may 2017 a las 19:12, moon soo Lee <[email protected]> escribió:

> Hi,
>
> Zeppelin does not support cluster mode deploy at the moment. Fortunately,
> there will be a support for cluster mode, soon!
> Please keep an eye on https://issues.apache.org/jira/browse/ZEPPELIN-2040.
>
> Thanks,
> moon
>
> On Wed, May 3, 2017 at 11:00 AM Sofiane Cherchalli <[email protected]>
> wrote:
>
>> Shall I configure a remote interpreter to my notebook to run on the
>> worker?
>>
>> Mayday!
>>
>> On Wed, May 3, 2017 at 4:18 PM, Sofiane Cherchalli <[email protected]>
>> wrote:
>>
>>> What port does the remote interpreter use?
>>>
>>> On Wed, May 3, 2017 at 2:14 PM, Sofiane Cherchalli <[email protected]>
>>> wrote:
>>>
>>>> Hi Moon and al,
>>>>
>>>> I have a standalone cluster with one master, one worker. I submit jobs
>>>> through zeppelin. master, worker, and zeppelin run in a separate container.
>>>>
>>>> My zeppelin-env.sh:
>>>>
>>>> # spark home
>>>> export SPARK_HOME=/usr/local/spark
>>>>
>>>> # set hadoop conf dir
>>>> export HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop
>>>>
>>>> # set options to pass spark-submit command
>>>> export SPARK_SUBMIT_OPTIONS="--packages
>>>> com.databricks:spark-csv_2.11:1.5.0 --deploy-mode cluster"
>>>>
>>>> # worker memory
>>>> export ZEPPELIN_JAVA_OPTS="-Dspark.driver.memory=7g
>>>> -Dspark.submit.deployMode=cluster"
>>>>
>>>> # master
>>>> export MASTER="spark://<master>:7077"
>>>>
>>>> My notebook code is very simple. It read csv and write it again in
>>>> directory /data previously created:
>>>> %spark.pyspark
>>>> def read_input(fin):
>>>>     '''
>>>>     Read input file from filesystem and return dataframe
>>>>     '''
>>>>     df = sqlContext.read.load(fin, format='com.databricks.spark.csv',
>>>> mode='PERMISSIVE', header='false', inferSchema='true')
>>>>     return df
>>>>
>>>> def write_output(df, fout):
>>>>     '''
>>>>     Write dataframe to filesystem
>>>>     '''
>>>>
>>>> df.write.mode('overwrite').format('com.databricks.spark.csv').options(delimiter=',',
>>>> header='true').save(fout)
>>>>
>>>> data_in = '/data/01.csv'
>>>> data_out = '/data/02.csv'
>>>> df = read_input(data_in)
>>>> newdf = del_columns(df)
>>>> write_output(newdf, data_out)
>>>>
>>>>
>>>> I used --deploy-mode to *cluster* so that the driver is run in the
>>>> worker in order to read the CSV in the /data directory and not in zeppelin.
>>>> When running the notebook it complains that
>>>> /opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar is
>>>> missing:
>>>> org.apache.zeppelin.interpreter.InterpreterException: Ivy Default Cache
>>>> set to: /root/.ivy2/cache The jars for the packages stored in:
>>>> /root/.ivy2/jars :: loading settings :: url =
>>>> jar:file:/opt/spark-2.1.0/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
>>>> com.databricks#spark-csv_2.11 added as a dependency :: resolving
>>>> dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default]
>>>> found com.databricks#spark-csv_2.11;1.5.0 in central found
>>>> org.apache.commons#commons-csv;1.1 in central found
>>>> com.univocity#univocity-parsers;1.5.1 in central :: resolution report ::
>>>> resolve 310ms :: artifacts dl 6ms :: modules in use:
>>>> com.databricks#spark-csv_2.11;1.5.0 from central in [default]
>>>> com.univocity#univocity-parsers;1.5.1 from central in [default]
>>>> org.apache.commons#commons-csv;1.1 from central in [default]
>>>> --------------------------------------------------------------------- | |
>>>> modules || artifacts | | conf | number| search|dwnlded|evicted||
>>>> number|dwnlded|
>>>> --------------------------------------------------------------------- |
>>>> default | 3 | 0 | 0 | 0 || 3 | 0 |
>>>> --------------------------------------------------------------------- ::
>>>> retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0
>>>> artifacts copied, 3 already retrieved (0kB/8ms) Running Spark using the
>>>> REST application submission protocol. SLF4J: Class path contains multiple
>>>> SLF4J bindings. SLF4J: Found binding in
>>>> [jar:file:/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> [jar:file:/opt/zeppelin-0.7.1/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>> explanation. SLF4J: Actual binding is of type
>>>> [org.slf4j.impl.Log4jLoggerFactory] Warning: Master endpoint
>>>> spark://spark-drone-master-sofiane.autoetl.svc.cluster.local:7077 was not a
>>>> REST server. Falling back to legacy submission gateway instead. Ivy Default
>>>> Cache set to: /root/.ivy2/cache The jars for the packages stored in:
>>>> /root/.ivy2/jars com.databricks#spark-csv_2.11 added as a dependency ::
>>>> resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs:
>>>> [default] found com.databricks#spark-csv_2.11;1.5.0 in central found
>>>> org.apache.commons#commons-csv;1.1 in central found
>>>> com.univocity#univocity-parsers;1.5.1 in central :: resolution report ::
>>>> resolve 69ms :: artifacts dl 5ms :: modules in use:
>>>> com.databricks#spark-csv_2.11;1.5.0 from central in [default]
>>>> com.univocity#univocity-parsers;1.5.1 from central in [default]
>>>> org.apache.commons#commons-csv;1.1 from central in [default]
>>>> --------------------------------------------------------------------- | |
>>>> modules || artifacts | | conf | number| search|dwnlded|evicted||
>>>> number|dwnlded|
>>>> --------------------------------------------------------------------- |
>>>> default | 3 | 0 | 0 | 0 || 3 | 0 |
>>>> --------------------------------------------------------------------- ::
>>>> retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0
>>>> artifacts copied, 3 already retrieved (0kB/4ms)
>>>> java.nio.file.NoSuchFileException:
>>>> /opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar at
>>>> sun.nio.fs.UnixException.translateToIOException(UnixException.java:86) at
>>>> sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
>>>>
>>>>
>>>>
>>>>
>>>> So, what I did next is copy the 
>>>> /opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar
>>>> in the worker container and restarted the interpreter and run the notebook.
>>>> It doesn't complain anymore about the zeppelin-spark_2.11-0.7.1.jar,
>>>> but I got another exception in the notebook related to the
>>>> RemoteInterpreterManagedProcess:
>>>>
>>>> org.apache.zeppelin.interpreter.InterpreterException: Ivy Default Cache
>>>> set to: /root/.ivy2/cache The jars for the packages stored in:
>>>> /root/.ivy2/jars :: loading settings :: url =
>>>> jar:file:/opt/spark-2.1.0/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
>>>> com.databricks#spark-csv_2.11 added as a dependency :: resolving
>>>> dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default]
>>>> found com.databricks#spark-csv_2.11;1.5.0 in central found
>>>> org.apache.commons#commons-csv;1.1 in central found
>>>> com.univocity#univocity-parsers;1.5.1 in central :: resolution report ::
>>>> resolve 277ms :: artifacts dl 7ms :: modules in use:
>>>> com.databricks#spark-csv_2.11;1.5.0 from central in [default]
>>>> com.univocity#univocity-parsers;1.5.1 from central in [default]
>>>> org.apache.commons#commons-csv;1.1 from central in [default]
>>>> --------------------------------------------------------------------- | |
>>>> modules || artifacts | | conf | number| search|dwnlded|evicted||
>>>> number|dwnlded|
>>>> --------------------------------------------------------------------- |
>>>> default | 3 | 0 | 0 | 0 || 3 | 0 |
>>>> --------------------------------------------------------------------- ::
>>>> retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0
>>>> artifacts copied, 3 already retrieved (0kB/8ms) Running Spark using the
>>>> REST application submission protocol. SLF4J: Class path contains multiple
>>>> SLF4J bindings. SLF4J: Found binding in
>>>> [jar:file:/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> [jar:file:/opt/zeppelin-0.7.1/lib/interpreter/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in
>>>> [jar:file:/opt/hadoop-2.7.3/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>>>> explanation. SLF4J: Actual binding is of type
>>>> [org.slf4j.impl.Log4jLoggerFactory] Warning: Master endpoint
>>>> spark://spark-drone-master-sofiane.autoetl.svc.cluster.local:7077 was not a
>>>> REST server. Falling back to legacy submission gateway instead. Ivy Default
>>>> Cache set to: /root/.ivy2/cache The jars for the packages stored in:
>>>> /root/.ivy2/jars com.databricks#spark-csv_2.11 added as a dependency ::
>>>> resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs:
>>>> [default] found com.databricks#spark-csv_2.11;1.5.0 in central found
>>>> org.apache.commons#commons-csv;1.1 in central found
>>>> com.univocity#univocity-parsers;1.5.1 in central :: resolution report ::
>>>> resolve 66ms :: artifacts dl 5ms :: modules in use:
>>>> com.databricks#spark-csv_2.11;1.5.0 from central in [default]
>>>> com.univocity#univocity-parsers;1.5.1 from central in [default]
>>>> org.apache.commons#commons-csv;1.1 from central in [default]
>>>> --------------------------------------------------------------------- | |
>>>> modules || artifacts | | conf | number| search|dwnlded|evicted||
>>>> number|dwnlded|
>>>> --------------------------------------------------------------------- |
>>>> default | 3 | 0 | 0 | 0 || 3 | 0 |
>>>> --------------------------------------------------------------------- ::
>>>> retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0
>>>> artifacts copied, 3 already retrieved (0kB/4ms) at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:143)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.reference(RemoteInterpreterProcess.java:73)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:258)
>>>> at
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:423)
>>>> at
>>>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:106)
>>>> at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387) at
>>>> org.apache.zeppelin.scheduler.Job.run(Job.java:175) at
>>>> org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
>>>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>> at
>>>> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>> at
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>> at java.lang.Thread.run(Thread.java:745)
>>>>
>>>>
>>>>
>>>> In the Spark jobs I see a
>>>> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer
>>>> running, and the stderr logs complains about missing log4j.properties:
>>>>
>>>> Launch Command: "/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java" "-cp" 
>>>> "/opt/zeppelin-0.7.1/interpreter/spark/*:/opt/zeppelin-0.7.1/lib/interpreter/*:/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar:/usr/local/spark/conf/:/usr/local/spark/jars/*:/opt/hadoop-2.7.3/etc/hadoop:/opt/hadoop-2.7.3/etc/hadoop/*:/opt/hadoop-2.7.3/share/hadoop/common/lib/*:/opt/hadoop-2.7.3/share/hadoop/common/*:/opt/hadoop-2.7.3/share/hadoop/hdfs/*:/opt/hadoop-2.7.3/share/hadoop/hdfs/lib/*:/opt/hadoop-2.7.3/share/hadoop/hdfs/*:/opt/hadoop-2.7.3/share/hadoop/yarn/lib/*:/opt/hadoop-2.7.3/share/hadoop/yarn/*:/opt/hadoop-2.7.3/share/hadoop/mapreduce/lib/*:/opt/hadoop-2.7.3/share/hadoop/mapreduce/*:/opt/hadoop-2.7.3/share/hadoop/tools/lib/*"
>>>>  "-Xmx1024M" 
>>>> "-Dspark.jars=file:/root/.ivy2/jars/com.databricks_spark-csv_2.11-1.5.0.jar,file:/root/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/root/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar,file:/root/.ivy2/jars/com.databricks_spark-csv_2.11-1.5.0.jar,file:/root/.ivy2/jars/org.apache.commons_commons-csv-1.1.jar,file:/root/.ivy2/jars/com.univocity_univocity-parsers-1.5.1.jar,file:/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar"
>>>>  "-Dspark.driver.supervise=false" "-Dspark.driver.extraJavaOptions= 
>>>> -Dfile.encoding=UTF-8 
>>>> -Dlog4j.configuration=file:///opt/zeppelin-0.7.1/conf/log4j.properties 
>>>> -Dzeppelin.log.file=/opt/zeppelin-0.7.1/logs/zeppelin-interpreter-spark--zeppelin-sofiane-1-zyfya.log"
>>>>  
>>>> "-Dspark.app.name=org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer"
>>>>  "-Dspark.submit.deployMode=cluster" 
>>>> "-Dspark.master=spark://spark-drone-master-sofiane.autoetl.svc.cluster.local:7077"
>>>>  
>>>> "-Dspark.driver.extraClassPath=::/opt/zeppelin-0.7.1/interpreter/spark/*:/opt/zeppelin-0.7.1/lib/interpreter/*::/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar"
>>>>  "-Dspark.rpc.askTimeout=10s" "-Dfile.encoding=UTF-8" 
>>>> "-Dlog4j.configuration=file:///opt/zeppelin-0.7.1/conf/log4j.properties" 
>>>> "-Dzeppelin.log.file=/opt/zeppelin-0.7.1/logs/zeppelin-interpreter-spark--zeppelin-sofiane-1-zyfya.log"
>>>>  "org.apache.spark.deploy.worker.DriverWrapper" 
>>>> "spark://[email protected]:41417" 
>>>> "/usr/local/spark/work/driver-20170503115405-0036/zeppelin-spark_2.11-0.7.1.jar"
>>>>  "org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer" "46151"
>>>> ========================================
>>>>
>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>> SLF4J: Found binding in 
>>>> [jar:file:/opt/zeppelin-0.7.1/interpreter/spark/zeppelin-spark_2.11-0.7.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: Found binding in 
>>>> [jar:file:/usr/local/spark/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>>>> explanation.
>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>> log4j:ERROR Could not read configuration file from URL 
>>>> [file:/opt/zeppelin-0.7.1/conf/log4j.properties].
>>>> java.io.FileNotFoundException: /opt/zeppelin-0.7.1/conf/log4j.properties 
>>>> (No such file or directory)
>>>>    at java.io.FileInputStream.open0(Native Method)
>>>>    at java.io.FileInputStream.open(FileInputStream.java:195)
>>>>    at java.io.FileInputStream.<init>(FileInputStream.java:138)
>>>>    at java.io.FileInputStream.<init>(FileInputStream.java:93)
>>>>    at 
>>>> sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
>>>>    at 
>>>> sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
>>>>    at 
>>>> org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:557)
>>>>    at 
>>>> org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:526)
>>>>    at org.apache.log4j.LogManager.<clinit>(LogManager.java:127)
>>>>    at 
>>>> org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:64)
>>>>    at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:285)
>>>>    at 
>>>> org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:155)
>>>>    at 
>>>> org.apache.commons.logging.impl.SLF4JLogFactory.getInstance(SLF4JLogFactory.java:132)
>>>>    at org.apache.commons.logging.LogFactory.getLog(LogFactory.java:685)
>>>>    at 
>>>> org.apache.hadoop.security.UserGroupInformation.<clinit>(UserGroupInformation.java:85)
>>>>    at 
>>>> org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2373)
>>>>    at 
>>>> org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2373)
>>>>    at scala.Option.getOrElse(Option.scala:121)
>>>>    at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2373)
>>>>    at org.apache.spark.SecurityManager.<init>(SecurityManager.scala:221)
>>>>    at 
>>>> org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:42)
>>>>    at 
>>>> org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
>>>> log4j:ERROR Ignoring configuration file 
>>>> [file:/opt/zeppelin-0.7.1/conf/log4j.properties].
>>>> log4j:WARN No appenders could be found for logger 
>>>> (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
>>>> log4j:WARN Please initialize the log4j system properly.
>>>> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for 
>>>> more info.
>>>> Using Spark's default log4j profile: 
>>>> org/apache/spark/log4j-defaults.properties
>>>> 17/05/03 11:54:06 INFO SecurityManager: Changing view acls to: root
>>>> 17/05/03 11:54:06 INFO SecurityManager: Changing modify acls to: root
>>>> 17/05/03 11:54:06 INFO SecurityManager: Changing view acls groups to:
>>>> 17/05/03 11:54:06 INFO SecurityManager: Changing modify acls groups to:
>>>> 17/05/03 11:54:06 INFO SecurityManager: SecurityManager: authentication 
>>>> disabled; ui acls disabled; users  with view permissions: Set(root); 
>>>> groups with view permissions: Set(); users  with modify permissions: 
>>>> Set(root); groups with modify permissions: Set()
>>>> 17/05/03 11:54:07 INFO Utils: Successfully started service 'Driver' on 
>>>> port 39770.
>>>> 17/05/03 11:54:07 INFO WorkerWatcher: Connecting to worker 
>>>> spark://[email protected]:41417
>>>> 17/05/03 11:54:07 INFO TransportClientFactory: Successfully created 
>>>> connection to /172.30.102.7:41417 after 27 ms (0 ms spent in bootstraps)
>>>> 17/05/03 11:54:07 INFO WorkerWatcher: Successfully connected to 
>>>> spark://[email protected]:41417
>>>> 17/05/03 11:54:07 INFO RemoteInterpreterServer: Starting remote 
>>>> interpreter server on port 46151
>>>>
>>>>
>>>> The process never finishes, so I got to kill it...
>>>>
>>>> What's going on? Anything wrong with my configuration?
>>>>
>>>> Any help appreciated. I am struggling since a week.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>

Re: Running a notebook in a standalone cluster mode issues

Reply via email to