zeppelin architecture with multiple users
Hi, I want to set up a environment for a group of users so that they can access zeppelin. Each of them should have their own space, should not interfere each other. I install zeppelin on the MapR sandbox. If I access it from different computers, even I access different notebooks, the data are still shared. What I want is the data should be totally seperate between users and notebooks. How do I set it up like this? Thanks, York Huang
Issue with Zeppelin setup on Datastax-Spark
Hi Team, I am trying to integrate Zeppelin 0.6.0 with DataStax 4.8.8 (which has Spark 1.4.2). After I configured following properties in zeppelin-env.sh when I start zeppelin daemon it started and in the browser I can see zeppelin is running but when I am trying to execute spark query in the notebook it is throwing below Error. Could you please help me to solve this issue. export JAVA_HOME= export SPARK_HOME=/etc/dse/spark export HADOOP_CONF_DIR=/etc/dse/hadoop export MASTER=spark://:7077 Also, added/updated below properties in the Spark Interpreter screen of Zeppelin UI. Master spark.app.name spark.cassandra.auth.password spark.cassandra.auth.username spark.cassandra.connection.host spark.cores.max spark.executor.memory zeppelin.interpreter.host zeppelin.interpreter.port Trying to execute below statement in the notebook: %spark sc.version Below Error is getting: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:209) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:184) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:168) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:172) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:328) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105) users-subscr...@zeppelin.apache.org Thanks, Arpan. Notice: This e-mail message has been sent by an employee or contractor of American Express Global Business Travel (“GBT”), a joint venture that is not wholly-owned by American Express Company or any of its subsidiaries (“American Express”). E-mail from GBT employees and contractors may continue to utilize an American Express domain for some period; however American Express is not responsible for the content of this email, which is controlled by GBT. “American Express Global Business Travel”, “American Express” and the American Express Logo are trademarks of American Express and are used by GBT under limited license. Important : Cet email a été envoyé par un employé ou un sous-traitant d’American Express Global Business Travel (« GBT »), une joint-venture qui n’est pas intégralement détenue par American Express Company ou l’une de ses filiales (« American Express »). Les emails des employés et sous-traitants de GBT peuvent continuer de comporter un nom de domaine d’American Express pour une certaine période ; toutefois, American Express n’est pas responsable du contenu de cet email, qui est contrôlé par GBT. « American Express Global Business Travel », « American Express » et le logo American Express sont des marques d’American Express et sont utilisées par GBT aux termes d’une licence limitée. http://www.mindtree.com/email/disclaimer.html
Re: Matplotlib uses tkinter instead of Agg
I think I found the cause. I think it is font problem. In docker environment, it only has a small set of fonts installed. But I have not find out which font should I install...I will update you guys later. On Thu, Sep 15, 2016, 00:33 moon soo Lee wrote: > Tried x = np.arange(100), x = np.linspace(-2,2,1000) with both python2 and > python3 in %python interpreter. I don't have any problem. > > > On Wed, Sep 14, 2016 at 3:12 AM Xi Shen wrote: > >> OK, for this problem, it is discussed at >> https://stackoverflow.com/questions/15538099/conversion-of-unicode-minus-sign-from-matplotlib-ticklabels >> >> However, I just tried with Jupyter notebook, and its matplotlib can plot >> with negative values on the axes correctly, and >> matplotlib.rcParams['axes.unicode_minus'] = True. >> >> Can you guys please check if this only happens to a Python3 environment? >> I don't think I am the first one hit this problem. >> >> >> >> On Wed, Sep 14, 2016 at 5:49 PM Xi Shen wrote: >> >>> Hi, >>> >>> I worked it out...So I have start a new instance of Zeppelin...creating >>> a new notebook wont take effect...So all the Python code are executed in >>> one python vm? Shouldn't separating ones are better? >>> >>> After I get matplotlib work, I have a new problem. >>> >>> This code snippet works >>> %python >>> >>> import numpy as np >>> import matplotlib.pyplot as plt >>> >>> x = np.arange(100) >>> >>> plt.figure() >>> plt.plot(x, x**2) >>> z.show(plt, width='300px') >>> plt.close() >>> >>> But if I change x value to x= np.linspace(-2, 2, 1000), as it it used in >>> the example, I got >>> >>> >>> [] >>> >>> Traceback (most recent call last): >>> File "", line 1, in >>> File "", line 23, in show >>> File "", line 69, in show_matplotlib >>> UnicodeEncodeError: 'ascii' codec can't encode character '\u2212' in >>> position 17262: ordinal not in range(128) >>> >>> I did some testing, and I found if any of the value passed to plot() >>> contains negative numbers, I will get this error...very odd. >>> >>> >>> >>> On Wed, Sep 14, 2016 at 8:50 AM Felix Cheung >>> wrote: >>> And matplotlib.use('Agg') Would only work before matplotlib is first used so you would need to restart the interpreter. From error stack below it looks like something might be setting the default backend in matplotlib to TkAgg though. Are you using the Python interpreter or PySpark interpreter? Also how you are calling matplotlib like Moon asks? _ From: moon soo Lee Sent: Tuesday, September 13, 2016 2:34 PM Subject: Re: Matplotlib uses tkinter instead of Agg To: Hi, Thanks for sharing the problem. Could you share which version of Zeppelin are you using and how did you try matplotlib inside of Zeppelin? Are you trying matplotlib with z.show() ? Thanks, moon On Tue, Sep 13, 2016 at 1:56 AM Xi Shen wrote: > Hi, > > I want to build a Zeppelin docker image for my self. The docker image > is based on ubuntu:wily, and has openjdk-8-jre and python3 installed. I > also installed other packages that I need. > > After started Zeppelin in the docker, I am able to access the webapp > from my local browser. I tried to execute some simple Python script, and > it > works fine. But when I try to run the matplotlib example, I got error > saying that tkinter cannot find the $DISPLAY. > > Traceback (most recent call last): > File "", line 1, in > File "/usr/local/lib/python3.4/dist-packages/matplotlib/pyplot.py", > line 535, in figure > **kwargs) > File > "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", > line 84, in new_figure_manager > return new_figure_manager_given_figure(num, figure) > File > "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", > line 92, in new_figure_manager_given_figure > window = Tk.Tk() > File "/usr/lib/python3.4/tkinter/__init__.py", line 1859, in __init__ > self.tk = _tkinter.create(screenName, baseName, className, > interactive, wantobjects, useTk, sync, use) > _tkinter.TclError: no display name and no $DISPLAY environment variable > > Some people on the Internet suggested adding matplotlib.use('Agg') at > the beginning of the notebook, but it still does not work for me. > > -- > > > Thanks, > David S. > -- >>> >>> >>> Thanks, >>> David S. >>> >> -- >> >> >> Thanks, >> David S. >> > -- Thanks, David S.
Re: Pyspark interpreter configuration for Zeppelin
I feel there is a scala compatibility issue and I will try compiling with the right switches. On Wed, Sep 14, 2016 at 1:54 PM, Abhi Basu <9000r...@gmail.com> wrote: > Yes that fixed some of the problems. > > I am using Zeppelin 0.6.1 binaries against CDH 5.8 (Spark 1.6.0). Would > there be a compatibility issue? > > Thanks > > Abhi > > On Wed, Sep 14, 2016 at 12:55 PM, moon soo Lee wrote: > >> Could you try to set full path of python command on zeppelin.python >> property? not the bin directory. >> >> On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <9000r...@gmail.com> wrote: >> >>> Tried pyspark command on same machine which uses Anaconda python and >>> sc.version returned value. >>> >>> Zeppelin: >>> zeppelin.python /home/cloudera/anaconda2/bin >>> >>> In zeppelin, nothing is returned. >>> >>> >>> On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee wrote: >>> Did you export SPARK_HOME in conf/zeppelin-env.sh? Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on the same machine that zeppelin runs? Thanks, moon On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <9000r...@gmail.com> wrote: > Oops sorry. the above code generated this error: > > RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} > NotebookServer.java[onMessage]:221) - Can't handle message > org.apache.zeppelin.interpreter.InterpreterException: > org.apache.thrift.transport.TTransportException > at org.apache.zeppelin.interpreter.remote.RemoteInterpreter. > cancel(RemoteInterpreter.java:319) > at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(L > azyOpenInterpreter.java:100) > at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) > at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) > at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(No > tebookServer.java:995) > at org.apache.zeppelin.socket.NotebookServer.onMessage(Notebook > Server.java:180) > at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(No > tebookSocket.java:56) > at org.eclipse.jetty.websocket.common.events.JettyListenerEvent > Driver.onTextMessage(JettyListenerEventDriver.java:128) > at org.eclipse.jetty.websocket.common.message.SimpleTextMessage > .messageComplete(SimpleTextMessage.java:69) > at org.eclipse.jetty.websocket.common.events.AbstractEventDrive > r.appendMessage(AbstractEventDriver.java:65) > at org.eclipse.jetty.websocket.common.events.JettyListenerEvent > Driver.onTextFrame(JettyListenerEventDriver.java:122) > at org.eclipse.jetty.websocket.common.events.AbstractEventDrive > r.incomingFrame(AbstractEventDriver.java:161) > at org.eclipse.jetty.websocket.common.WebSocketSession.incoming > Frame(WebSocketSession.java:309) > at org.eclipse.jetty.websocket.common.extensions.ExtensionStack > .incomingFrame(ExtensionStack.java:214) > at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser > .java:220) > at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) > at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConne > ction.readParse(AbstractWebSocketConnection.java:632) > at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConne > ction.onFillable(AbstractWebSocketConnection.java:480) > at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnec > tion.java:544) > at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(Queued > ThreadPool.java:635) > at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedT > hreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TTransportException > at org.apache.thrift.transport.TIOStreamTransport.read(TIOStrea > mTransport.java:132) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryPr > otocol.java:429) > at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryPr > otocol.java:318) > at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin( > TBinaryProtocol.java:219) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient. > java:69) > at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterServ > ice$Client.recv_cancel(RemoteInterpreterService.java:274) > at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterServ > ice$Client.cancel(RemoteInterpreterService.java:259) > at org.apache.zeppelin.interpreter.remote.RemoteInterpreter. > cancel(RemoteInterpreter.java:316) > ... 21 more > > > This is my spark interpreter settings: > > > spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep > Option > Interpreter for note > > Connect to existing process > Properties > name value > args > master yarn-client >>>
Re: Pyspark interpreter configuration for Zeppelin
Yes that fixed some of the problems. I am using Zeppelin 0.6.1 binaries against CDH 5.8 (Spark 1.6.0). Would there be a compatibility issue? Thanks Abhi On Wed, Sep 14, 2016 at 12:55 PM, moon soo Lee wrote: > Could you try to set full path of python command on zeppelin.python > property? not the bin directory. > > On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <9000r...@gmail.com> wrote: > >> Tried pyspark command on same machine which uses Anaconda python and >> sc.version returned value. >> >> Zeppelin: >> zeppelin.python /home/cloudera/anaconda2/bin >> >> In zeppelin, nothing is returned. >> >> >> On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee wrote: >> >>> Did you export SPARK_HOME in conf/zeppelin-env.sh? >>> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on >>> the same machine that zeppelin runs? >>> >>> Thanks, >>> moon >>> >>> >>> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <9000r...@gmail.com> wrote: >>> Oops sorry. the above code generated this error: RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} NotebookServer.java[onMessage]:221) - Can't handle message org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel( RemoteInterpreter.java:319) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel( LazyOpenInterpreter.java:100) at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) at org.apache.zeppelin.socket.NotebookServer.cancelParagraph( NotebookServer.java:995) at org.apache.zeppelin.socket.NotebookServer.onMessage( NotebookServer.java:180) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText( NotebookSocket.java:56) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver. onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage. messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver. appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver. onTextFrame(JettyListenerEventDriver.java:122) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver. incomingFrame(AbstractEventDriver.java:161) at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame( WebSocketSession.java:309) at org.eclipse.jetty.websocket.common.extensions. ExtensionStack.incomingFrame(ExtensionStack.java:214) at org.eclipse.jetty.websocket.common.Parser.notifyFrame( Parser.java:220) at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection. readParse(AbstractWebSocketConnection.java:632) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection. onFillable(AbstractWebSocketConnection.java:480) at org.eclipse.jetty.io.AbstractConnection$2.run( AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run( QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read( TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll( TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32( TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin( TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$ Client.recv_cancel(RemoteInterpreterService.java:274) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$ Client.cancel(RemoteInterpreterService.java:259) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel( RemoteInterpreter.java:316) ... 21 more This is my spark interpreter settings: spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep Option Interpreter for note Connect to existing process Properties name value args master yarn-client spark.app.name Zeppelin spark.cores.max spark.executor.memory zeppelin.R.cmd R zeppelin.R.image.width 100% zeppelin.R.knitr true zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F zeppel
Issue with Zeppelin setup
Hi Team, I am trying to integrate Zeppelin 0.6.0 with DataStax 4.8.8 (which has Spark 1.4.2). After I configured following properties in zeppelin-env.sh when I start zeppelin daemon it started and in the browser I can see zeppelin is running but when I am trying to execute spark query in the notebook it is throwing below Error. Could you please guide me to solve this issue. export JAVA_HOME= export SPARK_HOME=/etc/dse/spark export HADOOP_CONF_DIR=/etc/dse/hadoop export MASTER=spark://:7077 Also, added/updated below properties in the Spark Interpreter screen of Zeppelin UI. Master spark.app.name spark.cassandra.auth.password spark.cassandra.auth.username spark.cassandra.connection.host spark.cores.max spark.executor.memory zeppelin.interpreter.host zeppelin.interpreter.port Trying to execute below statement in the notebook: %spark sc.version Below Error is getting: java.net.SocketException: Connection reset at java.net.SocketInputStream.read(SocketInputStream.java:209) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_createInterpreter(RemoteInterpreterService.java:184) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.createInterpreter(RemoteInterpreterService.java:168) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.init(RemoteInterpreter.java:172) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:328) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:105) users-subscr...@zeppelin.apache.org Thanks, Arpan. Notice: This e-mail message has been sent by an employee or contractor of American Express Global Business Travel (“GBT”), a joint venture that is not wholly-owned by American Express Company or any of its subsidiaries (“American Express”). E-mail from GBT employees and contractors may continue to utilize an American Express domain for some period; however American Express is not responsible for the content of this email, which is controlled by GBT. “American Express Global Business Travel”, “American Express” and the American Express Logo are trademarks of American Express and are used by GBT under limited license. Important : Cet email a été envoyé par un employé ou un sous-traitant d’American Express Global Business Travel (« GBT »), une joint-venture qui n’est pas intégralement détenue par American Express Company ou l’une de ses filiales (« American Express »). Les emails des employés et sous-traitants de GBT peuvent continuer de comporter un nom de domaine d’American Express pour une certaine période ; toutefois, American Express n’est pas responsable du contenu de cet email, qui est contrôlé par GBT. « American Express Global Business Travel », « American Express » et le logo American Express sont des marques d’American Express et sont utilisées par GBT aux termes d’une licence limitée. http://www.mindtree.com/email/disclaimer.html
Re: Pyspark interpreter configuration for Zeppelin
Could you try to set full path of python command on zeppelin.python property? not the bin directory. On Wed, Sep 14, 2016 at 10:19 AM Abhi Basu <9000r...@gmail.com> wrote: > Tried pyspark command on same machine which uses Anaconda python and > sc.version returned value. > > Zeppelin: > zeppelin.python /home/cloudera/anaconda2/bin > > In zeppelin, nothing is returned. > > > On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee wrote: > >> Did you export SPARK_HOME in conf/zeppelin-env.sh? >> Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on >> the same machine that zeppelin runs? >> >> Thanks, >> moon >> >> >> On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <9000r...@gmail.com> wrote: >> >>> Oops sorry. the above code generated this error: >>> >>> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} >>> NotebookServer.java[onMessage]:221) - Can't handle message >>> org.apache.zeppelin.interpreter.InterpreterException: >>> org.apache.thrift.transport.TTransportException >>> at >>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319) >>> at >>> org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100) >>> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) >>> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) >>> at >>> org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995) >>> at >>> org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180) >>> at >>> org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56) >>> at >>> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) >>> at >>> org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) >>> at >>> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) >>> at >>> org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122) >>> at >>> org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161) >>> at >>> org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309) >>> at >>> org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214) >>> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) >>> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) >>> at org.eclipse.jetty.websocket.common.io >>> .AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632) >>> at org.eclipse.jetty.websocket.common.io >>> .AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480) >>> at org.eclipse.jetty.io >>> .AbstractConnection$2.run(AbstractConnection.java:544) >>> at >>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) >>> at >>> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) >>> at java.lang.Thread.run(Thread.java:745) >>> Caused by: org.apache.thrift.transport.TTransportException >>> at >>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) >>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) >>> at >>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) >>> at >>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) >>> at >>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) >>> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) >>> at >>> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274) >>> at >>> org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259) >>> at >>> org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316) >>> ... 21 more >>> >>> >>> This is my spark interpreter settings: >>> >>> >>> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep >>> Option >>> Interpreter for note >>> >>> Connect to existing process >>> Properties >>> name value >>> args >>> master yarn-client >>> spark.app.name Zeppelin >>> spark.cores.max >>> spark.executor.memory >>> zeppelin.R.cmd R >>> zeppelin.R.image.width 100% >>> zeppelin.R.knitr true >>> zeppelin.R.render.options out.format = 'html', comment = NA, echo = >>> FALSE, results = 'asis', message = F, warning = F >>> zeppelin.dep.additionalRemoteRepository spark-packages, >>> http://dl.bintray.com/spark-packages/maven,false; >>> zeppelin.dep.localrepo local-repo >>> zeppelin.interpreter.localRepo >>> /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU >>> zeppelin.pyspark.python python >>> zeppelin.spark.concurrentSQL false >>> zeppelin.spark.importImplicit true >>
Re: Pyspark interpreter configuration for Zeppelin
Tried pyspark command on same machine which uses Anaconda python and sc.version returned value. Zeppelin: zeppelin.python /home/cloudera/anaconda2/bin In zeppelin, nothing is returned. On Wed, Sep 14, 2016 at 11:53 AM, moon soo Lee wrote: > Did you export SPARK_HOME in conf/zeppelin-env.sh? > Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on > the same machine that zeppelin runs? > > Thanks, > moon > > > On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <9000r...@gmail.com> wrote: > >> Oops sorry. the above code generated this error: >> >> RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} >> NotebookServer.java[onMessage]:221) - Can't handle message >> org.apache.zeppelin.interpreter.InterpreterException: >> org.apache.thrift.transport.TTransportException >> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel( >> RemoteInterpreter.java:319) >> at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel( >> LazyOpenInterpreter.java:100) >> at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) >> at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) >> at org.apache.zeppelin.socket.NotebookServer.cancelParagraph( >> NotebookServer.java:995) >> at org.apache.zeppelin.socket.NotebookServer.onMessage( >> NotebookServer.java:180) >> at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText( >> NotebookSocket.java:56) >> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver. >> onTextMessage(JettyListenerEventDriver.java:128) >> at org.eclipse.jetty.websocket.common.message.SimpleTextMessage. >> messageComplete(SimpleTextMessage.java:69) >> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver. >> appendMessage(AbstractEventDriver.java:65) >> at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver. >> onTextFrame(JettyListenerEventDriver.java:122) >> at org.eclipse.jetty.websocket.common.events.AbstractEventDriver. >> incomingFrame(AbstractEventDriver.java:161) >> at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame( >> WebSocketSession.java:309) >> at org.eclipse.jetty.websocket.common.extensions. >> ExtensionStack.incomingFrame(ExtensionStack.java:214) >> at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) >> at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) >> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection. >> readParse(AbstractWebSocketConnection.java:632) >> at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection. >> onFillable(AbstractWebSocketConnection.java:480) >> at org.eclipse.jetty.io.AbstractConnection$2.run( >> AbstractConnection.java:544) >> at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob( >> QueuedThreadPool.java:635) >> at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run( >> QueuedThreadPool.java:555) >> at java.lang.Thread.run(Thread.java:745) >> Caused by: org.apache.thrift.transport.TTransportException >> at org.apache.thrift.transport.TIOStreamTransport.read( >> TIOStreamTransport.java:132) >> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) >> at org.apache.thrift.protocol.TBinaryProtocol.readAll( >> TBinaryProtocol.java:429) >> at org.apache.thrift.protocol.TBinaryProtocol.readI32( >> TBinaryProtocol.java:318) >> at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin( >> TBinaryProtocol.java:219) >> at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) >> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$ >> Client.recv_cancel(RemoteInterpreterService.java:274) >> at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$ >> Client.cancel(RemoteInterpreterService.java:259) >> at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel( >> RemoteInterpreter.java:316) >> ... 21 more >> >> >> This is my spark interpreter settings: >> >> >> spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep >> Option >> Interpreter for note >> >> Connect to existing process >> Properties >> name value >> args >> master yarn-client >> spark.app.name Zeppelin >> spark.cores.max >> spark.executor.memory >> zeppelin.R.cmd R >> zeppelin.R.image.width 100% >> zeppelin.R.knitr true >> zeppelin.R.render.options out.format = 'html', comment = NA, echo = >> FALSE, results = 'asis', message = F, warning = F >> zeppelin.dep.additionalRemoteRepository spark-packages,http://dl. >> bintray.com/spark-packages/maven,false; >> zeppelin.dep.localrepo local-repo >> zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1- >> bin-all/local-repo/2BXF675WU >> zeppelin.pyspark.python python >> zeppelin.spark.concurrentSQL false >> zeppelin.spark.importImplicit true >> zeppelin.spark.maxResult 1000 >> zeppelin.spark.printREPLOutput true >> zeppelin.spark.sql.stacktrace false >> zeppelin.spark.useHiveContext true >> >> >> On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <9000r...@gmail.com> wrote: >> >>> %pyspark >>> >>> input_file = "hdfs:/
Re: Pyspark interpreter configuration for Zeppelin
Did you export SPARK_HOME in conf/zeppelin-env.sh? Could you verify the some code works with ${SPARK_HOME}/bin/pyspark, on the same machine that zeppelin runs? Thanks, moon On Wed, Sep 14, 2016 at 8:07 AM Abhi Basu <9000r...@gmail.com> wrote: > Oops sorry. the above code generated this error: > > RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} > NotebookServer.java[onMessage]:221) - Can't handle message > org.apache.zeppelin.interpreter.InterpreterException: > org.apache.thrift.transport.TTransportException > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100) > at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) > at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) > at > org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995) > at > org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180) > at > org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56) > at > org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) > at > org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) > at > org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) > at > org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122) > at > org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161) > at > org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309) > at > org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214) > at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) > at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) > at > org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632) > at > org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480) > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.apache.thrift.transport.TTransportException > at > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > at > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > at > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274) > at > org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316) > ... 21 more > > > This is my spark interpreter settings: > > > spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep > Option > Interpreter for note > > Connect to existing process > Properties > name value > args > master yarn-client > spark.app.name Zeppelin > spark.cores.max > spark.executor.memory > zeppelin.R.cmd R > zeppelin.R.image.width 100% > zeppelin.R.knitr true > zeppelin.R.render.options out.format = 'html', comment = NA, echo = > FALSE, results = 'asis', message = F, warning = F > zeppelin.dep.additionalRemoteRepository spark-packages, > http://dl.bintray.com/spark-packages/maven,false; > zeppelin.dep.localrepo local-repo > zeppelin.interpreter.localRepo > /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU > zeppelin.pyspark.python python > zeppelin.spark.concurrentSQL false > zeppelin.spark.importImplicit true > zeppelin.spark.maxResult 1000 > zeppelin.spark.printREPLOutput true > zeppelin.spark.sql.stacktrace false > zeppelin.spark.useHiveContext true > > > On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <9000r...@gmail.com> wrote: > >> %pyspark >> >> input_file = "hdfs:tmp/filenname.gz" >> >> raw_rdd = sc.textFile(input_file) >> >> >> > > > -- > Abhi Basu >
Re: Plot empty points in Line chart
I think there should be code chances to address this problem. Maybe line chart can have a checkbox option that user can select ignore empty value or treats empty value as zero. Do you mind file an issue for it? Thanks, moon On Mon, Sep 12, 2016 at 8:11 AM Ayestaran Nerea wrote: > Hi everyone! > > > > I have multiple variables in a Spark Dataset that I need to plot in > Zeppelin. Those variables consist of a timestamp and a value but not all > the variables have the same timestamps. For example: > > > > 1. Variable a has values at: |10:00|10:05| > |10:25| > > 2. Variable b has values at: |10:00|10:05|10:10| > > 3. Variable c has values at: |10:05| |10:15|10:20| > > > > Zeppelin treats those empty points as zeros, so the resulting line chart > makes no sense. If I make a scatter plot, all the points are shown > correctly but I need a line chart. Is there any way to plot the average > value in those timestamps? Or even better, can I ignore those points and > not plot them ? > > > > I send you some pictures in case my explanation is not good enough. > > [image: image001.jpg][image: image002.jpg] > > > > As it can be seen in the scatter plot image, the zeros are not real, but > are shown in the line chart. Is there any solution to my problem? > > > > Thank you so much in advance > > Nerea > > > > >
Re: Matplotlib uses tkinter instead of Agg
Tried x = np.arange(100), x = np.linspace(-2,2,1000) with both python2 and python3 in %python interpreter. I don't have any problem. On Wed, Sep 14, 2016 at 3:12 AM Xi Shen wrote: > OK, for this problem, it is discussed at > https://stackoverflow.com/questions/15538099/conversion-of-unicode-minus-sign-from-matplotlib-ticklabels > > However, I just tried with Jupyter notebook, and its matplotlib can plot > with negative values on the axes correctly, and > matplotlib.rcParams['axes.unicode_minus'] = True. > > Can you guys please check if this only happens to a Python3 environment? I > don't think I am the first one hit this problem. > > > > On Wed, Sep 14, 2016 at 5:49 PM Xi Shen wrote: > >> Hi, >> >> I worked it out...So I have start a new instance of Zeppelin...creating a >> new notebook wont take effect...So all the Python code are executed in one >> python vm? Shouldn't separating ones are better? >> >> After I get matplotlib work, I have a new problem. >> >> This code snippet works >> %python >> >> import numpy as np >> import matplotlib.pyplot as plt >> >> x = np.arange(100) >> >> plt.figure() >> plt.plot(x, x**2) >> z.show(plt, width='300px') >> plt.close() >> >> But if I change x value to x= np.linspace(-2, 2, 1000), as it it used in >> the example, I got >> >> >> [] >> >> Traceback (most recent call last): >> File "", line 1, in >> File "", line 23, in show >> File "", line 69, in show_matplotlib >> UnicodeEncodeError: 'ascii' codec can't encode character '\u2212' in >> position 17262: ordinal not in range(128) >> >> I did some testing, and I found if any of the value passed to plot() >> contains negative numbers, I will get this error...very odd. >> >> >> >> On Wed, Sep 14, 2016 at 8:50 AM Felix Cheung >> wrote: >> >>> And >>> matplotlib.use('Agg') >>> >>> Would only work before matplotlib is first used so you would need to >>> restart the interpreter. From error stack below it looks like something >>> might be setting the default backend in matplotlib to TkAgg though. >>> >>> Are you using the Python interpreter or PySpark interpreter? Also how >>> you are calling matplotlib like Moon asks? >>> >>> _ >>> From: moon soo Lee >>> Sent: Tuesday, September 13, 2016 2:34 PM >>> Subject: Re: Matplotlib uses tkinter instead of Agg >>> To: >>> >>> >>> >>> Hi, >>> >>> Thanks for sharing the problem. >>> Could you share which version of Zeppelin are you using and how did you >>> try matplotlib inside of Zeppelin? Are you trying matplotlib with >>> z.show() ? >>> >>> Thanks, >>> moon >>> >>> On Tue, Sep 13, 2016 at 1:56 AM Xi Shen wrote: >>> Hi, I want to build a Zeppelin docker image for my self. The docker image is based on ubuntu:wily, and has openjdk-8-jre and python3 installed. I also installed other packages that I need. After started Zeppelin in the docker, I am able to access the webapp from my local browser. I tried to execute some simple Python script, and it works fine. But when I try to run the matplotlib example, I got error saying that tkinter cannot find the $DISPLAY. Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.4/dist-packages/matplotlib/pyplot.py", line 535, in figure **kwargs) File "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", line 84, in new_figure_manager return new_figure_manager_given_figure(num, figure) File "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", line 92, in new_figure_manager_given_figure window = Tk.Tk() File "/usr/lib/python3.4/tkinter/__init__.py", line 1859, in __init__ self.tk = _tkinter.create(screenName, baseName, className, interactive, wantobjects, useTk, sync, use) _tkinter.TclError: no display name and no $DISPLAY environment variable Some people on the Internet suggested adding matplotlib.use('Agg') at the beginning of the notebook, but it still does not work for me. -- Thanks, David S. >>> >>> >>> -- >> >> >> Thanks, >> David S. >> > -- > > > Thanks, > David S. >
Re: Hbase configuration storage without data
Regarding data in the note.json, In case of user doesn't want include data in exported note.json, user can clean the outputs before export, for now. We might think displaying two export options with / without data when click export button, if exporting notebook without data is important and need to user aware everytime they export the notebook. But please consider many different possible use cases. Some people might have important information inside of the code (like credentials) but result of query can be made public, some people might want to restrict access to the raw data but want to share query result to other people. Best, moon On Tue, Sep 13, 2016 at 10:20 PM Vikash Kumar wrote: > Hi, > > But storing the data in a separate file approach will need to maintain the > link between both files. And also this approach is not preferable when the > data is obtained on access basis. like in my case data which comes from > hbase through phoenix is tenant base. So storing that data into note.json > or in different file is breaking the point of multi tenancy. > > So as an approach can we store only configuration and retrieve the data > when we are loading the note by running the all paragraph for first time > load. > > > > But at the same time, i think having data in the note.json helps make > import/export simple and make notebook render able without run it. > > > > So for import/export providing the data is it good? Data is always > confidential and cannot be shared with anyone in form of json. So in this > approach any one can open the note.json and can access the data. > > Thanks & Regards, > > *Vikash Kumar* > > *From:* Felix Cheung [mailto:felixcheun...@hotmail.com] > *Sent:* Wednesday, September 14, 2016 6:24 AM > *To:* users@zeppelin.apache.org; users@zeppelin.apache.org > > > *Subject:* Re: Hbase configuration storage without data > > > > I like that approach - though you should be able to clear result output > before exporting the note, if all you want is the config? The should remove > all output data, keeping it smaller? > > > > > > _ > From: Mohit Jaggi > Sent: Monday, September 12, 2016 10:38 AM > Subject: Re: Hbase configuration storage without data > To: > > > > one option is to keep the data in separate files. notes.json can contain > the code and the data can be a pointer to /path/to/file. import/export can > choose to include or exclude the data. when it is included the data files > are added to a tgz file containing notes.json otherwise you just export > notes.json > > > > > > > > On Mon, Sep 12, 2016 at 10:33 AM, moon soo Lee wrote: > > Right big note.json file is a problem. > > But at the same time, i think having data in the note.json helps make > import/export simple and make notebook renderable without run it. > > > > So far, i didn't see much discussion about this subject on mailing list or > on the issue tracker. > > > > If there's an good idea that can handle large data while keeping > import/export simple and ability to render without run, that would be a > great starting point of the discussions. > > > > Thanks, > > moon > > > > On Wed, Sep 7, 2016 at 9:40 PM Vikash Kumar > wrote: > > Hi moon, > > Yes that was the way that I was using. But is there any plan for future > releases to removing the data from note and storing only configuration? > > Because storing the configuration with data when there is no max result > limit will create a big note.json file. > > > > Thanks & Regards, > > *Vikash Kumar* > > *From:* moon soo Lee [mailto:m...@apache.org] > *Sent:* Wednesday, September 7, 2016 8:39 PM > *To:* users@zeppelin.apache.org > *Subject:* Re: Hbase configuration storage without data > > > > Hi, > > > > For now, code and result data are mixed in note.json, which is represented > by 'class Note' [1]. And every Notebook storage layer need to implement > 'NotebookRepo.get()' [2] to read note.json from underlying storage and > convert it into 'class Note'. > > > > As you see the related API and class definition, NotebookRepo actually > doesn't have any restriction how 'class Note' is serialized and saved in > the storage. > > > > So you can event new format, you can exclude result data from saving, and > so on. > > > > Hop this helps. > > > > Thanks, > > moon > > > > [1] > https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/Note.java > > [2] > https://github.com/apache/zeppelin/blob/master/zeppelin-zengine/src/main/java/org/apache/zeppelin/notebook/repo/NotebookRepo.java#L47 > > > > On Wed, Sep 7, 2016 at 3:47 AM Vikash Kumar > wrote: > > Hi all, > > We are storing the note.json configuration into hbase as > it is stored into File system. As default behavior in note.json the query > data is stored along with configuration. But we want to store the > configurations only and when user loading its note then query should get > executed and data generated. This feature we are usin
Re: Pyspark interpreter configuration for Zeppelin
Oops sorry. the above code generated this error: RROR [2016-09-14 10:04:27,121] ({qtp2003293121-11} NotebookServer.java[onMessage]:221) - Can't handle message org.apache.zeppelin.interpreter.InterpreterException: org.apache.thrift.transport.TTransportException at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:319) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.cancel(LazyOpenInterpreter.java:100) at org.apache.zeppelin.notebook.Paragraph.jobAbort(Paragraph.java:330) at org.apache.zeppelin.scheduler.Job.abort(Job.java:239) at org.apache.zeppelin.socket.NotebookServer.cancelParagraph(NotebookServer.java:995) at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:180) at org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:56) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128) at org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65) at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122) at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161) at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309) at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214) at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220) at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632) at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) at org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) at org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.recv_cancel(RemoteInterpreterService.java:274) at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Client.cancel(RemoteInterpreterService.java:259) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.cancel(RemoteInterpreter.java:316) ... 21 more This is my spark interpreter settings: spark %spark , %spark.pyspark , %spark.r , %spark.sql , %spark.dep Option Interpreter for note Connect to existing process Properties name value args master yarn-client spark.app.name Zeppelin spark.cores.max spark.executor.memory zeppelin.R.cmd R zeppelin.R.image.width 100% zeppelin.R.knitr true zeppelin.R.render.options out.format = 'html', comment = NA, echo = FALSE, results = 'asis', message = F, warning = F zeppelin.dep.additionalRemoteRepository spark-packages, http://dl.bintray.com/spark-packages/maven,false; zeppelin.dep.localrepo local-repo zeppelin.interpreter.localRepo /usr/local/bin/zeppelin-0.6.1-bin-all/local-repo/2BXF675WU zeppelin.pyspark.python python zeppelin.spark.concurrentSQL false zeppelin.spark.importImplicit true zeppelin.spark.maxResult 1000 zeppelin.spark.printREPLOutput true zeppelin.spark.sql.stacktrace false zeppelin.spark.useHiveContext true On Wed, Sep 14, 2016 at 10:05 AM, Abhi Basu <9000r...@gmail.com> wrote: > %pyspark > > input_file = "hdfs:tmp/filenname.gz" > > raw_rdd = sc.textFile(input_file) > > > -- Abhi Basu
Pyspark interpreter configuration for Zeppelin
%pyspark input_file = "hdfs:tmp/filenname.gz" raw_rdd = sc.textFile(input_file)
Re: Failed to build Zeppelin pulled from Master Branch
Hi Afancy, if you want to build with Scala 2.11 by using -Pscala-2.11 flag, you will need to run `./dev/change_scala_version.sh 2.11` prior to running mvn command. Scala dependent modules in Zeppelin have _2.10 suffix in artifact id by default and running ./dev/change_scala_version.sh will change this suffix to _2.11. On Wed, Sep 14, 2016 at 10:01 AM afancy wrote: > Hello Folk, > > I am using this command "mvn -X clean package -Pbuild-distr -DskipTests > -Pspark-2.0 -Phadoop-2.4 -Pyarn -Pscala-2.11 -Ppyspark -Psparkr" to build > the source code pulled from master branch, but got the following error. Any > suggestion is appreciated if you encounter the same problem. Thanks a lot! > > /Afancy > > > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Zeppelin ... SUCCESS [ > 3.144 s] > [INFO] Zeppelin: Interpreter .. SUCCESS [ > 10.096 s] > [INFO] Zeppelin: Zengine .. SUCCESS [ > 5.014 s] > [INFO] Zeppelin: Display system apis .. SUCCESS [ > 13.960 s] > [INFO] Zeppelin: Spark dependencies ... SUCCESS [ > 46.806 s] > [INFO] Zeppelin: Spark FAILURE [ > 0.087 s] > [INFO] Zeppelin: Markdown interpreter . SKIPPED > [INFO] Zeppelin: Angular interpreter .. SKIPPED > [INFO] Zeppelin: Shell interpreter SKIPPED > [INFO] Zeppelin: Livy interpreter . SKIPPED > [INFO] Zeppelin: HBase interpreter SKIPPED > [INFO] Zeppelin: PostgreSQL interpreter ... SKIPPED > [INFO] Zeppelin: JDBC interpreter . SKIPPED > [INFO] Zeppelin: File System Interpreters . SKIPPED > [INFO] Zeppelin: Flink SKIPPED > [INFO] Zeppelin: Apache Ignite interpreter SKIPPED > [INFO] Zeppelin: Kylin interpreter SKIPPED > [INFO] Zeppelin: Python interpreter ... SKIPPED > [INFO] Zeppelin: Lens interpreter . SKIPPED > [INFO] Zeppelin: Apache Cassandra interpreter . SKIPPED > [INFO] Zeppelin: Elasticsearch interpreter SKIPPED > [INFO] Zeppelin: BigQuery interpreter . SKIPPED > [INFO] Zeppelin: Alluxio interpreter .. SKIPPED > [INFO] Zeppelin: web Application .. SKIPPED > [INFO] Zeppelin: Server ... SKIPPED > [INFO] Zeppelin: Packaging distribution ... SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 01:19 min > [INFO] Finished at: 2016-09-14T09:15:38+02:00 > [INFO] Final Memory: 116M/918M > [INFO] > > [ERROR] Failed to execute goal on project zeppelin-spark_2.10: Could not > resolve dependencies for project > org.apache.zeppelin:zeppelin-spark_2.10:jar:0.7.0-SNAPSHOT: Failure to find > org.apache.zeppelin:zeppelin-display_2.11:jar:0.7.0-SNAPSHOT in > http://repository.apache.org/snapshots was cached in the local > repository, resolution will not be reattempted until the update interval of > apache.snapshots has elapsed or updates are forced -> [Help 1] > org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute > goal on project zeppelin-spark_2.10: Could not resolve dependencies for > project org.apache.zeppelin:zeppelin-spark_2.10:jar:0.7.0-SNAPSHOT: Failure > to find org.apache.zeppelin:zeppelin-display_2.11:jar:0.7.0-SNAPSHOT in > http://repository.apache.org/snapshots was cached in the local > repository, resolution will not be reattempted until the update interval of > apache.snapshots has elapsed or updates are forced > at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies(LifecycleDependencyResolver.java:221) > at > org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies(LifecycleDependencyResolver.java:127) > at > org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved(MojoExecutor.java:257) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:200) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) > at > org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) > at > org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) > at > org.apache.maven.lifecycle.internal.builder.singlethreaded.
Re: Matplotlib uses tkinter instead of Agg
OK, for this problem, it is discussed at https://stackoverflow.com/questions/15538099/conversion-of-unicode-minus-sign-from-matplotlib-ticklabels However, I just tried with Jupyter notebook, and its matplotlib can plot with negative values on the axes correctly, and matplotlib.rcParams['axes.unicode_minus'] = True. Can you guys please check if this only happens to a Python3 environment? I don't think I am the first one hit this problem. On Wed, Sep 14, 2016 at 5:49 PM Xi Shen wrote: > Hi, > > I worked it out...So I have start a new instance of Zeppelin...creating a > new notebook wont take effect...So all the Python code are executed in one > python vm? Shouldn't separating ones are better? > > After I get matplotlib work, I have a new problem. > > This code snippet works > %python > > import numpy as np > import matplotlib.pyplot as plt > > x = np.arange(100) > > plt.figure() > plt.plot(x, x**2) > z.show(plt, width='300px') > plt.close() > > But if I change x value to x= np.linspace(-2, 2, 1000), as it it used in > the example, I got > > > [] > > Traceback (most recent call last): > File "", line 1, in > File "", line 23, in show > File "", line 69, in show_matplotlib > UnicodeEncodeError: 'ascii' codec can't encode character '\u2212' in > position 17262: ordinal not in range(128) > > I did some testing, and I found if any of the value passed to plot() > contains negative numbers, I will get this error...very odd. > > > > On Wed, Sep 14, 2016 at 8:50 AM Felix Cheung > wrote: > >> And >> matplotlib.use('Agg') >> >> Would only work before matplotlib is first used so you would need to >> restart the interpreter. From error stack below it looks like something >> might be setting the default backend in matplotlib to TkAgg though. >> >> Are you using the Python interpreter or PySpark interpreter? Also how you >> are calling matplotlib like Moon asks? >> >> _ >> From: moon soo Lee >> Sent: Tuesday, September 13, 2016 2:34 PM >> Subject: Re: Matplotlib uses tkinter instead of Agg >> To: >> >> >> >> Hi, >> >> Thanks for sharing the problem. >> Could you share which version of Zeppelin are you using and how did you >> try matplotlib inside of Zeppelin? Are you trying matplotlib with >> z.show() ? >> >> Thanks, >> moon >> >> On Tue, Sep 13, 2016 at 1:56 AM Xi Shen wrote: >> >>> Hi, >>> >>> I want to build a Zeppelin docker image for my self. The docker image is >>> based on ubuntu:wily, and has openjdk-8-jre and python3 installed. I also >>> installed other packages that I need. >>> >>> After started Zeppelin in the docker, I am able to access the webapp >>> from my local browser. I tried to execute some simple Python script, and it >>> works fine. But when I try to run the matplotlib example, I got error >>> saying that tkinter cannot find the $DISPLAY. >>> >>> Traceback (most recent call last): >>> File "", line 1, in >>> File "/usr/local/lib/python3.4/dist-packages/matplotlib/pyplot.py", line >>> 535, in figure >>> **kwargs) >>> File >>> "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", >>> line 84, in new_figure_manager >>> return new_figure_manager_given_figure(num, figure) >>> File >>> "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", >>> line 92, in new_figure_manager_given_figure >>> window = Tk.Tk() >>> File "/usr/lib/python3.4/tkinter/__init__.py", line 1859, in __init__ >>> self.tk = _tkinter.create(screenName, baseName, className, interactive, >>> wantobjects, useTk, sync, use) >>> _tkinter.TclError: no display name and no $DISPLAY environment variable >>> >>> Some people on the Internet suggested adding matplotlib.use('Agg') at >>> the beginning of the notebook, but it still does not work for me. >>> >>> -- >>> >>> >>> Thanks, >>> David S. >>> >> >> >> -- > > > Thanks, > David S. > -- Thanks, David S.
Re: Is there any more angular display system examples ?
Hi Jeff, I think there might be some examples here: https://www.zeppelinhub.com/viewer/showcases/Visualization But I'm sure others that have some of their own, would post it here too On Wed, Sep 14, 2016 at 5:43 PM, Jeff Zhang wrote: > I looked at the following link about angular display system, it is very > interesting. I just wonder is there any more examples and small widget > built upon angular. Thanks > > https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysystem/back-end- > angular.html > > > > -- > Best Regards > > Jeff Zhang >
Re: Matplotlib uses tkinter instead of Agg
Hi, I worked it out...So I have start a new instance of Zeppelin...creating a new notebook wont take effect...So all the Python code are executed in one python vm? Shouldn't separating ones are better? After I get matplotlib work, I have a new problem. This code snippet works %python import numpy as np import matplotlib.pyplot as plt x = np.arange(100) plt.figure() plt.plot(x, x**2) z.show(plt, width='300px') plt.close() But if I change x value to x= np.linspace(-2, 2, 1000), as it it used in the example, I got [] Traceback (most recent call last): File "", line 1, in File "", line 23, in show File "", line 69, in show_matplotlib UnicodeEncodeError: 'ascii' codec can't encode character '\u2212' in position 17262: ordinal not in range(128) I did some testing, and I found if any of the value passed to plot() contains negative numbers, I will get this error...very odd. On Wed, Sep 14, 2016 at 8:50 AM Felix Cheung wrote: > And > matplotlib.use('Agg') > > Would only work before matplotlib is first used so you would need to > restart the interpreter. From error stack below it looks like something > might be setting the default backend in matplotlib to TkAgg though. > > Are you using the Python interpreter or PySpark interpreter? Also how you > are calling matplotlib like Moon asks? > > _ > From: moon soo Lee > Sent: Tuesday, September 13, 2016 2:34 PM > Subject: Re: Matplotlib uses tkinter instead of Agg > To: > > > > Hi, > > Thanks for sharing the problem. > Could you share which version of Zeppelin are you using and how did you > try matplotlib inside of Zeppelin? Are you trying matplotlib with > z.show() ? > > Thanks, > moon > > On Tue, Sep 13, 2016 at 1:56 AM Xi Shen wrote: > >> Hi, >> >> I want to build a Zeppelin docker image for my self. The docker image is >> based on ubuntu:wily, and has openjdk-8-jre and python3 installed. I also >> installed other packages that I need. >> >> After started Zeppelin in the docker, I am able to access the webapp from >> my local browser. I tried to execute some simple Python script, and it >> works fine. But when I try to run the matplotlib example, I got error >> saying that tkinter cannot find the $DISPLAY. >> >> Traceback (most recent call last): >> File "", line 1, in >> File "/usr/local/lib/python3.4/dist-packages/matplotlib/pyplot.py", line >> 535, in figure >> **kwargs) >> File >> "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", >> line 84, in new_figure_manager >> return new_figure_manager_given_figure(num, figure) >> File >> "/usr/local/lib/python3.4/dist-packages/matplotlib/backends/backend_tkagg.py", >> line 92, in new_figure_manager_given_figure >> window = Tk.Tk() >> File "/usr/lib/python3.4/tkinter/__init__.py", line 1859, in __init__ >> self.tk = _tkinter.create(screenName, baseName, className, interactive, >> wantobjects, useTk, sync, use) >> _tkinter.TclError: no display name and no $DISPLAY environment variable >> >> Some people on the Internet suggested adding matplotlib.use('Agg') at the >> beginning of the notebook, but it still does not work for me. >> >> -- >> >> >> Thanks, >> David S. >> > > > -- Thanks, David S.
Is there any more angular display system examples ?
I looked at the following link about angular display system, it is very interesting. I just wonder is there any more examples and small widget built upon angular. Thanks https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysystem/back-end-angular.html -- Best Regards Jeff Zhang
Failed to build Zeppelin pulled from Master Branch
Hello Folk, I am using this command "mvn -X clean package -Pbuild-distr -DskipTests -Pspark-2.0 -Phadoop-2.4 -Pyarn -Pscala-2.11 -Ppyspark -Psparkr" to build the source code pulled from master branch, but got the following error. Any suggestion is appreciated if you encounter the same problem. Thanks a lot! /Afancy [INFO] [INFO] Reactor Summary: [INFO] [INFO] Zeppelin ... SUCCESS [ 3.144 s] [INFO] Zeppelin: Interpreter .. SUCCESS [ 10.096 s] [INFO] Zeppelin: Zengine .. SUCCESS [ 5.014 s] [INFO] Zeppelin: Display system apis .. SUCCESS [ 13.960 s] [INFO] Zeppelin: Spark dependencies ... SUCCESS [ 46.806 s] [INFO] Zeppelin: Spark FAILURE [ 0.087 s] [INFO] Zeppelin: Markdown interpreter . SKIPPED [INFO] Zeppelin: Angular interpreter .. SKIPPED [INFO] Zeppelin: Shell interpreter SKIPPED [INFO] Zeppelin: Livy interpreter . SKIPPED [INFO] Zeppelin: HBase interpreter SKIPPED [INFO] Zeppelin: PostgreSQL interpreter ... SKIPPED [INFO] Zeppelin: JDBC interpreter . SKIPPED [INFO] Zeppelin: File System Interpreters . SKIPPED [INFO] Zeppelin: Flink SKIPPED [INFO] Zeppelin: Apache Ignite interpreter SKIPPED [INFO] Zeppelin: Kylin interpreter SKIPPED [INFO] Zeppelin: Python interpreter ... SKIPPED [INFO] Zeppelin: Lens interpreter . SKIPPED [INFO] Zeppelin: Apache Cassandra interpreter . SKIPPED [INFO] Zeppelin: Elasticsearch interpreter SKIPPED [INFO] Zeppelin: BigQuery interpreter . SKIPPED [INFO] Zeppelin: Alluxio interpreter .. SKIPPED [INFO] Zeppelin: web Application .. SKIPPED [INFO] Zeppelin: Server ... SKIPPED [INFO] Zeppelin: Packaging distribution ... SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 01:19 min [INFO] Finished at: 2016-09-14T09:15:38+02:00 [INFO] Final Memory: 116M/918M [INFO] [ERROR] Failed to execute goal on project zeppelin-spark_2.10: Could not resolve dependencies for project org.apache.zeppelin:zeppelin-spark_2.10:jar:0.7.0-SNAPSHOT: Failure to find org.apache.zeppelin:zeppelin-display_2.11:jar:0.7.0-SNAPSHOT in http://repository.apache.org/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots has elapsed or updates are forced -> [Help 1] org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal on project zeppelin-spark_2.10: Could not resolve dependencies for project org.apache.zeppelin:zeppelin-spark_2.10:jar:0.7.0-SNAPSHOT: Failure to find org.apache.zeppelin:zeppelin-display_2.11:jar:0.7.0-SNAPSHOT in http://repository.apache.org/snapshots was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots has elapsed or updates are forced at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.getDependencies(LifecycleDependencyResolver.java:221) at org.apache.maven.lifecycle.internal.LifecycleDependencyResolver.resolveProjectDependencies(LifecycleDependencyResolver.java:127) at org.apache.maven.lifecycle.internal.MojoExecutor.ensureDependenciesAreResolved(MojoExecutor.java:257) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:200) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153) at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116) at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80) at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51) at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307) at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193) at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106) at org.apache.maven.cli.MavenCli.execute(MavenCli.java:862) at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:286) at org.apache.maven.cli.MavenCli.main(MavenCli.java:197) at sun.reflect.Native