RE: Unable to connect with Spark Interpreter

2016-11-29 Thread Jan Botorek
Hello,
Thanks for the advice, but it doesn’t seem that anything is wrong when I start 
the interpreter manually. I attach logs from interpreter and from zeppelin.
This is the cmd output from interpreter launched manually:

D:\zeppelin-0.6.2\bin> interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark 
-p 55492
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Could you, please, think of any possible next steps?

Best regards,
Jan

From: moon soo Lee [mailto:m...@apache.org]
Sent: Monday, November 28, 2016 5:36 PM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

According to your log, your interpreter process seems failed to start.
Check following lines in your log.
You can try run interpreter process manually and see why it is failing.
i.e. run

D:\zeppelin-0.6.2\bin\interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark -p 
55492

---

 INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2} 
RemoteInterpreterProcess.java[reference]:148) - Run interpreter process 
[D:\zeppelin-0.6.2\bin\interpreter.cmd, -d, 
D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l, 
D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M

 INFO [2016-11-28 10:34:03,491] ({Exec Default Executor} 
RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process 
failed {}^M

org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)


On Mon, Nov 28, 2016 at 1:42 AM Jan Botorek 
mailto:jan.boto...@infor.com>> wrote:
Hello again,
I am sorry, but don’t you, guys, really nobody tackle with the same issue, 
please?

I have currently tried the new version (0.6.2) – both binary and „to compile“ 
versions. But the issue remains the same. I have tried it on several laptops 
and servers, always the same result.

Please, don’t you have any idea what to check or repair, please?

Best regards,
Jan
From: Jan Botorek [mailto:jan.boto...@infor.com]
Sent: Wednesday, November 16, 2016 12:54 PM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

Hello Alexander,
Thank you for a quick response. Please, see the server log attached. 
Unfortunately, I don’t have any zeppelin-interpreter-spark*.log in the logs 
file.

Questions:

-  It happens everytime – even, If I try to run several paragraphs

-  Yes, it keeps happening even if the interpreter is re-started
--
Jan

From: Alexander Bezzubov [mailto:b...@apache.org]
Sent: Wednesday, November 16, 2016 12:47 PM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

Hi Jan,

this is rather generic error saying that ZeppelinServer somehow could not 
connect to the interpreter proces on your machine.

Could you please share more from logs/* in particular, .out and .log of the 
Zeppelin server AND zepplein-interpreter-spark*.log - usually this is enough to 
identify the reason.

Two more questions:
- does this happen on every paragraph run? if you try to click Run multiple 
times in a row
- does it still happen if you re-starting Spark interpreter manually from GUI? 
("Anonymous"->Interpreters->Spark->restart)

--
Alex

On Wed, Nov 16, 2016, 12:37 Jan Botorek 
mailto:jan.boto...@infor.com>> wrote:
Hello,
I am not able to run any Spark code in the Zeppelin. I tried compiled versions 
of Zeppelin as well as to compile the source code on my own based on the 
https://github.com/apache/zeppelin steps.
My configuration is Scala in 2.11 version and spark 2.0.1. Also, I tried 
different versions of Zeppelin available at github (master, 0.6, 0.5.6).

The result is always the same. The Zeppelin starts but when any code is run 
(e.g. “2 + 1”, “sc.version”), the subsequent exception is thrown.

java.net.ConnectException: Connection refused: connect at 
java.net.DualStackPlainSocketImpl.connect0(Native Method) at 
java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
 at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
 at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) 
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172) at 
java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at 
java.net.Socket.connect(Socket.java:589) at 
org.apache.thrift.transport.TSocket.open(TSocket.java

Re: Unable to connect with Spark Interpreter

2016-11-29 Thread Jeff Zhang
According your log, the spark interpreter fail to start.  Do you see any
spark interpreter log ?



Jan Botorek 于2016年11月29日周二 下午4:08写道:

> Hello,
>
> Thanks for the advice, but it doesn’t seem that anything is wrong when I
> start the interpreter manually. I attach logs from interpreter and from
> zeppelin.
>
> This is the cmd output from interpreter launched manually:
>
>
>
> *D:\zeppelin-0.6.2\bin>* *interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492*
>
> *Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512m; support was removed in 8.0*
>
> *SLF4J: Class path contains multiple SLF4J bindings.*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings
>  for an explanation.*
>
> *SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]*
>
>
>
> Could you, please, think of any possible next steps?
>
>
>
> Best regards,
>
> Jan
>
>
>
> *From:* moon soo Lee [mailto:m...@apache.org]
> *Sent:* Monday, November 28, 2016 5:36 PM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> According to your log, your interpreter process seems failed to start.
>
> Check following lines in your log.
>
> You can try run interpreter process manually and see why it is failing.
>
> i.e. run
>
> D:\zeppelin-0.6.2\bin\interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492
>
>
>
> ---
>
>  INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2}
> RemoteInterpreterProcess.java[reference]:148) - Run interpreter process
> [D:\zeppelin-0.6.2\bin\interpreter.cmd, -d,
> D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l,
> D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M
>
>  INFO [2016-11-28 10:34:03,491] ({Exec Default Executor}
> RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process
> failed {}^M
>
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1
> (Exit value: 1)
>
>
>
>
>
> On Mon, Nov 28, 2016 at 1:42 AM Jan Botorek  wrote:
>
> Hello again,
>
> I am sorry, but don’t you, guys, really nobody tackle with the same issue,
> please?
>
>
>
> I have currently tried the new version (0.6.2) – both binary and „to
> compile“ versions. But the issue remains the same. I have tried it on
> several laptops and servers, always the same result.
>
>
>
> Please, don’t you have any idea what to check or repair, please?
>
>
>
> Best regards,
>
> Jan
>
> *From:* Jan Botorek [mailto:jan.boto...@infor.com]
> *Sent:* Wednesday, November 16, 2016 12:54 PM
> *To:* users@zeppelin.apache.org
> *Subject:* RE: Unable to connect with Spark Interpreter
>
>
>
> Hello Alexander,
>
> Thank you for a quick response. Please, see the server log attached.
> Unfortunately, I don’t have any zeppelin-interpreter-spark*.log in the
> *logs* file.
>
>
>
> Questions:
>
> -  It happens everytime – even, If I try to run several paragraphs
>
> -  Yes, it keeps happening even if the interpreter is re-started
>
> --
>
> Jan
>
>
>
> *From:* Alexander Bezzubov [mailto:b...@apache.org ]
> *Sent:* Wednesday, November 16, 2016 12:47 PM
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> Hi Jan,
>
> this is rather generic error saying that ZeppelinServer somehow could not
> connect to the interpreter proces on your machine.
>
> Could you please share more from logs/* in particular, .out and .log of
> the Zeppelin server AND zepplein-interpreter-spark*.log - usually this is
> enough to identify the reason.
>
> Two more questions:
> - does this happen on every paragraph run? if you try to click Run
> multiple times in a row
> - does it still happen if you re-starting Spark interpreter manually from
> GUI? ("Anonymous"->Interpreters->Spark->restart)
>
> --
> Alex
>
>
>
> On Wed, Nov 16, 2016, 12:37 Jan Botorek  wrote:
>
> Hello,
>
> I am not able to run any Spark code in the Zeppelin. I tried compiled
> versions of Zeppelin as well as to compile the source code on my own based
> on the https://github.com/apache/zeppelin steps.
>
> My configuration is Scala in 2.11 version and spark 2.0.1. Also, I tried
> different versions of Zeppelin available at github (master, 0.6, 0.5.6).
>
>
>
> The result is always the same. The Zeppelin starts but when any code is
> run (e.g. “2 + 1”, “sc.version”), the subsequent exception is thrown.
>
>
>
> java.net.ConnectException: Connection refused: connect at
> java.net.DualStackPlainSocketImpl.connect0(Native Method) at
> java.net.DualStackPlainSocketImpl.socketConnect(DualStackPlainSocketImpl.java:79)
> at
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at
> java.net.A

RE: Unable to connect with Spark Interpreter

2016-11-29 Thread Jan Botorek
If I start Zeppelin by zeppelin.cmd, only zeppelin log appears.
Interpreter log is created only when I manually start the interpreter; but the 
log contains only information that the interpreter was started (see my 
preceding mail with attachment).

-  INFO [2016-11-29 08:43:59,757] ({Thread-0} 
RemoteInterpreterServer.java[run]:81) - Starting remote interpreter server on 
port 55492


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 9:48 AM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

According your log, the spark interpreter fail to start.  Do you see any spark 
interpreter log ?



Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午4:08写道:
Hello,
Thanks for the advice, but it doesn’t seem that anything is wrong when I start 
the interpreter manually. I attach logs from interpreter and from zeppelin.
This is the cmd output from interpreter launched manually:

D:\zeppelin-0.6.2\bin> interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark 
-p 55492
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Could you, please, think of any possible next steps?

Best regards,
Jan

From: moon soo Lee [mailto:m...@apache.org]
Sent: Monday, November 28, 2016 5:36 PM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

According to your log, your interpreter process seems failed to start.
Check following lines in your log.
You can try run interpreter process manually and see why it is failing.
i.e. run

D:\zeppelin-0.6.2\bin\interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark -p 
55492

---

 INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2} 
RemoteInterpreterProcess.java[reference]:148) - Run interpreter process 
[D:\zeppelin-0.6.2\bin\interpreter.cmd, -d, 
D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l, 
D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M

 INFO [2016-11-28 10:34:03,491] ({Exec Default Executor} 
RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process 
failed {}^M

org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit 
value: 1)


On Mon, Nov 28, 2016 at 1:42 AM Jan Botorek 
mailto:jan.boto...@infor.com>> wrote:
Hello again,
I am sorry, but don’t you, guys, really nobody tackle with the same issue, 
please?

I have currently tried the new version (0.6.2) – both binary and „to compile“ 
versions. But the issue remains the same. I have tried it on several laptops 
and servers, always the same result.

Please, don’t you have any idea what to check or repair, please?

Best regards,
Jan
From: Jan Botorek [mailto:jan.boto...@infor.com]
Sent: Wednesday, November 16, 2016 12:54 PM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

Hello Alexander,
Thank you for a quick response. Please, see the server log attached. 
Unfortunately, I don’t have any zeppelin-interpreter-spark*.log in the logs 
file.

Questions:

-  It happens everytime – even, If I try to run several paragraphs

-  Yes, it keeps happening even if the interpreter is re-started
--
Jan

From: Alexander Bezzubov [mailto:b...@apache.org]
Sent: Wednesday, November 16, 2016 12:47 PM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

Hi Jan,

this is rather generic error saying that ZeppelinServer somehow could not 
connect to the interpreter proces on your machine.

Could you please share more from logs/* in particular, .out and .log of the 
Zeppelin server AND zepplein-interpreter-spark*.log - usually this is enough to 
identify the reason.

Two more questions:
- does this happen on every paragraph run? if you try to click Run multiple 
times in a row
- does it still happen if you re-starting Spark interpreter manually from GUI? 
("Anonymous"->Interpreters->Spark->restart)

--
Alex

On Wed, Nov 16, 2016, 12:37 Jan Botorek 
mailto:jan.boto...@infor.com>> wrote:
Hello,
I am not able to run any Spark code in the Zeppelin. I tried compiled versions 
of Zeppelin as well as to compile the source code on my own based on the 
https://github.com/apache/zeppelin steps.
My configuration is Scala in 2.11 version and spark 2.0.1. Also, I tried 
different versions of Zeppelin available at github (master, 0.6, 0.5.6).

The result is always the same.

Re: Unable to connect with Spark Interpreter

2016-11-29 Thread Jeff Zhang
Then I guess the spark process is failed to start so no logs for spark
interpreter.

Can you use the following log4.properties ? This log4j properties file
print more error info for further diagnose.

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) -
%m%n

log4j.appender.dailyfile.DatePattern=.-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L)
- %m%n


log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG



Jan Botorek 于2016年11月29日周二 下午4:57写道:

> If I start Zeppelin by *zeppelin.cmd*, only zeppelin log appears.
>
> Interpreter log is created only when I manually start the interpreter; but
> the log contains only information that the interpreter was started (see my
> preceding mail with attachment).
>
> -  *INFO [2016-11-29 08:43:59,757] ({Thread-0}
> RemoteInterpreterServer.java[run]:81) - Starting remote interpreter server
> on port 55492*
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Tuesday, November 29, 2016 9:48 AM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> According your log, the spark interpreter fail to start.  Do you see any
> spark interpreter log ?
>
>
>
>
>
>
>
> Jan Botorek 于2016年11月29日周二 下午4:08写道:
>
> Hello,
>
> Thanks for the advice, but it doesn’t seem that anything is wrong when I
> start the interpreter manually. I attach logs from interpreter and from
> zeppelin.
>
> This is the cmd output from interpreter launched manually:
>
>
>
> *D:\zeppelin-0.6.2\bin>* *interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492*
>
> *Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512m; support was removed in 8.0*
>
> *SLF4J: Class path contains multiple SLF4J bindings.*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: See **http://www.slf4j.org/codes.html#multiple_bindings*
> * for an explanation.*
>
> *SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]*
>
>
>
> Could you, please, think of any possible next steps?
>
>
>
> Best regards,
>
> Jan
>
>
>
> *From:* moon soo Lee [mailto:m...@apache.org]
> *Sent:* Monday, November 28, 2016 5:36 PM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> According to your log, your interpreter process seems failed to start.
>
> Check following lines in your log.
>
> You can try run interpreter process manually and see why it is failing.
>
> i.e. run
>
> D:\zeppelin-0.6.2\bin\interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492
>
>
>
> ---
>
>  INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2}
> RemoteInterpreterProcess.java[reference]:148) - Run interpreter process
> [D:\zeppelin-0.6.2\bin\interpreter.cmd, -d,
> D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l,
> D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M
>
>  INFO [2016-11-28 10:34:03,491] ({Exec Default Executor}
> RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process
> failed {}^M
>
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1
> (Exit value: 1)
>
>
>
>
>
> On Mon, Nov 28, 2016 at 1:42 AM Jan Botorek  wrote:
>
> Hello again,
>
> I am sorry, but don’t you, guys, really nobody tackle with the same issue,
> please?
>
>
>
> I have currently tried the new version (0.6.2) – both binary and „to
> compile“ versions. But the issue remains the same. I have tried it on
> several laptops and servers, always the same result.
>
>
>
> Please, don’t you have any idea what to check or repair, please?
>
>
>
> Best regards,
>
> Jan
>
> *From:* Jan Botorek [mailto:jan.boto...@infor.com]
> *Sent:* Wednesday, November 16, 2016 12:54 PM
> *To:* users@zeppelin.apache.org
> *Subject:* RE: Unable to connect with Spark Interpreter
>
>
>
> Hello Alexander,
>
> Thank you for a quick response. Please, see the server log attached.
> Unfortunately, I don’t have any zeppelin-interpreter-spark*.log in the
> *logs* file.
>
>
>
> Questions:
>
> -  It happens everytime

how to add zookeeper discovery of ignite in ignite interpreter?

2016-11-29 Thread pavan agrawal
Hi,

I have cluster of ignite which is based on zookeeper discovery.

I can see only five parameters in ignite interpreter of zeppelin.
1. ignite.addresses
2. ignite.clientMode
3. ignite.config.url
4. ignite.jdbc.url
5. ignite.peerClassLoadingEnabled
6. zeppelin.interpreter.localRepo

where should i add details about zookeeper?

I am really stuck over here, Please help.

Thanks in advance.


Re: Unable to connect with Spark Interpreter

2016-11-29 Thread Jeff Zhang
I still don't see much useful info. Could you try run the following
interpreter command directly ?

c:\_libs\zeppelin-0.6.2-bin-all\\bin\interpreter.cmd  -d
c:\_libs\zeppelin-0.6.2-bin-all\interpreter\spark -p 53099 -l
c:\_libs\zeppelin-0.6.2-bin-all\/local-repo/2C2ZNEH5W


Jan Botorek 于2016年11月29日周二 下午5:26写道:

> I attach the log file after debugging turned on.
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Tuesday, November 29, 2016 10:04 AM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> Then I guess the spark process is failed to start so no logs for spark
> interpreter.
>
>
>
> Can you use the following log4.properties ? This log4j properties file
> print more error info for further diagnose.
>
>
>
> log4j.rootLogger = INFO, dailyfile
>
>
>
> log4j.appender.stdout = org.apache.log4j.ConsoleAppender
>
> log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
>
> log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) -
> %m%n
>
>
>
> log4j.appender.dailyfile.DatePattern=.-MM-dd
>
> log4j.appender.dailyfile.Threshold = DEBUG
>
> log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
>
> log4j.appender.dailyfile.File = ${zeppelin.log.file}
>
> log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
>
> log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t}
> %F[%M]:%L) - %m%n
>
>
>
>
>
> log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
>
> log4j.logger.org.apache.zeppelin.scheduler=DEBUG
>
> log4j.logger.org.apache.zeppelin.livy=DEBUG
>
> log4j.logger.org.apache.zeppelin.flink=DEBUG
>
> log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
>
>
> log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG
>
>
>
>
>
>
>
> Jan Botorek 于2016年11月29日周二 下午4:57写道:
>
> If I start Zeppelin by *zeppelin.cmd*, only zeppelin log appears.
>
> Interpreter log is created only when I manually start the interpreter; but
> the log contains only information that the interpreter was started (see my
> preceding mail with attachment).
>
> -  *INFO [2016-11-29 08:43:59,757] ({Thread-0}
> RemoteInterpreterServer.java[run]:81) - Starting remote interpreter server
> on port 55492*
>
>
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Tuesday, November 29, 2016 9:48 AM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> According your log, the spark interpreter fail to start.  Do you see any
> spark interpreter log ?
>
>
>
>
>
>
>
> Jan Botorek 于2016年11月29日周二 下午4:08写道:
>
> Hello,
>
> Thanks for the advice, but it doesn’t seem that anything is wrong when I
> start the interpreter manually. I attach logs from interpreter and from
> zeppelin.
>
> This is the cmd output from interpreter launched manually:
>
>
>
> *D:\zeppelin-0.6.2\bin>* *interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492*
>
> *Java HotSpot(TM) 64-Bit Server VM warning: ignoring option
> MaxPermSize=512m; support was removed in 8.0*
>
> *SLF4J: Class path contains multiple SLF4J bindings.*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: Found binding in [jar:file:/
> D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]*
>
> *SLF4J: See **http://www.slf4j.org/codes.html#multiple_bindings*
> * for an explanation.*
>
> *SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]*
>
>
>
> Could you, please, think of any possible next steps?
>
>
>
> Best regards,
>
> Jan
>
>
>
> *From:* moon soo Lee [mailto:m...@apache.org]
> *Sent:* Monday, November 28, 2016 5:36 PM
>
>
> *To:* users@zeppelin.apache.org
> *Subject:* Re: Unable to connect with Spark Interpreter
>
>
>
> According to your log, your interpreter process seems failed to start.
>
> Check following lines in your log.
>
> You can try run interpreter process manually and see why it is failing.
>
> i.e. run
>
> D:\zeppelin-0.6.2\bin\interpreter.cmd -d
> D:\zeppelin-0.6.2\interpreter\spark -p 55492
>
>
>
> ---
>
>  INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2}
> RemoteInterpreterProcess.java[reference]:148) - Run interpreter process
> [D:\zeppelin-0.6.2\bin\interpreter.cmd, -d,
> D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l,
> D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M
>
>  INFO [2016-11-28 10:34:03,491] ({Exec Default Executor}
> RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process
> failed {}^M
>
> org.apache.commons.exec.ExecuteException: Process exited with an error: 1
> (Exit value: 1)
>
>
>
>
>
> On Mon, Nov 28, 2016 at 1:42 AM Jan Botorek  wrote:
>
> Hello again,
>
> I am sorry, but don’t you, guys, really nobody tackle with the same issue,
> please?
>
>
>
> I have currently tried the new version (0.6.2) – both bin

RE: Unable to connect with Spark Interpreter

2016-11-29 Thread Jan Botorek
I am sorry, but the directory local-repo is not presented in the zeppelin 
folder. I use this (https://zeppelin.apache.org/download.html) newest binary 
version.

Unfortunately, in the 0.6 version downloaded and built from github, also the 
folder local-repo doesn’t exist


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:45 AM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

I still don't see much useful info. Could you try run the following interpreter 
command directly ?

c:\_libs\zeppelin-0.6.2-bin-all\\bin\interpreter.cmd  -d 
c:\_libs\zeppelin-0.6.2-bin-all\interpreter\spark -p 53099 -l 
c:\_libs\zeppelin-0.6.2-bin-all\/local-repo/2C2ZNEH5W


Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午5:26写道:
I attach the log file after debugging turned on.

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:04 AM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

Then I guess the spark process is failed to start so no logs for spark 
interpreter.

Can you use the following log4.properties ? This log4j properties file print 
more error info for further diagnose.

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - 
%m%n


log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG



Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午4:57写道:
If I start Zeppelin by zeppelin.cmd, only zeppelin log appears.
Interpreter log is created only when I manually start the interpreter; but the 
log contains only information that the interpreter was started (see my 
preceding mail with attachment).

-  INFO [2016-11-29 08:43:59,757] ({Thread-0} 
RemoteInterpreterServer.java[run]:81) - Starting remote interpreter server on 
port 55492


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 9:48 AM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

According your log, the spark interpreter fail to start.  Do you see any spark 
interpreter log ?



Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午4:08写道:
Hello,
Thanks for the advice, but it doesn’t seem that anything is wrong when I start 
the interpreter manually. I attach logs from interpreter and from zeppelin.
This is the cmd output from interpreter launched manually:

D:\zeppelin-0.6.2\bin> interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark 
-p 55492
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512m; 
support was removed in 8.0
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/interpreter/spark/zeppelin-spark_2.11-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/ 
D:/zeppelin-0.6.2/lib/zeppelin-interpreter-0.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

Could you, please, think of any possible next steps?

Best regards,
Jan

From: moon soo Lee [mailto:m...@apache.org]
Sent: Monday, November 28, 2016 5:36 PM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

According to your log, your interpreter process seems failed to start.
Check following lines in your log.
You can try run interpreter process manually and see why it is failing.
i.e. run

D:\zeppelin-0.6.2\bin\interpreter.cmd -d D:\zeppelin-0.6.2\interpreter\spark -p 
55492

---

 INFO [2016-11-28 10:34:02,837] ({pool-1-thread-2} 
RemoteInterpreterProcess.java[reference]:148) - Run interpreter process 
[D:\zeppelin-0.6.2\bin\interpreter.cmd, -d, 
D:\zeppelin-0.6.2\interpreter\spark, -p, 55492, -l, 
D:\zeppelin-0.6.2/local-repo/2C36NT8YK]^M

 INFO [2016-11-28 10:34:03,491] ({Exec Default Executor} 
RemoteInterpreterProcess.java[onProcessFailed]:288) - Interpreter process 
failed {}^M

org.apache.commons.exec.Exec

RE: Unable to connect with Spark Interpreter

2016-11-29 Thread Jan Botorek
Your last advice helped me to progress a little bit:

-  I started spark interpreter manually

o   c:\zepp\\bin\interpreter.cmd, -d, c:\zepp\interpreter\spark\, -p, 61176, 
-l, c:\zepp\/local-repo/2C2ZNEH5W

o   I needed to add a ‚\‘ into the –d attributte and make the path shorter --> 
moved to c:\zepp

-  Then, in Zeppelin web environment I setup the spark interpret to 
„connect to existing process“ (localhost/61176)

-  After that, when I execute any command, in interpreter cmd window 
appears this exception:

o   Exception in thread "pool-1-thread-2" java.lang.NoClassDefFoundError: 
scala/Option

o   at java.lang.Class.forName0(Native Method)

o   at java.lang.Class.forName(Class.java:264)

o   at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.createInterpreter(RemoteInterpreterServer.java:148)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1409)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1394)

o   at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

o   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

o   at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)

o   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

o   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

o   at java.lang.Thread.run(Thread.java:745)

o   Caused by: java.lang.ClassNotFoundException: scala.Option

o   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

o   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

o   ... 11 more

Is this of any help, please?

Regards,
Jan



From: Jan Botorek [mailto:jan.boto...@infor.com]
Sent: Tuesday, November 29, 2016 12:13 PM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

I am sorry, but the directory local-repo is not presented in the zeppelin 
folder. I use this (https://zeppelin.apache.org/download.html) newest binary 
version.

Unfortunately, in the 0.6 version downloaded and built from github, also the 
folder local-repo doesn’t exist


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:45 AM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

I still don't see much useful info. Could you try run the following interpreter 
command directly ?

c:\_libs\zeppelin-0.6.2-bin-all\\bin\interpreter.cmd  -d 
c:\_libs\zeppelin-0.6.2-bin-all\interpreter\spark -p 53099 -l 
c:\_libs\zeppelin-0.6.2-bin-all\/local-repo/2C2ZNEH5W


Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午5:26写道:
I attach the log file after debugging turned on.

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:04 AM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

Then I guess the spark process is failed to start so no logs for spark 
interpreter.

Can you use the following log4.properties ? This log4j properties file print 
more error info for further diagnose.

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - 
%m%n


log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG



Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午4:57写道:
If I start Zeppelin by zeppelin.cmd, only zeppelin log appears.
Interpreter log is created only when I manually start the interpreter; but the 
log contains only information that the interpreter was started (see my 
preceding mail with attachment).

-  INFO [2016-11-29 08:43:59,757] ({Thread-0} 
RemoteInterpreterServer.java[run]:81) - Starting remote interpreter se

Re: Zeppelin or Jupiter

2016-11-29 Thread Kevin Niemann
I can comment the reasons I use Zeppelin, though I haven't used Jupyter
extensively. This is for a Fortune 500 company shared by many users.
-Easy to write new Interpreter for organization specific requirements (e.g.
authentication, query limits etc).
-Already using Java and AngularJS extensively so it was a great fit.
-LDAP and Notebook level permissions worked great.
-Default D3.js visualization system works pretty well (could use some
improvement)
-Easy to create and share business user friendly reports.
-Wide variety of Interpreters (JDBC, Spark, R, Mongo, custom etc).
-So far has been stable.

On Mon, Nov 28, 2016 at 12:59 PM, Mich Talebzadeh  wrote:

> Thank you guys for valuable inputs.
>
> I have never used Jupyter myself but have used Zeppelin. Obviously it
> sounds like if the Big Data deployed has Spark centric view of things (with
> Spark being the penicillin of Big Data World :) together with Scala and
> SQL, then Zeppelin is a goof fit. I have also noticed recently that
> Hortonworks are actively promoting Zeppelin. However, I do appreciate that
> there are fans of Python around.
>
> May be a strategy would to offer both. Having said that there are hard
> core users that would never give up on Tableau!
>
> Regards
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 28 November 2016 at 20:32, DuyHai Doan  wrote:
>
>> "Granted, these two features are currently only fully supported by the
>> spark interpreter group but work is currently underway to make the API
>> extensible to other interpreters"
>> --> Incorrect, the display system has also an API for front-end:
>> https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysystem/front-end-
>> angular.html
>>
>> On Mon, Nov 28, 2016 at 9:23 PM, Goodman, Alexander (398K) <
>> alexander.good...@jpl.nasa.gov> wrote:
>>
>>> Hi Mich,
>>>
>>> You might want to take a look at this:
>>> https://www.linkedin.com/pulse/comprehensive-comparison-jupy
>>> ter-vs-zeppelin-hoc-q-phan-mba-
>>>
>>> I use both Zeppelin and Jupyter myself, and I would say by and large the
>>> conclusions of that article are still mostly correct. Jupyter is definitely
>>> superior in terms of stability, language (kernel) support, ease of
>>> installation and maintenance (thanks to conda) and performance. If you just
>>> want something that works well straight out of the box, then Jupyter should
>>> be your goto notebook solution. I would say this is especially true if your
>>> workflow is largely in python since many of the Jupyter developers also
>>> have close ties with the general python data analytics / scientific
>>> computing community, which results in better integration with some
>>> important packages (like matplotlib and bokeh, for example). This makes
>>> sense given that the project was originally a part of ipython after all.
>>>
>>> However I definitely think Zeppelin still has an important place. The
>>> vast majority of Zeppelin users also use spark (also an apache project),
>>> and for that use case it should always be better than Jupyter given that
>>> its backend code is written in Java (a JVM language). There are also
>>> several advanced features that Zeppelin has that are somewhat unique,
>>> including a simple API for sharing variables across interpreters (
>>> https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter
>>> /spark.html#object-exchange). There's also the angular display system
>>> API (https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysyst
>>> em/back-end-angular.html). Granted, these two features are currently
>>> only fully supported by the spark interpreter group but work is currently
>>> underway to make the API extensible to other interpreters. Lastly, I think
>>> the most powerful feature of Zeppelin is the overall concept of the
>>> interpreter (in contrast to Jupyter's kernels) and the ability to use them
>>> together in a single notebook. This is my main reason for using Zeppelin
>>> since I regularly work with both spark/scala and python together.
>>>
>>> So tl;dr, if you are using spark and/or have workflows which use
>>> multiple languages (namely scala/R/python/SQL), you should stick with
>>> Zeppelin. Otherwise, I would suggest Jupyter.
>>>
>>> On Mon, Nov 28, 2016 at 5:06 AM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 H,

 I use Zeppelin in different form and shape and it is very promising.
 Some colleagues are mentioning that Jupiter can do a

Re: Zeppelin or Jupiter

2016-11-29 Thread Mohit Jaggi
> -LDAP and Notebook level permissions worked great.

Would you mind sharing details on this?

Mohit Jaggi
Founder,
Data Orchard LLC
www.dataorchardllc.com




> On Nov 29, 2016, at 9:52 AM, Kevin Niemann  wrote:
> 
> I can comment the reasons I use Zeppelin, though I haven't used Jupyter 
> extensively. This is for a Fortune 500 company shared by many users.
> -Easy to write new Interpreter for organization specific requirements (e.g. 
> authentication, query limits etc).
> -Already using Java and AngularJS extensively so it was a great fit.
> -LDAP and Notebook level permissions worked great.
> -Default D3.js visualization system works pretty well (could use some 
> improvement)
> -Easy to create and share business user friendly reports.
> -Wide variety of Interpreters (JDBC, Spark, R, Mongo, custom etc).
> -So far has been stable.
> 
> On Mon, Nov 28, 2016 at 12:59 PM, Mich Talebzadeh  > wrote:
> Thank you guys for valuable inputs.
> 
> I have never used Jupyter myself but have used Zeppelin. Obviously it sounds 
> like if the Big Data deployed has Spark centric view of things (with Spark 
> being the penicillin of Big Data World :) together with Scala and SQL, then 
> Zeppelin is a goof fit. I have also noticed recently that Hortonworks are 
> actively promoting Zeppelin. However, I do appreciate that there are fans of 
> Python around.
> 
> May be a strategy would to offer both. Having said that there are hard core 
> users that would never give up on Tableau!
> 
> Regards
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> 
>  
> http://talebzadehmich.wordpress.com 
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> On 28 November 2016 at 20:32, DuyHai Doan  > wrote:
> "Granted, these two features are currently only fully supported by the spark 
> interpreter group but work is currently underway to make the API extensible 
> to other interpreters"
> --> Incorrect, the display system has also an API for front-end: 
> https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysystem/front-end-angular.html
>  
> 
> 
> On Mon, Nov 28, 2016 at 9:23 PM, Goodman, Alexander (398K) 
> mailto:alexander.good...@jpl.nasa.gov>> 
> wrote:
> Hi Mich,
> 
> You might want to take a look at this:
> https://www.linkedin.com/pulse/comprehensive-comparison-jupyter-vs-zeppelin-hoc-q-phan-mba-
>  
> 
> 
> I use both Zeppelin and Jupyter myself, and I would say by and large the 
> conclusions of that article are still mostly correct. Jupyter is definitely 
> superior in terms of stability, language (kernel) support, ease of 
> installation and maintenance (thanks to conda) and performance. If you just 
> want something that works well straight out of the box, then Jupyter should 
> be your goto notebook solution. I would say this is especially true if your 
> workflow is largely in python since many of the Jupyter developers also have 
> close ties with the general python data analytics / scientific computing 
> community, which results in better integration with some important packages 
> (like matplotlib and bokeh, for example). This makes sense given that the 
> project was originally a part of ipython after all. 
> 
> However I definitely think Zeppelin still has an important place. The vast 
> majority of Zeppelin users also use spark (also an apache project), and for 
> that use case it should always be better than Jupyter given that its backend 
> code is written in Java (a JVM language). There are also several advanced 
> features that Zeppelin has that are somewhat unique, including a simple API 
> for sharing variables across interpreters 
> (https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter/spark.html#object-exchange
>  
> ).
>  There's also the angular display system API 
> (https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/displaysystem/back-end-angular.html
>  
> ).
>  Granted, these two features are currently only fully supported by the spark 
> interpreter group but work is currently underway to make the API extensible 
> to other interpreters. Lastly, I think the most powerful feature of Zeppelin 
> is

multi-tennant Zeppelin notebook

2016-11-29 Thread Ruslan Dautkhanov
What's a best way to have a multi-tennant Zeppelin notebook?

It seems we currently will have to ask users to run their own Zeppelin
instances.
Since each user has its own authethentication & authorization based on user
who runs
Zeppelin server.

I see best solution could be to have probably --keytab and --principal to
be
notebook-level parameters rather than server-level.

So, for example, I can see Zeppelin multitennancy could be implemented as
1) users after being authenticated through LDAP,
2) that user gets mapped to a  --keytab and --principal pair specific for
that user
so in-Hadoop HDFS, Hive etc access will be specific for that user
(through HDFS ACL, and Sentry/Ranger roles).

Another way: It might be easier to implement through spark-submit's
--proxy-user
parameter, but I am not sure details in this case.
I know that for example Cloudera's Hue is using proxy authentication quite
successfully
in our organization. I.e. Hue does LDAP authentication, and then
impersonates to that
specific user and all requests are made on behalf of that user (although
`hue` is actual
OS user that runs Hue service). Other Hadoop services are just configured
to trust
user `hue` to impersonate to other users.

Is there is a better way?

Anything in Zeppelin roadmap to bring user multitennancy?


Thank you,
Ruslan Dautkhanov


Re: multi-tennant Zeppelin notebook

2016-11-29 Thread vincent gromakowski
It bas been asked many times. For now only livy can impersonate the spark
user. For other interpreters it's not possible as I know...

Le 29 nov. 2016 7:44 PM, "Ruslan Dautkhanov"  a
écrit :

> What's a best way to have a multi-tennant Zeppelin notebook?
>
> It seems we currently will have to ask users to run their own Zeppelin
> instances.
> Since each user has its own authethentication & authorization based on
> user who runs
> Zeppelin server.
>
> I see best solution could be to have probably --keytab and --principal to
> be
> notebook-level parameters rather than server-level.
>
> So, for example, I can see Zeppelin multitennancy could be implemented as
> 1) users after being authenticated through LDAP,
> 2) that user gets mapped to a  --keytab and --principal pair specific for
> that user
> so in-Hadoop HDFS, Hive etc access will be specific for that user
> (through HDFS ACL, and Sentry/Ranger roles).
>
> Another way: It might be easier to implement through spark-submit's
> --proxy-user
> parameter, but I am not sure details in this case.
> I know that for example Cloudera's Hue is using proxy authentication quite
> successfully
> in our organization. I.e. Hue does LDAP authentication, and then
> impersonates to that
> specific user and all requests are made on behalf of that user (although
> `hue` is actual
> OS user that runs Hue service). Other Hadoop services are just configured
> to trust
> user `hue` to impersonate to other users.
>
> Is there is a better way?
>
> Anything in Zeppelin roadmap to bring user multitennancy?
>
>
> Thank you,
> Ruslan Dautkhanov
>


Binding variable inside javascript code

2016-11-29 Thread iqueralt
Hello everyone,

I'm trying to bind a variable from javascript to the spark context when I
click on an object. So that I can use the variable in another Zeppelin
paragraph. I have the following code in one particular paragraph:


In another file I build the "data" variable used by /plot/. The
*node.on("click", function(d){...})* function is where I want the binding to
happen. I wanted to use z.angularBind, but this is for a scala paragraph,
right?

My question would be: How can I bind a variable (let's call it
selected_word) inside the *node.on("click", function(d){...})* function, so
that in another *scala* zeppelin paragraph I can access such variable?
The variable should contain the value *d.className* inside the function
*node.on("click", function(d){...})*.
After this, I would like to run z.angular("selected_word") in another
paragraph, and get the d.className corresponding value.

Thank you in advance for your help!




--
View this message in context: 
http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Binding-variable-inside-javascript-code-tp4655.html
Sent from the Apache Zeppelin Users (incubating) mailing list mailing list 
archive at Nabble.com.


Re: "You must build Spark with Hive. Export 'SPARK_HIVE=true'"

2016-11-29 Thread Felix Cheung
Can you reuse the HiveContext instead of making new ones with HiveContext(sc)?



From: Ruslan Dautkhanov 
Sent: Sunday, November 27, 2016 8:07:41 AM
To: users
Subject: Re: "You must build Spark with Hive. Export 'SPARK_HIVE=true'"

Also, to get rid of this problem (once HiveContext(sc) was assigned at least 
twice to a variable,
the only fix is - ro restart Zeppelin :-(


--
Ruslan Dautkhanov

On Sun, Nov 27, 2016 at 9:00 AM, Ruslan Dautkhanov 
mailto:dautkha...@gmail.com>> wrote:
I found a pattern when this happens.

When I run
sqlCtx = HiveContext(sc)

it works as expected.

Second and any time after that - gives that exception stack I reported in this 
email chain.

> sqlCtx = HiveContext(sc)
> sqlCtx.sql('select * from marketview.spend_dim')

You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt 
assembly
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 267, in 
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 265, in 
exec(code)
File "", line 2, in 
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
580, in sql
return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
683, in _ssql_ctx
self._scala_HiveContext = self._get_hive_ctx()
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
692, in _get_hive_ctx
return self._jvm.HiveContext(self._jsc.sc())
File 
"/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
 line 1064, in __call__
answer, self._gateway_client, None, self._fqn)
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/utils.py", line 
45, in deco
return f(*a, **kw)


Key piece to reproduce this issue - assign HiveContext(sc) to a variable more 
than once,
and use that variable between assignments.


--
Ruslan Dautkhanov

On Mon, Nov 21, 2016 at 2:52 PM, Ruslan Dautkhanov 
mailto:dautkha...@gmail.com>> wrote:
Getting
You must build Spark with Hive. Export 'SPARK_HIVE=true'
See full stack [2] below.

I'm using Spark 1.6 that comes with CDH 5.8.3.
So it's definitely compiled with Hive.
We use Jupyter notebooks without problems in the same environment.

Using Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from 
apache.org

Is Zeppelin compiled with Hive too? I guess so.
Not sure what else is missing.

Tried to play with ZEPPELIN_SPARK_USEHIVECONTEXT but it does not make 
difference.


[1]
$ cat zeppelin-env.sh
export JAVA_HOME=/usr/java/java7
export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
export SPARK_SUBMIT_OPTIONS="--principal  --keytab yyy --conf 
spark.driver.memory=7g --conf spark.executor.cores=2 --conf 
spark.executor.memory=8g"
export SPARK_APP_NAME="Zeppelin notebook"
export HADOOP_CONF_DIR=/etc/hadoop/conf
export HIVE_CONF_DIR=/etc/hive/conf
export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
export PYSPARK_PYTHON="/opt/cloudera/parcels/Anaconda/bin/python2"
export 
PYTHONPATH="/opt/cloudera/parcels/CDH/lib/spark/python:/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip"
export MASTER="yarn-client"
export ZEPPELIN_SPARK_USEHIVECONTEXT=true




[2]

You must build Spark with Hive. Export 'SPARK_HIVE=true' and run build/sbt 
assembly
Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 267, in 
raise Exception(traceback.format_exc())
Exception: Traceback (most recent call last):
File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 265, in 
exec(code)
File "", line 9, in 
File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py", line 
580, in sql

[3]
Also have correct symlinks in zeppelin_home/conf for
- hive-site.xml
- hdfs-site.xml
- core-site.xml
- yarn-site.xml



Thank you,
Ruslan Dautkhanov




Re: Unable to connect with Spark Interpreter

2016-11-29 Thread Felix Cheung
Hmm possibly with the classpath. These might be Windows specific issues. We 
probably need to debug to fix these.



From: Jan Botorek 
Sent: Tuesday, November 29, 2016 4:01:43 AM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

Your last advice helped me to progress a little bit:

-  I started spark interpreter manually

o   c:\zepp\\bin\interpreter.cmd, -d, c:\zepp\interpreter\spark\, -p, 61176, 
-l, c:\zepp\/local-repo/2C2ZNEH5W

o   I needed to add a ‚\‘ into the –d attributte and make the path shorter --> 
moved to c:\zepp

-  Then, in Zeppelin web environment I setup the spark interpret to 
„connect to existing process“ (localhost/61176)

-  After that, when I execute any command, in interpreter cmd window 
appears this exception:

o   Exception in thread "pool-1-thread-2" java.lang.NoClassDefFoundError: 
scala/Option

o   at java.lang.Class.forName0(Native Method)

o   at java.lang.Class.forName(Class.java:264)

o   at 
org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.createInterpreter(RemoteInterpreterServer.java:148)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1409)

o   at 
org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$createInterpreter.getResult(RemoteInterpreterService.java:1394)

o   at 
org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)

o   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)

o   at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)

o   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

o   at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

o   at java.lang.Thread.run(Thread.java:745)

o   Caused by: java.lang.ClassNotFoundException: scala.Option

o   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

o   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)

o   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

o   ... 11 more

Is this of any help, please?

Regards,
Jan



From: Jan Botorek [mailto:jan.boto...@infor.com]
Sent: Tuesday, November 29, 2016 12:13 PM
To: users@zeppelin.apache.org
Subject: RE: Unable to connect with Spark Interpreter

I am sorry, but the directory local-repo is not presented in the zeppelin 
folder. I use this (https://zeppelin.apache.org/download.html) newest binary 
version.

Unfortunately, in the 0.6 version downloaded and built from github, also the 
folder local-repo doesn’t exist


From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:45 AM
To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

I still don't see much useful info. Could you try run the following interpreter 
command directly ?

c:\_libs\zeppelin-0.6.2-bin-all\\bin\interpreter.cmd  -d 
c:\_libs\zeppelin-0.6.2-bin-all\interpreter\spark -p 53099 -l 
c:\_libs\zeppelin-0.6.2-bin-all\/local-repo/2C2ZNEH5W


Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午5:26写道:
I attach the log file after debugging turned on.

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, November 29, 2016 10:04 AM

To: users@zeppelin.apache.org
Subject: Re: Unable to connect with Spark Interpreter

Then I guess the spark process is failed to start so no logs for spark 
interpreter.

Can you use the following log4.properties ? This log4j properties file print 
more error info for further diagnose.

log4j.rootLogger = INFO, dailyfile

log4j.appender.stdout = org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout = org.apache.log4j.PatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - %m%n

log4j.appender.dailyfile.DatePattern=.-MM-dd
log4j.appender.dailyfile.Threshold = DEBUG
log4j.appender.dailyfile = org.apache.log4j.DailyRollingFileAppender
log4j.appender.dailyfile.File = ${zeppelin.log.file}
log4j.appender.dailyfile.layout = org.apache.log4j.PatternLayout
log4j.appender.dailyfile.layout.ConversionPattern=%5p [%d] ({%t} %F[%M]:%L) - 
%m%n


log4j.logger.org.apache.zeppelin.notebook.Paragraph=DEBUG
log4j.logger.org.apache.zeppelin.scheduler=DEBUG
log4j.logger.org.apache.zeppelin.livy=DEBUG
log4j.logger.org.apache.zeppelin.flink=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote=DEBUG
log4j.logger.org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer=DEBUG



Jan Botorek mailto:jan.boto...@infor.com>>于2016年11月29日周二 
下午4:57写道:
If I start Zeppelin by zeppelin.cmd, only zeppelin log appears.
Interpreter lo

Re: "You must build Spark with Hive. Export 'SPARK_HIVE=true'"

2016-11-29 Thread Ruslan Dautkhanov
That's what we will have to do. It's hard to explain to users though, that
in Zeppelin you can assign HiveContext
to a variable only once. Didn't have this problem in Jupyter. Is this hard
to fix? Created https://issues.apache.org/jira/browse/ZEPPELIN-1728

If somebody forgets about this rule, it's only fixable by restarting
Zeppelin server which is super inconvenient.

Thanks.



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 12:54 PM, Felix Cheung 
wrote:

> Can you reuse the HiveContext instead of making new ones with
> HiveContext(sc)?
>
>
> --
> *From:* Ruslan Dautkhanov 
> *Sent:* Sunday, November 27, 2016 8:07:41 AM
> *To:* users
> *Subject:* Re: "You must build Spark with Hive. Export 'SPARK_HIVE=true'"
>
> Also, to get rid of this problem (once HiveContext(sc) was assigned at
> least twice to a variable,
> the only fix is - ro restart Zeppelin :-(
>
>
> --
> Ruslan Dautkhanov
>
> On Sun, Nov 27, 2016 at 9:00 AM, Ruslan Dautkhanov 
> wrote:
>
>> I found a pattern when this happens.
>>
>> When I run
>> sqlCtx = HiveContext(sc)
>>
>> it works as expected.
>>
>> Second and any time after that - gives that exception stack I reported in
>> this email chain.
>>
>> > sqlCtx = HiveContext(sc)
>> > sqlCtx.sql('select * from marketview.spend_dim')
>>
>> You must build Spark with Hive. Export 'SPARK_HIVE=true' and run
>> build/sbt assembly
>> Traceback (most recent call last):
>> File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 267, in
>> 
>> raise Exception(traceback.format_exc())
>> Exception: Traceback (most recent call last):
>> File "/tmp/zeppelin_pyspark-6752406810533348793.py", line 265, in
>> 
>> exec(code)
>> File "", line 2, in 
>> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
>> line 580, in sql
>> return DataFrame(self._ssql_ctx.sql(sqlQuery), self)
>> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
>> line 683, in _ssql_ctx
>> self._scala_HiveContext = self._get_hive_ctx()
>> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
>> line 692, in _get_hive_ctx
>> return self._jvm.HiveContext(self._jsc.sc())
>> File 
>> "/opt/cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py",
>> line 1064, in __call__
>> answer, self._gateway_client, None, self._fqn)
>> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/utils.py",
>> line 45, in deco
>> return f(*a, **kw)
>>
>>
>> Key piece to reproduce this issue - assign HiveContext(sc) to a variable
>> more than once,
>> and use that variable between assignments.
>>
>>
>> --
>> Ruslan Dautkhanov
>>
>> On Mon, Nov 21, 2016 at 2:52 PM, Ruslan Dautkhanov 
>> wrote:
>>
>>> Getting
>>> You must *build Spark with Hive*. Export 'SPARK_HIVE=true'
>>> See full stack [2] below.
>>>
>>> I'm using Spark 1.6 that comes with CDH 5.8.3.
>>> So it's definitely compiled with Hive.
>>> We use Jupyter notebooks without problems in the same environment.
>>>
>>> Using Zeppelin 0.6.2, downloaded as zeppelin-0.6.2-bin-all.tgz from from
>>> apache.org
>>>
>>> Is Zeppelin compiled with Hive too? I guess so.
>>> Not sure what else is missing.
>>>
>>> Tried to play with ZEPPELIN_SPARK_USEHIVECONTEXT but it does not make
>>> difference.
>>>
>>>
>>> [1]
>>> $ cat zeppelin-env.sh
>>> export JAVA_HOME=/usr/java/java7
>>> export SPARK_HOME=/opt/cloudera/parcels/CDH/lib/spark
>>> export SPARK_SUBMIT_OPTIONS="--principal  --keytab yyy --conf
>>> spark.driver.memory=7g --conf spark.executor.cores=2 --conf
>>> spark.executor.memory=8g"
>>> export SPARK_APP_NAME="Zeppelin notebook"
>>> export HADOOP_CONF_DIR=/etc/hadoop/conf
>>> export HIVE_CONF_DIR=/etc/hive/conf
>>> export HIVE_HOME=/opt/cloudera/parcels/CDH/lib/hive
>>> export PYSPARK_PYTHON="/opt/cloudera/parcels/Anaconda/bin/python2"
>>> export PYTHONPATH="/opt/cloudera/parcels/CDH/lib/spark/python:/opt/
>>> cloudera/parcels/CDH/lib/spark/python/lib/py4j-0.9-src.zip"
>>> export MASTER="yarn-client"
>>> export ZEPPELIN_SPARK_USEHIVECONTEXT=true
>>>
>>>
>>>
>>>
>>> [2]
>>>
>>> You must build Spark with Hive. Export 'SPARK_HIVE=true' and run
>>> build/sbt assembly
>>> Traceback (most recent call last):
>>> File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 267, in
>>> 
>>> raise Exception(traceback.format_exc())
>>> Exception: Traceback (most recent call last):
>>> File "/tmp/zeppelin_pyspark-9143637669637506477.py", line 265, in
>>> 
>>> exec(code)
>>> File "", line 9, in 
>>> File "/opt/cloudera/parcels/CDH/lib/spark/python/pyspark/sql/context.py",
>>> line 580, in sql
>>>
>>> [3]
>>> Also have correct symlinks in zeppelin_home/conf for
>>> - hive-site.xml
>>> - hdfs-site.xml
>>> - core-site.xml
>>> - yarn-site.xml
>>>
>>>
>>>
>>> Thank you,
>>> Ruslan Dautkhanov
>>>
>>
>>
>


Re: multi-tennant Zeppelin notebook

2016-11-29 Thread moon soo Lee
Interpreter Impersonation [1] is recently introduced and there is further
improvement in progress [2].

I didn't see any issue about impersonate spark interpreter using
--proxy-user. Do you mind create one?

Thanks,
moon

[1]
http://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/manual/userimpersonation.html
[2] https://github.com/apache/zeppelin/pull/1672


On Tue, Nov 29, 2016 at 11:05 AM vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> It bas been asked many times. For now only livy can impersonate the spark
> user. For other interpreters it's not possible as I know...
>
> Le 29 nov. 2016 7:44 PM, "Ruslan Dautkhanov"  a
> écrit :
>
> What's a best way to have a multi-tennant Zeppelin notebook?
>
> It seems we currently will have to ask users to run their own Zeppelin
> instances.
> Since each user has its own authethentication & authorization based on
> user who runs
> Zeppelin server.
>
> I see best solution could be to have probably --keytab and --principal to
> be
> notebook-level parameters rather than server-level.
>
> So, for example, I can see Zeppelin multitennancy could be implemented as
> 1) users after being authenticated through LDAP,
> 2) that user gets mapped to a  --keytab and --principal pair specific for
> that user
> so in-Hadoop HDFS, Hive etc access will be specific for that user
> (through HDFS ACL, and Sentry/Ranger roles).
>
> Another way: It might be easier to implement through spark-submit's
> --proxy-user
> parameter, but I am not sure details in this case.
> I know that for example Cloudera's Hue is using proxy authentication quite
> successfully
> in our organization. I.e. Hue does LDAP authentication, and then
> impersonates to that
> specific user and all requests are made on behalf of that user (although
> `hue` is actual
> OS user that runs Hue service). Other Hadoop services are just configured
> to trust
> user `hue` to impersonate to other users.
>
> Is there is a better way?
>
> Anything in Zeppelin roadmap to bring user multitennancy?
>
>
> Thank you,
> Ruslan Dautkhanov
>
>


Re: multi-tennant Zeppelin notebook

2016-11-29 Thread vincent gromakowski
Good to know, great job

2016-11-29 23:30 GMT+01:00 moon soo Lee :

> Interpreter Impersonation [1] is recently introduced and there is further
> improvement in progress [2].
>
> I didn't see any issue about impersonate spark interpreter using
> --proxy-user. Do you mind create one?
>
> Thanks,
> moon
>
> [1] http://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/
> manual/userimpersonation.html
> [2] https://github.com/apache/zeppelin/pull/1672
>
>
> On Tue, Nov 29, 2016 at 11:05 AM vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> It bas been asked many times. For now only livy can impersonate the spark
>> user. For other interpreters it's not possible as I know...
>>
>> Le 29 nov. 2016 7:44 PM, "Ruslan Dautkhanov"  a
>> écrit :
>>
>> What's a best way to have a multi-tennant Zeppelin notebook?
>>
>> It seems we currently will have to ask users to run their own Zeppelin
>> instances.
>> Since each user has its own authethentication & authorization based on
>> user who runs
>> Zeppelin server.
>>
>> I see best solution could be to have probably --keytab and --principal to
>> be
>> notebook-level parameters rather than server-level.
>>
>> So, for example, I can see Zeppelin multitennancy could be implemented as
>> 1) users after being authenticated through LDAP,
>> 2) that user gets mapped to a  --keytab and --principal pair specific for
>> that user
>> so in-Hadoop HDFS, Hive etc access will be specific for that user
>> (through HDFS ACL, and Sentry/Ranger roles).
>>
>> Another way: It might be easier to implement through spark-submit's
>> --proxy-user
>> parameter, but I am not sure details in this case.
>> I know that for example Cloudera's Hue is using proxy authentication
>> quite successfully
>> in our organization. I.e. Hue does LDAP authentication, and then
>> impersonates to that
>> specific user and all requests are made on behalf of that user (although
>> `hue` is actual
>> OS user that runs Hue service). Other Hadoop services are just configured
>> to trust
>> user `hue` to impersonate to other users.
>>
>> Is there is a better way?
>>
>> Anything in Zeppelin roadmap to bring user multitennancy?
>>
>>
>> Thank you,
>> Ruslan Dautkhanov
>>
>>


Re: multi-tennant Zeppelin notebook

2016-11-29 Thread Ruslan Dautkhanov
Thank you a lot moon!

> Interpreter Impersonation [1] is recently introduced and there is further
improvement in progress [2].

Very cool. Please consider checking
https://issues.apache.org/jira/browse/ZEPPELIN-1660 too as we
would always run into this to make Zeppelin not have any user-specific
paths.

> I didn't see any issue about impersonate spark interpreter using
--proxy-user. Do you mind create one?

Complete: https://issues.apache.org/jira/browse/ZEPPELIN-1730

Thank you.



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 3:30 PM, moon soo Lee  wrote:

> Interpreter Impersonation [1] is recently introduced and there is further
> improvement in progress [2].
>
> I didn't see any issue about impersonate spark interpreter using
> --proxy-user. Do you mind create one?
>
> Thanks,
> moon
>
> [1] http://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/
> manual/userimpersonation.html
> [2] https://github.com/apache/zeppelin/pull/1672
>
>
> On Tue, Nov 29, 2016 at 11:05 AM vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> It bas been asked many times. For now only livy can impersonate the spark
>> user. For other interpreters it's not possible as I know...
>>
>> Le 29 nov. 2016 7:44 PM, "Ruslan Dautkhanov"  a
>> écrit :
>>
>> What's a best way to have a multi-tennant Zeppelin notebook?
>>
>> It seems we currently will have to ask users to run their own Zeppelin
>> instances.
>> Since each user has its own authethentication & authorization based on
>> user who runs
>> Zeppelin server.
>>
>> I see best solution could be to have probably --keytab and --principal to
>> be
>> notebook-level parameters rather than server-level.
>>
>> So, for example, I can see Zeppelin multitennancy could be implemented as
>> 1) users after being authenticated through LDAP,
>> 2) that user gets mapped to a  --keytab and --principal pair specific for
>> that user
>> so in-Hadoop HDFS, Hive etc access will be specific for that user
>> (through HDFS ACL, and Sentry/Ranger roles).
>>
>> Another way: It might be easier to implement through spark-submit's
>> --proxy-user
>> parameter, but I am not sure details in this case.
>> I know that for example Cloudera's Hue is using proxy authentication
>> quite successfully
>> in our organization. I.e. Hue does LDAP authentication, and then
>> impersonates to that
>> specific user and all requests are made on behalf of that user (although
>> `hue` is actual
>> OS user that runs Hue service). Other Hadoop services are just configured
>> to trust
>> user `hue` to impersonate to other users.
>>
>> Is there is a better way?
>>
>> Anything in Zeppelin roadmap to bring user multitennancy?
>>
>>
>> Thank you,
>> Ruslan Dautkhanov
>>
>>


Re: 0.7 Shiro LDAP authentication changes? Unable to instantiate class [org.apache.zeppelin.server.LdapGroupRealm]

2016-11-29 Thread Ruslan Dautkhanov
Thank you Khalid.

That was it. I was able to start 0.7.0 with ldap shiro config now.



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 12:15 AM, Khalid Huseynov 
wrote:

> I think during refactoring LdapGroupRealm has moved into different package,
> so could you try in your shiro config with:
>
> ldapRealm = org.apache.zeppelin.realm.LdapGroupRealm
>
> On Tue, Nov 29, 2016 at 2:33 AM, Ruslan Dautkhanov 
> wrote:
>
> > + dev list
> >
> > Could somebody please let me know if shiro-LDAP is known to be broken in
> > master?
> > So I will stop my attempts to work with 0.7.
> >
> > [org.apache.zeppelin.server.LdapGroupRealm] for object named
> > 'ldapRealm'.  Please ensure you've specified the fully qualified class
> name
> > correctly.
> > at org.apache.shiro.config.ReflectionBuilder.createNewInstance(
> > ReflectionBuilder.java:151)
> > at org.apache.shiro.config.ReflectionBuilder.buildObjects(Refle
> > ctionBuilder.java:119)
> > at org.apache.shiro.config.IniSecurityManagerFactory.buildInsta
> > nces(IniSecurityManagerFactory.java:161)
> >
> >
> > Thanks,
> > Ruslan
> >
> >
> > On Mon, Nov 28, 2016 at 9:13 AM, Ruslan Dautkhanov  >
> > wrote:
> >
> >> Looking at 0.7 docs, Shiro LDAP authentication shiro.ini configuration
> >> looks the same.
> >> http://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/security/shir
> >> oauthentication.html
> >>
> >> Any ideas why this might be broken in the current snapshot?
> >>
> >> Exception in thread "main" org.apache.shiro.config.
> ConfigurationException:
> >> Unable to instantiate class [org.apache.zeppelin.server.LdapGroupRealm]
> >> for object named 'ldapRealm'.  Please ensure you've specified the fully
> >> qualified class name correctly.
> >> at org.apache.shiro.config.ReflectionBuilder.createNewInstance(
> >> ReflectionBuilder.java:151)
> >> at org.apache.shiro.config.ReflectionBuilder.buildObjects(Refle
> >> ctionBuilder.java:119)
> >> at org.apache.shiro.config.IniSecurityManagerFactory.buildInsta
> >> nces(IniSecurityManagerFactory.java:161)
> >>
> >>
> >>
> >> --
> >> Ruslan Dautkhanov
> >>
> >> On Mon, Nov 28, 2016 at 8:23 AM, Ruslan Dautkhanov <
> dautkha...@gmail.com>
> >> wrote:
> >>
> >>> Zeppelin 0.7.0 built from yesterday's snapshot.
> >>> Getting below error stack when trying to start Zeppelin 0.7.0.
> >>> The same shiro config works fine in 0.6.2.
> >>>
> >>> We're using LDAP authentication configured in shiro.ini as
> >>> ldapRealm = org.apache.zeppelin.server.LdapGroupRealm
> >>> ldapRealm.contextFactory.environment[ldap.searchBase] = ...
> >>> ldapRealm.contextFactory.url = ...
> >>> ldapRealm.contextFactory.authenticationMechanism = SIMPLE
> >>> ..
> >>>
> >>> This config works fine in 0.6.2.
> >>> Is org.apache.zeppelin.server.LdapGroupRealm has to be changed in 0.7
> >>> to something else?
> >>> Or there are other significant changes in Shiro / LDAP authentication?
> >>>
> >>>
> >>>
> >>> [1]
> >>>
> >>> $ ./zeppelin.sh
> >>> ...
> >>> Exception in thread "main" org.apache.shiro.config.
> ConfigurationException:
> >>> Unable to instantiate class [org.apache.zeppelin.server.
> LdapGroupRealm]
> >>> for object named 'ldapRealm'.  Please ensure you've specified the fully
> >>> qualified class name correctly.
> >>> at org.apache.shiro.config.ReflectionBuilder.
> createNewInstance(
> >>> ReflectionBuilder.java:151)
> >>> at org.apache.shiro.config.ReflectionBuilder.
> buildObjects(Refle
> >>> ctionBuilder.java:119)
> >>> at org.apache.shiro.config.IniSecurityManagerFactory.
> buildInsta
> >>> nces(IniSecurityManagerFactory.java:161)
> >>> at org.apache.shiro.config.IniSecurityManagerFactory.
> createSecu
> >>> rityManager(IniSecurityManagerFactory.java:124)
> >>> at org.apache.shiro.config.IniSecurityManagerFactory.
> createSecu
> >>> rityManager(IniSecurityManagerFactory.java:102)
> >>> at org.apache.shiro.config.IniSecurityManagerFactory.
> createInst
> >>> ance(IniSecurityManagerFactory.java:88)
> >>> at org.apache.shiro.config.IniSecurityManagerFactory.
> createInst
> >>> ance(IniSecurityManagerFactory.java:46)
> >>> at org.apache.shiro.config.IniFactorySupport.
> createInstance(Ini
> >>> FactorySupport.java:123)
> >>> at org.apache.shiro.util.AbstractFactory.getInstance(
> AbstractFa
> >>> ctory.java:47)
> >>> at org.apache.zeppelin.utils.SecurityUtils.
> initSecurityManager(
> >>> SecurityUtils.java:56)
> >>> at org.apache.zeppelin.server.ZeppelinServer.
> setupRestApiContex
> >>> tHandler(ZeppelinServer.java:268)
> >>> at org.apache.zeppelin.server.ZeppelinServer.main(
> ZeppelinServe
> >>> r.java:137)
> >>> Caused by: org.apache.shiro.util.UnknownClassException: Unable to load
> >>> class named [org.apache.zeppelin.server.LdapGroupRealm] from the
> thread
> >>> context, current, or system/application ClassLoaders.  All heuristics
> have
> >>> been exhausted.  Class could not be found.
> >>> at org.apa

0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-29 Thread Ruslan Dautkhanov
After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
listed first in zeppelin.interpreters.

zeppelin.interpreters in zeppelin-site.xml:


>   zeppelin.interpreters
>
> org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter
> ...
> 



Any ideas how to fix this?


Thanks,
Ruslan


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-29 Thread Jeff Zhang
The default interpreter is now defined in interpreter-setting.json

You can update the following file to make pyspark as the default
interpreter and then copy it to folder interpreter/spark

https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/interpreter-setting.json



Ruslan Dautkhanov 于2016年11月30日周三 上午8:49写道:

> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
> listed first in zeppelin.interpreters.
>
> zeppelin.interpreters in zeppelin-site.xml:
>
> 
>   zeppelin.interpreters
>
> org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter
> ...
> 
>
>
>
> Any ideas how to fix this?
>
>
> Thanks,
> Ruslan
>


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-29 Thread Ruslan Dautkhanov
Thank you Jeff.

Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
or in $ZEPPELIN_HOME directory?
So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?

Thanks!



-- 
Ruslan Dautkhanov

On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang  wrote:

> The default interpreter is now defined in interpreter-setting.json
>
> You can update the following file to make pyspark as the default
> interpreter and then copy it to folder interpreter/spark
>
> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/
> interpreter-setting.json
>
>
>
> Ruslan Dautkhanov 于2016年11月30日周三 上午8:49写道:
>
>> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
>> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
>> listed first in zeppelin.interpreters.
>>
>> zeppelin.interpreters in zeppelin-site.xml:
>>
>> 
>>   zeppelin.interpreters
>>   org.apache.zeppelin.spark.PySparkInterpreter,org.
>> apache.zeppelin.spark.SparkInterpreter
>> ...
>> 
>>
>>
>>
>> Any ideas how to fix this?
>>
>>
>> Thanks,
>> Ruslan
>>
>


Re: 0.7.0 zeppelin.interpreters change: can't make pyspark be default Spark interperter

2016-11-29 Thread Jeff Zhang
No, you don't need to create that directory, it should be in
$ZEPPELIN_HOME/interpreter/spark




Ruslan Dautkhanov 于2016年11月30日周三 下午12:12写道:

> Thank you Jeff.
>
> Do I have to create interpreter/spark directory in $ZEPPELIN_HOME/conf
> or in $ZEPPELIN_HOME directory?
> So zeppelin.interpreters in zeppelin-site.xml is deprecated in 0.7?
>
> Thanks!
>
>
>
> --
> Ruslan Dautkhanov
>
> On Tue, Nov 29, 2016 at 6:54 PM, Jeff Zhang  wrote:
>
> The default interpreter is now defined in interpreter-setting.json
>
> You can update the following file to make pyspark as the default
> interpreter and then copy it to folder interpreter/spark
>
>
> https://github.com/apache/zeppelin/blob/master/spark/src/main/resources/interpreter-setting.json
>
>
>
> Ruslan Dautkhanov 于2016年11月30日周三 上午8:49写道:
>
> After 0.6.2 -> 0.7 upgrade, pySpark isn't a default Spark interpreter;
> despite we have org.apache.zeppelin.spark.*PySparkInterpreter*
> listed first in zeppelin.interpreters.
>
> zeppelin.interpreters in zeppelin-site.xml:
>
> 
>   zeppelin.interpreters
>
> org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkInterpreter
> ...
> 
>
>
>
> Any ideas how to fix this?
>
>
> Thanks,
> Ruslan
>
>
>