date:20170726

Re: Aw: Re: Two different errors while executing Spark SQL queries against cached temp tables

2017-07-26 Thread josephpconley

I recently ran into this issue running 0.7.1.  Specifically, I had cached a
dataframe before registering it as a temp table.  When I removed the caching
this error stopped.



--
View this message in context: 
http://apache-zeppelin-users-incubating-mailing-list.75479.x6.nabble.com/Two-different-errors-while-executing-Spark-SQL-queries-against-cached-temp-tables-tp4517p6038.html
Sent from the Apache Zeppelin Users (incubating) mailing list mailing list 
archive at Nabble.com.

RE: z.load() Must be used before SparkInterpreter (%spark) initialized?

2017-07-26 Thread Davidson, Jonathan

We’ve also found it undesirable being unable to load extra jars without 
restarting the interpreter. Is the best way to mitigate this by running in 
isolated mode (by note or user), so that other users are less affected? Is 
there any development in progress to load without restart?

Thanks!

From: Jeff Zhang [mailto:zjf...@gmail.com]
Sent: Tuesday, July 25, 2017 8:31 PM
To: Users 
Subject: Re: z.load() Must be used before SparkInterpreter (%spark) initialized?

It is not restarting zeppelin, you just need to restart spark interpreter.

Richard Xin 
mailto:richardxin...@yahoo.com>>于2017年7月26日周三 
上午12:53写道：
I used %dep
z.load("path/to/jar")
I got following error:
Must be used before SparkInterpreter (%spark) initialized
Hint: put this paragraph before any Spark code and restart Zeppelin/Interpreter

restart zeppelin did make it work, it seems to be an expected behavior, but I 
don't understand thee reason behind it. If every time I have to restart 
zeppelin before I could dynamically add an external jar, then this feature is 
useless to most people.

Richard Xin

This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.
This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity
to which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

Re: z.load() Must be used before SparkInterpreter (%spark) initialized?

2017-07-26 Thread Rick Moritz

Please allow me to opinionate on that subject:

To me, there are two options: Indeed you either run the Spark interpreter
in isolated mode, or you have dedicated Spark Interpreter-Groups per
organziational unit, so you can manage dependencies independently.
Obviously, there's no way around restarting the interpreter, when you need
to tell the classloader of additional Jars in the classloader, never mind
distributing those jars across the cluster without calling spark-submit.
Since an interpreter represents an actual running JVM, you need to treat it
as such. I assume that is also the reason, why z.load has been superseded
by dependency configuration in the Interpreter settings.

A good way to manage dependencies is to collate all the dependencies per
unit in a fat-jar, and manage that via an external build. That way you can
have testable dependencies, and a curated experience, where everything just
works -- as long as someone puts that effort in. Still, with a
collaborative tool, that's better than everyone putting in their favorite
lib, and then causing each interpreter start to pull in half the Internet
in transitive dependencies, with potential conflicts to boot. Zeppelin will
be slowish, if every interpreter start starts off with uploading a GB of
dependencies into the cluster.

In an ad hoc, almost-single-user environment, you can work well with
Zeppelin's built-in dependency management, but I don't really see it scale
to the enterprise level -- and I don't think it should either. There's no
point in investing ressources into something, that external tools can
already easily provide.

I wouldn't deploy Zeppelin as enterprise infrastructure either - deploy one
Zeppelin per project. and manage segregation there by separate
interpreters. This also helps with finer ressource management.

I hope this helps your understanding, as well as giving you some pointers
on how to manage Zeppelin in such a way, that there are less conflicts
between users.

On Wed, Jul 26, 2017 at 2:30 PM, Davidson, Jonathan <
jonathan.david...@optum.com> wrote:

> We’ve also found it undesirable being unable to load extra jars without
> restarting the interpreter. Is the best way to mitigate this by running in
> isolated mode (by note or user), so that other users are less affected? Is
> there any development in progress to load without restart?
>
>
>
> Thanks!
>
>
>
> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
> *Sent:* Tuesday, July 25, 2017 8:31 PM
> *To:* Users 
> *Subject:* Re: z.load() Must be used before SparkInterpreter (%spark)
> initialized?
>
>
>
>
>
> It is not restarting zeppelin, you just need to restart spark interpreter.
>
>
>
>
>
> Richard Xin 于2017年7月26日周三 上午12:53写道：
>
> I used %dep
>
> z.load("path/to/jar")
>
> I got following error:
>
> Must be used before SparkInterpreter (%spark) initialized
>
> Hint: put this paragraph before any Spark code and restart
> Zeppelin/Interpreter
>
>
>
> restart zeppelin did make it work, it seems to be an expected behavior,
> but I don't understand thee reason behind it. If every time I have to
> restart zeppelin before I could dynamically add an external jar, then this
> feature is useless to most people.
>
>
>
> Richard Xin
>
>
> This e-mail, including attachments, may include confidential and/or
> proprietary information, and may be used only by the person or entity
> to which it is addressed. If the reader of this e-mail is not the intended
> recipient or his or her authorized agent, the reader is hereby notified
> that any dissemination, distribution or copying of this e-mail is
> prohibited. If you have received this e-mail in error, please notify the
> sender by replying to this message and delete this e-mail immediately.
>
>
> This e-mail, including attachments, may include confidential and/or
> proprietary information, and may be used only by the person or entity
> to which it is addressed. If the reader of this e-mail is not the intended
> recipient or his or her authorized agent, the reader is hereby notified
> that any dissemination, distribution or copying of this e-mail is
> prohibited. If you have received this e-mail in error, please notify the
> sender by replying to this message and delete this e-mail immediately.
>

activeDirectoryRealm.groupRolesMap

2017-07-26 Thread Richard Xin

I am facing some hurdle with activeDirectoryRealm.groupRolesMapthe following is 
the content of my shiro.ini...activeDirectoryRealm.groupRolesMap = 
"CN=Zeppelin-Admin,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com":"admin","CN=ZeppelinZepZeppelinpelin-Devs,OU=Zepplin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com":"developer","CN=Zeppelin-Analyst,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=DC=[domain_here],DC=com":"datascientist"
activeDirectoryRealm.authorizationCachingEnabled = 
falseactiveDirectoryRealm.principalSuffix = @directory.mydomain.com...
[roles]
admin = *
datascientist = *developer = *
[urls]uncomment the below urls that you want to hide./api/version = anon
/api/interpreter/** = authc, roles[admin]/** = authc


My AD account is member of 
"CN=Zeppelin-Admin,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com",
 but when I login, I saw followings in the log:

WARN [2017-07-26 00:14:10,981] (
{qtp1287712235-15} LoginRestApi.java[postLogin]:119) - 
{"status":"OK","message":"","body":{"principal":"richard.xin","ticket":"b681cbbb-8a10-40c8-9ba8-c46ee59efd42","roles":"[]"}}

please note roles node is empty, I was expecting "admin" in the role list, does 
anyone have similar issue? is my config activeDirectoryRealm.groupRolesMap 
correct?
Thanks,Richard Xin

unsubscribe

2017-07-26 Thread Scott Zelenka



--
Scott Zelenka
CCTG Engineering - US
Phone: (+1) 919-392-1394
Email: szele...@cisco.com

This email may contain confidential and privileged material for the sole 
use of the intended recipient. Any review, use, distribution or 
disclosure by others is strictly prohibited. If you are not the intended 
recipient (or authorized to receive for the recipient), please contact 
the sender by reply email and delete all copies of this message.


For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

Re: activeDirectoryRealm.groupRolesMap

2017-07-26 Thread Richard Xin

just saw this one, appear to be a know bug [ZEPPELIN-2640] Roles are not 
getting honored from shiro_ini for setting permissions in Zeppelin notebook - 
ASF JIRA

| 
| 
|  | 
[ZEPPELIN-2640] Roles are not getting honored from shiro_ini for setting...


 |

 |

 |






On Wednesday, July 26, 2017, 11:13:39 AM PDT, Richard Xin 
 wrote:

I am facing some hurdle with activeDirectoryRealm.groupRolesMapthe following is 
the content of my shiro.ini...activeDirectoryRealm.groupRolesMap = 
"CN=Zeppelin-Admin,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com":"admin","CN=ZeppelinZepZeppelinpelin-Devs,OU=Zepplin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com":"developer","CN=Zeppelin-Analyst,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=DC=[domain_here],DC=com":"datascientist"
activeDirectoryRealm.authorizationCachingEnabled = 
falseactiveDirectoryRealm.principalSuffix = @directory.mydomain.com...
[roles]
admin = *
datascientist = *developer = *
[urls]uncomment the below urls that you want to hide./api/version = anon
/api/interpreter/** = authc, roles[admin]/** = authc


My AD account is member of 
"CN=Zeppelin-Admin,OU=Zeppelin,OU=Applications,OU=Groups,DC=directory,DC=[domain_here],DC=com",
 but when I login, I saw followings in the log:

WARN [2017-07-26 00:14:10,981] (
{qtp1287712235-15} LoginRestApi.java[postLogin]:119) - 
{"status":"OK","message":"","body":{"principal":"richard.xin","ticket":"b681cbbb-8a10-40c8-9ba8-c46ee59efd42","roles":"[]"}}

please note roles node is empty, I was expecting "admin" in the role list, does 
anyone have similar issue? is my config activeDirectoryRealm.groupRolesMap 
correct?
Thanks,Richard Xin

Re: Shiro AD auth - unable to use jceks

2017-07-26 Thread Richard Xin


we have Zeppelin instance on aws emr, we didn't experience any issues with jceks


On Monday, July 24, 2017, 11:57:16 PM PDT, cs user  wrote:

Bump
Has anyone managed to get this working? 
On Thu, Jul 20, 2017 at 11:37 AM, cs user  wrote:

Hello, 
Can someone explain how the shiro.ini config should look when trying to encrypt 
the AD password?
We have the following config:

activeDirectoryRealm = org.apache.zeppelin.realm. ActiveDirectoryGroupRealm
activeDirectoryRealm.url = ldaps://some.address.com:636
activeDirectoryRealm. searchBase = DC=top,DC=domain,DC=sub,DC=com
activeDirectoryRealm. groupRolesMap = "CN=GROUP,OU=some,OU=location, 
OU=folder,DC=top,DC=domain,DC= sub,DC=com”:”someuser”
activeDirectoryRealm. systemUsername = some.account


# Password commented out

#activeDirectoryRealm. systemPassword = passwordnotused
activeDirectoryRealm. hadoopSecurityCredentialPath = 
"jceks://file/tmp/zeppelin/ conf/zeppelin.jceks"
activeDirectoryRealm. principalSuffix=@some.sub.com
activeDirectoryRealm. authorizationCachingEnabled = false

However it doesn't appear to be using the credential which is stored in the 
jceks file. 
The file was created using the following command:

hadoop credential create activeDirectoryRealm. systemPassword -provider 
jceks://file/tmp/zeppelin/ conf/zeppelin.jceks 

The file is owned by zeppelin. 
I've tried created the credential with both  "systemPassword" and 
"systempassword" as the name. 
Everything works fine if I just use the plain text password. I'm using Zeppelin 
version 0.7.0. 
What am I missing here? Does anyone have an example config which is working for 
them? I've check the logs and there are no errors relating to loading the above 
jceks file. 
Thanks!

Re: z.load() Must be used before SparkInterpreter (%spark) initialized?

2017-07-26 Thread Jeff Zhang

Thanks Rick for detailed explanation. It should be very helpful for users.

Personally I would suggest user to set additional jars in interpreter
setting instead of using %dep. For the long term solution, I am considering
to put configuration into note itself. e.g. for each interpreter,  there
would be one special interpreter to initialize the interpreter setting (to
override the default interpreter setting of zeppelin).  This special
interpreter could be called something like %spark.init

And user need to put this as the first paragraph of note before running any
other paragraph. The purpose is that to not only include the code but also
include the configuration into note. so that user can rerun the note in
other zeppelin instances. Just make note as a self-contained concept
without any external dependencies.




Rick Moritz 于2017年7月27日周四 上午12:52写道：

> Please allow me to opinionate on that subject:
>
> To me, there are two options: Indeed you either run the Spark interpreter
> in isolated mode, or you have dedicated Spark Interpreter-Groups per
> organziational unit, so you can manage dependencies independently.
> Obviously, there's no way around restarting the interpreter, when you need
> to tell the classloader of additional Jars in the classloader, never mind
> distributing those jars across the cluster without calling spark-submit.
> Since an interpreter represents an actual running JVM, you need to treat it
> as such. I assume that is also the reason, why z.load has been superseded
> by dependency configuration in the Interpreter settings.
>
> A good way to manage dependencies is to collate all the dependencies per
> unit in a fat-jar, and manage that via an external build. That way you can
> have testable dependencies, and a curated experience, where everything just
> works -- as long as someone puts that effort in. Still, with a
> collaborative tool, that's better than everyone putting in their favorite
> lib, and then causing each interpreter start to pull in half the Internet
> in transitive dependencies, with potential conflicts to boot. Zeppelin will
> be slowish, if every interpreter start starts off with uploading a GB of
> dependencies into the cluster.
>
> In an ad hoc, almost-single-user environment, you can work well with
> Zeppelin's built-in dependency management, but I don't really see it scale
> to the enterprise level -- and I don't think it should either. There's no
> point in investing ressources into something, that external tools can
> already easily provide.
>
> I wouldn't deploy Zeppelin as enterprise infrastructure either - deploy
> one Zeppelin per project. and manage segregation there by separate
> interpreters. This also helps with finer ressource management.
>
> I hope this helps your understanding, as well as giving you some pointers
> on how to manage Zeppelin in such a way, that there are less conflicts
> between users.
>
> On Wed, Jul 26, 2017 at 2:30 PM, Davidson, Jonathan <
> jonathan.david...@optum.com> wrote:
>
>> We’ve also found it undesirable being unable to load extra jars without
>> restarting the interpreter. Is the best way to mitigate this by running in
>> isolated mode (by note or user), so that other users are less affected? Is
>> there any development in progress to load without restart?
>>
>>
>>
>> Thanks!
>>
>>
>>
>> *From:* Jeff Zhang [mailto:zjf...@gmail.com]
>> *Sent:* Tuesday, July 25, 2017 8:31 PM
>> *To:* Users 
>> *Subject:* Re: z.load() Must be used before SparkInterpreter (%spark)
>> initialized?
>>
>>
>>
>>
>>
>> It is not restarting zeppelin, you just need to restart spark interpreter.
>>
>>
>>
>>
>>
>> Richard Xin 于2017年7月26日周三 上午12:53写道：
>>
>> I used %dep
>>
>> z.load("path/to/jar")
>>
>> I got following error:
>>
>> Must be used before SparkInterpreter (%spark) initialized
>>
>> Hint: put this paragraph before any Spark code and restart
>> Zeppelin/Interpreter
>>
>>
>>
>> restart zeppelin did make it work, it seems to be an expected behavior,
>> but I don't understand thee reason behind it. If every time I have to
>> restart zeppelin before I could dynamically add an external jar, then this
>> feature is useless to most people.
>>
>>
>>
>> Richard Xin
>>
>>
>> This e-mail, including attachments, may include confidential and/or
>> proprietary information, and may be used only by the person or entity
>> to which it is addressed. If the reader of this e-mail is not the intended
>> recipient or his or her authorized agent, the reader is hereby notified
>> that any dissemination, distribution or copying of this e-mail is
>> prohibited. If you have received this e-mail in error, please notify the
>> sender by replying to this message and delete this e-mail immediately.
>>
>>
>> This e-mail, including attachments, may include confidential and/or
>> proprietary information, and may be used only by the person or entity
>> to which it is addressed. If the reader of this e-mail is not the intended
>> recipient or his or her authorized agent, the rea

Re: Aw: Re: Two different errors while executing Spark SQL queries against cached temp tables

RE: z.load() Must be used before SparkInterpreter (%spark) initialized?

Re: z.load() Must be used before SparkInterpreter (%spark) initialized?

activeDirectoryRealm.groupRolesMap

unsubscribe

Re: activeDirectoryRealm.groupRolesMap

Re: Shiro AD auth - unable to use jceks

Re: z.load() Must be used before SparkInterpreter (%spark) initialized?

8 matches

Site Navigation

Mail list logo

Footer information