We've been hitting similar issues - Spark interpreter configured to
run per user in an isolated process, then per note in a scoped process
- the intention being that each user has their own Spark interpreter
(e.g. for resource allocation and queuing) but that that interpreter
can be isolated between multiple notes (sharing variable namespaces
being very bad). This mostly works, but we haven't got interpreter
restarts working - whether the user is an admin or not, the restart
seems to have no effect. Our workaround is giving the user YARN admin
ACL permissions to kill the underlying job, but that's rather sub
optimal - especially since it can be hard to identify which job
belongs to which interpreter.

We've tried to get Zeppelin running Spark jobs as the end user, but
that's not working for reasons I forget. I'll try to report more back
on both issues. Unfortunately multi-tenant Zeppelin doesn't seem quite
there, but it we find it no better for Jupyter etc.

James

On Mon, 29 Jul 2019 at 08:14, Jeff Zhang <zjf...@gmail.com> wrote:
>
> Then I am afraid there's no workaround for now. I think one approach is that 
> we just restart the current user's interpreter process when he clicks the 
> restart button in interpreter setting page. Only admin can restart all users' 
> interpreter process.
>
>
>
> Dima Kamalov <dimakama...@asana.com> 于2019年7月29日周一 下午2:51写道:
>>
>> Using var in multiple notebooks is dangerous -- users will run into 
>> inadvertent bugs because the same variable value got changed in a different 
>> notebook.  So that will not work for us.  Thank you for the suggestion 
>> though -- let me know if you or others have any other ones.
>>
>> On Sun, Jul 28, 2019 at 10:19 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>
>>> You can use var instead of val, so that you can the same variable in a 
>>> different paragraph. And as long as you don't run paragraphs cross 
>>> paragraphs, it should be fine.
>>>
>>>
>>> Dima Kamalov <dimakama...@asana.com> 于2019年7月29日周一 下午1:16写道:
>>>>
>>>> Yes, user isolated only works for Spark.  The problem with running only 
>>>> user isolated is that then we get conflicts in the Scala REPL -- e.g. the 
>>>> same variable name cannot be used in multiple notebooks.
>>>>
>>>> On Sun, Jul 28, 2019 at 9:10 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>
>>>>> Hmm, that's right. Does only user isolated work for you ?
>>>>>
>>>>> Dima Kamalov <dimakama...@asana.com> 于2019年7月29日周一 下午12:03写道:
>>>>>>
>>>>>> This does not fix the problem when a Spark session crashes.   If a user 
>>>>>> has multiple notes in scoped mode, restarting one note will not restart 
>>>>>> the interpreter group -- it will only restart the session.  This will 
>>>>>> restart the Scala REPL but not e.g. the Spark session.
>>>>>>
>>>>>> On Sun, Jul 28, 2019 at 6:16 PM Jeff Zhang <zjf...@gmail.com> wrote:
>>>>>>>
>>>>>>> Restarting interpreter in note page will only restart that note's 
>>>>>>> owner's interpreter, won't affect other users' interpreter.
>>>>>>>
>>>>>>>
>>>>>>> Dima Kamalov <dimakama...@asana.com> 于2019年7月27日周六 上午9:51写道:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm wondering whether there are any best practices around restarting a 
>>>>>>>> user's interpreter group.  (E.g. the whole set of sessions for that 
>>>>>>>> user, but not sessions for other users.)
>>>>>>>>
>>>>>>>> Here's the problem I want to solve:
>>>>>>>>
>>>>>>>> - We are primarily using Zeppelin for the spark interpreter.  Because 
>>>>>>>> each user has a number of notebooks, it seemed like a good idea to 
>>>>>>>> pool spark sessions per user so we did that by setting spark 
>>>>>>>> interpreter to user isolated, note scoped.
>>>>>>>> - Periodically, a user's spark session will crash for whatever reason.
>>>>>>>>
>>>>>>>> Here's the possible solutions that I can think of.  We're currently 
>>>>>>>> using 1a.
>>>>>>>> (1) Within existing interpreter mode
>>>>>>>> a. Restart the spark interpreter from the interpreter menu.  This 
>>>>>>>> restarts it for ~30 users, so it's inconvenient to do often.
>>>>>>>> b. Track down all of a user's notebooks, and restart the spark 
>>>>>>>> interpreter in each notebook.
>>>>>>>>
>>>>>>>> (2) Considering switching interpreter modes
>>>>>>>> a. User isolated, note isolated -- our biggest concern with this is 
>>>>>>>> just the number of spark sessions that would get generated.  Maybe 
>>>>>>>> this would play well with lifecycle management?
>>>>>>>> b. User isolated -- seems a little bad for users because a variable 
>>>>>>>> updated in one note would overwrite the same variable in another note.
>>>>>>>>
>>>>>>>> (3) Work on a change to Zeppelin, assuming this feature doesn't exist 
>>>>>>>> yet
>>>>>>>> a. In the interpreter menu, we can have the restart option ask whether 
>>>>>>>> to restart the interpreter only for the user or globally?  Or maybe it 
>>>>>>>> only makes sense to allow to restarting it for the user?  It seems 
>>>>>>>> like there's a more major undertaking for 
>>>>>>>> https://issues.apache.org/jira/browse/ZEPPELIN-1338 so I don't want to 
>>>>>>>> conflict with that direction.
>>>>>>>>
>>>>>>>> Have other people run into this problem?  Are there solution options 
>>>>>>>> I'm missing?  What option have you chosen?
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>> Dima
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards
>>>>>>>
>>>>>>> Jeff Zhang
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>
>
>
> --
> Best Regards
>
> Jeff Zhang

Reply via email to