Re: HA for Zeppelin

vincent gromakowski Fri, 08 Apr 2016 04:34:04 -0700

Using it for 3 months without any incident
Le 8 avr. 2016 9:09 AM, "ashish rawat" <dceash...@gmail.com> a écrit :


> Sounds great. How long have you been using glusterfs in prod? and have you
> encountered any challenges. The only difficulty for me to use it, would be
> a lack of expertise to fix broken things, so hope it's stability isn't
> something to be concerned about.
>
> Regards,
> Ashish
>
> On Fri, Apr 8, 2016 at 12:20 PM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> use fuse interface. Gluster volume is directly accessible as local
>> storage on all nodes but performance is only 200 Mb/s. More than enough for
>> notebooks. For data prefer tachyon/alluxio on top of gluster...
>> Le 8 avr. 2016 6:35 AM, "ashish rawat" <dceash...@gmail.com> a écrit :
>>
>>> Thanks Eran and Vincent.
>>> Eran, I would definitely like to try it out, since it won't add to the
>>> complexity of my deployment. Would see the S3 implementation, to figure out
>>> how complex it would be.
>>>
>>> Vincent,
>>> I haven't explored glusterfs at all. Would it also require to write an
>>> implementation of storage interface? Or zeppelin can work with it, out of
>>> the box?
>>>
>>> Regards,
>>> Ashish
>>>
>>> On Wed, Apr 6, 2016 at 12:53 PM, vincent gromakowski <
>>> vincent.gromakow...@gmail.com> wrote:
>>>
>>>> For 1 marathon on mesos restart zeppelin daemon In case of failure.
>>>> For 2 glusterfs fuse mount allows to share notebooks on all mesos nodes.
>>>> For 3 not available right now In our  design but a manual restart In
>>>> zeppelin config page is acceptable for US.
>>>> Le 6 avr. 2016 8:18 AM, "Eran Witkon" <eranwit...@gmail.com> a écrit :
>>>>
>>>>> Yes this is correct.
>>>>> For HA disk, if you don't have HA storage and no access to S3 then
>>>>> AFAIK you don't have other option at the moment.
>>>>> If you like to save notebooks to elastic then I suggest you look at
>>>>> the storage interface and implementation for git and s3 and implement that
>>>>> yourself. It does sound like an interesting feature
>>>>> Best
>>>>> Eran
>>>>> On Wed, 6 Apr 2016 at 08:57 ashish rawat <dceash...@gmail.com> wrote:
>>>>>
>>>>>> Thanks Eran. So 3, seems to be something external to Zeppelin, and
>>>>>> hopefully 1 only means running "zeppelin-daemon.sh start" on a slave
>>>>>> machine, when master become inaccessible. Is that correct?
>>>>>>
>>>>>> My main concern still remains on the storage front. And I don't
>>>>>> really have high availability disks or even hdfs in my setup. I have been
>>>>>> using elastic search cluster for data high availability, but was hoping
>>>>>> that zeppelin can save notebooks to a Elastic Search (like kibana) or 
>>>>>> maybe
>>>>>> a document store.
>>>>>>
>>>>>> Any idea if anything is planned in that direction. Don't want to
>>>>>> fallback to 'rsync' like options.
>>>>>>
>>>>>> Regards,
>>>>>> Ashish
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 5, 2016 at 11:17 PM, Eran Witkon <eranwit...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> For 1 you need to have both zeppelin web HA and zeppelin deamon HA
>>>>>>> For 2 I guess you can use HDFS if you implement the storage
>>>>>>> interface for HDFS. But i am not sure.
>>>>>>> For 3 I mean that if you connect to an external cluster for example
>>>>>>> a spark cluster you need to make sure your spark cluster is HA. 
>>>>>>> Otherwise
>>>>>>> you will have zeppelin running but your notebook will fail as no spark
>>>>>>> cluster available.
>>>>>>> HTH
>>>>>>> Eran
>>>>>>>
>>>>>>>
>>>>>>> On Tue, 5 Apr 2016 at 20:20 ashish rawat <dceash...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks Eran for your reply.
>>>>>>>> For 1) I am assuming that it would similar to HA of any other web
>>>>>>>> application, i.e. running multiple instances and switching to the 
>>>>>>>> backup
>>>>>>>> server when master is down, is it not the case?
>>>>>>>> For 2) is it also possible to save it on hdfs?
>>>>>>>> Can you please explain 3, are you referring to interpreter config?
>>>>>>>> If I am using Spark interpreter and submitting jobs to it, and if 
>>>>>>>> zeppelin
>>>>>>>> master node goes down, then what could be the problem in slave node
>>>>>>>> pointing to the same cluster and submitting jobs?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Ashish
>>>>>>>>
>>>>>>>> On Tue, Apr 5, 2016 at 10:08 PM, Eran Witkon <eranwit...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I would say you need to account for these things
>>>>>>>>> 1) availability of the zeppelin deamon
>>>>>>>>> 2) availability of the notebookd files
>>>>>>>>> 3) availability of the interpreters used.
>>>>>>>>>
>>>>>>>>> For 1 i don't know of out-of-box solution
>>>>>>>>> For 2 any ha storage will do, s3 or any ha external mounted disk
>>>>>>>>> For 3 it is up to the interpreter and your big data ha solution
>>>>>>>>>
>>>>>>>>> On Tue, 5 Apr 2016 at 19:29 ashish rawat <dceash...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Is there a suggested architecture to run Zeppelin in high
>>>>>>>>>> availability mode. The only option I could find was by saving 
>>>>>>>>>> notebooks to
>>>>>>>>>> S3. Are there any options if one is not using AWS?
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Ashish
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>
>

Re: HA for Zeppelin

Reply via email to