Re: Zeppelin distributed architecture design

2018-07-23 Thread Jongyoul Lee
First of all, thank you for your effort and contribution.

I read it carefully today, and personally, it's a very nice feature and
idea.

Let's discuss it and improve more concretely. I also left comments on the
doc.

And I have a simple question.

`Copycat`, which you used to implement it, is deprecated by owner[1] and
moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you
have any reason to use this library? It's even SNAPSHOT version.

Regards,
JL

[1]: https://github.com/atomix/copycat

On Sat, Jul 21, 2018 at 2:07 AM, liuxun  wrote:

> HI:
>
> In order to more intuitively express the actual use of distributed
> zeppelin clusters.
> I updated this design document, starting with the 16th page of the
> document, adding 2 GIF animations showing the operation record screen of
> the zeppelin cluster we are using now.
> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit#  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>
> Distributed clustered zeppelin is already in use at our company, and the
> recorded screens are all real.
> The first recorded screens GIF shows the following
> Create a cluster of three zeppelin servers
> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
> zeppelin-site.xml to create a cluster
> Start these 3 servers at the same time
> Open the web pages of these 3 servers and prepare for the notebook
> operation.
>
>
> The second recorded screens GIF shows the following
> Create an interpreter process in the cluster
> Create a notebook on host234 and execute it, This action will create an
> interpreter process in the server with free resources in the cluster.
> You can then continue editing this notebook on host235 and execute it, You
> can return results immediately without waiting for the time to create an
> interpreter process.
> Again, you can continue to edit this notebook on host236. And execute it,
> you can return results immediately without waiting for the time to create
> the interpreter process
> The same notebook will reuse the first created interpreter process, so you
> can get the execution result immediately on any server.
> By looking at the background server process, you will find that host234,
> host235, and host235 use the same interpreter process for the same notebook.
>
> Originally, I wanted to record the interpreter process exception. The
> cluster re-created the screenshot of the interpreter process in the idle
> server, but I am too tired now.
> There is time to record later.
>
>
> > 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
> >
> > Thank you luxun,
> >
> > I left a couple of comments in that google document.
> >
> > --
> > Ruslan Dautkhanov
> >
> >
> > On Tue, Jul 17, 2018 at 11:30 PM liuxun  neliu...@163.com>> wrote:
> > hi,Ruslan Dautkhanov
> >
> > Thank you very much for your question. according to your advice, I added
> 3 schematics to illustrate.
> > 1. Distributed Zeppelin Deployment architecture diagram.
> > 2. Distributed zeppelin Server fault tolerance diagram.
> > 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> >
> >
> > The email attachment exceeded the size limit, so I reorganized the
> document and updated it with Google Docs.
> > https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
> VDKCRRBm-Qa3Bw/edit?usp=sharing  1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing>
> >
> >
> >> 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>>
> 写道:
> >>
> >> hi,Ruslan Dautkhanov
> >>
> >> Thank you very much for your question. according to your advice, I
> added 3 schematics to illustrate.
> >> 1. Zeppelin Cluster architecture diagram.
> >> 2. Distributed zeppelin Server fault tolerance diagram.
> >> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
> >>
> >> Later, I will merge the schematic into the system design document.
> >>
> >> 
> >>
> >>
> >> 
> >>
> >>
> >>
> >> 
> >>
> >>
> >>
> >>> 在 2018年7月18日,上午1:16,Ruslan Dautkhanov  dautkha...@gmail.com>> 写道:
> >>>
> >>> Nice.
> >>>
> >>> Thanks for sharing.
> >>>
> >>> Can you explain how are users routed into a particular zeppelin server
> >>> instance? I've seen nginx on top of them, but I don't think the
> document
> >>> covers details? If one zeppelin server goes down or unhealthy, is nginx
> >>> supposed to detect (if so, how?) that and reroute users to a survived
> >>> instance?
> >>>
> >>> Thanks,
> >>> Ruslan Dautkhanov
> >>>
> >>>
> >>> On Tue, Jul 17, 2018 at 2:46 AM liuxun  neliu...@163.com>> wrote:
> >>>
>  hi:
> 
>  Our company installed and deployed a lot of zeppelin for data
> analysis.
>  The single server version of zeppelin could not meet our application
>  scenarios, so we transformed zeppelin into a clustered service that
>  supports distributed deployment, Have a unified entrance, high
>  availability, and High server resource usage.  the email attach

Re: Zeppelin distributed architecture design

2018-07-23 Thread liuxun
@Jongyoul Lee:
Thank you for your attention.

Indeed, as you said, the `Copycat` project has been closed and has been 
migrated to `https://github.com/atomix/atomix`.

I also considered this issue during development.
The main reason was that it was enough to realize Raft using `Copycat` at the 
time, and it was not considered too long.

Today, I took a look at the documentation of atomix, 
https://atomix.io/docs/latest/user-manual/ 
 , 
which has a lot of features, such as broadcasting messages in the cluster, 
detecting cluster events... ,
From the perspective of zeppelin's long-term development, it is better to use 
atomix.
So, I will switch the Raft protocol algorithm library to atomix, which is not 
difficult to modify.

Struggle for zeppelin!!! :-)


> 在 2018年7月24日,上午9:35,Jongyoul Lee  写道:
> 
> First of all, thank you for your effort and contribution.
> 
> I read it carefully today, and personally, it's a very nice feature and
> idea.
> 
> Let's discuss it and improve more concretely. I also left comments on the
> doc.
> 
> And I have a simple question.
> 
> `Copycat`, which you used to implement it, is deprecated by owner[1] and
> moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you
> have any reason to use this library? It's even SNAPSHOT version.
> 
> Regards,
> JL
> 
> [1]: https://github.com/atomix/copycat
> 
> On Sat, Jul 21, 2018 at 2:07 AM, liuxun  wrote:
> 
>> HI:
>> 
>> In order to more intuitively express the actual use of distributed
>> zeppelin clusters.
>> I updated this design document, starting with the 16th page of the
>> document, adding 2 GIF animations showing the operation record screen of
>> the zeppelin cluster we are using now.
>> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
>> VDKCRRBm-Qa3Bw/edit# > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#>
>> 
>> Distributed clustered zeppelin is already in use at our company, and the
>> recorded screens are all real.
>> The first recorded screens GIF shows the following
>> Create a cluster of three zeppelin servers
>> Add 234, 235, 236 to the zeppelin.cluster.addr attribute in
>> zeppelin-site.xml to create a cluster
>> Start these 3 servers at the same time
>> Open the web pages of these 3 servers and prepare for the notebook
>> operation.
>> 
>> 
>> The second recorded screens GIF shows the following
>> Create an interpreter process in the cluster
>> Create a notebook on host234 and execute it, This action will create an
>> interpreter process in the server with free resources in the cluster.
>> You can then continue editing this notebook on host235 and execute it, You
>> can return results immediately without waiting for the time to create an
>> interpreter process.
>> Again, you can continue to edit this notebook on host236. And execute it,
>> you can return results immediately without waiting for the time to create
>> the interpreter process
>> The same notebook will reuse the first created interpreter process, so you
>> can get the execution result immediately on any server.
>> By looking at the background server process, you will find that host234,
>> host235, and host235 use the same interpreter process for the same notebook.
>> 
>> Originally, I wanted to record the interpreter process exception. The
>> cluster re-created the screenshot of the interpreter process in the idle
>> server, but I am too tired now.
>> There is time to record later.
>> 
>> 
>>> 在 2018年7月19日,上午7:36,Ruslan Dautkhanov  写道:
>>> 
>>> Thank you luxun,
>>> 
>>> I left a couple of comments in that google document.
>>> 
>>> --
>>> Ruslan Dautkhanov
>>> 
>>> 
>>> On Tue, Jul 17, 2018 at 11:30 PM liuxun > neliu...@163.com>> wrote:
>>> hi,Ruslan Dautkhanov
>>> 
>>> Thank you very much for your question. according to your advice, I added
>> 3 schematics to illustrate.
>>> 1. Distributed Zeppelin Deployment architecture diagram.
>>> 2. Distributed zeppelin Server fault tolerance diagram.
>>> 3. Distributed zeppelin Server & intp process fault tolerance diagram.
>>> 
>>> 
>>> The email attachment exceeded the size limit, so I reorganized the
>> document and updated it with Google Docs.
>>> https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu
>> VDKCRRBm-Qa3Bw/edit?usp=sharing > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing>
>>> 
>>> 
 在 2018年7月18日,下午1:03,liuxun mailto:neliu...@163.com>>
>> 写道:
 
 hi,Ruslan Dautkhanov
 
 Thank you very much for your question. according to your advice, I
>> added 3 schematics to illustrate.
 1. Zeppelin Cluster architecture diagram.
 2. Distributed zeppelin Server fault tolerance diagram.
 3. Distributed zeppelin Server & intp process fault tolerance diagram.
 
 Later, I will merge the schematic into the system design document.
 
 
 
 
 
 
 
 
 
 
 
 
>