Thank you. I fully agree with you that we need a framework to support distributed version. IMHO, we cannot afford to develop our own. I'll dig into atomix as well.
On Tue, Jul 24, 2018 at 1:57 PM, liuxun <neliu...@163.com> wrote: > @Jongyoul Lee: > Thank you for your attention. > > Indeed, as you said, the `Copycat` project has been closed and has been > migrated to `https://github.com/atomix/atomix` > <https://github.com/atomix/atomix>. > > I also considered this issue during development. > The main reason was that it was enough to realize Raft using `Copycat` at > the time, and it was not considered too long. > > Today, I took a look at the documentation of atomix, > https://atomix.io/docs/latest/user-manual/ , > which has a lot of features, such as broadcasting messages in the cluster, > detecting cluster events... , > From the perspective of zeppelin's long-term development, it is better to > use atomix. > So, I will switch the Raft protocol algorithm library to atomix, which is > not difficult to modify. > > Struggle for zeppelin!!! :-) > > > 在 2018年7月24日,上午9:35,Jongyoul Lee <jongy...@gmail.com> 写道: > > First of all, thank you for your effort and contribution. > > I read it carefully today, and personally, it's a very nice feature and > idea. > > Let's discuss it and improve more concretely. I also left comments on the > doc. > > And I have a simple question. > > `Copycat`, which you used to implement it, is deprecated by owner[1] and > moved under https://github.com/atomix/atomix/. I'm afraid of it. Do you > have any reason to use this library? It's even SNAPSHOT version. > > Regards, > JL > > [1]: https://github.com/atomix/copycat > > On Sat, Jul 21, 2018 at 2:07 AM, liuxun <neliu...@163.com> wrote: > > HI: > > In order to more intuitively express the actual use of distributed > zeppelin clusters. > I updated this design document, starting with the 16th page of the > document, adding 2 GIF animations showing the operation record screen of > the zeppelin cluster we are using now. > https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu > VDKCRRBm-Qa3Bw/edit# <https://docs.google.com/document/d/ > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit#> > > Distributed clustered zeppelin is already in use at our company, and the > recorded screens are all real. > The first recorded screens GIF shows the following > Create a cluster of three zeppelin servers > Add 234, 235, 236 to the zeppelin.cluster.addr attribute in > zeppelin-site.xml to create a cluster > Start these 3 servers at the same time > Open the web pages of these 3 servers and prepare for the notebook > operation. > > > The second recorded screens GIF shows the following > Create an interpreter process in the cluster > Create a notebook on host234 and execute it, This action will create an > interpreter process in the server with free resources in the cluster. > You can then continue editing this notebook on host235 and execute it, You > can return results immediately without waiting for the time to create an > interpreter process. > Again, you can continue to edit this notebook on host236. And execute it, > you can return results immediately without waiting for the time to create > the interpreter process > The same notebook will reuse the first created interpreter process, so you > can get the execution result immediately on any server. > By looking at the background server process, you will find that host234, > host235, and host235 use the same interpreter process for the same > notebook. > > Originally, I wanted to record the interpreter process exception. The > cluster re-created the screenshot of the interpreter process in the idle > server, but I am too tired now. > There is time to record later. > > > 在 2018年7月19日,上午7:36,Ruslan Dautkhanov <dautkha...@gmail.com> 写道: > > Thank you luxun, > > I left a couple of comments in that google document. > > -- > Ruslan Dautkhanov > > > On Tue, Jul 17, 2018 at 11:30 PM liuxun <neliu...@163.com <mailto: > > neliu...@163.com>> wrote: > > hi,Ruslan Dautkhanov > > Thank you very much for your question. according to your advice, I added > > 3 schematics to illustrate. > > 1. Distributed Zeppelin Deployment architecture diagram. > 2. Distributed zeppelin Server fault tolerance diagram. > 3. Distributed zeppelin Server & intp process fault tolerance diagram. > > > The email attachment exceeded the size limit, so I reorganized the > > document and updated it with Google Docs. > > https://docs.google.com/document/d/1a8QLSyR3M5AhlG1GIYuDTj6bwazeu > > VDKCRRBm-Qa3Bw/edit?usp=sharing <https://docs.google.com/document/d/ > 1a8QLSyR3M5AhlG1GIYuDTj6bwazeuVDKCRRBm-Qa3Bw/edit?usp=sharing> > > > > 在 2018年7月18日,下午1:03,liuxun <neliu...@163.com <mailto:neliu...@163.com>> > > 写道: > > > hi,Ruslan Dautkhanov > > Thank you very much for your question. according to your advice, I > > added 3 schematics to illustrate. > > 1. Zeppelin Cluster architecture diagram. > 2. Distributed zeppelin Server fault tolerance diagram. > 3. Distributed zeppelin Server & intp process fault tolerance diagram. > > Later, I will merge the schematic into the system design document. > > <Zeppelin system architecture diagram00.png> > > > <Distributed zeppelin Server fault tolerance diagram 1.png> > > > > <Distributed zeppelin Server fault tolerance diagram 2.png> > > > > 在 2018年7月18日,上午1:16,Ruslan Dautkhanov <dautkha...@gmail.com <mailto: > > dautkha...@gmail.com>> 写道: > > > Nice. > > Thanks for sharing. > > Can you explain how are users routed into a particular zeppelin server > instance? I've seen nginx on top of them, but I don't think the > > document > > covers details? If one zeppelin server goes down or unhealthy, is nginx > supposed to detect (if so, how?) that and reroute users to a survived > instance? > > Thanks, > Ruslan Dautkhanov > > > On Tue, Jul 17, 2018 at 2:46 AM liuxun <neliu...@163.com <mailto: > > neliu...@163.com>> wrote: > > > hi: > > Our company installed and deployed a lot of zeppelin for data > > analysis. > > The single server version of zeppelin could not meet our application > scenarios, so we transformed zeppelin into a clustered service that > supports distributed deployment, Have a unified entrance, high > availability, and High server resource usage. the email attachment > > is the > > entire design document, I am very happy to feedback our modified code > > back > > to the community. > > > this is the JIRA I submitted in the community, > > https://issues.apache.org/jira/browse/ZEPPELIN-3471 < > > https://issues.apache.org/jira/browse/ZEPPELIN-3471> > > > > Since the design document size exceeds the mail attachment size > > limit, the > > document link address has to be sent. > > https://issues.apache.org/jira/secure/attachment/12931896/Zeppelin% > > 20distributed%20architecture%20design.pdf <https://issues.apache.org/ > jira/secure/attachment/12931896/Zeppelin%20distributed%20architecture% > 20design.pdf> > > > https://issues.apache.org/jira/secure/attachment/ > > 12931895/zepplin%20Cluster%20Sequence%20Diagram.png < > https://issues.apache.org/jira/secure/attachment/ > 12931895/zepplin%20Cluster%20Sequence%20Diagram.png> > > > > liuxun > > > > > > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net > > > -- 이종열, Jongyoul Lee, 李宗烈 http://madeng.net