Hi, Mei Long I am very happy to be able to attend the meeting of the zeppelin community. What time is the next meeting? Waiting for community email notifications?
Zeppelin workflow's ticket is here, https://issues.apache.org/jira/browse/ZEPPELIN-4018 <https://issues.apache.org/jira/browse/ZEPPELIN-4018> welcome everyone's attention. > 在 2019年3月19日,上午1:04,Mei Long <ml...@zepl.com> 写道: > > Very cool! @Xun Liu Would you like to talk about it at our next Apache > Zeppelin community meeting? > > On Sat, Mar 16, 2019 at 1:00 PM Felix Cheung <felixcheun...@hotmail.com> > wrote: > >> I like it! >> >> ________________________________ >> From: Jongyoul Lee <jongy...@gmail.com> >> Sent: Monday, March 11, 2019 9:05:03 PM >> To: dev >> Subject: Re: [discuss] Zeppelin support workflow >> >> Thanks for the sharing this kind of discussion. >> >> I'm interested in it. Will see it. >> >> On Mon, Mar 11, 2019 at 10:43 AM Xun Liu <neliu...@163.com> wrote: >> >>> Hello, everyone >>> >>> Because there are more than 20 interpreters in zeppelin, Data analysts >>> can be used to do a variety of data development, >>> A lot of data development is interdependent. >>> For example, the development of machine learning algorithms requires >>> relying on spark to preprocess data, and so on. >>> >>> Zeppelin should have built-in workflow capabilities. Instead of relying >> on >>> external software to schedule notes in zeppelin for the following >> reasons: >>> >>> 1. Now that we have upgraded from the data processing era to the >> algorithm >>> era, After zeppelin has its own workflow, >>> Will have a complete ecosystem of complete data processing and >> algorithmic >>> operations. >>> 2. zeppelin's powerful interactive processing capabilities help algorithm >>> engineers improve productivity and work. >>> Zeppelin should give the algorithm engineer more direct control. Instead >>> of handing the algorithm to other teams(or software) to do the workflow. >>> 3. zeppelin knows more about the processing status of data than Azkaban >>> and airflow. >>> So the built-in workflow will have better performance, user experience >> and >>> control. >>> >>> Typical use case >>> Especially in machine learning, Because machine learning generally has a >>> long task execution. >>> A typical example is as follows: >>> 1) First, obtain data from HDFS through spark; >>> 2) Clean and convert the data through sparksql; >>> 3) Feature extraction of data through spark; >>> 4) Tensorflow writing algorithm through hadoop submarine; >>> 5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch >>> processing; >>> 6) Publish the training acquisition model and provide online prediction >>> services; >>> 7) Model prediction by flink; >>> 8) Receive incremental data through flink for incremental update of the >>> model; >>> Therefore, zeppelin is especially required to have the ability to arrange >>> workflows. >>> >>> I completed the draft of the zeppelin workflow system design, please >>> review, you can directly modify the document or fill in the comments. >>> >>> JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-4018 < >>> https://issues.apache.org/jira/browse/ZEPPELIN-4018> >>> gdoc: >>> >> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit >>> < >>> >> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit >>> >>> >>> >>> :-) >>> >>> Xun Liu >>> 2019-03-11 >> >> >> >> -- >> 이종열, Jongyoul Lee, 李宗烈 >> http://madeng.net >>