I like it!

________________________________
From: Jongyoul Lee <jongy...@gmail.com>
Sent: Monday, March 11, 2019 9:05:03 PM
To: dev
Subject: Re: [discuss] Zeppelin support workflow

Thanks for the sharing this kind of discussion.

I'm interested in it. Will see it.

On Mon, Mar 11, 2019 at 10:43 AM Xun Liu <neliu...@163.com> wrote:

> Hello, everyone
>
> Because there are more than 20 interpreters in zeppelin,  Data analysts
> can be used to do a variety of data development,
> A lot of data development is interdependent.
> For example, the development of machine learning algorithms requires
> relying on spark to preprocess data, and so on.
>
> Zeppelin should have built-in workflow capabilities. Instead of relying on
> external software to schedule notes in zeppelin for the following reasons:
>
> 1. Now that we have upgraded from the data processing era to the algorithm
> era, After zeppelin has its own workflow,
> Will have a complete ecosystem of complete data processing and algorithmic
> operations.
> 2. zeppelin's powerful interactive processing capabilities help algorithm
> engineers improve productivity and work.
> Zeppelin should give the algorithm engineer more direct control. Instead
> of handing the algorithm to other teams(or software) to do the workflow.
> 3. zeppelin knows more about the processing status of data than Azkaban
> and airflow.
> So the built-in workflow will have better performance, user experience and
> control.
>
> Typical use case
> Especially in machine learning, Because machine learning generally has a
> long task execution.
> A typical example is as follows:
> 1) First, obtain data from HDFS through spark;
> 2) Clean and convert the data through sparksql;
> 3) Feature extraction of data through spark;
> 4) Tensorflow writing algorithm through hadoop submarine;
> 5) Distribute the tensorflow algorithm as a job to YARN or k8s for batch
> processing;
> 6) Publish the training acquisition model and provide online prediction
> services;
> 7) Model prediction by flink;
> 8) Receive incremental data through flink for incremental update of the
> model;
> Therefore, zeppelin is especially required to have the ability to arrange
> workflows.
>
> I completed the draft of the zeppelin workflow system design, please
> review, you can directly modify the document or fill in the comments.
>
> JIRA: https://issues.apache.org/jira/browse/ZEPPELIN-4018 <
> https://issues.apache.org/jira/browse/ZEPPELIN-4018>
> gdoc:
> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit
> <
> https://docs.google.com/document/d/1pQjVifOC1knPBuw3LVvby7GyNDXaeBq1ltRg6x4vDxM/edit>
>
>
> :-)
>
> Xun Liu
> 2019-03-11



--
이종열, Jongyoul Lee, 李宗烈
http://madeng.net

Reply via email to