Hi Flavio,

If you know a bit of Python, it's also trivial to add a new Flink operator
where you can use REST API.

In general, I'd consider Airflow to be the best choice for your problem,
especially if it gets more complicated in the future (do something else if
the first job fails).

If you have specific questions, feel free to ask.

Best,

Arvid

On Tue, Feb 2, 2021 at 10:08 AM 姜鑫 <[email protected]> wrote:

> Hi Flavio,
>
> I probably understand what you need. Apache AirFlow is a scheduling
> framework which you can define your own dependent operators, therefore you
> can define a BashOperator to submit flink job to you local flink cluster.
> For example:
> ```
> t1 = BashOperator(
>     task_id=‘flink-wordcount',
>     bash_command=‘./bin/flink run
> flink/build-target/examples/batch/WordCount.jar',
>     ...
> )
> ```
> Alse Airflow supports submitting jobs to kubernetes and you can even
> implement your own operator if bash command doesn’t meet your demands.
>
> Indeed Flink AI (flink-ai-extended
> <https://github.com/alibaba/flink-ai-extended> ?) needs an enhanced
> version of AirFlow, but it is mainly for streaming scenario which means the
> job won’t stop. In your case which are all batch jobs it doesn’t help much.
> Hope this helps.
>
> Regard,
> Xin
>
>
> 2021年2月2日 下午4:30,Flavio Pompermaier <[email protected]> 写道:
>
> Hi Xin,
> let me state first that I never used AirFlow so I can probably miss some
> background here.
> I just want to externalize the job scheduling to some consolidated
> framework and from what I see Apache AirFlow is probably what I need.
> However I can't find any good blog post or documentation about how to
> integrate these 2 technologies using REST API of both services.
> I saw that Flink AI decided to use a customized/enhanced version of
> AirFlow [1] but I didn't look into the code to understand how they use it.
> In my use case I just want to schedule 2 Flink batch jobs using the REST
> API of AirFlow, where the second one is fired after the first.
>
> [1] https://github.com/alibaba/flink-ai-extended/tree/master/flink-ai-flow
>
> Best,
> Flavio
>
> On Tue, Feb 2, 2021 at 2:43 AM 姜鑫 <[email protected]> wrote:
>
>> Hi Flavio,
>>
>> Could you explain what your direct question is? In my opinion, it is
>> possible to define two airflow operators to submit dependent flink job, as
>> long as the first one can reach the end.
>>
>> Regards,
>> Xin
>>
>> 2021年2月1日 下午6:43,Flavio Pompermaier <[email protected]> 写道:
>>
>> Any advice here?
>>
>> On Wed, Jan 27, 2021 at 9:49 PM Flavio Pompermaier <[email protected]>
>> wrote:
>>
>>> Hello everybody,
>>> is there any suggested way/pointer to schedule Flink jobs using Apache
>>> AirFlow?
>>> What I'd like to achieve is the submission (using the REST API of
>>> AirFlow) of 2 jobs, where the second one can be executed only if the first
>>> one succeed.
>>>
>>> Thanks in advance
>>> Flavio
>>>
>>
>
>

Reply via email to