I know it is technically possible, but my case may be a little special. Say
I have 3 steps for my control flow (ETL workflow):
Step 1. upstream file watching
Step 2. call some external service to run one job, e.g. run a notebook, run
a python script
Step 3. notify downstream workflow
Can I use apache beam to build a DAG with 3 nodes and run this as either
flink or spark job.  It might be a little weird, but I just want to
learn from the community whether this is the right way to use apache beam,
and has anyone done this before? Thanks



On Fri, Dec 15, 2023 at 10:28 AM Byron Ellis via user <[email protected]>
wrote:

> It’s technically possible but the closest thing I can think of would be
> triggering things based on things like file watching.
>
> On Thu, Dec 14, 2023 at 2:46 PM data_nerd_666 <[email protected]>
> wrote:
>
>> Not using beam as time-based scheduler, but just use it to control
>> execution orders of ETL workflow DAG, because beam's abstraction is also a
>> DAG.
>> I know it is a little weird, just want to confirm with the community, has
>> anyone used beam like this before?
>>
>>
>>
>> On Thu, Dec 14, 2023 at 10:59 PM Jan Lukavský <[email protected]> wrote:
>>
>>> Hi,
>>>
>>> can you give an example of what you mean for better understanding? Do
>>> you mean using Beam as a scheduler of other ETL workflows?
>>>
>>>   Jan
>>>
>>> On 12/14/23 13:17, data_nerd_666 wrote:
>>> > Hi all,
>>> >
>>> > I am new to apache beam, and am very excited to find beam in apache
>>> > community. I see lots of use cases of using apache beam for data flow
>>> > (process large amount of batch/streaming data). I am just wondering
>>> > whether I can use apache beam for control flow (ETL workflow). I don't
>>> > mean the spark/flink job in the ETL workflow, I mean the ETL workflow
>>> > itself. Because ETL workflow is also a DAG which is very similar as
>>> > the abstraction of apache beam, but unfortunately I didn't find such
>>> > use cases on internet. So I'd like to ask this question in beam
>>> > community to confirm whether I can use apache beam for control flow
>>> > (ETL workflow). If yes, please let me know some success stories of
>>> > this. Thanks
>>> >
>>> >
>>> >
>>>
>>

Reply via email to