Hello, I think this page https://beam.apache.org/documentation/ml/orchestration/ might answer your question. Frankly speaking: GCP Workflows and Apache Airflow. But Beam itself is a data-stream/flow or batch processor; not a workflow engine (IMHO).
On Fri, Dec 15, 2023 at 3:13 PM data_nerd_666 <dataner...@gmail.com> wrote: > I know it is technically possible, but my case may be a little special. > Say I have 3 steps for my control flow (ETL workflow): > Step 1. upstream file watching > Step 2. call some external service to run one job, e.g. run a notebook, > run a python script > Step 3. notify downstream workflow > Can I use apache beam to build a DAG with 3 nodes and run this as either > flink or spark job. It might be a little weird, but I just want to > learn from the community whether this is the right way to use apache beam, > and has anyone done this before? Thanks > > > > On Fri, Dec 15, 2023 at 10:28 AM Byron Ellis via user < > user@beam.apache.org> wrote: > >> It’s technically possible but the closest thing I can think of would be >> triggering things based on things like file watching. >> >> On Thu, Dec 14, 2023 at 2:46 PM data_nerd_666 <dataner...@gmail.com> >> wrote: >> >>> Not using beam as time-based scheduler, but just use it to control >>> execution orders of ETL workflow DAG, because beam's abstraction is also a >>> DAG. >>> I know it is a little weird, just want to confirm with the community, >>> has anyone used beam like this before? >>> >>> >>> >>> On Thu, Dec 14, 2023 at 10:59 PM Jan Lukavský <je...@seznam.cz> wrote: >>> >>>> Hi, >>>> >>>> can you give an example of what you mean for better understanding? Do >>>> you mean using Beam as a scheduler of other ETL workflows? >>>> >>>> Jan >>>> >>>> On 12/14/23 13:17, data_nerd_666 wrote: >>>> > Hi all, >>>> > >>>> > I am new to apache beam, and am very excited to find beam in apache >>>> > community. I see lots of use cases of using apache beam for data flow >>>> > (process large amount of batch/streaming data). I am just wondering >>>> > whether I can use apache beam for control flow (ETL workflow). I >>>> don't >>>> > mean the spark/flink job in the ETL workflow, I mean the ETL workflow >>>> > itself. Because ETL workflow is also a DAG which is very similar as >>>> > the abstraction of apache beam, but unfortunately I didn't find such >>>> > use cases on internet. So I'd like to ask this question in beam >>>> > community to confirm whether I can use apache beam for control flow >>>> > (ETL workflow). If yes, please let me know some success stories of >>>> > this. Thanks >>>> > >>>> > >>>> > >>>> >>> -- Sincerely yours Mikhail Khludnev