Hi.
I'm new to airflow, and I'm trying to understand if this tool can help us in
our work.
In my case I need to translate in DAGs this kind of behavior:
* monitor a local directory
* if 1,2,10... "new" files are detected
* for each of them it's necessary to
* open the file and extract an (unbound) list of items
* for each of them it's necessary to
* call a pair of rest endpoint to retrieve additional metadata
* write the retrieved data and the original items inside an SQL db
* if it's all right, call a POST to notify that the single file has been
worked on
* move the file in a "DONE" directory
I'm struggling to understand how to map those two "for each" behaviors.
for what I understood the "monitoring" should be done via some "sensor", that,
basically "polls" for conditions at a fixed rate, so it's possible to detect
more than one "new file" per check.
I have a similar scenario inside each file (it's basically a list of "jobs"),
but now I also need to do actions if all of the jobs have ended correctly.
The main problem I see is that I cannot find an operator that can "spread" the
work in the DAG (or, a "magical dynamic DAG", where I can have step 1, step 2
and then step 3.a, 3.b, ..., 3.x with a different x per run)
I saw TriggerDagRunOperator, but it basically can trigger (eventually) 1
external DAG.
I also found some "TriggerMultiDagRunOperator" implementations (for example one
on https://github.com/mastak/airflow_multi_dagrun ) but I would like to know if
this is a good usage of airflow, if there is some other better approach, or if
it's better to look at other tools.
Thanks
Vito De Tullio
Senior developer
FINCONS SPA
Via Torri Bianche 10 - Pal. Betulla
20871 Vimercate (MB)
Tel. +39 039657081
Fax +39 0396570877
[cid:e2e10b70-bcf3-4b53-8948-8001a8743536]
********** Informativa GDPR di riservatezza *************
Il presente messaggio corredato dei relativi allegati, può contenere
informazioni da considerarsi strettamente riservate e destinate esclusivamente
ai destinatari sopra indicati, i quali sono gli unici autorizzati ad usarle,
copiarle e, sotto la propria responsabilità, diffonderle.
La diffusione, distribuzione e/o copiatura del documento trasmesso da parte di
qualsiasi soggetto diverso dai destinatari è proibita.
Chiunque ricevesse questo messaggio per errore o comunque lo leggesse senza
esserne legittimato è avvertito che trattenerlo, copiarlo, divulgarlo,
distribuirlo a persone diverse dal destinatario è severamente proibito sia ai
sensi dell’art. 616 c.p. , che ai sensi del D.Lgs. n. 101 del 10/08/2018, ed è
pregato di rinviarlo immediatamente al mittente distruggendo permanentemente
l’originale e qualsiasi copia della presente nonché qualsiasi stampa della
stesso.
********** GDPR CONFIDENTIALITY NOTICE *************
The contents of this e-mail message and any attachments may contain strictly
confidential information and are intended solely for the above indicated
recipient(s), allowed to use, copy and disclose it on their own responsibility.
It is strictly forbidden to disclose, copy and/or forward or in any way reveal
the contents sent by any individual or entity other than the intended
recipient(s).
If you are not the intended recipient of this message, or if this message has
been addressed to you in error, please immediately alert the sender and then
delete this message and any attachments.
If you are not the intended recipient, you are hereby notified that any use,
dissemination, copying, or storage of this message or its attachments is
strictly prohibited, according to art. 616 Italian Criminal Code and
Legislative Decree N. 101/2018.