Re: [DISCUSS] a cache for Airflow Variables

Vandon, Raphael Mon, 27 Mar 2023 11:26:16 -0700

My initial goal when working on this cache was mostly to shorten DAG parsing 
times, simply because that's what I was looking at, so I'd be happy with 
restricting this cache to dag parsing.
I'm still relatively new to airflow codebase, so I don't know all the 
implications this change has, so I'm grateful for the comments here.


The benefits can be quite noticeable, depending a lot on the context. If the 
dag file is simple, then a network call is going to be slow in comparison.
And if the parsing interval is short compared to the dag execution schedule, 
then the number of calls to get secrets is going to be dominated by the dag 
parsing rather than the executions.

The scenario where this brings the most benefits is many simple dag files, all 
querying the same key from the Variables, parsed regularly, and ran less often.

@Hussein says that "the user can implement [their] own secret backend", but 
it's not an easy task. They'd have to implement it as a wrapper around the 
custom backend they want to use, since there can only be one custom secret 
backend. And implementing an in-memory cache that works cross-process just as a 
custom backend is straight up impossible.

About the secure caching, since I'm only caching in-memory, I didn't do 
anything to that regard, but we already have something in place to encrypt 
secrets when they are saved in the metastore using cryptography.fernet.

Re: [DISCUSS] a cache for Airflow Variables

Reply via email to