I tinkered with this a bit.  I think the case to add indexes for "REST" API
may be weaker than even *I* thought (and I thought it was weak / was mostly
against it).

The reason is cus there are reasons beyond indexing why such API calls are
slow, and adding indexes may not even help that much.  For example the TI
list endpoint lists TIs. TIs have a *lot* of information distributed across
many tables.  We always load all of it.  Sometimes multiple queries to get
it.  That's just not going to be an efficient way to implement change data
capture a.k.a. replication.

I think the main responsibility of airflow is to be performant with regard
to task execution broadly defined.  Anything beyond that is sort of not
relevant.  And getting data out of the metastore for ancillary purposes is
outside that line IMO.  The range of possible queries etc are too varied to
reasonably optimize for.  So it's best left to the person wearing the "dba"
hat at the organization.

Users can also add endpoints via plugins that query the data more
efficiently.

That's my take in a nutshell I guess.

Reply via email to