jscheffl commented on PR #36492: URL: https://github.com/apache/airflow/pull/36492#issuecomment-1884625637
> > Hi, just saw this. Just curious, why not just always encrypt, instead of forcing each implementer to mess with the kwargs naming? > > My take: encrypting all of it makes trigger data opaque and difficult to diagnose in case of problems. > > Also In the future optimisation cases it makes it impossible (or very slow) to do some housekeeping. Good example of this is encrypted uses session in users db. We could not delete old sessions without seeing all the metadata there that was encrypted in blob - for example we have not been able to do very efficient "historical session" deletion because session creation time was part of the blob. > > The current implementation uses ExtendedJson to store kwargs in the DB - which in modern dbs (even MySQL(!) we could make efficient queries for kwargs. I can easily imagine the use of it for particular types of triggers (Find all deferred KPOs that have been using this namespace for example) . I think - for example - when we go to multitenancy and even (maybe in the future) to multi-multi-tenancy, this kind of queryable metadata might become really useful even for internal Airflow stuff. But even now it can provide valueable information and diagnostics for some power users. Mhm, I agree on "Troubleshooting" encryption is making it harder - but the Picked JSON/Object data anyway can not be queried on DB Level w/o Python. Consindering that the Triggering information is very volatile (I am not sure but was expecting after a task is completed it is clened-up and does not pile-up?) it would be really reasonable to encrypt just all. Then less code is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org