> Are we all talking about different things ? Good point. I think that's the main source of confusion here and we think about different things.
> So what I feel that the use case that Nathan defined can just be solved a > VaultHook & VaultOperator for example. That's what I was talking (from the beginning - maybe it was not clear) about separate hooks for each service. Not a shared one. GCP Secret Manager Hook, Vault Hook, KMS Hook, AWS Secret Hook - all of them separate, in different providers, and simple hooks to be used by whoever wants to use them in their custom operators. We also talked about implementing operators, but there is very little use of generic Operators for secrets. Mainly because the only way operators can pass anything to other operators (tasks) is via xcom which would make the secrets stored plain text in the database. That is rather bad I am afraid. Having Hooks make them instantiatable in the context of running tasks, use Fernet to decrypt credentials from the Connection DB, request to retrieve secret from the backend and pass the unencrypted secret to the other parts of the operator - all in the context of a single worker/task. > > This should not be confused with "Secrets" at all. Why do we need to create > a generic Hooks for all Secrets Backend? No generic hooks :). I never meant it to be generic.Maybe that's a confusion there - I wanted to implement a separate hook for every type of backend. > Consider we use PostgreSQL for backend and the connection is defined in > airflow.cfg. Now you can still use the MySQLHook and PostgresHook > independently to connect to those Databases, correct. > > But they both should not be confused to be using anything "shared". No plans for that whatsoever. > The proposal if I interpret correctly talks about the following: > > We have an idea that we might want also (on top of the above SecretManager > > implementation) define generic Hooks for accessing secrets from those > > services (just generic secrets, not connection, variables). Simply treat > > each of the backends above as another "provider" and create a Hook to > > access the service. Such Hook could have just one method: > > def get_secret(self, path_prefix: str, secret_id: str) -> Optional[str] > > It would use a connection defined (as usual) in ENV variables or database > > of Airflow to authenticate with the secret service and retrieve the > > secrets. OK. maybe confusion is about 'generic' . My "generic" was ("no connections, no variables") - just retrieve "generic" secret. Separate implementation for Hashicorp Vault, Separate for Secret Manager, etc. > The connection can be defined in The Secrets backend. To make it clearer, > "Vault" in Nathan's case is a "Service" and has nothing to do with > SecretsBackend similar to how PostgresHook or MySQLHook has nothing to do > with using Postgres as Airflow MetadataDB Backend. > > Another example is Google KMS, there is already Hook for Google KMS ( > https://github.com/apache/airflow/blob/master/airflow/providers/google/cloud/hooks/kms.py) > and an Operator can be created. Same can be done for Google Secrets Manager > and Hashicorp Vault, in which cases all of these are "Services". That's exactly what I plan to implement. As explained above - Operator for secrets makes no sense because it would have to pass the secrets via xcom :(. I did not even check that we already have KMS hook. I was mostly about Vault and Secret Manager and AWS Secret Manager. Knowing that we have KMS makes it even easier :). > We could create SecretsHook similar to DbApiHook ( > https://github.com/apache/airflow/blob/master/airflow/hooks/dbapi_hook.py) > if we want to just define the single *get_secret* method you talked about. I don't even plan that in fact, I thought about implementing several totally independent Hooks for each of the Backend Secrets. > The concept of "Secrets Backend" is to allow Managing of "Secrets used in > Airflow" (Either to connect to an external system or Variables) in actual > Secret Management Tools. > Yeah. I do not - at all - want to mess with that :) > > *Pros:* > > And I > > well imagine this might be actually even more convenient to configure > > connections in the DB and access secrets this way rather than having to > > configure Secret Backends in Airflow configuration. > > This is exactly where both "Secrets" and the "Service" terms are mixed I > think. Again echoing what I said above : The concept of "Secrets Backend" > is to allow Managing of "Secrets used in Airflow". > The Secrets Backend is so that you don't need to store secrets in Airflow > Metadata DB whether they can encrypted or not as there are tools that are > specifically designed to handle "Secrets, rotation of secrets etc". Having > the Hook and Operator to talk to the Service should be separate. Full agreement - I do not want to intermix those. It was always thought as per-provider implementation of traditional "Hook". > > * Another benefit of it is that it would allow people still stuck on pre > > 1.10.10 to write custom operators that would like to use secret backends > > (via backport operators). And still continue doing it in the future > > (possibly migrating to 2.0/1.10.10+ in cases when there is one secret > > backed only - but continue ot use connections/hooks where some specific > > secrets shoudl be kept in different secret backend. > > > What is the objective here: (1) is it to interact with those Services > (Vault or Secrets Manager etc) or (2) Get Airflow Connections and Variables > from different Secrets Backend Just to interact with it - no plans at all to get Airflow Connections nor Variables. > > Regards, > Kaxil >