This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/airflow.git
The following commit(s) were added to refs/heads/master by this push: new 9d4b914 Docs: Separate page for each Secrets backend (#10211) 9d4b914 is described below commit 9d4b914fa29982514cdac3d6458dcd6f827dda0a Author: Kamil Breguła <mik-...@users.noreply.github.com> AuthorDate: Fri Aug 7 12:16:50 2020 +0200 Docs: Separate page for each Secrets backend (#10211) --- docs/howto/connection/index.rst | 2 +- docs/howto/index.rst | 2 +- .../aws-secrets-manaager-backend.rst | 73 +++ .../aws-ssm-parameter-store-secrets-backend.rst | 51 ++ .../google-cloud-secret-manager-backend.rst | 134 ++++++ .../hashicorp-vault-secrets-backend.rst | 117 +++++ docs/howto/secrets-backend/index.rst | 86 ++++ .../local-filesystem-secrets-backend.rst | 145 ++++++ docs/howto/use-alternative-secrets-backend.rst | 519 --------------------- docs/integration.rst | 2 +- docs/redirects.txt | 1 + 11 files changed, 610 insertions(+), 522 deletions(-) diff --git a/docs/howto/connection/index.rst b/docs/howto/connection/index.rst index d88e020..0254d59 100644 --- a/docs/howto/connection/index.rst +++ b/docs/howto/connection/index.rst @@ -127,7 +127,7 @@ Alternative secrets backend --------------------------- In addition to retrieving connections from environment variables or the metastore database, you can enable -an alternative secrets backend to retrieve connections. For more details see :doc:`../use-alternative-secrets-backend` +an alternative secrets backend to retrieve connections. For more details see :doc:`../secrets-backend/index` Connection URI format --------------------- diff --git a/docs/howto/index.rst b/docs/howto/index.rst index 837f0f3..a47dd20 100644 --- a/docs/howto/index.rst +++ b/docs/howto/index.rst @@ -46,4 +46,4 @@ configuring an Airflow environment. define_extra_link tracking-user-activity email-config - use-alternative-secrets-backend + secrets-backend/index diff --git a/docs/howto/secrets-backend/aws-secrets-manaager-backend.rst b/docs/howto/secrets-backend/aws-secrets-manaager-backend.rst new file mode 100644 index 0000000..d52c5e4 --- /dev/null +++ b/docs/howto/secrets-backend/aws-secrets-manaager-backend.rst @@ -0,0 +1,73 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +AWS Secrets Manager Backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To enable Secrets Manager, specify :py:class:`~airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend` +as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. + +Here is a sample configuration: + +.. code-block:: ini + + [secrets] + backend = airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend + backend_kwargs = {"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "profile_name": "default"} + +To authenticate you can either supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config`` or set +environment variables like ``AWS_ACCESS_KEY_ID``, ``AWS_SECRET_ACCESS_KEY``. + + +Storing and Retrieving Connections +"""""""""""""""""""""""""""""""""" + +If you have set ``connections_prefix`` as ``airflow/connections``, then for a connection id of ``smtp_default``, +you would want to store your connection at ``airflow/connections/smtp_default``. + +Example: + +.. code-block:: bash + + aws secretsmanager put-secret-value \ + --secret-id airflow/connections/smtp_default \ + --secret-string "smtps://user:h...@relay.example.com:465" + +Verify that you can get the secret: + +.. code-block:: console + + ❯ aws secretsmanager get-secret-value --secret-id airflow/connections/smtp_default + { + "ARN": "arn:aws:secretsmanager:us-east-2:314524341751:secret:airflow/connections/smtp_default-7meuul", + "Name": "airflow/connections/smtp_default", + "VersionId": "34f90eff-ea21-455a-9c8f-5ee74b21be672", + "SecretString": "smtps://user:h...@relay.example.com:465", + "VersionStages": [ + "AWSCURRENT" + ], + "CreatedDate": "2020-04-08T02:10:35.132000+01:00" + } + +The value of the secret must be the :ref:`connection URI representation <generating_connection_uri>` +of the connection object. + +Storing and Retrieving Variables +"""""""""""""""""""""""""""""""" + +If you have set ``variables_prefix`` as ``airflow/variables``, then for an Variable key of ``hello``, +you would want to store your Variable at ``airflow/variables/hello``. diff --git a/docs/howto/secrets-backend/aws-ssm-parameter-store-secrets-backend.rst b/docs/howto/secrets-backend/aws-ssm-parameter-store-secrets-backend.rst new file mode 100644 index 0000000..4d99800 --- /dev/null +++ b/docs/howto/secrets-backend/aws-ssm-parameter-store-secrets-backend.rst @@ -0,0 +1,51 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _ssm_parameter_store_secrets: + +AWS SSM Parameter Store Secrets Backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To enable SSM parameter store, specify :py:class:`~airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend` +as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. + +Here is a sample configuration: + +.. code-block:: ini + + [secrets] + backend = airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend + backend_kwargs = {"connections_prefix": "/airflow/connections", "variables_prefix": "/airflow/variables", "profile_name": "default"} + +Storing and Retrieving Connections +"""""""""""""""""""""""""""""""""" + +If you have set ``connections_prefix`` as ``/airflow/connections``, then for a connection id of ``smtp_default``, +you would want to store your connection at ``/airflow/connections/smtp_default``. + +Optionally you can supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config``. + +The value of the SSM parameter must be the :ref:`connection URI representation <generating_connection_uri>` +of the connection object. + +Storing and Retrieving Variables +"""""""""""""""""""""""""""""""" + +If you have set ``variables_prefix`` as ``/airflow/variables``, then for an Variable key of ``hello``, +you would want to store your Variable at ``/airflow/variables/hello``. + +Optionally you can supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config``. diff --git a/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst b/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst new file mode 100644 index 0000000..b7e5815 --- /dev/null +++ b/docs/howto/secrets-backend/google-cloud-secret-manager-backend.rst @@ -0,0 +1,134 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _google_cloud_secret_manager_backend: + +Google Cloud Secret Manager Backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This topic describes how to configure Airflow to use `Secret Manager <https://cloud.google.com/secret-manager/docs>`__ as +a secret backend and how to manage secrets. + +Before you begin +"""""""""""""""" + +`Configure Secret Manager and your local environment <https://cloud.google.com/secret-manager/docs/configuring-secret-manager>`__, once per project. + +Enabling the secret backend +""""""""""""""""""""""""""" + +To enable the secret backend for Google Cloud Secrets Manager to retrieve connection/variables, +specify :py:class:`~airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend` +as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. + +Here is a sample configuration if you want to use it: + +.. code-block:: ini + + [secrets] + backend = airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend + +You can also set this with environment variables. + +.. code-block:: bash + + export AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend + +You can verify the correct setting of the configuration options with the ``airflw config get-value`` command. + +.. code-block:: bash + + $ airflow config get-value secrets backend + airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend + +Backend parameters +"""""""""""""""""" + +The next step is to configure backend parameters using the ``backend_kwargs`` options. You can pass +the following parameters: + +* ``connections_prefix``: Specifies the prefix of the secret to read to get Connections. Default: ``"airflow-connections"`` +* ``variables_prefix``: Specifies the prefix of the secret to read to get Variables. Default: ``"airflow-variables"`` +* ``gcp_key_path``: Path to GCP Credential JSON file. +* ``gcp_keyfile_dict``: Dictionary of keyfile parameters. +* ``gcp_scopes``: Comma-separated string containing GCP scopes. +* ``sep``: Separator used to concatenate connections_prefix and conn_id. Default: "-" +* ``project_id``: Project ID to read the secrets from. If not passed, the project ID from credentials will be used. + +All options should be passed as a JSON dictionary. + +For example, if you want to set parameter ``connections_prefix`` to ``"airflow-tenant-primary"`` and parameter ``variables_prefix`` to ``"variables_prefix"``, your configuration file should look like this: + +.. code-block:: ini + + [secrets] + backend = airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend + backend_kwargs = {"connections_prefix": "airflow-tenant-primary", "variables_prefix": "airflow-tenant-primary"} + +Set-up credentials +"""""""""""""""""" + +You can configure the credentials in three ways: + +* By default, Application Default Credentials (ADC) is used obtain credentials. +* ``gcp_key_path`` option in ``backend_kwargs`` option - allows you to configure authorizations with a service account stored in local file. +* ``gcp_keyfile_dict`` option in ``backend_kwargs`` option - allows you to configure authorizations with a service account stored in Airflow configuration. + +.. note:: + + For more information about the Application Default Credentials (ADC), see: + + * `google.auth.default <https://google-auth.readthedocs.io/en/latest/reference/google.auth.html#google.auth.default>`__ + * `Setting Up Authentication for Server to Server Production Applications <https://cloud.google.com/docs/authentication/production>`__ + +Managing secrets +"""""""""""""""" + +If you want to configure a connection, you need to save it as a :ref:`connection URI representation <generating_connection_uri>`. +Variables should be saved as plain text. + +In order to manage secrets, you can use the ``gcloud`` tool or other supported tools. For more information, take a look at: +`Managing secrets <https://cloud.google.com/secret-manager/docs/creating-and-accessing-secrets>`__ in Google Cloud Documentation. + +The name of the secret must fit the following formats: + + * for variable: ``[connections_prefix][sep][variable_name]`` + * for connection: ``[variable_prefix][sep][connection_name]`` + +where: + + * ``connections_prefix`` - fixed value defined in the ``connections_prefix`` parameter in backend configuration. Default: ``airflow-connections``. + * ``variable_prefix`` - fixed value defined in the ``variable_prefix`` parameter in backend configuration. Default: ``airflow-variables``. + * ``sep`` - fixed value defined in the ``sep`` parameter in backend configuration. Default: ``-``. + +The Cloud Secrets Manager secret name should follow the pattern ``[a-zA-Z0-9-_]``. + +If you have the default backend configuration and you want to create a connection with ``conn_id`` +equals ``first-connection``, you should create secret named ``airflow-connections-first-connection``. +You can do it with the gcloud tools as in the example below. + +.. code-block:: bash + + echo "mysql://example.org" | gcloud beta secrets create airflow-connections-first-connection --data-file=- + +If you have the default backend configuration and you want to create a variable named ``first-variable``, +you should create a secret named ``airflow-variables-first-variable``. You can do it with the gcloud +command as in the example below. + +.. code-block:: bash + + echo "content" | gcloud beta secrets create airflow-variables-first-variable --data-file= diff --git a/docs/howto/secrets-backend/hashicorp-vault-secrets-backend.rst b/docs/howto/secrets-backend/hashicorp-vault-secrets-backend.rst new file mode 100644 index 0000000..1b25060 --- /dev/null +++ b/docs/howto/secrets-backend/hashicorp-vault-secrets-backend.rst @@ -0,0 +1,117 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _hashicorp_vault_secrets: + +Hashicorp Vault Secrets Backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To enable Hashicorp vault to retrieve Airflow connection/variable, specify :py:class:`~airflow.providers.hashicorp.secrets.vault.VaultBackend` +as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. + +Here is a sample configuration: + +.. code-block:: ini + + [secrets] + backend = airflow.providers.hashicorp.secrets.vault.VaultBackend + backend_kwargs = {"connections_path": "connections", "variables_path": "variables", "mount_point": "airflow", "url": "http://127.0.0.1:8200"} + +The default KV version engine is ``2``, pass ``kv_engine_version: 1`` in ``backend_kwargs`` if you use +KV Secrets Engine Version ``1``. + +You can also set and pass values to Vault client by setting environment variables. All the +environment variables listed at https://www.vaultproject.io/docs/commands/#environment-variables are supported. + +Hence, if you set ``VAULT_ADDR`` environment variable like below, you do not need to pass ``url`` +key to ``backend_kwargs``: + +.. code-block:: bash + + export VAULT_ADDR="http://127.0.0.1:8200" + + +Storing and Retrieving Connections +"""""""""""""""""""""""""""""""""" + +If you have set ``connections_path`` as ``connections`` and ``mount_point`` as ``airflow``, then for a connection id of +``smtp_default``, you would want to store your secret as: + +.. code-block:: bash + + vault kv put airflow/connections/smtp_default conn_uri=smtps://user:h...@relay.example.com:465 + +Note that the ``Key`` is ``conn_uri``, ``Value`` is ``postgresql://airflow:airflow@host:5432/airflow`` and +``mount_point`` is ``airflow``. + +You can make a ``mount_point`` for ``airflow`` as follows: + +.. code-block:: bash + + vault secrets enable -path=airflow -version=2 kv + +Verify that you can get the secret from ``vault``: + +.. code-block:: console + + ❯ vault kv get airflow/connections/smtp_default + ====== Metadata ====== + Key Value + --- ----- + created_time 2020-03-19T19:17:51.281721Z + deletion_time n/a + destroyed false + version 1 + + ====== Data ====== + Key Value + --- ----- + conn_uri smtps://user:h...@relay.example.com:465 + +The value of the Vault key must be the :ref:`connection URI representation <generating_connection_uri>` +of the connection object to get connection. + +Storing and Retrieving Variables +"""""""""""""""""""""""""""""""" + +If you have set ``variables_path`` as ``variables`` and ``mount_point`` as ``airflow``, then for a variable with +``hello`` as key, you would want to store your secret as: + +.. code-block:: bash + + vault kv put airflow/variables/hello value=world + +Verify that you can get the secret from ``vault``: + +.. code-block:: console + + ❯ vault kv get airflow/variables/hello + ====== Metadata ====== + Key Value + --- ----- + created_time 2020-03-28T02:10:54.301784Z + deletion_time n/a + destroyed false + version 1 + + ==== Data ==== + Key Value + --- ----- + value world + +Note that the secret ``Key`` is ``value``, and secret ``Value`` is ``world`` and +``mount_point`` is ``airflow``. diff --git a/docs/howto/secrets-backend/index.rst b/docs/howto/secrets-backend/index.rst new file mode 100644 index 0000000..9c50218 --- /dev/null +++ b/docs/howto/secrets-backend/index.rst @@ -0,0 +1,86 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + + +Secrets backend +--------------- + +.. versionadded:: 1.10.10 + +In addition to retrieving connections & variables from environment variables or the metastore database, you can enable +an alternative secrets backend to retrieve Airflow connections or Airflow variables, +such as :ref:`Google Cloud Secret Maanager<google_cloud_secret_manager_backend>`, +:ref:`Hashicorp Vault Secrets<hashicorp_vault_secrets>` or you can :ref:`roll your own <roll_your_own_secrets_backend>`. + +.. note:: + + The Airflow UI only shows connections and variables stored in the Metadata DB and not via any other method. + If you use an alternative secrets backend, check inside your backend to view the values of your variables and connections. + +Search path +^^^^^^^^^^^ +When looking up a connection/variable, by default Airflow will search environment variables first and metastore +database second. + +If you enable an alternative secrets backend, it will be searched first, followed by environment variables, +then metastore. This search ordering is not configurable. + +.. _secrets_backend_configuration: + +Configuration +^^^^^^^^^^^^^ + +The ``[secrets]`` section has the following options: + +.. code-block:: ini + + [secrets] + backend = + backend_kwargs = + +Set ``backend`` to the fully qualified class name of the backend you want to enable. + +You can provide ``backend_kwargs`` with json and it will be passed as kwargs to the ``__init__`` method of +your secrets backend. + +Supported backends +^^^^^^^^^^^^^^^^^^ + +.. toctree:: + :maxdepth: 1 + :glob: + + * + +.. _roll_your_own_secrets_backend: + +Roll your own secrets backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A secrets backend is a subclass of :py:class:`airflow.secrets.BaseSecretsBackend` and must implement either +:py:meth:`~airflow.secrets.BaseSecretsBackend.get_connections` or :py:meth:`~airflow.secrets.BaseSecretsBackend.get_conn_uri`. + +After writing your backend class, provide the fully qualified class name in the ``backend`` key in the ``[secrets]`` +section of ``airflow.cfg``. + +Additional arguments to your SecretsBackend can be configured in ``airflow.cfg`` by supplying a JSON string to ``backend_kwargs``, which will be passed to the ``__init__`` of your SecretsBackend. +See :ref:`Configuration <secrets_backend_configuration>` for more details, and :ref:`SSM Parameter Store <ssm_parameter_store_secrets>` for an example. + +.. note:: + + If you are rolling your own secrets backend, you don't strictly need to use airflow's URI format. But + doing so makes it easier to switch between environment variables, the metastore, and your secrets backend. diff --git a/docs/howto/secrets-backend/local-filesystem-secrets-backend.rst b/docs/howto/secrets-backend/local-filesystem-secrets-backend.rst new file mode 100644 index 0000000..463a06b --- /dev/null +++ b/docs/howto/secrets-backend/local-filesystem-secrets-backend.rst @@ -0,0 +1,145 @@ + .. Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + .. http://www.apache.org/licenses/LICENSE-2.0 + + .. Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. + +.. _local_filesystem_secrets: + +Local Filesystem Secrets Backend +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This backend is especially useful in the following use cases: + +* **Development**: It ensures data synchronization between all terminal windows (same as databases), + and at the same time the values are retained after database restart (same as environment variable) +* **Kubernetes**: It allows you to store secrets in `Kubernetes Secrets <https://kubernetes.io/docs/concepts/configuration/secret/>`__ + or you can synchronize values using the sidecar container and + `a shared volume <https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/>`__ + +To use variable and connection from local file, specify :py:class:`~airflow.secrets.local_filesystem.LocalFilesystemBackend` +as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. + +Available parameters to ``backend_kwargs``: + +* ``variables_file_path``: File location with variables data. +* ``connections_file_path``: File location with connections data. + +Here is a sample configuration: + +.. code-block:: ini + + [secrets] + backend = airflow.secrets.local_filesystem.LocalFilesystemBackend + backend_kwargs = {"variables_file_path": "/files/var.json", "connections_file_path": "/files/conn.json"} + +``JSON``, ``YAML`` and ``.env`` files are supported. All parameters are optional. If the file path is not passed, +the backend returns an empty collection. + +Storing and Retrieving Connections +"""""""""""""""""""""""""""""""""" + +If you have set ``connections_file_path`` as ``/files/my_conn.json``, then the backend will read the +file ``/files/my_conn.json`` when it looks for connections. + +The file can be defined in ``JSON``, ``YAML`` or ``env`` format. Depending on the format, the data should be saved as a URL or as a connection object. +Any extra json parameters can be provided using keys like ``extra_dejson`` and ``extra``. +The key ``extra_dejson`` can be used to provide parameters as JSON object where as the key ``extra`` can be used in case of a JSON string. +The keys ``extra`` and ``extra_dejson`` are mutually exclusive. + +The JSON file must contain an object where the key contains the connection ID and the value contains +the definition of one connection. The connection can be defined as a URI (string) or JSON object. +For a guide about defining a connection as a URI, see:: :ref:`generating_connection_uri`. +For a description of the connection object parameters see :class:`~airflow.models.connection.Connection`. +The following is a sample JSON file. + +.. code-block:: json + + { + "CONN_A": "mysq://host_a", + "CONN_B": { + "conn_type": "scheme", + "host": "host", + "schema": "lschema", + "login": "Login", + "password": "None", + "port": "1234" + } + } + +The YAML file structure is similar to that of a JSON. The key-value pair of connection ID and the definitions of one or more connections. +In this format, the connection can be defined as a URI (string) or JSON object. + +.. code-block:: yaml + + CONN_A: 'mysq://host_a' + + CONN_B: + - 'mysq://host_a' + - 'mysq://host_b' + + CONN_C: + conn_type: scheme + host: host + schema: lschema + login: Login + password: None + port: 1234 + extra_dejson: + a: b + nestedblock_dict: + x: y + +You can also define connections using a ``.env`` file. Then the key is the connection ID, and +the value should describe the connection using the URI. Connection ID should not be repeated, it will +raise an exception. The following is a sample file. + + .. code-block:: text + + mysql_conn_id=mysql://log:password@13.1.21.1:3306/mysqldbrd + google_custom_key=google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json + +Storing and Retrieving Variables +"""""""""""""""""""""""""""""""" + +If you have set ``variables_file_path`` as ``/files/my_var.json``, then the backend will read the +file ``/files/my_var.json`` when it looks for variables. + +The file can be defined in ``JSON``, ``YAML`` or ``env`` format. + +The JSON file must contain an object where the key contains the variable key and the value contains +the variable value. The following is a sample JSON file. + + .. code-block:: json + + { + "VAR_A": "some_value", + "var_b": "differnet_value" + } + +The YAML file structure is similar to that of JSON, with key containing the variable key and the value containing +the variable value. The following is a sample YAML file. + + .. code-block:: yaml + + VAR_A: some_value + VAR_B: different_value + +You can also define variable using a ``.env`` file. Then the key is the variable key, and variable should +describe the variable value. The following is a sample file. + + .. code-block:: text + + VAR_A=some_value + var_B=different_value diff --git a/docs/howto/use-alternative-secrets-backend.rst b/docs/howto/use-alternative-secrets-backend.rst deleted file mode 100644 index 7979f47..0000000 --- a/docs/howto/use-alternative-secrets-backend.rst +++ /dev/null @@ -1,519 +0,0 @@ - .. Licensed to the Apache Software Foundation (ASF) under one - or more contributor license agreements. See the NOTICE file - distributed with this work for additional information - regarding copyright ownership. The ASF licenses this file - to you under the Apache License, Version 2.0 (the - "License"); you may not use this file except in compliance - with the License. You may obtain a copy of the License at - - .. http://www.apache.org/licenses/LICENSE-2.0 - - .. Unless required by applicable law or agreed to in writing, - software distributed under the License is distributed on an - "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - KIND, either express or implied. See the License for the - specific language governing permissions and limitations - under the License. - - -Alternative secrets backend ---------------------------- - -.. versionadded:: 1.10.10 - -In addition to retrieving connections & variables from environment variables or the metastore database, you can enable -an alternative secrets backend to retrieve Airflow connections or Airflow variables, -such as :ref:`AWS SSM Parameter Store <ssm_parameter_store_secrets>`, -:ref:`Hashicorp Vault Secrets<hashicorp_vault_secrets>` or you can :ref:`roll your own <roll_your_own_secrets_backend>`. - -.. note:: - - The Airflow UI only shows connections and variables stored in the Metadata DB and not via any other method. - If you use an alternative secrets backend, check inside your backend to view the values of your variables and connections. - -Search path -^^^^^^^^^^^ -When looking up a connection/variable, by default Airflow will search environment variables first and metastore -database second. - -If you enable an alternative secrets backend, it will be searched first, followed by environment variables, -then metastore. This search ordering is not configurable. - -.. _secrets_backend_configuration: - -Configuration -^^^^^^^^^^^^^ - -The ``[secrets]`` section has the following options: - -.. code-block:: ini - - [secrets] - backend = - backend_kwargs = - -Set ``backend`` to the fully qualified class name of the backend you want to enable. - -You can provide ``backend_kwargs`` with json and it will be passed as kwargs to the ``__init__`` method of -your secrets backend. - -.. _local_filesystem_secrets: - -Local Filesystem Secrets Backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This backend is especially useful in the following use cases: - -* **Development**: It ensures data synchronization between all terminal windows (same as databases), - and at the same time the values are retained after database restart (same as environment variable) -* **Kubernetes**: It allows you to store secrets in `Kubernetes Secrets <https://kubernetes.io/docs/concepts/configuration/secret/>`__ - or you can synchronize values using the sidecar container and - `a shared volume <https://kubernetes.io/docs/tasks/access-application-cluster/communicate-containers-same-pod-shared-volume/>`__ - -To use variable and connection from local file, specify :py:class:`~airflow.secrets.local_filesystem.LocalFilesystemBackend` -as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. - -Available parameters to ``backend_kwargs``: - -* ``variables_file_path``: File location with variables data. -* ``connections_file_path``: File location with connections data. - -Here is a sample configuration: - -.. code-block:: ini - - [secrets] - backend = airflow.secrets.local_filesystem.LocalFilesystemBackend - backend_kwargs = {"variables_file_path": "/files/var.json", "connections_file_path": "/files/conn.json"} - -``JSON``, ``YAML`` and ``.env`` files are supported. All parameters are optional. If the file path is not passed, -the backend returns an empty collection. - -Storing and Retrieving Connections -"""""""""""""""""""""""""""""""""" - -If you have set ``connections_file_path`` as ``/files/my_conn.json``, then the backend will read the -file ``/files/my_conn.json`` when it looks for connections. - -The file can be defined in ``JSON``, ``YAML`` or ``env`` format. Depending on the format, the data should be saved as a URL or as a connection object. -Any extra json parameters can be provided using keys like ``extra_dejson`` and ``extra``. -The key ``extra_dejson`` can be used to provide parameters as JSON object where as the key ``extra`` can be used in case of a JSON string. -The keys ``extra`` and ``extra_dejson`` are mutually exclusive. - -The JSON file must contain an object where the key contains the connection ID and the value contains -the definition of one connection. The connection can be defined as a URI (string) or JSON object. -For a guide about defining a connection as a URI, see:: :ref:`generating_connection_uri`. -For a description of the connection object parameters see :class:`~airflow.models.connection.Connection`. -The following is a sample JSON file. - -.. code-block:: json - - { - "CONN_A": "mysq://host_a", - "CONN_B": { - "conn_type": "scheme", - "host": "host", - "schema": "lschema", - "login": "Login", - "password": "None", - "port": "1234" - } - } - -The YAML file structure is similar to that of a JSON. The key-value pair of connection ID and the definitions of one or more connections. -In this format, the connection can be defined as a URI (string) or JSON object. - -.. code-block:: yaml - - CONN_A: 'mysq://host_a' - - CONN_B: - - 'mysq://host_a' - - 'mysq://host_b' - - CONN_C: - conn_type: scheme - host: host - schema: lschema - login: Login - password: None - port: 1234 - extra_dejson: - a: b - nestedblock_dict: - x: y - -You can also define connections using a ``.env`` file. Then the key is the connection ID, and -the value should describe the connection using the URI. Connection ID should not be repeated, it will -raise an exception. The following is a sample file. - - .. code-block:: text - - mysql_conn_id=mysql://log:password@13.1.21.1:3306/mysqldbrd - google_custom_key=google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json - -Storing and Retrieving Variables -"""""""""""""""""""""""""""""""" - -If you have set ``variables_file_path`` as ``/files/my_var.json``, then the backend will read the -file ``/files/my_var.json`` when it looks for variables. - -The file can be defined in ``JSON``, ``YAML`` or ``env`` format. - -The JSON file must contain an object where the key contains the variable key and the value contains -the variable value. The following is a sample JSON file. - - .. code-block:: json - - { - "VAR_A": "some_value", - "var_b": "differnet_value" - } - -The YAML file structure is similar to that of JSON, with key containing the variable key and the value containing -the variable value. The following is a sample YAML file. - - .. code-block:: yaml - - VAR_A: some_value - VAR_B: different_value - -You can also define variable using a ``.env`` file. Then the key is the variable key, and variable should -describe the variable value. The following is a sample file. - - .. code-block:: text - - VAR_A=some_value - var_B=different_value - -.. _ssm_parameter_store_secrets: - -AWS SSM Parameter Store Secrets Backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To enable SSM parameter store, specify :py:class:`~airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend` -as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. - -Here is a sample configuration: - -.. code-block:: ini - - [secrets] - backend = airflow.providers.amazon.aws.secrets.systems_manager.SystemsManagerParameterStoreBackend - backend_kwargs = {"connections_prefix": "/airflow/connections", "variables_prefix": "/airflow/variables", "profile_name": "default"} - -Storing and Retrieving Connections -"""""""""""""""""""""""""""""""""" - -If you have set ``connections_prefix`` as ``/airflow/connections``, then for a connection id of ``smtp_default``, -you would want to store your connection at ``/airflow/connections/smtp_default``. - -Optionally you can supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config``. - -The value of the SSM parameter must be the :ref:`connection URI representation <generating_connection_uri>` -of the connection object. - -Storing and Retrieving Variables -"""""""""""""""""""""""""""""""" - -If you have set ``variables_prefix`` as ``/airflow/variables``, then for an Variable key of ``hello``, -you would want to store your Variable at ``/airflow/variables/hello``. - -Optionally you can supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config``. - -AWS Secrets Manager Backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To enable Secrets Manager, specify :py:class:`~airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend` -as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. - -Here is a sample configuration: - -.. code-block:: ini - - [secrets] - backend = airflow.providers.amazon.aws.secrets.secrets_manager.SecretsManagerBackend - backend_kwargs = {"connections_prefix": "airflow/connections", "variables_prefix": "airflow/variables", "profile_name": "default"} - -To authenticate you can either supply a profile name to reference aws profile, e.g. defined in ``~/.aws/config`` or set -environment variables like ``AWS_ACCESS_KEY_ID``, ``AWS_SECRET_ACCESS_KEY``. - - -Storing and Retrieving Connections -"""""""""""""""""""""""""""""""""" - -If you have set ``connections_prefix`` as ``airflow/connections``, then for a connection id of ``smtp_default``, -you would want to store your connection at ``airflow/connections/smtp_default``. - -Example: - -.. code-block:: bash - - aws secretsmanager put-secret-value \ - --secret-id airflow/connections/smtp_default \ - --secret-string "smtps://user:h...@relay.example.com:465" - -Verify that you can get the secret: - -.. code-block:: console - - ❯ aws secretsmanager get-secret-value --secret-id airflow/connections/smtp_default - { - "ARN": "arn:aws:secretsmanager:us-east-2:314524341751:secret:airflow/connections/smtp_default-7meuul", - "Name": "airflow/connections/smtp_default", - "VersionId": "34f90eff-ea21-455a-9c8f-5ee74b21be672", - "SecretString": "smtps://user:h...@relay.example.com:465", - "VersionStages": [ - "AWSCURRENT" - ], - "CreatedDate": "2020-04-08T02:10:35.132000+01:00" - } - -The value of the secret must be the :ref:`connection URI representation <generating_connection_uri>` -of the connection object. - -Storing and Retrieving Variables -"""""""""""""""""""""""""""""""" - -If you have set ``variables_prefix`` as ``airflow/variables``, then for an Variable key of ``hello``, -you would want to store your Variable at ``airflow/variables/hello``. - - -.. _hashicorp_vault_secrets: - -Hashicorp Vault Secrets Backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -To enable Hashicorp vault to retrieve Airflow connection/variable, specify :py:class:`~airflow.providers.hashicorp.secrets.vault.VaultBackend` -as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. - -Here is a sample configuration: - -.. code-block:: ini - - [secrets] - backend = airflow.providers.hashicorp.secrets.vault.VaultBackend - backend_kwargs = {"connections_path": "connections", "variables_path": "variables", "mount_point": "airflow", "url": "http://127.0.0.1:8200"} - -The default KV version engine is ``2``, pass ``kv_engine_version: 1`` in ``backend_kwargs`` if you use -KV Secrets Engine Version ``1``. - -You can also set and pass values to Vault client by setting environment variables. All the -environment variables listed at https://www.vaultproject.io/docs/commands/#environment-variables are supported. - -Hence, if you set ``VAULT_ADDR`` environment variable like below, you do not need to pass ``url`` -key to ``backend_kwargs``: - -.. code-block:: bash - - export VAULT_ADDR="http://127.0.0.1:8200" - - -Storing and Retrieving Connections -"""""""""""""""""""""""""""""""""" - -If you have set ``connections_path`` as ``connections`` and ``mount_point`` as ``airflow``, then for a connection id of -``smtp_default``, you would want to store your secret as: - -.. code-block:: bash - - vault kv put airflow/connections/smtp_default conn_uri=smtps://user:h...@relay.example.com:465 - -Note that the ``Key`` is ``conn_uri``, ``Value`` is ``postgresql://airflow:airflow@host:5432/airflow`` and -``mount_point`` is ``airflow``. - -You can make a ``mount_point`` for ``airflow`` as follows: - -.. code-block:: bash - - vault secrets enable -path=airflow -version=2 kv - -Verify that you can get the secret from ``vault``: - -.. code-block:: console - - ❯ vault kv get airflow/connections/smtp_default - ====== Metadata ====== - Key Value - --- ----- - created_time 2020-03-19T19:17:51.281721Z - deletion_time n/a - destroyed false - version 1 - - ====== Data ====== - Key Value - --- ----- - conn_uri smtps://user:h...@relay.example.com:465 - -The value of the Vault key must be the :ref:`connection URI representation <generating_connection_uri>` -of the connection object to get connection. - -Storing and Retrieving Variables -"""""""""""""""""""""""""""""""" - -If you have set ``variables_path`` as ``variables`` and ``mount_point`` as ``airflow``, then for a variable with -``hello`` as key, you would want to store your secret as: - -.. code-block:: bash - - vault kv put airflow/variables/hello value=world - -Verify that you can get the secret from ``vault``: - -.. code-block:: console - - ❯ vault kv get airflow/variables/hello - ====== Metadata ====== - Key Value - --- ----- - created_time 2020-03-28T02:10:54.301784Z - deletion_time n/a - destroyed false - version 1 - - ==== Data ==== - Key Value - --- ----- - value world - -Note that the secret ``Key`` is ``value``, and secret ``Value`` is ``world`` and -``mount_point`` is ``airflow``. - - -.. _secret_manager_backend: - -Google Cloud Secret Manager Backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -This topic describes how to configure Airflow to use `Secret Manager <https://cloud.google.com/secret-manager/docs>`__ as -a secret backend and how to manage secrets. - -Before you begin -"""""""""""""""" - -`Configure Secret Manager and your local environment <https://cloud.google.com/secret-manager/docs/configuring-secret-manager>`__, once per project. - -Enabling the secret backend -""""""""""""""""""""""""""" - -To enable the secret backend for Google Cloud Secrets Manager to retrieve connection/variables, -specify :py:class:`~airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend` -as the ``backend`` in ``[secrets]`` section of ``airflow.cfg``. - -Here is a sample configuration if you want to use it: - -.. code-block:: ini - - [secrets] - backend = airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend - -You can also set this with environment variables. - -.. code-block:: bash - - export AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend - -You can verify the correct setting of the configuration options with the ``airflw config get-value`` command. - -.. code-block:: bash - - $ airflow config get-value secrets backend - airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend - -Backend parameters -"""""""""""""""""" - -The next step is to configure backend parameters using the ``backend_kwargs`` options. You can pass -the following parameters: - -* ``connections_prefix``: Specifies the prefix of the secret to read to get Connections. Default: ``"airflow-connections"`` -* ``variables_prefix``: Specifies the prefix of the secret to read to get Variables. Default: ``"airflow-variables"`` -* ``gcp_key_path``: Path to GCP Credential JSON file. -* ``gcp_keyfile_dict``: Dictionary of keyfile parameters. -* ``gcp_scopes``: Comma-separated string containing GCP scopes. -* ``sep``: Separator used to concatenate connections_prefix and conn_id. Default: "-" -* ``project_id``: Project ID to read the secrets from. If not passed, the project ID from credentials will be used. - -All options should be passed as a JSON dictionary. - -For example, if you want to set parameter ``connections_prefix`` to ``"airflow-tenant-primary"`` and parameter ``variables_prefix`` to ``"variables_prefix"``, your configuration file should look like this: - -.. code-block:: ini - - [secrets] - backend = airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend - backend_kwargs = {"connections_prefix": "airflow-tenant-primary", "variables_prefix": "airflow-tenant-primary"} - -Set-up credentials -"""""""""""""""""" - -You can configure the credentials in three ways: - -* By default, Application Default Credentials (ADC) is used obtain credentials. -* ``gcp_key_path`` option in ``backend_kwargs`` option - allows you to configure authorizations with a service account stored in local file. -* ``gcp_keyfile_dict`` option in ``backend_kwargs`` option - allows you to configure authorizations with a service account stored in Airflow configuration. - -.. note:: - - For more information about the Application Default Credentials (ADC), see: - - * `google.auth.default <https://google-auth.readthedocs.io/en/latest/reference/google.auth.html#google.auth.default>`__ - * `Setting Up Authentication for Server to Server Production Applications <https://cloud.google.com/docs/authentication/production>`__ - -Managing secrets -"""""""""""""""" - -If you want to configure a connection, you need to save it as a :ref:`connection URI representation <generating_connection_uri>`. -Variables should be saved as plain text. - -In order to manage secrets, you can use the ``gcloud`` tool or other supported tools. For more information, take a look at: -`Managing secrets <https://cloud.google.com/secret-manager/docs/creating-and-accessing-secrets>`__ in Google Cloud Documentation. - -The name of the secret must fit the following formats: - - * for variable: ``[connections_prefix][sep][variable_name]`` - * for connection: ``[variable_prefix][sep][connection_name]`` - -where: - - * ``connections_prefix`` - fixed value defined in the ``connections_prefix`` parameter in backend configuration. Default: ``airflow-connections``. - * ``variable_prefix`` - fixed value defined in the ``variable_prefix`` parameter in backend configuration. Default: ``airflow-variables``. - * ``sep`` - fixed value defined in the ``sep`` parameter in backend configuration. Default: ``-``. - -The Cloud Secrets Manager secret name should follow the pattern ``[a-zA-Z0-9-_]``. - -If you have the default backend configuration and you want to create a connection with ``conn_id`` -equals ``first-connection``, you should create secret named ``airflow-connections-first-connection``. -You can do it with the gcloud tools as in the example below. - -.. code-block:: bash - - echo "mysql://example.org" | gcloud beta secrets create airflow-connections-first-connection --data-file=- - -If you have the default backend configuration and you want to create a variable named ``first-variable``, -you should create a secret named ``airflow-variables-first-variable``. You can do it with the gcloud -command as in the example below. - -.. code-block:: bash - - echo "content" | gcloud beta secrets create airflow-variables-first-variable --data-file=- - -.. _roll_your_own_secrets_backend: - -Roll your own secrets backend -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -A secrets backend is a subclass of :py:class:`airflow.secrets.BaseSecretsBackend` and must implement either -:py:meth:`~airflow.secrets.BaseSecretsBackend.get_connections` or :py:meth:`~airflow.secrets.BaseSecretsBackend.get_conn_uri`. - -After writing your backend class, provide the fully qualified class name in the ``backend`` key in the ``[secrets]`` -section of ``airflow.cfg``. - -Additional arguments to your SecretsBackend can be configured in ``airflow.cfg`` by supplying a JSON string to ``backend_kwargs``, which will be passed to the ``__init__`` of your SecretsBackend. -See :ref:`Configuration <secrets_backend_configuration>` for more details, and :ref:`SSM Parameter Store <ssm_parameter_store_secrets>` for an example. - -.. note:: - - If you are rolling your own secrets backend, you don't strictly need to use airflow's URI format. But - doing so makes it easier to switch between environment variables, the metastore, and your secrets backend. diff --git a/docs/integration.rst b/docs/integration.rst index 4e8568e..a2002c8 100644 --- a/docs/integration.rst +++ b/docs/integration.rst @@ -27,7 +27,7 @@ Airflow has a mechanism that allows you to expand its functionality and integrat * :doc:`Authentication backends </security>` * :doc:`Logging </howto/write-logs>` * :doc:`Tracking systems </howto/tracking-user-activity>` -* :doc:`Secrets backends </howto/use-alternative-secrets-backend>` +* :doc:`Secrets backends </howto/secrets-backend/index>` * :doc:`Email backends </howto/email-config>` It also has integration with :doc:`Sentry <errors>` service for error tracking. Other applications can also integrate using diff --git a/docs/redirects.txt b/docs/redirects.txt index 40eb27b..fa69abc 100644 --- a/docs/redirects.txt +++ b/docs/redirects.txt @@ -69,3 +69,4 @@ howto/operator/google/firebase/index.rst howto/operator/google/index.rst # Other redirects howto/operator/http/http.rst howto/operator/http.rst docs/howto/operator/http/index.rst howto/operator/http.rst +docs/howto/use-alternative-secrets-backend.rst howto/altenative-secrets-backends/index.rst