Miretpl commented on code in PR #62544:
URL: https://github.com/apache/airflow/pull/62544#discussion_r2875068104
##########
chart/docs/production-guide.rst:
##########
@@ -51,243 +52,260 @@ Values file
This is the simpler options, as the chart will create a Kubernetes Secret for
you. However, keep in mind your credentials will be in your values file.
.. code-block:: yaml
+ :caption: values.yaml
- data:
- metadataConnection:
- user: <username>
- pass: <password>
- protocol: postgresql
- host: <hostname>
- port: 5432
- db: <database name>
+ data:
+ metadataConnection:
+ user: <username>
+ pass: <password>
+ protocol: postgresql
+ host: <hostname>
+ port: 5432
+ db: <database name>
+.. warning::
+
+ Due to security concerns, it is not advised to store Airflow database user
credentials directly in the ``values.yaml`` file.
Kubernetes Secret
^^^^^^^^^^^^^^^^^
-You can also store the credentials in a Kubernetes Secret you create. Note that
-special characters in the username/password must be URL encoded.
+You can store the credentials in a Kubernetes Secret (it requires manual
creation).
+
+.. note::
+
+ Any special character in the username/password must be URL encoded.
.. code-block:: bash
- kubectl create secret generic mydatabase
--from-literal=connection=postgresql://user:pass@host:5432/db
+ kubectl create secret generic mydatabase
--from-literal=connection=postgresql://user:pass@host:5432/db
-Finally, configure the chart to use the secret you created:
+After secret creation, configure the chart to use the secret:
.. code-block:: yaml
+ :caption: values.yaml
- data:
- metadataSecretName: mydatabase
+ data:
+ metadataSecretName: mydatabase
.. _production-guide:pgbouncer:
-Metadata DB cleanup
+Metadata DB Cleanup
^^^^^^^^^^^^^^^^^^^
-It is recommended to periodically clean up the Airflow metadata database to
remove old records and keep the database size manageable. A Kubernetes CronJob
can be enabled for this purpose:
+It is recommended to periodically clean up the Airflow metadata database to
remove old records and keep the database size manageable.
+A Kubernetes CronJob can be enabled for this purpose:
.. code-block:: yaml
+ :caption: values.yaml
- databaseCleanup:
- enabled: true
- retentionDays: 90
+ databaseCleanup:
+ enabled: true
+ retentionDays: 90
-Several additional options can be configured and passed to the ``airflow db
clean`` command:
-
-+-------------------------------------------------------+------------------------------------------+
-| Helm chart value | ``airflow db clean``
option |
-+=======================================================+==========================================+
-| ``.Values.databaseCleanup.skipArchive`` | ``--skip-archive``
|
-+-------------------------------------------------------+------------------------------------------+
-| ``.Values.databaseCleanup.tables`` | ``--tables``
|
-+-------------------------------------------------------+------------------------------------------+
-| ``.Values.databaseCleanup.batchSize`` | ``--batch-size``
|
-+-------------------------------------------------------+------------------------------------------+
-| ``.Values.databaseCleanup.verbose`` | ``--verbose``
|
-+-------------------------------------------------------+------------------------------------------+
-
-See :ref:`db clean usage <cli-db-clean>` for more details.
+For details regarding the ``airflow db clean`` command, see :ref:`db clean
usage <cli-db-clean>` and for additional options which
+can be configured via helm chart values, see :doc:`parameters reference
<parameters-ref>`.
PgBouncer
---------
If you are using PostgreSQL as your database, you will likely want to enable
`PgBouncer <https://www.pgbouncer.org/>`_ as well.
-Airflow can open a lot of database connections due to its distributed nature
and using a connection pooler can significantly
+Due to distributed nature of Airflow, it can open a lot of database
connections. Using a connection pooler can significantly
reduce the number of open connections on the database.
Database credentials stored Values file
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: yaml
+ :caption: values.yaml
- pgbouncer:
- enabled: true
+ pgbouncer:
+ enabled: true
Database credentials stored Kubernetes Secret
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-The default connection string in this case will not work you need to modify
accordingly
+The default connection string in this case will not work. You need to modify
accordingly the Kubernetes secret:
.. code-block:: bash
- kubectl create secret generic mydatabase
--from-literal=connection=postgresql://user:pass@pgbouncer_svc_name.deployment_namespace:6543/airflow-metadata
+ kubectl create secret generic mydatabase
--from-literal=connection=postgresql://user:pass@pgbouncer_svc_name.deployment_namespace:6543/airflow-metadata
-Two additional Kubernetes Secret required to PgBouncer able to properly work
in this configuration:
+Furthermore, two additional Kubernetes Secret are required for PgBouncer to be
able to properly work:
-``airflow-pgbouncer-stats``
+1. ``airflow-pgbouncer-stats`` secret:
-.. code-block:: bash
+ .. code-block:: bash
- kubectl create secret generic airflow-pgbouncer-stats
--from-literal=connection=postgresql://user:[email protected]:6543/pgbouncer?sslmode=disable
+ kubectl create secret generic airflow-pgbouncer-stats
--from-literal=connection=postgresql://user:[email protected]:6543/pgbouncer?sslmode=disable
-``airflow-pgbouncer-config``
+2. ``airflow-pgbouncer-config`` secret:
-.. code-block:: yaml
+ .. code-block:: yaml
+ :caption: airflow-pgbouncer-config
- apiVersion: v1
- kind: Secret
- metadata:
- name: airflow-pgbouncer-config
- data:
- pgbouncer.ini: dmFsdWUtMg0KDQo=
- users.txt: dmFsdWUtMg0KDQo=
+ apiVersion: v1
+ kind: Secret
+ metadata:
+ name: airflow-pgbouncer-config
+ data:
+ pgbouncer.ini: dmFsdWUtMg0KDQo=
+ users.txt: dmFsdWUtMg0KDQo=
+ where:
-``pgbouncer.ini`` equal to the base64 encoded version of this text
+ 1. ``pgbouncer.ini`` value is equal to the base64 encoded version of below
text:
-.. code-block:: text
+ .. code-block:: text
+ :caption: pgbouncer.ini
- [databases]
- airflow-metadata = host={external_database_host}
dbname={external_database_dbname} port=5432 pool_size=10
+ [databases]
+ airflow-metadata = host={external_database_host}
dbname={external_database_dbname} port=5432 pool_size=10
- [pgbouncer]
- pool_mode = transaction
- listen_port = 6543
- listen_addr = *
- auth_type = scram-sha-256
- auth_file = /etc/pgbouncer/users.txt
- stats_users = postgres
- ignore_startup_parameters = extra_float_digits
- max_client_conn = 100
- verbose = 0
- log_disconnections = 0
- log_connections = 0
+ [pgbouncer]
+ pool_mode = transaction
+ listen_port = 6543
+ listen_addr = *
+ auth_type = scram-sha-256
+ auth_file = /etc/pgbouncer/users.txt
+ stats_users = postgres
+ ignore_startup_parameters = extra_float_digits
+ max_client_conn = 100
+ verbose = 0
+ log_disconnections = 0
+ log_connections = 0
- server_tls_sslmode = prefer
- server_tls_ciphers = normal
+ server_tls_sslmode = prefer
+ server_tls_ciphers = normal
-``users.txt`` equal to the base64 encoded version of this text
+ 2. ``users.txt`` value is equal to the base64 encoded version of below text:
-.. code-block:: text
+ .. code-block:: text
+ :caption: users.txt
- "{ external_database_username }" "{ external_database_pass }"
+ "{ external_database_username }" "{ external_database_pass }"
-The ``values.yaml`` should looks like this
+In the ``values.yaml`` below secret-related parameters should be adjusted like:
.. code-block:: yaml
+ :caption: values.yaml
- pgbouncer:
- enabled: true
- configSecretName: airflow-pgbouncer-config
- metricsExporterSidecar:
- statsSecretName: airflow-pgbouncer-stats
+ pgbouncer:
+ enabled: true
+ configSecretName: airflow-pgbouncer-config
+ metricsExporterSidecar:
+ statsSecretName: airflow-pgbouncer-stats
+.. note::
-Depending on the size of your Airflow instance, you may want to adjust the
following as well (defaults are shown):
+ Depending on the size of your Airflow instance, you may want to adjust the
following as well (defaults are shown):
-.. code-block:: yaml
+ .. code-block:: yaml
+ :caption: values.yaml
- pgbouncer:
- # The maximum number of connections to PgBouncer
- maxClientConn: 100
- # The maximum number of server connections to the metadata database from
PgBouncer
- metadataPoolSize: 10
- # The maximum number of server connections to the result backend database
from PgBouncer
- resultBackendPoolSize: 5
+ pgbouncer:
+ # The maximum number of connections to PgBouncer
+ maxClientConn: 100
+ # The maximum number of server connections to the metadata database
from PgBouncer
+ metadataPoolSize: 10
+ # The maximum number of server connections to the result backend
database from PgBouncer
+ resultBackendPoolSize: 5
API Secret Key
----------------
+--------------
-You should set a static API secret key when deploying with this chart as it
will help ensure
-your Airflow components only restart when necessary.
+You should set a static API secret key when deploying with Airflow chart as it
will help ensure
+your Airflow components only restarts when necessary.
.. note::
- This section also applies to the webserver for Airflow <3 -- simply replace
"API" with "webserver."
+
+ This section also applies to the webserver for Airflow (simply replace
``api`` with ``webserver``).
Review Comment:
Not sure if by leave this, you meant not do the change or do the change for
2 (I added 2 :D). If I misunderstood, I will remove it, but IMHO it is nice to
have it there
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]