norm commented on code in PR #29058: URL: https://github.com/apache/airflow/pull/29058#discussion_r1091933569
########## docs/apache-airflow/howto/usage-cli.rst: ########## @@ -217,6 +217,17 @@ You can use the ``--dry-run`` option to print the row counts in the primary tabl By default, ``db clean`` will archive purged rows in tables of the form ``_airflow_deleted__<table>__<timestamp>``. If you don't want the data preserved in this way, you may supply argument ``--skip-archive``. +Export the purged records from the archive tables +------------------------------------------------- +The ``db export-cleaned`` command exports the contents of the archived tables, created by the ``db clean`` command, +to a specified format, by default to a CSV file. The exported file will contain the records that were purged from the +primary tables during the ``db clean`` process. + +You can specify the export format using ``--export-format`` option. The default format is csv. Review Comment: Default … or only? If you fake add `json` as a second option, is `csv` still chosen by default? ########## airflow/utils/db_cleanup.py: ########## @@ -123,6 +127,15 @@ def _check_for_rows(*, query: Query, print_rows=False): return num_entities +def _dump_table_to_file(*, target_table, file_path, export_format, session): + if export_format == "csv": Review Comment: It's a minor thing since you *shouldn't* be able to get here without `csv` being the format, but there's no code to catch what-if-no-acceptable-format. (I mean, "you shouldn't see this" is a classic comedy error message for a reason, someone will get there) ########## docs/apache-airflow/howto/usage-cli.rst: ########## @@ -217,6 +217,17 @@ You can use the ``--dry-run`` option to print the row counts in the primary tabl By default, ``db clean`` will archive purged rows in tables of the form ``_airflow_deleted__<table>__<timestamp>``. If you don't want the data preserved in this way, you may supply argument ``--skip-archive``. +Export the purged records from the archive tables +------------------------------------------------- +The ``db export-cleaned`` command exports the contents of the archived tables, created by the ``db clean`` command, +to a specified format, by default to a CSV file. The exported file will contain the records that were purged from the +primary tables during the ``db clean`` process. + +You can specify the export format using ``--export-format`` option. The default format is csv. + +You must also specify the location of the path to which you want to export the data using ``--output-path`` option. Review Comment: Worth mentioning that the dir must exist here? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@airflow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org