+1 to Jarek's suggestion. The UNLOGGED table approach for log and job
tables makes sense for high-volume deployments. Pairing with external log
storage provides a safety net for recovery scenarios.

Regards,
Rahul Vats

On Tue, 30 Dec 2025 at 01:50, Jarek Potiuk <[email protected]> wrote:

> Wait.. Actually ... Stupid me!
>
> I just realized setting the "unlogged" for the two tables means that you do
> not have to do any partitioning at all ....... Because you will not have
> all the WAL issues AT ALL.
>
> So it's not really about making things "partitioning friendly" but simply
> "mark those two tables as unlogged" and make sure that you handle recovery
> when plug is pulled. No partitioning needed at all. Might be connected with
> the "send audit logs somewhere".
>
> So - we might not need **any** change to airflow, just a documentation
> (maybe even unofficial blog of some sort) - describing how to mark those
> two tables as unlogged and how to handle recovery in case of replica
> failover.
>
>
>
> On Mon, Dec 29, 2025 at 9:15 PM Jarek Potiuk <[email protected]> wrote:
>
> > > But eg lookiups to log table might use ti key attrs .
> >
> > Yeah - that would be the price to pay. Trade-offs, trade-offs everywhere.
> >
> > Such queries over a multi-partition table would have to do full table
> scan
> > - they will not be able to use indexes (unless the queries use partition
> > key). So they would **work** but would be **slow** if the table gets too
> > big.
> > However I would argue that if someone would like to partition those
> > tables, they will frequently and aggressively prune the old partitions
> > (this is the reason why they want to have it) - which will effectively
> keep
> > those tables "small" (relatively). And even if full table scan is used,
> if
> > there is a question about task_id audit log, that would be fine.
> >
> > Actually - I strongly believe - and we had this discussion in the past -
> > that the log table is quite a bit of abomination, because it's not a
> "true"
> > audit log if it is kept in modifiable database, and anyone who want to
> make
> > it a "true" audit log will have to effectively send those log entries to
> a
> > "write-only" storage and query the audit logs there. And that nicely fits
> > into "unlogged" pattern -> you could add a trigger in your DB to
> > automatically send audit logs somewhere "write-only" and aggressively
> prune
> > the old data (i.e. partitions).
> >
> > So - from the side of airflow that would mean that those tables are
> > "partitioning friendly", but not to implement partitioning itself.
> Leaving
> > "partitioning|" and "handling recovery" and "handling audit logs" up to
> the
> > Deployment Manager. Which IMHO should happen anyway for "log" table and
> > would be nice pattern to describe as a way to achieve "true" audit log.
> >
> > J,.
> >
> >
> >
> > On Mon, Dec 29, 2025 at 9:02 PM Daniel Standish via dev <
> > [email protected]> wrote:
> >
> >> The problem with partitioning them is that when querying you would need
> to
> >> know the partition.
> >> But eg lookiups to log table might use ti key attrs .
> >>
> >> I would try just trimming log and job table in small batches frequently
> >> using an appropriate index, before thinking about the complexity of
> >> partitioning.
> >>
> >>
> >>
> >>
> >>
> >> On Mon, Dec 29, 2025 at 11:24 AM Jarek Potiuk <[email protected]> wrote:
> >>
> >> > I have a feeling that making those **two** tables partition-friendly
> >> should
> >> > be easy - and maybe that's all that we need. That would make it
> possible
> >> > for you to use the partitioning, (where it would not be necessary for
> >> > everyone).
> >> > There might be a few ways of doing it without losing uniqueness. For
> >> > example Job ID and audit log ID could  follow certain conventions -
> and
> >> > always start with a partition key) thus providing uniqueness we need.
> >> >
> >> > Those two tables are pretty specific and neither job id nor log id
> >> impact
> >> > anything else in our data model.
> >> >
> >> > I think such a change to make those two tables "partition friendly"
> >> could
> >> > be accepted, but we could only say it after seeing a POC.
> >> >
> >> > J.
> >> >
> >> >
> >> > On Mon, Dec 29, 2025 at 8:01 PM <[email protected]> wrote:
> >> >
> >> > > I do think that the second option is best, it is also what we wanted
> >> to
> >> > > do, the only reason we did not do that is because from our tests,
> sql
> >> > > alchemy sometimes breaks as it expects certain constraints which are
> >> not
> >> > > there, mainly for update queries, select works well, if I am not
> >> > mistaken,
> >> > > there are 3 or 4 large tables, job being the largest (7 times larger
> >> than
> >> > > the second largest), the question is, will such a pull request be
> >> > approved?
> >> > >
> >> > > As we do lose the unique constraint (as it becomes per partition),
> >> though
> >> > > it is a sequence that most likely won't repeat until the previous
> has
> >> > been
> >> > > deleted, but if not, we might query unrelated job or log data, and
> so
> >> > > changes in the api server are also needed, creating the pr is not a
> >> > > problem, the question is how beneficial will it be, as if it is done
> >> to
> >> > > those problematic tables, it means that the preferred way to manage
> >> > > retention is from the db, and can be an optional alembic script.
> >> > >
> >> > > I do not want to just push a fix that will add more complexity than
> >> the
> >> > > benefit it will bring.
> >> > >
> >> > > If we do go the pr way, a separate discussion is probably needed to
> >> > decide
> >> > > how it should be done (most likely an additional airflow command to
> >> turn
> >> > on
> >> > > or off the partitions and retention)
> >> > >
> >> > > Doing it as a custom solution has caused problems with sqlalchemy,
> >> and we
> >> > > do not want to do so as if later an alembic script relies on the
> >> primary
> >> > > key in some way, we will need to fix it manually and deal with the
> >> > problems
> >> > > it may cause when we update.
> >> > >
> >> > > > On 29 Dec 2025, at 21:36, Jarek Potiuk <[email protected]> wrote:
> >> > > >
> >> > > > Another option is that you ( could make ONLY for those two
> tables -
> >> > > > partition-friendly.  And do not partition anything else.  I think
> >> that
> >> > > > **could** be possible - both have datetime fields that could be
> >> used as
> >> > > > partition keys - you would have to assess if you can do it as your
> >> > > "custom"
> >> > > > solution or whether it would require some changes to airflow
> models.
> >> > But
> >> > > I
> >> > > > can't see foreign key problems if ONLY those two tables are
> >> > partitioned,
> >> > > so
> >> > > > likely you could do it yourself in your DB.
> >> > > >
> >> > > > In this case - maybe "let's solve those tables that are
> >> problematic" is
> >> > > > easier to do than "let's apply partitioning to everything".
> >> > > >
> >> > > >> On Mon, Dec 29, 2025 at 7:31 PM Jarek Potiuk <[email protected]>
> >> > wrote:
> >> > > >>
> >> > > >> Purely theoretically, you could change the log and job tables to
> be
> >> > > >> unlogged - and thus avoid WAL for them.
> >> > > >>
> >> > > >> The drawback of this:
> >> > > >>
> >> > > >> * the data in those tables will be lost if you pull the plug or
> >> kill
> >> > -9
> >> > > >> the primary postgres server
> >> > > >> * the tables are not available (at all) in replicas - so in case
> >> of a
> >> > > >> fallback, you would have to have a manual data import/export for
> >> those
> >> > > >> tables on fail-over, rather than rely on replicas being "ready to
> >> > > fallback
> >> > > >> immediately". Or accept data loss.
> >> > > >> * the data from those tables will not be present in backups
> >> > > >>
> >> > > >> I am not 100% sure, but I believe loss of data in both tables is
> >> not
> >> > > >> really catastrophic for Airflow, so maybe it's acceptable risk
> (but
> >> > > likely
> >> > > >> you should do a disaster-recovery test to see what happens and
> how
> >> to
> >> > > >> recover in case, indeed, someone pulls the plug on your postgres
> >> > server.
> >> > > >>
> >> > > >> J,
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>
> >> > > >>> On Mon, Dec 29, 2025 at 7:16 PM Natanel <
> [email protected]>
> >> > > wrote:
> >> > > >>>
> >> > > >>> Yes, the problem is the manual deletions, we have tried it, it
> >> > > resulted in
> >> > > >>> the same exact issue, as the scheduled db procedures which clean
> >> up
> >> > the
> >> > > >>> rows marked as deleted actually get deleted, and so it takes up
> >> > > storage,
> >> > > >>> yet it does not solve the WAL problem, the problematic table is
> >> > > actually
> >> > > >>> not task_instance, it is relatively small, the log and job
> tables
> >> are
> >> > > the
> >> > > >>> biggest tables (the problem with them is the primary key change
> >> > > required),
> >> > > >>> by a multiple of 10 (or more, cluster dependant).
> >> > > >>>
> >> > > >>> The smaller batches might solve the issue, however, it seems to
> >> just
> >> > > delay
> >> > > >>> the problem a little rather than solve it, as deleting data
> with a
> >> > > delete
> >> > > >>> query (especially a lot of data) is not a very light operation,
> >> and
> >> > so
> >> > > I
> >> > > >>> think that this is the main issue.
> >> > > >>>
> >> > > >>> It would be nice if we could use partitions instead, as it is a
> >> > lighter
> >> > > >>> operation, and does not require us to maintain a query and
> manage
> >> our
> >> > > db,
> >> > > >>> I
> >> > > >>> have thought about changing the models, most of the changes are
> >> > > relatively
> >> > > >>> simple, for some it is just removing the foreign key and relying
> >> on
> >> > ORM
> >> > > >>> level constraints, for others, it requires adding a pre query to
> >> have
> >> > > the
> >> > > >>> same constrains but I do not like that idea, maybe there is
> >> another
> >> > > way to
> >> > > >>> make airflow "partition-friendly"?
> >> > > >>>
> >> > > >>> I can't think of a nice way to do so, maybe it does not exist,
> as
> >> the
> >> > > db
> >> > > >>> clean is as simple as a delete query gets, yet when there is a
> >> lot of
> >> > > >>> data,
> >> > > >>> it is all duplicated in WALs.
> >> > > >>>
> >> > > >>> On Mon, Dec 29, 2025, 19:40 Daniel Standish via dev <
> >> > > >>> [email protected]>
> >> > > >>> wrote:
> >> > > >>>
> >> > > >>>> Have you looked at doing manual deletions?  I.e. writing your
> own
> >> > sql?
> >> > > >>>>
> >> > > >>>> The db clean command is probably not "optimal" for all
> scenarios.
> >> > > >>>>
> >> > > >>>> So for example, if the main problem table for you is
> >> task_instance,
> >> > > you
> >> > > >>>> could periodically delete TI records in smaller batches using
> >> some
> >> > > >>>> appropriate index (whether it exists now or you add it).  Then
> >> maybe
> >> > > you
> >> > > >>>> would not stress the db as hard.
> >> > > >>>>
> >> > > >>>> Airflow isn't designed to use partitions so, you may not get
> good
> >> > > >>> results
> >> > > >>>> with that approach.
> >> > > >>>>
> >> > > >>>>
> >> > > >>>>
> >> > > >>>> On Mon, Dec 29, 2025 at 7:32 AM Natanel <
> [email protected]
> >> >
> >> > > >>> wrote:
> >> > > >>>>
> >> > > >>>>> Hello everyone, after having issues with the 'airflow db
> clean'
> >> > > >>> command,
> >> > > >>>>> where due to the amount of dags and tasks that are running
> every
> >> > day
> >> > > >>> in
> >> > > >>>> our
> >> > > >>>>> deployments, we get a lot of new data every day, which is
> >> stored in
> >> > > >>> the
> >> > > >>>>> database, and when we delete the data, due to the way PGSQL
> >> works,
> >> > > the
> >> > > >>>>> WAL's get replicated to both the archive storage and the main
> >> data
> >> > > >>>> storage
> >> > > >>>>> of the db instance, which in turn, causes a significant jump
> in
> >> cpu
> >> > > >>>> usage,
> >> > > >>>>> ram usage and disk usage, whenever we run the command, which
> >> causes
> >> > > >>> all
> >> > > >>>>> kinds of issues, we even had it once fill up the db storage,
> and
> >> > > >>> causing
> >> > > >>>>> the database to be unresponsive, forcing us to move to our
> >> backup
> >> > > >>>> database,
> >> > > >>>>> after we haven't ran the command for a few months due to human
> >> > error.
> >> > > >>>>>
> >> > > >>>>> As of now, I know that this is the accepted and widely used
> way
> >> of
> >> > > >>>> managing
> >> > > >>>>> the airflow database's size, however, we noticed that it may
> >> cause
> >> > > >>> issues
> >> > > >>>>> in certain cases, just like in our case, where if the db has
> not
> >> > been
> >> > > >>>>> cleaned up for a while, cleaning it can be problematic.
> >> > > >>>>>
> >> > > >>>>> We decided to try and partition the table, and use pgsql's
> >> built in
> >> > > >>>>> retention of partitions, which does not issue a DELETE query,
> >> and
> >> > is
> >> > > >>>>> lighter and faster, while being simpler to use, however, we
> have
> >> > > >>>>> encountered issues due to having Foreign Key constraints in
> some
> >> > > >>> tables,
> >> > > >>>>> having to duplicate such keys and other than forcing code
> >> changes
> >> > (as
> >> > > >>> the
> >> > > >>>>> foreign key must include the partitioned key, as the
> partitioned
> >> > key
> >> > > >>> must
> >> > > >>>>> be part of the primary key), while also having the issue of
> >> > > sqlalchemy
> >> > > >>>>> breaking once we change the primary key, with the addition of
> >> the
> >> > > >>>>> constraints on the primary key breaking.
> >> > > >>>>>
> >> > > >>>>> And in Mysql, due to the foreign keys, it is not possible to
> >> > > partition
> >> > > >>>>> tables which include them, as it is not supported yet
> >> (according to
> >> > > >>> this
> >> > > >>>>> <
> >> > > >>>>>
> >> > > >>>>
> >> > > >>>
> >> > >
> >> >
> >>
> https://dev.mysql.com/doc/refman/8.4/en/partitioning-limitations-storage-engines.html
> >> > > >>>>>>
> >> > > >>>>> ).
> >> > > >>>>>
> >> > > >>>>> Has anyone else tried to use the databases built in partition
> >> > > >>> retention
> >> > > >>>>> system instead of the 'airflow db clean' command?
> >> > > >>>>>
> >> > > >>>>> Thanks, Natanel.
> >> > > >>>>>
> >> > > >>>>
> >> > > >>>
> >> > > >>
> >> > >
> >> > >
> ---------------------------------------------------------------------
> >> > > To unsubscribe, e-mail: [email protected]
> >> > > For additional commands, e-mail: [email protected]
> >> > >
> >> > >
> >> >
> >>
> >
>

Reply via email to