+1 to Jarek's suggestion. The UNLOGGED table approach for log and job tables makes sense for high-volume deployments. Pairing with external log storage provides a safety net for recovery scenarios.
Regards, Rahul Vats On Tue, 30 Dec 2025 at 01:50, Jarek Potiuk <[email protected]> wrote: > Wait.. Actually ... Stupid me! > > I just realized setting the "unlogged" for the two tables means that you do > not have to do any partitioning at all ....... Because you will not have > all the WAL issues AT ALL. > > So it's not really about making things "partitioning friendly" but simply > "mark those two tables as unlogged" and make sure that you handle recovery > when plug is pulled. No partitioning needed at all. Might be connected with > the "send audit logs somewhere". > > So - we might not need **any** change to airflow, just a documentation > (maybe even unofficial blog of some sort) - describing how to mark those > two tables as unlogged and how to handle recovery in case of replica > failover. > > > > On Mon, Dec 29, 2025 at 9:15 PM Jarek Potiuk <[email protected]> wrote: > > > > But eg lookiups to log table might use ti key attrs . > > > > Yeah - that would be the price to pay. Trade-offs, trade-offs everywhere. > > > > Such queries over a multi-partition table would have to do full table > scan > > - they will not be able to use indexes (unless the queries use partition > > key). So they would **work** but would be **slow** if the table gets too > > big. > > However I would argue that if someone would like to partition those > > tables, they will frequently and aggressively prune the old partitions > > (this is the reason why they want to have it) - which will effectively > keep > > those tables "small" (relatively). And even if full table scan is used, > if > > there is a question about task_id audit log, that would be fine. > > > > Actually - I strongly believe - and we had this discussion in the past - > > that the log table is quite a bit of abomination, because it's not a > "true" > > audit log if it is kept in modifiable database, and anyone who want to > make > > it a "true" audit log will have to effectively send those log entries to > a > > "write-only" storage and query the audit logs there. And that nicely fits > > into "unlogged" pattern -> you could add a trigger in your DB to > > automatically send audit logs somewhere "write-only" and aggressively > prune > > the old data (i.e. partitions). > > > > So - from the side of airflow that would mean that those tables are > > "partitioning friendly", but not to implement partitioning itself. > Leaving > > "partitioning|" and "handling recovery" and "handling audit logs" up to > the > > Deployment Manager. Which IMHO should happen anyway for "log" table and > > would be nice pattern to describe as a way to achieve "true" audit log. > > > > J,. > > > > > > > > On Mon, Dec 29, 2025 at 9:02 PM Daniel Standish via dev < > > [email protected]> wrote: > > > >> The problem with partitioning them is that when querying you would need > to > >> know the partition. > >> But eg lookiups to log table might use ti key attrs . > >> > >> I would try just trimming log and job table in small batches frequently > >> using an appropriate index, before thinking about the complexity of > >> partitioning. > >> > >> > >> > >> > >> > >> On Mon, Dec 29, 2025 at 11:24 AM Jarek Potiuk <[email protected]> wrote: > >> > >> > I have a feeling that making those **two** tables partition-friendly > >> should > >> > be easy - and maybe that's all that we need. That would make it > possible > >> > for you to use the partitioning, (where it would not be necessary for > >> > everyone). > >> > There might be a few ways of doing it without losing uniqueness. For > >> > example Job ID and audit log ID could follow certain conventions - > and > >> > always start with a partition key) thus providing uniqueness we need. > >> > > >> > Those two tables are pretty specific and neither job id nor log id > >> impact > >> > anything else in our data model. > >> > > >> > I think such a change to make those two tables "partition friendly" > >> could > >> > be accepted, but we could only say it after seeing a POC. > >> > > >> > J. > >> > > >> > > >> > On Mon, Dec 29, 2025 at 8:01 PM <[email protected]> wrote: > >> > > >> > > I do think that the second option is best, it is also what we wanted > >> to > >> > > do, the only reason we did not do that is because from our tests, > sql > >> > > alchemy sometimes breaks as it expects certain constraints which are > >> not > >> > > there, mainly for update queries, select works well, if I am not > >> > mistaken, > >> > > there are 3 or 4 large tables, job being the largest (7 times larger > >> than > >> > > the second largest), the question is, will such a pull request be > >> > approved? > >> > > > >> > > As we do lose the unique constraint (as it becomes per partition), > >> though > >> > > it is a sequence that most likely won't repeat until the previous > has > >> > been > >> > > deleted, but if not, we might query unrelated job or log data, and > so > >> > > changes in the api server are also needed, creating the pr is not a > >> > > problem, the question is how beneficial will it be, as if it is done > >> to > >> > > those problematic tables, it means that the preferred way to manage > >> > > retention is from the db, and can be an optional alembic script. > >> > > > >> > > I do not want to just push a fix that will add more complexity than > >> the > >> > > benefit it will bring. > >> > > > >> > > If we do go the pr way, a separate discussion is probably needed to > >> > decide > >> > > how it should be done (most likely an additional airflow command to > >> turn > >> > on > >> > > or off the partitions and retention) > >> > > > >> > > Doing it as a custom solution has caused problems with sqlalchemy, > >> and we > >> > > do not want to do so as if later an alembic script relies on the > >> primary > >> > > key in some way, we will need to fix it manually and deal with the > >> > problems > >> > > it may cause when we update. > >> > > > >> > > > On 29 Dec 2025, at 21:36, Jarek Potiuk <[email protected]> wrote: > >> > > > > >> > > > Another option is that you ( could make ONLY for those two > tables - > >> > > > partition-friendly. And do not partition anything else. I think > >> that > >> > > > **could** be possible - both have datetime fields that could be > >> used as > >> > > > partition keys - you would have to assess if you can do it as your > >> > > "custom" > >> > > > solution or whether it would require some changes to airflow > models. > >> > But > >> > > I > >> > > > can't see foreign key problems if ONLY those two tables are > >> > partitioned, > >> > > so > >> > > > likely you could do it yourself in your DB. > >> > > > > >> > > > In this case - maybe "let's solve those tables that are > >> problematic" is > >> > > > easier to do than "let's apply partitioning to everything". > >> > > > > >> > > >> On Mon, Dec 29, 2025 at 7:31 PM Jarek Potiuk <[email protected]> > >> > wrote: > >> > > >> > >> > > >> Purely theoretically, you could change the log and job tables to > be > >> > > >> unlogged - and thus avoid WAL for them. > >> > > >> > >> > > >> The drawback of this: > >> > > >> > >> > > >> * the data in those tables will be lost if you pull the plug or > >> kill > >> > -9 > >> > > >> the primary postgres server > >> > > >> * the tables are not available (at all) in replicas - so in case > >> of a > >> > > >> fallback, you would have to have a manual data import/export for > >> those > >> > > >> tables on fail-over, rather than rely on replicas being "ready to > >> > > fallback > >> > > >> immediately". Or accept data loss. > >> > > >> * the data from those tables will not be present in backups > >> > > >> > >> > > >> I am not 100% sure, but I believe loss of data in both tables is > >> not > >> > > >> really catastrophic for Airflow, so maybe it's acceptable risk > (but > >> > > likely > >> > > >> you should do a disaster-recovery test to see what happens and > how > >> to > >> > > >> recover in case, indeed, someone pulls the plug on your postgres > >> > server. > >> > > >> > >> > > >> J, > >> > > >> > >> > > >> > >> > > >> > >> > > >> > >> > > >>> On Mon, Dec 29, 2025 at 7:16 PM Natanel < > [email protected]> > >> > > wrote: > >> > > >>> > >> > > >>> Yes, the problem is the manual deletions, we have tried it, it > >> > > resulted in > >> > > >>> the same exact issue, as the scheduled db procedures which clean > >> up > >> > the > >> > > >>> rows marked as deleted actually get deleted, and so it takes up > >> > > storage, > >> > > >>> yet it does not solve the WAL problem, the problematic table is > >> > > actually > >> > > >>> not task_instance, it is relatively small, the log and job > tables > >> are > >> > > the > >> > > >>> biggest tables (the problem with them is the primary key change > >> > > required), > >> > > >>> by a multiple of 10 (or more, cluster dependant). > >> > > >>> > >> > > >>> The smaller batches might solve the issue, however, it seems to > >> just > >> > > delay > >> > > >>> the problem a little rather than solve it, as deleting data > with a > >> > > delete > >> > > >>> query (especially a lot of data) is not a very light operation, > >> and > >> > so > >> > > I > >> > > >>> think that this is the main issue. > >> > > >>> > >> > > >>> It would be nice if we could use partitions instead, as it is a > >> > lighter > >> > > >>> operation, and does not require us to maintain a query and > manage > >> our > >> > > db, > >> > > >>> I > >> > > >>> have thought about changing the models, most of the changes are > >> > > relatively > >> > > >>> simple, for some it is just removing the foreign key and relying > >> on > >> > ORM > >> > > >>> level constraints, for others, it requires adding a pre query to > >> have > >> > > the > >> > > >>> same constrains but I do not like that idea, maybe there is > >> another > >> > > way to > >> > > >>> make airflow "partition-friendly"? > >> > > >>> > >> > > >>> I can't think of a nice way to do so, maybe it does not exist, > as > >> the > >> > > db > >> > > >>> clean is as simple as a delete query gets, yet when there is a > >> lot of > >> > > >>> data, > >> > > >>> it is all duplicated in WALs. > >> > > >>> > >> > > >>> On Mon, Dec 29, 2025, 19:40 Daniel Standish via dev < > >> > > >>> [email protected]> > >> > > >>> wrote: > >> > > >>> > >> > > >>>> Have you looked at doing manual deletions? I.e. writing your > own > >> > sql? > >> > > >>>> > >> > > >>>> The db clean command is probably not "optimal" for all > scenarios. > >> > > >>>> > >> > > >>>> So for example, if the main problem table for you is > >> task_instance, > >> > > you > >> > > >>>> could periodically delete TI records in smaller batches using > >> some > >> > > >>>> appropriate index (whether it exists now or you add it). Then > >> maybe > >> > > you > >> > > >>>> would not stress the db as hard. > >> > > >>>> > >> > > >>>> Airflow isn't designed to use partitions so, you may not get > good > >> > > >>> results > >> > > >>>> with that approach. > >> > > >>>> > >> > > >>>> > >> > > >>>> > >> > > >>>> On Mon, Dec 29, 2025 at 7:32 AM Natanel < > [email protected] > >> > > >> > > >>> wrote: > >> > > >>>> > >> > > >>>>> Hello everyone, after having issues with the 'airflow db > clean' > >> > > >>> command, > >> > > >>>>> where due to the amount of dags and tasks that are running > every > >> > day > >> > > >>> in > >> > > >>>> our > >> > > >>>>> deployments, we get a lot of new data every day, which is > >> stored in > >> > > >>> the > >> > > >>>>> database, and when we delete the data, due to the way PGSQL > >> works, > >> > > the > >> > > >>>>> WAL's get replicated to both the archive storage and the main > >> data > >> > > >>>> storage > >> > > >>>>> of the db instance, which in turn, causes a significant jump > in > >> cpu > >> > > >>>> usage, > >> > > >>>>> ram usage and disk usage, whenever we run the command, which > >> causes > >> > > >>> all > >> > > >>>>> kinds of issues, we even had it once fill up the db storage, > and > >> > > >>> causing > >> > > >>>>> the database to be unresponsive, forcing us to move to our > >> backup > >> > > >>>> database, > >> > > >>>>> after we haven't ran the command for a few months due to human > >> > error. > >> > > >>>>> > >> > > >>>>> As of now, I know that this is the accepted and widely used > way > >> of > >> > > >>>> managing > >> > > >>>>> the airflow database's size, however, we noticed that it may > >> cause > >> > > >>> issues > >> > > >>>>> in certain cases, just like in our case, where if the db has > not > >> > been > >> > > >>>>> cleaned up for a while, cleaning it can be problematic. > >> > > >>>>> > >> > > >>>>> We decided to try and partition the table, and use pgsql's > >> built in > >> > > >>>>> retention of partitions, which does not issue a DELETE query, > >> and > >> > is > >> > > >>>>> lighter and faster, while being simpler to use, however, we > have > >> > > >>>>> encountered issues due to having Foreign Key constraints in > some > >> > > >>> tables, > >> > > >>>>> having to duplicate such keys and other than forcing code > >> changes > >> > (as > >> > > >>> the > >> > > >>>>> foreign key must include the partitioned key, as the > partitioned > >> > key > >> > > >>> must > >> > > >>>>> be part of the primary key), while also having the issue of > >> > > sqlalchemy > >> > > >>>>> breaking once we change the primary key, with the addition of > >> the > >> > > >>>>> constraints on the primary key breaking. > >> > > >>>>> > >> > > >>>>> And in Mysql, due to the foreign keys, it is not possible to > >> > > partition > >> > > >>>>> tables which include them, as it is not supported yet > >> (according to > >> > > >>> this > >> > > >>>>> < > >> > > >>>>> > >> > > >>>> > >> > > >>> > >> > > > >> > > >> > https://dev.mysql.com/doc/refman/8.4/en/partitioning-limitations-storage-engines.html > >> > > >>>>>> > >> > > >>>>> ). > >> > > >>>>> > >> > > >>>>> Has anyone else tried to use the databases built in partition > >> > > >>> retention > >> > > >>>>> system instead of the 'airflow db clean' command? > >> > > >>>>> > >> > > >>>>> Thanks, Natanel. > >> > > >>>>> > >> > > >>>> > >> > > >>> > >> > > >> > >> > > > >> > > > --------------------------------------------------------------------- > >> > > To unsubscribe, e-mail: [email protected] > >> > > For additional commands, e-mail: [email protected] > >> > > > >> > > > >> > > >> > > >
