On Fri, 2022-05-13 at 13:54 -0500, Jason L Tibbitts III wrote: > So I went to do a dnf system-upgrade from F35 to F36 on a test > machine, > as part of my usual testing. In the middle of the process, it > appears > that /var filled up and that left the system in an unfortunate state. > Surprisingly (to me) it did boot with a random mix of F35 and F36 > packages and even though it's a throwaway test box, I wanted to play > around with fixing it a bit and trying to understand why it ran out > of > space instead of just reinstalling. > > Turns out that "dnf --releasever 36 --nogpgcheck remove --duplicates" > was able to effectively everything in the system, and while running > this > /var filled up again. When that happened, dnf couldn't even be > aborted; > I had to kill -9. The culprit is the write-ahead log, > /var/lib/rpm/rpmdb.sqlite-wal. I resized /var and reran, and by the > end > of the process had grown to over 9GB: > > -rw-r--r--. 1 root root 9124576392 May 13 13:11 rpmdb.sqlite-wal > > Of course it immediately went to 0 once the transaction completed, > though rpmdb.sqlite went from: > > -rw-r--r--. 1 root root 281739264 May 11 14:24 rpmdb.sqlite > > to > > -rw-r--r--. 1 root root 730648576 May 13 13:15 rpmdb.sqlite > > which seems... odd for what's effectively just reinstalling the > existing > package set. >
ll /var/lib/rpm/rpmdb* -h -rw-r--r-- 1 root root 666M May 15 21:12 /var/lib/rpm/rpmdb.sqlite -rw-r--r-- 1 root root 32K May 16 01:52 /var/lib/rpm/rpmdb.sqlite-shm -rw-r--r-- 1 root root 0 May 15 21:12 /var/lib/rpm/rpmdb.sqlite-wal so 9 Gigas is not normal but you can do a symbol link to other partition of /var/lib/dnf/system- upgrade/ , is required the symbol link be relative , at least this worked some years ago . I decide *not* separate root partition of home partition, now they are only one partition, exactly to avoid this kind of problem. Also /var where are docker files , databases , mock cache and build dirs , etc etc make root partition fill up often . > Anyway, obviously the solution is to make sure that /var is "big > enough" > before you do a system upgrade. And we do have warnings about > filesystems being too small, but nothing about needing an extra 10GB > for > this. Certainly my case might be somewhat pathological and it was > good > that in the end I was able to get the system back into a useful state > without wiping it. But in the end I wonder: > > 1) Is it really expected that the wal file will grow to that size? > > 2) Is there anything to be done to reduce the size of the log? > > 3) Is there any better way to handle a lack of space in /var during > an > RPM transaction? > > 4) Can we estimate how large the file will grow, and refuse to start > a > system upgrade if there is not enough space? Certainly we already do > this to some degree, but it seems that the estimate of the required > space is a bit too small. > > - J< > _______________________________________________ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: > https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure -- Sérgio M. B. _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure