Since the point release we've run into a number of databases that when we restore from a base backup end up being larger than the primary database was. Sometimes by a large factor. The data below is from 9.1.11 (both primary and standby) but we've seen the same thing on 9.2.6.
primary$ for i in 1261982 1364767 1366221 473158 ; do echo -n "$i " ; du -shc $i* | tail -1 ; done 1261982 29G total 1364767 23G total 1366221 12G total 473158 76G total standby$ for i in 1261982 1364767 1366221 473158 ; do echo -n "$i " ; du -shc $i* | tail -1 ; done 1261982 55G total 1364767 28G total 1366221 17G total 473158 139G total I've run the snaga xlogdump on the WAL records played before reaching a consistent point (we confirmed the extra storage had already appeared by then) and grepped for the above relfilenode but they're quite large. I believe these dumps don't contain any sensitive data, when I verify that I can upload one of them for inspection. $ ls -lh [14]* -rw-rw-r-- 1 heroku heroku 325M Jan 24 04:13 1261982 -rw-r--r-- 1 root root 352M Jan 25 00:04 1364767 -rw-r--r-- 1 root root 123M Jan 25 00:04 1366221 -rw-r--r-- 1 root root 357M Jan 25 00:04 473158 The first three are btrees and the fourth is a haeap btw. We're also seeing log entries about "wal contains reference to invalid pages" but these errors seem only vaguely correlated. Sometimes we get the errors but the tables don't grow noticeably and sometimes we don't get the errors and the tables are much larger. Much of the added space is uninitialized pages as you might expect but I don't understand is how the database can start up without running into the "reference to invalid pages" panic consistently. We check both that there are no references after consistency is reached *and* that any references before consistency are resolved by a truncate or unlink before consistency. The primary was never this large btw, so it's not just a case of leftover files from drops or truncates that might have failed on the standby. I'm assuming this is somehow related to the mulixact or transaction wraparound problems but I don't really understand how they could be hitting when both the primary and standby are post-upgrade to the most recent point release which have the fixes -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers