Tom Lane wrote:
Glen Parker <glene...@nwlink.com> writes:
Mainly because the idea doesn't seem to make sense unless that's part
of the package.  If you don't cut index changes out of the WAL load
then the savings on the base backup alone aren't going to be all that
exciting when you consider the total cost of PITR backup.

In our setting, I think it might be more exciting than you think. As I said, I've not noticed any real impact to the system related to WAL exporting, but the nightly backup does indeed have a significant impact because of how long it runs. WAL export is a couple seconds ever few minutes, which nobody ever notices. The backup runs for a minimum of an hour and fifteen minutes, which people definitely notice.

Furthermore, you would need some very ugly hacks on the recovery process
to make it ignore (rather than try to apply) WAL records relating to
indexes.  I believe there are a fair number of cases where the recovery
process doesn't even know that a particular file is an index, because
the WAL stream doesn't tell it.  The live backends generating the WAL
log entries typically know that (and could suppress the entries) but the
recovery process has only a very limited view of reality.  It cannot,
for example, trust the system catalogs to be in a correct/consistent
state, so it couldn't look up the info for itself.

Could the live backends label the log entries with "hints" to be used by the replay process? In this case, I would think a simple flag indicating whether replay is critical or not would suffice.

BTW, there's a related problem with the idea, which is that the
tools normally used to take base backups haven't got any way to
distinguish indexes from any other kind of relation.

Yes there's no doubt it would increase the complexity of the base backup, IF a person chooses to ignore indexes. The up side is that people who are happy with the backup as it is would have to do nothing at all, it would just continue to work as it does now. To ignore indexes (and only certain indexes at that), you'd have to examine the system catalog as part of each backup. I already do that to some extent, in order to discover all the extra tablespaces that need to be backed up.

I guess the biggest problem I see with this is that it would have rather a small target audience.


-Glen


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Reply via email to