On Thu, Jun 30, 2011 at 6:45 AM, Simon Riggs <si...@2ndquadrant.com> wrote: > The only way to control this is with a time delay that can be changed > while the server is running. A recovery.conf parameter doesn't allow > that, so another way is preferable.
True. We've talked about making the recovery.conf parameters into GUCs, which would address that concern (and some others). > I think the time problems are more complex than said. The patch relies > upon transaction completion times, but not all WAL records have a time > attached to them. Plus you only used commits anyway, not sure why. For the same reason we do that with the recovery_target_* code - replaying something like a heap insert or heap update doesn't change the user-visible state of the database, because the records aren't visible anyway until the commit record is replayed. > Some actions aren't even transactional, such as DROP DATABASE, amongst Good point. We'd probably need to add a timestamp to the drop database record, as that's a case that people would likely want to defend against with this feature. > others. Consecutive records can be hours apart, so it would be > possible to delay on some WAL records but then replay records that > happened minutes ago, then wait hours for the next apply. So this > patch doesn't do what it claims in all cases. > > Similar discussion on max_standby_delay covered exactly that ground > and went on for weeks in 9.0. IIRC I presented the same case you just > did and we agreed in the end that was not acceptable. I'm not going to > repeat it. Please check the archives. I think this case is a bit different. First, max_standby_delay is relevant for any installation using Hot Standby, whereas this is a feature that specifically involves time. Saying that you have to have time synchronization for Hot Standby to work as designed is more of a burden than saying you need time synchronization *if you want to use the time-delayed recovery feature*. Second, and maybe more importantly, no one has come up with an idea for how to make this work reliably in the presence of time skew. Perhaps we could provide a simple time-skew correction feature that would work in the streaming case (though probably not nearly as well as running ntpd), but as I understand your argument, you're saying that most people will want to use this with archiving. I don't see how to make that work without time synchronization. In the max_standby_delay case, the requirement is that queries not get cancelled too aggressively while at the same time letting the standby get too far behind the master, which leaves some flexibility in terms of how we actually make that trade-off, and we eventually found a way that didn't require time synchronization, which was an improvement. But for a time-delayed standby, the requirement at least AIUI is that the state of the standby lag the master by a certain time interval, and I don't see any way to do that without comparing slave timestamps with master timestamps. If we can find a similar clever trick here, great! But I'm not seeing how to do it. Now, another option here is to give up on the idea of a time-delayed standby altogether and instead allow the standby to lag the master by a certain number of WAL segments or XIDs. Of course, if we do that, then we will not have a feature called "time-delayed standbys". Instead, we will have a feature called "standbys delayed by a certain number of WAL segments (or XIDs)". That certainly caters to some of the same use cases, but I think it severely lacking in the usability department. I bet the first thing most people will do is to try to figure out how to translate between those metrics and time, and I bet we'll get complaints on systems where the activity load is variable and therefore the time lag for a fixed WAL-segment lag or XID-lag is unpredictable. So I think keeping it defined it terms of time is the right way forward, even though the need for external time synchronization is, certainly, not ideal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers