[EMAIL PROTECTED] wrote:
On Tue, March 7, 2006 21:04, Christopher Browne said:Jim C. Nasby wrote:I quite like it ;-) One things that comes to my mind is that slony should do a vacuum between delete_from(n) and copy_into(n). Otherwise, the table suddenly grows to twice it's size, and even more if the long-running subscribe-event fails a few times (We've had that problem due to network outages).greetings, Florian PflugYou're asking for a contradiction, then. - You wanted for the old data to be visible until COPY_SET completes; - Now, you want to vacuum that old data away.
No, I don't want the currently old data to be vacuumed away, but I want previously-old (and now completly invisible) data to be vacuumed away. consider the following <node is filled with data> <I resubscribe> delete from table copy into table <copy fails, slony retries, table could have nearly doubles it's size now> delete from table copy into table <copy fails, slony retries, table could have nearly trippled it's size now> delete from table copy into table ... ... This continues until the disk is full... Now, with a vacuum step between delete and copy, we get: <node is filled with data> <I resubscribe> delete from table vacuum (nothing is actually vacuumed) copy into table <copy fails, slony retries, table could have nearly doubles it's size now> delete from table vacuum (The data from the previous, aborted copy is removed) copy into table <copy fails, slony retries, table has still at worst doubled it's size> delete from table vacuum (The data from the previous, aborted copy is removed) copy into table <copy fails, slony retries, table has still at worst doubled it's size> ... ... Now the diskspace is bounded with about twice the size of the actual data.
Only one of those is possible. In order for the old data to be visible, we *can't* have vacuumed it away. The better option, if we *do* want the wasted space to be gone, is to use TRUNCATE to clear out the tables, as that is a lot cheaper than the vacuums. (And, by the way, that needs a pretty full lock on the table, pretty early...) - LOCK ASAP+Truncate is the approach that will lead to the most efficient behaviour;
Full ack.
- Delete+Avoid Locking is what permissively allows the subscriber to appear reusable when it is being rebuilt.
+ inbetween vacuum that guarantes a bound on the disk space used, even in case of repeated failure. And, to avoid deadlock issues, one could take locks ASAP that prevent anything but a select on the tables. I don't know which locking level does that, but I guess there is one.
Originally, we used Delete+Avoid Locking; after all the troubles experienced with it, I changed over to LOCK ASAP in preparation for 1.2. I daresay I wish we had 'LOCK ASAP' in 1.1.5 or even 1.1.2; we had a node that had to be rebuilt _last night_, and the "permissive" approach in the elder versions allowed it to run into something of a deadlock that required restarting the COPY_SET after it had been nearly complete. I didn't lose any sleep, but we didn't get the node back as early as hoped for. If we're rebuilding a node, the "best practice," as far as I'm concerned, is to lock everything out (via pg_hba.conf) other than the slony user so that nothing can get in edgewise and mess up the data copy. If we demand exclusive locks up front on all the tables, that makes the pg_hba.conf changes less needful.
I agree, but only to the extent that locking everybody else out is possible. I have to replicate gigs of data over a wan, which takes a few hours as best (because of slow connections). I just can't take things offline for that long...
Any time we have been "permissive" about letting connections in while a node was rebuilding, it has been easy for it to lead to grief. I'm willing to listen and consider making this sort of thing optional, but in my experience, it is not only very much not a "best practice;" I find it turns out badly. It did, last night, Yet Again, supporting my preference for "Lock them early, Lock them up tight." Events are doing a good job of convincing me that my position on the matter is right...
I can't see how a select could possibly disturb a subscribe-set operation. I full agree that doing any kind of updates during a subscribe is crazy, but that could be avoided with taking the right kind of lock, I'd say.
After having watched it turn out badly, your arguments for the "Delete + Avoid Locking Permissive Approach" need to be very persuasive to overcome that. When you suggest impossibilities, that really doesn't help the case...
s/sugest impossibilities/don't define what you mean clearly/, then I agree ;-))) Sorry for the confusion, I hope I could clear it up a bit. greetings, Florian Pflug
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ Slony1-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/slony1-general
