[EMAIL PROTECTED] wrote:
On Tue, March 7, 2006 21:04, Christopher Browne said:
Jim C. Nasby wrote:
I quite like it ;-) One things that comes to my mind is that
slony should do a vacuum between delete_from(n) and copy_into(n).
Otherwise, the table suddenly grows to twice it's size, and even
more if the long-running subscribe-event fails a few times (We've
had that problem due to network outages).

greetings, Florian Pflug


You're asking for a contradiction, then.

- You wanted for the old data to be visible until COPY_SET completes;

- Now, you want to vacuum that old data away.
No, I don't want the currently old data to be vacuumed away, but I want
previously-old (and now completly invisible) data to be vacuumed away.

consider the following
<node is filled with data>
<I resubscribe>
delete from table
copy into table
<copy fails, slony retries, table could have nearly doubles it's size now>
delete from table
copy into table
<copy fails, slony retries, table could have nearly trippled it's size now>
delete from table
copy into table
...
...

This continues until the disk is full...

Now, with a vacuum step between delete and copy, we get:
<node is filled with data>
<I resubscribe>
delete from table
vacuum (nothing is actually vacuumed)
copy into table
<copy fails, slony retries, table could have nearly doubles it's size now>
delete from table
vacuum (The data from the previous, aborted copy is removed)
copy into table
<copy fails, slony retries, table has still at worst doubled it's size>
delete from table
vacuum (The data from the previous, aborted copy is removed)
copy into table
<copy fails, slony retries, table has still at worst doubled it's size>
...
...


Now the diskspace is bounded with about twice the size of the actual data.


Only one of those is possible.  In order for the old data to be visible,
we *can't* have vacuumed it away.

The better option, if we *do* want the wasted space to be gone, is to use
TRUNCATE to clear out the tables, as that is a lot cheaper than the
vacuums.  (And, by the way, that needs a pretty full lock on the table,
pretty early...)

- LOCK ASAP+Truncate is the approach that will lead to the most efficient
behaviour;
Full ack.

- Delete+Avoid Locking is what permissively allows the subscriber to
appear reusable when it is being rebuilt.
+ inbetween vacuum that guarantes a bound on the disk space used,
even in case of repeated failure.
And, to avoid deadlock issues, one could take locks ASAP that prevent
anything but a select on the tables. I don't know which locking
level does that, but I guess there is one.

Originally, we used Delete+Avoid Locking; after all the troubles
experienced with it, I changed over to LOCK ASAP in preparation for 1.2.

I daresay I wish we had 'LOCK ASAP' in 1.1.5 or even 1.1.2; we had a node
that had to be rebuilt _last night_, and the "permissive" approach in the
elder versions allowed it to run into something of a deadlock that
required restarting the COPY_SET after it had been nearly complete.  I
didn't lose any sleep, but we didn't get the node back as early as hoped
for.

If we're rebuilding a node, the "best practice," as far as I'm concerned,
is to lock everything out (via pg_hba.conf) other than the slony user so
that nothing can get in edgewise and mess up the data copy.  If we demand
exclusive locks up front on all the tables, that makes the pg_hba.conf
changes less needful.
I agree, but only to the extent that locking everybody else out is possible.
I have to replicate gigs of data over a wan, which takes a few hours as best
(because of slow connections). I just can't take things offline for that long...

Any time we have been "permissive" about letting connections in while a
node was rebuilding, it has been easy for it to lead to grief.  I'm
willing to listen and consider making this sort of thing optional, but in
my experience, it is not only very much not a "best practice;" I find it
turns out badly.  It did, last night, Yet Again, supporting my preference
for "Lock them early, Lock them up tight."  Events are doing a good job of
convincing me that my position on the matter is right...
I can't see how a select could possibly disturb a subscribe-set operation.
I full agree that doing any kind of updates during a subscribe is crazy,
but that could be avoided with taking the right kind of lock, I'd say.

After having watched it turn out badly, your arguments for the "Delete +
Avoid Locking Permissive Approach" need to be very persuasive to overcome
that.  When you suggest impossibilities, that really doesn't help the
case...
s/sugest impossibilities/don't define what you mean clearly/, then I agree ;-)))
Sorry for the confusion, I hope I could clear it up a bit.

greetings, Florian Pflug

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to