On 12/23/2005 4:57 AM, Florian G. Pflug wrote:
Jan Wieck wrote:
On 12/21/2005 8:18 PM, Florian G. Pflug wrote:

Florian G. Pflug wrote:
<snipped my own mail>

Can anyone confirm that this is actually a bug? I pretty sure
(Did multiple setups of my cluster, and the problem persisted -
I used the altperl scripts for setting up the cluster, so I
see no way I could have causes this).

If it's really I bug, I would at least be worth a note in
the docs or in the 1.1.5 release notes - I took me hours to
nail down the problem, and it wasn't fun, so preventing
others from having to do the same would be a good thing.

Rebuild listen entries is indeed broken. This is a show stopper for 1.1.5 ... I am working at it.

Is there a reason for not generating all "sensible" sl_listen entries?
I didn't find any documentation on the performance overhead a
sl_listen entry causes.

Exactly the "sensible" part of that all is important. Problems arise when a node receives an event from any set-origin, which has not yet been processed by its data provider for that set. For example

    1 -> 2

      3

1 being origin, 2 is subscriber, 3 is a new node not subscribed yet. 3 has paths for 1 and 2, so naturally it would listen on each of them for their events. If we now subscribe 3 as a cascaded node with 2 as its data provider, the ENABLE_SUBSCRIPTION event that will follow from node 1, on which node 3 will start copy_set, must be received by 3 from 2. That is the only way that 2 at the moment where 3 starts to copy data actually has data itself. It could still be busy with it's own copy_set, meaning that not only the data in the tables is missing, the tables themself aren't in sl_table either yet.

And to spice this up a little more, reading the events is done async in the remote_listen thread. They are queued and the remote_worker thread will process them from the queue. At the moment where node 3 gets the SUBSCRIBE_SET event, it will have a lot of stuff already queued, so it better restart ASAP to throw that away and listen again, this time for all 1-events on 2.


Jan


With "sensible" I mean: Telling node X via sl_listen to ask neighbour-nodes
(Those for which a sl_path entry exists) for events from all other nodes,
apart from those for which the events must have travelled via node X to
reach the neighbour of X in question.

I tried writing an algorithm to do that, but it turned out that isn't quite
as easy as I initially believed, because all "iterative" algorithms
I could think off (Which were all based basically on the idea, that
if X receives events from Y, and Y from Z, then X can receive events from Z
via Y) failed because there is not enough information in sl_listen to figure
out if Y already needs X receive events from Z).

greetings, Florian Pflug


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to