On Dec 13, 2012, at 7:48 PM, Greg Sabino Mullane <[email protected]> wrote:
> Just a heads up, there is a lot of conversation at times on > the #bucardo IRC channel on freenode, so make sure you are > visiting there for some interesting (but low volume) traffic! Yeah, and I’m slow to respond here. :-) > I don't think this will work out. If we did that, we'd have some problems: > > * Our main 'get deltas' query would no longer work, but would have to be a > handful of queries, per database. Additionally, it would add a lot of > complexity to push_rows. Because each DB might demand a different select > of distinct rows from bucardo_delta, we would also have to scan the > source database(s) multiple times, worse case scenario, once per > target DB. Ugh. > > * Since the data should already be in a logically self-contained set, > it should never be the case that we save anything by letting some other > sync do the work for us, because it's either going to violate our > existing assumptions of which tables should get replicated as a group, > or be the same as us in which case we don't win anything. > > * If we really want to avoid over-copying, there are probably better > approaches, but I am not convinced this is much of a problem, as > makedelta is 99% of the time used to replicate in a non-overlapping > target fashion. Well, let’s look at the use-case I have, then. We’re setting up a multi-master configuration between two servers A and B (in different data centers) for our customer information. There are 10 or so tables for this. The multi-master stuff is perfect for this. However, we have a bunch of other databases used for varying purposes, and that need to access a subset of the data from that system. For example, we have a project, let’s call it “frobisher,” that needs access to the "customers" table. Ideally, I would not change the existing MM config between the two databases, but just add a new sync that just syncs "customers" with both the main databases as sources and “frobisher” as a read-only target: `A:source B:source frobisher:target`. But this means that every delta in the "customers" table will be replicated between A and B *twice*. I suppose the solution is to configure the multi-master replication between A and B to exclude "customers", and then to add a separate sync that just targets "customers". But I expect to have a bunch of these kinds of "copy to this read-only target" things going on, so configuration might get to be a bit of a PITA. I guess the simplest solution, given the current architecture, is to have a separate sync for every table. Kind of annoying, though. >> As an aside: Ideally, when a table is being replicated with makedelta >> enabled, >> the deltas would be created, but the autokick, if there is one, is *not* >> fired. >> Any other sync autokicks would be. > > Not sure that this means - do you mean not firing the autokick of the current > sync? Yes. > Right now the kid ignores its own messages, but ideally we would train > the MCP to ignore such notices if they come from one of its (grandkids') PIDs. > Certainly this is a TODO. We could probably simply have the KID send the > MCP a quick NOTIFY on startup. Could also go the other way, and have the > information > about what PID did the kick bubble down to the CTL and then the KID, but > that would be trickier. Added: https://github.com/bucardo/bucardo/issues/34 >> * Replace "makedelta" with "cascade" and default to "auto" >> * Let the triggers do the work as much as possible > > I'm okay with the automatic searching, although we would have to be careful > to > check that the sync is active, as well as some other edge cases. However, > I've been thinking hard about the "always" triggers mentioned mostly > on channel (the idea being we set delta to 'always' and thus we can simply > rely on the trigger populating things rather than doing manual inserts > in the code for makedeltas). I really don't think that will work, as 'always' > triggers are just too dangerous. I'd much rather put the onus on Bucardo > itself > to emulate the trigger action, rather than violating the principle of least > surprise by having 'SET session_replication_role = replica' NOT disable the > Bucardo triggers. Even if we added some trickery to have them not do the > inserts > via a GUC setting or the like, just having them get called adds a lot of > overhead for bulk loading, one of the main reasons people "turn off" the > Bucardo delta triggers. If it was 'always', the only way to truly avoid > them would be to do an ALTER TABLE which we really, really want to avoid > as that is some heavy locking. We discussed on #bucardo having triggers exit if session_replication_role is set to "local", and to set it to that for bulk inserts. > It's a shame that session_replication_role is as limited as it is, as having > a little more discretion would be nice (e.g. setting a trigger to always fire > unless session_replication_role X is enabled, but not on normal 'replica'). > The current origin/local settings are too weak and advisory to really matter > here. Something to perhaps iron out and suggest to -hackers someday. Yeah, my suggestion to use "local" is a bit of a hack, though, frankly, more transparent if you're reading the code. Best, David _______________________________________________ Bucardo-general mailing list [email protected] https://mail.endcrypt.com/mailman/listinfo/bucardo-general
