I'm working on some major architectural changes to Bucardo, and this issue has popped up again. Quick background of the problem:
Bucardo works by knowing which rows in a table have changed since the last time it ran. Thus, we need to store the primary keys - this is done in the delta table. It also has a "txntime" next to each primary key column. This is used for two purposes: * As a unique identifier for the delta. It is too expensive to have the main process remove the deltas right away, so it is done at a later time. We do mark the row as done for this sync, however, by adding all the distinct txntimes from the delta table we just processed to the track table. Other syncs can thus ignore those rows by looking at the track table. This also allows for the case where a table is replicated by more than one sync - so we cannot remove a delta until all the syncs have processed it * The other purpose is for conflict resolution. The timestamp, in adition to being unique, is a handy way of knowing when the row changed, which we can use to determine which server "wins" in case the same primary key was changed. This use case is much less used. So the problem is that timestamp has a maximum precision of 6, which in rare cases can lead to two processes generating the same value for now(), and thus creating identical rows in bucardo_delta - which really confuses Bucardo, as it expects txntime to be unique. One possible solution is a bigint, tied to a sequence. This gets around the collision problem, but as we must call nextval() inside the trigger, we lose the many-to-one benefit of the same number for a process updating many rows (e.g. the track table will have to store a lot more distinct 'txntime' rows). Also, no more time-based conflict resolution. And calling nextval() is probably less perfomant than using now(). Another is to use txnid_current(), which basically is a unique bigint that will be the same inside a transaction ( much like now() ). The disadvantages are the loss of timestamp, and the worry that things could get wrapped around. That seems small, however, and the docs indicate there is an 'epoch' padding, although I'm not clear on how/where that gets applied. Other than that, I'm not sure. The ideal is something unique, that gives a time hint of some kind, and that doesn't take up a lot of space (as delta tables can get quite big quite quickly). Any ideas? A higher resolution timestamp would work, or combining the timestamp with something else to prevent collisions, but those will make the txntime column larger. Thus, the current best idea I have is to combine the timestamp with txid_current, which basically adds some precision to the timestamp, e.g. now() || txid_current()::text This means we have to store it as a text, and not a timestamp field, but it still sorts, and we don't do any other time-based queries on the field, other than the vac process, but that can be changed. -- Greg Sabino Mullane [email protected] End Point Corporation PGP Key: 2529 DF6A B8F7 9407 E944 45B4 BC9B 9067 1496 4AC8
signature.asc
Description: PGP signature
_______________________________________________ Bucardo-general mailing list [email protected] https://mail.endcrypt.com/mailman/listinfo/bucardo-general
