Hi,

I have been performance testing 4.99.11 over high latency connections between several servers. The sync contains over 70 tables. On the nodes a simple test script continously modifies random rows in randomly selected tables. The performance is surprisingly good, until a conflict arises. The default 'bucardo_latest' conflict strategy is slow and takes a few minutes to pick a winner. Typically this results in more conflicts in subsequent sync runs, until Bucardo can't serialize due to concurrent updates on one of the nodes. All sync runs from then on fail (serialization errors) until the test scripts are turned off.

Digging in to this I found that the conflict strategy works quite differently from what I expected: it does not pick a winner for each conflicting row, but it picks a winning database for all conflicts based on the latest update. This can produce some quite unexpected results. For example when you run:

psql -h node1 -c "update t set node='node1' where id=1" test
psql -h node2 -c "update t set node='node2' where id=1" test
psql -h node3 -c "update t set node='node3' where id=2" test

and these updates are processed in one sync run, the result looks like this, once the conflict has been resolved:

 id |   node
----+-----------
  1 | old value
  2 | node3

Since node3 made the last update, it is the winner for all conflicting rows. Row id 1 is conflicting (node1 and node2), so the old value from node3 is restored. In the (theoretical) scenario of the performance test above this produces really undesired results, since syncing will fail for a long time with lots of conflicts.

The reason it takes so long to pick a winner is that Bucardo queries all delta tables individually on all nodes. When latency is high this will take a while.

Remarks / questions:

1. Maybe some side effects of bucardo_latest can be avoided, e.g. ignore databases that are not part of the conflict. Documentation / man page suggests something else. 2. Performance of the current bucardo_latest strategy can be improved dramatically when each node is only queried once, using a UNION across all delta tables to find MAX(txntime). 3. Are the bucardo_source, bucardo_target, bucardo_skip and bucardo_random strategies still working in 4.99?

Kind regards,
--
Hans van der Riet
_______________________________________________
Bucardo-general mailing list
[email protected]
https://mail.endcrypt.com/mailman/listinfo/bucardo-general

Reply via email to