Hi Markus,

I didn't have much reliability issues with ensemble, appia or spread, so
far. Although, I admit I didn't ever run any of these in production.
Performance is certainly an issue, yes.
I may suggest another reading even though a bit dates, most of the results still apply: http://jmob.objectweb.org/jgroups/JGroups-middleware-2004.pdf The baseline is that if you use UDP multicast, you need a dedicated switch and the tuning is a nightmare. I discussed these issues with the developers of Spread and they have no real magic. TCP seems a more reliable alternative (especially predictable performance) but the TCP timeouts are also tricky to tune depending on the platform. We worked quite a bit with Nuno around Appia in the context of Sequoia and performance can be outstanding when properly tuned or absolutely awful is some default values are wrong. The chaotic behavior of GCS under stress quickly compromises the reliability of the replication system, and admission control on UDP multicast has no good solution so far. It's just a heads up on what is awaiting you in production when the system is stressed. There is no good solution so far besides a good admission control on top of the GCS (in the application).

I am now off for the holidays.

Cheers,
Emmanuel

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to