Re: Distributed System Software Recommendations

Gabriel Gunderson Tue, 25 Aug 2009 23:11:59 -0700

On Tue, Aug 25, 2009 at 11:59 PM, Shane Hathaway<[email protected]> wrote:
> In the last distributed system I helped build, we didn't feel good about
> having a central point of control (and failure), but in the end we
> decided that a fully distributed system would add unjustifiable
> complexity and expense.  Fully distributed systems seem to grow
> behaviors that are as hard to fix as human communication problems.


Yeah, my favorite example of a fully distributed system that seemed
"to grow behaviors that are as hard to fix as human communication
problems" was Amazon messaging system that carried *gossip*.  How
human like is that?

"""
At 9:41am PDT, we determined that servers within Amazon S3 were having
problems communicating with each other. As background information,
Amazon S3 uses a gossip protocol to quickly spread server state
information throughout the system. This allows Amazon S3 to quickly
route around failed or unreachable servers, among other things. When
one server connects to another as part of processing a customer's
request, it starts by gossiping about the system state. Only after
gossip is completed will the server send along the information related
to the customer request. On Sunday, we saw a large number of servers
that were spending almost all of their time gossiping and a
disproportionate amount of servers that had failed while gossiping.
With a large number of servers gossiping and failing while gossiping,
Amazon S3 wasn't able to successfully process many customer requests.
"""

http://status.aws.amazon.com/s3-20080720.html

Also watch out for backbiting, speaking ill of others, spite and slander.

Best,
Gabe

/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Re: Distributed System Software Recommendations

Reply via email to