Re: Cluster Operation over Slow Links

2011-01-04 Thread Gregory Farnum
On Tue, Jan 4, 2011 at 8:09 PM, Matthew Roy  wrote:
> The wiki talks about using crush maps to spread data out over
> different racks and servers, but there seems to be no reason crush
> couldn't be used to place data copies in different floors or nearby
> datacenters (think Equinix DC2 and DC7 in Washington - 1-3ms
> 'cross-connect' latency, large pipes). Thinking more broadly, the
> latency between DC7 and home users in Richmond, Virginia is 10-15ms,
> where a user with FiOS can easily get 15Mbps real speed in both
> directions. What is the performance impact of bandwidth and latency,
> especially between complete replica-sets? How far can ceph be pushed
> and what starts to hurt first? Does an entire cluster have to be in
> the same data center, the same city, or the same state?
Ceph will operate over whatever latency/bandwidth you give it, it's
just that things will become really slow.
It's possible to massage things around to disguise latency in some
configurations, but the bottom line is that no matter what
configuration you use, the slowest interconnect is going to dominate
everything else. Over a 15Mbps connection, you're going to get 15Mbps
write speeds at best, and that assumes a very careful map which only
sends data one way, etc.
At those kinds of speeds, you really need to use a
filesystem/application which is designed to handle that kind of
network topology. Ceph is designed for the other end of the spectrum.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cluster Operation over Slow Links

2011-01-04 Thread Matthew Roy
Has anyone tried to operate (or simulated) a cluster over various link
speeds? There was a mailing list question months ago about ceph over
"WAN" and the consensus was that it would not perform well - but
there's a broad spectrum of link speeds and latencies in the real
world - LAN and WAN are pretty blurry these days, especially in the
datacenters were large clusters will live.

The wiki talks about using crush maps to spread data out over
different racks and servers, but there seems to be no reason crush
couldn't be used to place data copies in different floors or nearby
datacenters (think Equinix DC2 and DC7 in Washington - 1-3ms
'cross-connect' latency, large pipes). Thinking more broadly, the
latency between DC7 and home users in Richmond, Virginia is 10-15ms,
where a user with FiOS can easily get 15Mbps real speed in both
directions. What is the performance impact of bandwidth and latency,
especially between complete replica-sets? How far can ceph be pushed
and what starts to hurt first? Does an entire cluster have to be in
the same data center, the same city, or the same state?

Matthew
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html