Re: [ceph-users] determine the size

John Wilkins Fri, 12 Apr 2013 11:03:04 -0700

Waed,

I've seen you returning to this line of questioning several times, and I
assume the reason is that you are still trying to understand the reasons
for computing object locations rather than having them statically defined.
Ceph calculates object placement and distributes data randomly. This has
numerous benefits, but the benefits aren't obvious at first:

1. *No Bottleneck or Single Point of Failure:* Instead of contacting a
centralized broker to look up and retrieve an object, a Ceph client
calculates where the object should be (based on the cluster map and the
CRUSH algorithm) and contacts the OSD directly. This eliminates a single
point of failure *and a performance bottleneck* under heavy loads. It's
conceivable to have some sort of round robin algorithm build a data
allocation table, but this requires contacting the server each time you
want to retrieve an object to determine the object location--resulting in
chatty sessions. It also makes dynamic rebalancing a bit of a nightmare.

2. *Load Distribution:* By distributing data randomly across the cluster,
load spikes on one OSD generally don't occur. In the scenario you
described, in actual fact *you may get better performance by having the
objects on separate OSDs* depending on their size. The time to establish a
connection is certainly a factor, but total throughput of the disk and the
network card are also considerations. If you want to read or write two
large objects simultaneously, you would likely get better performance by
having them on separate OSDs/hosts. This is because the total throughput of
the disk and sequential read and write throughput is usually the bottleneck.

3. *Dynamic Rebalancing:* By computing where to store an object, Ceph can
dynamically rebalance the cluster. Clients don't need to "know" where the
object is in order to retrieve it. What they need to know is the current
state of the cluster, which they retrieve from the monitor. Calculating an
object location is much faster than a look-up, so it doesn't involve a
performance penalty. The client and server don't need to be in sync with
respect to object locations either. They only need to be in sync in terms
of the current state of the cluster.

You might also want to have a look at these sections of the documentation:

http://ceph.com/docs/master/install/hardware-recommendations/#data-storage
http://ceph.com/docs/master/rados/operations/crush-map/
http://ceph.com/docs/master/architecture/#how-ceph-scales

On Wed, Apr 10, 2013 at 1:56 PM, Waed Bataineh <promiselad...@gmail.com>wrote:

> On Wednesday, April 10, 2013, Gregory Farnum <g...@inktank.com> wrote:
> > On Wednesday, April 10, 2013 at 2:53 AM, Waed Bataineh wrote:
> >> Hello,
> >>
> >> I have several question i'll be appreciated if i got answers for them:
> >>
> >> 1. does the osd have a fixed size or it compatible with the machine
> >> i'm working with.
> >
> > You can weight OSDs to account for different capacities or speeds; is
> that what you're asking?
>
> >I know about the weight let's say atribute. Whut i meant that can we think
> of the osds as memory space! which will take certain size from the machine
> RAM.
> >> if the next case is true what is the equation?
> >>
> >> 2. i can list the whole objects that in a certain pool, but can we
> >> determine the objects in specific osd, as a commandline i mean?
> >
> > Not really; no. You could construct the information by listing al the
> objects in each pool and calculating if they live on the OSD in question,
> but there is not an interface to have an OSD go list its contents.
>
> >Ok.
> >> 3. Finally, does it differ if i'm reeding two objects from the same
> >> osd than reading two objects from two osds?
> >
> > Differ how? There is not special handling for multiple reads from the
> same OSD, if that's what you're asking.
>
> > i meant in time factor. Take the following sinario if we notice that
> certain client always or let say more frequent need to open to objects n
> these two were in different osd would it better to make those objects be in
> the same osd?
> > -Greg
> > Software Engineer #42 @ http://inktank.com | http://ceph.com
> >
> >
> >
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

-- 
John Wilkins
Senior Technical Writer
Intank
john.wilk...@inktank.com
(415) 425-9599
http://inktank.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] determine the size

Reply via email to