On Fri, 21 Jun 2013, Stefan Hajnoczi wrote:
> >> but if there's really a case for it with performance profiles then I
> >> guess it would be necessary.  But we should definitely get feedback from
> >> the Ceph folks too.
> >
> >
> > The specific problem we are trying to solve (in case that's not
> > obvious) is the non-locality of data read/written by ceph. Whilst
> > you can use placement to localise data to the rack level, even if
> > one of your OSDs is in the machine you end up waiting on network
> > traffic. That is apparently hard to solve inside Ceph.
> 
> I'm not up-to-speed on Ceph architecture, is this because you need to
> visit a metadata server before you access the storage.  Even when the
> data is colocated on the same machine you'll need to ask the metadata
> server first?

The data location is determined by a fancy hash function, so there is no 
metadata lookup step and the client can directly contact the right server.  
The trade-off is that the client doesn't get to choose where to write--the 
hash deterministically tells us that based on the object name and current 
cluster state.

In the end this means there is some lower bound on latency because we are 
reading/writing over the network...

sage



Reply via email to