Also remember to drive your Ceph cluster as hard as you got means to, eg. tuning the VM OSes/IO sub systems like using multiple RBD devices per VM (to issue more out standing IOPs from VM IO subsystem), best IO scheduler, CPU power + memory per VM, also ensure low network latency + bandwidth between your rsyncing VMs etc.
> On 01/05/2015, at 11.13, Piotr Wachowicz > <piotr.wachow...@brightcomputing.com> wrote: > > Thanks for your answer, Nick. > > Typically it's a single rsync session at a time (sometimes two, but rarely > more concurrently). So it's a single ~5GB typical linux filesystem from one > random VM to another random VM. > > Apart from using RBD Cache, is there any other way to improve the overall > performance of such a use case in a Ceph cluster? > > In theory I guess we could always tarball it, and rsync the tarball, thus > effectively using sequential IO rather than random. But that's simply not > feasible for us at the moment. Any other ways? > > Sidequestion: does using RBDCache impact the way data is stored on the > client? (e.g. a write call returning after data has been written to Journal > (fast) vs written all the way to the OSD data store(slow)). I'm guessing > it's always the first one, regardless of whether client uses RBDCache or not, > right? My logic here is that otherwise that would imply that clients can > impact the way OSDs behave, which could be dangerous in some situations. > > Kind Regards, > Piotr > > > > On Fri, May 1, 2015 at 10:59 AM, Nick Fisk <n...@fisk.me.uk > <mailto:n...@fisk.me.uk>> wrote: > How many Rsync’s are doing at a time? If it is only a couple, you will not be > able to take advantage of the full number of OSD’s, as each block of data is > only located on 1 OSD (not including replicas). When you look at disk > statistics you are seeing an average over time, so it will look like the > OSD’s are not very busy, when in fact each one is busy for a very brief > period. > > > > SSD journals will help your write latency, probably going down from around > 15-30ms to under 5ms > > > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > <mailto:ceph-users-boun...@lists.ceph.com>] On Behalf Of Piotr Wachowicz > Sent: 01 May 2015 09:31 > To: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > Subject: [ceph-users] How to estimate whether putting a journal on SSD will > help with performance? > > > > Is there any way to confirm (beforehand) that using SSDs for journals will > help? > > We're seeing very disappointing Ceph performance. We have 10GigE interconnect > (as a shared public/internal network). > > > > We're wondering whether it makes sense to buy SSDs and put journals on them. > But we're looking for a way to verify that this will actually help BEFORE we > splash cash on SSDs. > > > > The problem is that the way we have things configured now, with journals on > spinning HDDs (shared with OSDs as the backend storage), apart from slow > read/write performance to Ceph I already mention, we're also seeing fairly > low disk utilization on OSDs. > > > > This low disk utilization suggests that journals are not really used to their > max, which begs for the questions whether buying SSDs for journals will help. > > > > This kind of suggests that the bottleneck is NOT the disk. But,m yeah, we > cannot really confirm that. > > > > Our typical data access use case is a lot of small random read/writes. We're > doing a lot of rsyncing (entire regular linux filesystems) from one VM to > another. > > > > We're using Ceph for OpenStack storage (kvm). Enabling RBD cache didn't > really help all that much. > > > > So, is there any way to confirm beforehand that using SSDs for journals will > help in our case? > > > > Kind Regards, > Piotr > > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com