On 22/06/16 17:54, Andrei Mikhailovsky wrote: > Hi Daniel, > > Many thanks for your useful tests and your results. > > How much IO wait do you have on your client vms? Has it significantly > increased or not? >
Hi Andrei, Bearing in mind that this cluster is tiny (four nodes, each with four OSDs), our metrics may not be that meaningful. However, on a VM that is running ElasticSearch, collecting logs from Graylog, we're seeing no more than about 5% iowait for a 5s period, and most of the time it's below 1%. This VM is really not writing a lot of data though. The cluster as a whole is peaking at only about 1200 write op/s, according to ceph -w. Executing a "sync" in a VM does of course have a noticeable delay due to the recovery happening in the background, but nothing is waiting for IO long enough to trigger the kernel's 120s timer / warning. The recovery has been running for about four hours now, and is down to 20% misplaced objects. So far we have not had any clients block indefinitely, so I think the migration of VMs to Jewel-capable hypervisors did the trick. Best, Daniel _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com