Re: [PVE-User] excessive I/O latency during CEPH rebuild

Eneko Lacunza Tue, 28 Oct 2014 08:14:47 -0700

Hi Adam,

I suggest to set noout before removing the drive, so that clusterdoesn't start to rebalance. Then put in the new disk and in it; thenunset noout.

That way you just get the network traffic to complete the data in thatnew disk (copy recover), and no rebalancing.


On 28/10/14 16:05, Adam Thompson wrote:

On 14-10-28 10:03 AM, Adam Thompson wrote:
I'm seeing ridiculous I/O latency after out'ing and re-in'ing a diskin the CEPH array; the OSD monitor tab shows two OSDs (i.e. disks)having latency above 10msec - they're both in the 200ms range - butreading a single uncached sector from a virtual disk takes >10sec.
It's bad enough that all my virtualized DNS servers are timing outand this, of course, directly impacts service.
During normal (non-rebuild, non-rebalance) operations, CEPH is notterribly fast to write, but delivers acceptable read speeds.
Where do I start looking for problems? Are there any knobs I shouldbe tweaking for CEPH?
A related question: to proactively replace a disk, I'm doingStop->Out->Remove / swap disk / Create OSD. Is that a viableprocedure? Other than the rebuild I/O starving regular reads, itseems to be working...



--
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943575997
      943493611
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es

_______________________________________________
pve-user mailing list
[email protected]
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Re: [PVE-User] excessive I/O latency during CEPH rebuild

Reply via email to