On 14-10-28 10:11 AM, Eneko Lacunza wrote:
Hi Adam,

You only have 3 osd in ceph cluster?

What about journals? Are they inline or in a separate (ssd?) disk?

What about network? Do you have an phisically independent network for proxmox/vms and ceph?

We have a currently 6-osd 3-node ceph cluster; doing an out/in of a osd, doesn't create a very high impact. If you in a new osd (replace a disk) the impact is noticeable but our ~30 vms were yet workable. We do have different physicall networks for proxmox/VMs and ceph. (1gbit)

4 nodes.
2 OSDs per node.
Journal on the same drive as the OSD, unfortunately... the nodes only have 3 drive bays each. Each node has 4 x 1Gb network in LACP bond, using OpenVSwitch, VLANs on top of that. Dedicated VLAN for CEPH and Proxmox management. Total network bandwidth in use from each node during rebuild is only ~1.5Gbps, with no single LACP member ever bursting higher than ~600Mbps. I believe it's unlikely to be a network problem, I've stress-tested OVS at much higher data rates than this.

You mention setting 'noout'; is there a way to do that inside the GUI, or should I just do that at the CEPH CLI with "ceph osd set noout"? I can see that this would skip one rebalancing step, but I still have to rebalance after I replace each disk, don't I?

FWIW, I'm replacing 8x 250GB disks with 8x 500GB disks that became available from another storage cluster. I'm almost done at this point... just want to know how to avoid the massive performance hit next time.

Oh, and on the node with the new disk, I see IOWAIT times of ~15%. Which makes sense IMHO, I'm writing a ton of data to the new disk.

--
-Adam Thompson
 [email protected]

_______________________________________________
pve-user mailing list
[email protected]
http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user

Reply via email to