Re: [ceph-users] Better way to use osd's of different size
On Wed, Jan 14, 2015 at 3:36 PM, Межов Игорь Александрович wrote: > What is the more right way to do it: > > - replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes full of > 2tb drives and > > other nodes remains in 12x1tb confifg > > - or replace 1tb to 2tb drives in more unify way, so every node will have > 6x1tb + 6x2tb drives? > > > I feel that the second way will give more smooth distribution among the > nodes, and > > outage of one node may give lesser impact on cluster. Am I right and what > you can > > advice me in such a situation? You are correct. The CRUSH weight assigned to an OSD depends on its capacity, so in order to fill a cluster evenly we have to write 2x as quickly to a 2TB OSD than a 1TB OSD. If some nodes had all the big drives, then the network interfaces to those nodes would be overloaded compared with the network interfaces to the other nodes. However, even if the drives are spread out across nodes such that there is no network imbalance, you will still have the local imbalance within a node: if you are writing (across many PGs) 100MB/s to the 2TB drives then you will only be writing 50MB/s to the 1TB drives. You could solve this in turn with some creative arrangement of pools with crush rules to make sure that each pool was only using a single drive size: that way you could have two pools that each got full bandwidth, but one pool would be smaller than the other. But if you don't care about the bandwidth under-utilization on the older drives, then that would be unnecessary complication. John ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Better way to use osd's of different size
Hi Megov, you should weight the OSD so it's represent the size (like an weight of 3.68 for an 4TB HDD). cephdeploy do this automaticly. Nevertheless also with the correct weight the disk was not filled in equal distribution. For that purposes you can use reweight for single OSDs, or automaticly with "ceph osd reweight-by-utilization". Udo On 14.01.2015 16:36, Межов Игорь Александрович wrote: > > Hi! > > > We have a small production ceph cluster, based on firefly release. > > > It was built using hardware we already have in our site so it is not > "new & shiny", > > but works quite good. It was started in 2014.09 as a "proof of > concept" from 4 hosts > > with 3 x 1tb osd's each: 1U dual socket Intel 54XX & 55XX platforms on > 1 gbit network. > > > Now it contains 4x12 osd nodes on shared 10Gbit network. We use it as > a backstore > > for running VMs under qemu+rbd. > > > During migration we temporarily use 1U nodes with 2tb osds and already > face some > > problems with uneven distribution. I know, that the best practice is > to use osds of same > > capacity, but it is impossible sometimes. > > > Now we have 24-28 spare 2tb drives and want to increase capacity on > the same boxes. > > What is the more right way to do it: > > - replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes > full of 2tb drives and > > other nodes remains in 12x1tb confifg > > - or replace 1tb to 2tb drives in more unify way, so every node will > have 6x1tb + 6x2tb drives? > > > I feel that the second way will give more smooth distribution among > the nodes, and > > outage of one node may give lesser impact on cluster. Am I right and > what you can > > advice me in such a situation? > > > > > Megov Igor > yuterra.ru, CIO > me...@yuterra.ru > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Better way to use osd's of different size
Hi! We have a small production ceph cluster, based on firefly release. It was built using hardware we already have in our site so it is not "new & shiny", but works quite good. It was started in 2014.09 as a "proof of concept" from 4 hosts with 3 x 1tb osd's each: 1U dual socket Intel 54XX & 55XX platforms on 1 gbit network. Now it contains 4x12 osd nodes on shared 10Gbit network. We use it as a backstore for running VMs under qemu+rbd. During migration we temporarily use 1U nodes with 2tb osds and already face some problems with uneven distribution. I know, that the best practice is to use osds of same capacity, but it is impossible sometimes. Now we have 24-28 spare 2tb drives and want to increase capacity on the same boxes. What is the more right way to do it: - replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes full of 2tb drives and other nodes remains in 12x1tb confifg - or replace 1tb to 2tb drives in more unify way, so every node will have 6x1tb + 6x2tb drives? I feel that the second way will give more smooth distribution among the nodes, and outage of one node may give lesser impact on cluster. Am I right and what you can advice me in such a situation? Megov Igor yuterra.ru, CIO me...@yuterra.ru ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com