Re: [ceph-users] Better way to use osd's of different size

2015-01-16 Thread John Spray
On Wed, Jan 14, 2015 at 3:36 PM, Межов Игорь Александрович
 wrote:
> What is the more right way to do it:
>
> - replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes full of
> 2tb drives and
>
> other nodes remains in 12x1tb confifg
>
> - or replace 1tb to 2tb drives in more unify way, so every node will have
> 6x1tb + 6x2tb drives?
>
>
> I feel that the second way will give more smooth distribution among the
> nodes, and
>
> outage of one node may give lesser impact on cluster. Am I right and what
> you can
>
> advice me in such a situation?

You are correct.  The CRUSH weight assigned to an OSD depends on its
capacity, so in order to fill a cluster evenly we have to write 2x as
quickly to a 2TB OSD than a 1TB OSD.  If some nodes had all the big
drives, then the network interfaces to those nodes would be overloaded
compared with the network interfaces to the other nodes.

However, even if the drives are spread out across nodes such that
there is no network imbalance, you will still have the local imbalance
within a node: if you are writing (across many PGs) 100MB/s to the 2TB
drives then you will only be writing 50MB/s to the 1TB drives.  You
could solve this in turn with some creative arrangement of pools with
crush rules to make sure that each pool was only using a single drive
size: that way you could have two pools that each got full bandwidth,
but one pool would be smaller than the other.  But if you don't care
about the bandwidth under-utilization on the older drives, then that
would be unnecessary complication.

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Better way to use osd's of different size

2015-01-16 Thread Udo Lembke
Hi Megov,
you should weight the OSD so it's represent the size (like an weight of
3.68 for an 4TB HDD).
cephdeploy do this automaticly.

Nevertheless also with the correct weight the disk was not filled in
equal distribution. For that purposes you can use reweight for single
OSDs, or automaticly with "ceph osd reweight-by-utilization".

Udo

On 14.01.2015 16:36, Межов Игорь Александрович wrote:
>
> Hi!
>
>
> We have a small production ceph cluster, based on firefly release.
>
>
> It was built using hardware we already have in our site so it is not
> "new & shiny",
>
> but works quite good. It was started in 2014.09 as a "proof of
> concept" from 4 hosts
>
> with 3 x 1tb osd's each: 1U dual socket Intel 54XX & 55XX platforms on
> 1 gbit network.
>
>
> Now it contains 4x12 osd nodes on shared 10Gbit network. We use it as
> a backstore
>
> for running VMs under qemu+rbd.
>
>
> During migration we temporarily use 1U nodes with 2tb osds and already
> face some
>
> problems with uneven distribution. I know, that the best practice is
> to use osds of same
>
> capacity, but it is impossible sometimes.
>
>
> Now we have 24-28 spare 2tb drives and want to increase capacity on
> the same boxes.
>
> What is the more right way to do it:
>
> - replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes
> full of 2tb drives and
>
> other nodes remains in 12x1tb confifg
>
> - or replace 1tb to 2tb drives in more unify way, so every node will
> have 6x1tb + 6x2tb drives?
>
>
> I feel that the second way will give more smooth distribution among
> the nodes, and
>
> outage of one node may give lesser impact on cluster. Am I right and
> what you can
>
> advice me in such a situation?
>
>
>
>
> Megov Igor
> yuterra.ru, CIO
> me...@yuterra.ru
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Better way to use osd's of different size

2015-01-14 Thread Межов Игорь Александрович
Hi!


We have a small production ceph cluster, based on firefly release.


It was built using hardware we already have in our site so it is not "new & 
shiny",

but works quite good. It was started in 2014.09 as a "proof of concept" from 4 
hosts

with 3 x 1tb osd's each: 1U dual socket Intel 54XX & 55XX platforms on 1 gbit 
network.


Now it contains 4x12 osd nodes on shared 10Gbit network. We use it as a 
backstore

for running VMs under qemu+rbd.


During migration we temporarily use 1U nodes with 2tb osds and already face some

problems with uneven distribution. I know, that the best practice is to use 
osds of same

capacity, but it is impossible sometimes.


Now we have 24-28 spare 2tb drives and want to increase capacity on the same 
boxes.

What is the more right way to do it:

- replace 12x1tb drives with 12x2tb drives, so we will have 2 nodes full of 2tb 
drives and

other nodes remains in 12x1tb confifg

- or replace 1tb to 2tb drives in more unify way, so every node will have 6x1tb 
+ 6x2tb drives?


I feel that the second way will give more smooth distribution among the nodes, 
and

outage of one node may give lesser impact on cluster. Am I right and what you 
can

advice me in such a situation?




Megov Igor
yuterra.ru, CIO
me...@yuterra.ru
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com