Re: [ceph-users] Expanding ceph cluster by adding more OSDs

Guang Sat, 02 Nov 2013 08:07:29 -0700

Hi Kyle,
Thanks for you response. Though I haven't tested it, my gut feeling is the 
same, changing the PG number may result in re-shuffling of the data.

In terms of the strategy you mentioned to expand a cluster, I have a few 
questions:
  1. By adding a LITTLE more weight each time, my understanding is to reduce 
the load for the OSD being added, is it? If so, can we use the throttle setting 
to achieve the same goal?
  2. If I would like to expand the cluster every quarter with 30% capacity, by 
using such way, it might take a long time to add new capacity, is my 
understanding correct?
  3. Is there any automatic tool to do this, or I will need to closely monitor, 
and dump the crush rule / edit it and push back?

I am testing a scenario to add one OSD each time (I have 330 OSD in total), the 
weight is using default one. There are a couple of observations: 1) the 
recovery start quick (several hundred MB/s) and then get slower to around 
10MB/s. 2) It impact the online traffic quite a lot (from my observation, 
mainly of the recovering PGs).

I tried to search some best practice to expand a cluster with bad luck, anybody 
would like to share your experience? Thanks very much.

Thanks,
Guang

Date: Thu, 10 Oct 2013 05:15:27 -0700
From: Kyle Bader <kyle.ba...@gmail.com>
To: "ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Expanding ceph cluster by adding more OSDs
Message-ID:
        <cafmfnwq+hbgsezme3vwom_gqcwikd1393rxc+xb0xgt4nxq...@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

I've contracted and expanded clusters by up to a rack of 216 OSDs - 18
nodes, 12 drives each.  New disks are configured with a CRUSH weight of 0
and I slowly add weight (0.1 to 0.01 increments), wait for the cluster to
become active+clean and then add more weight. I was expanding after
contraction so my PG count didn't need to be corrected, I tend to be
liberal and opt for more PGs.  If I hadn't contracted the cluster prior to
expanding it I would probably add PGs after all the new OSDs have finished
being weighted into the cluster.

On Wed, Oct 9, 2013 at 8:55 PM, Michael Lowe <j.michael.l...@gmail.com>wrote:

> I had those same questions, I think the answer I got was that it was
> better to have too few pg's than to have overloaded osd's.  So add osd's
> then add pg's.  I don't know the best increments to grow in, probably
> depends largely on the hardware in your osd's.
> 
> Sent from my iPad
> 
>> On Oct 9, 2013, at 11:34 PM, Guang <yguan...@yahoo.com> wrote:
>> 
>> Thanks Mike. I get your point.
>> 
>> There are still a few things confusing me:
>> 1) We expand Ceph cluster by adding more OSDs, which will trigger
> re-balance PGs across the old & new OSDs, and likely it will break the
> optimized PG numbers for the cluster.
>>  2) We can add more PGs which will trigger re-balance objects across
> old & new PGs.
>> 
>> So:
>> 1) What is the recommended way to expand the cluster by adding OSDs
> (and potentially adding PGs), should we do them at the same time?
>> 2) What is the recommended way to scale a cluster from like 1PB to 2PB,
> should we scale it to like 1.1PB to 1.2PB or move to 2PB directly?
>> 
>> Thanks,
>> Guang
>> 
>>> On Oct 10, 2013, at 11:10 AM, Michael Lowe wrote:
>>> 
>>> There used to be, can't find it right now.  Something like 'ceph osd
> set pg_num <num>' then 'ceph osd set pgp_num <num>' to actually move your
> data into the new pg's.  I successfully did it several months ago, when
> bobtail was current.
>>> 
>>> Sent from my iPad
>>> 
>>>> On Oct 9, 2013, at 10:30 PM, Guang <yguan...@yahoo.com> wrote:
>>>> 
>>>> Thanks Mike.
>>>> 
>>>> Is there any documentation for that?
>>>> 
>>>> Thanks,
>>>> Guang
>>>> 
>>>>> On Oct 9, 2013, at 9:58 PM, Mike Lowe wrote:
>>>>> 
>>>>> You can add PGs,  the process is called splitting.  I don't think PG
> merging, the reduction in the number of PGs, is ready yet.
>>>>> 
>>>>>> On Oct 8, 2013, at 11:58 PM, Guang <yguan...@yahoo.com> wrote:
>>>>>> 
>>>>>> Hi ceph-users,
>>>>>> Ceph recommends the PGs number of a pool is (100 * OSDs) / Replicas,
> per my understanding, the number of PGs for a pool should be fixed even we
> scale out / in the cluster by adding / removing OSDs, does that mean if we
> double the OSD numbers, the PG number for a pool is not optimal any more
> and there is no chance to correct it?
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Guang
>>>>>> _______________________________________________
>>>>>> ceph-users mailing list
>>>>>> ceph-users@lists.ceph.com
>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Expanding ceph cluster by adding more OSDs

Reply via email to