Actually both our solutions don't work very well.  Frequently the same OSD was 
chosen for multiple chunks:


8.72     9751         0          0        0  40895512576            0           
0  1302                   active+clean     2h  224790'12801   225410:49810    
[13,1,14,11,18,2,19,13]p13    [13,1,14,11,18,2,19,13]p13  
2021-05-11T22:41:11.332885+0000  2021-05-11T22:41:11.332885+0000
8.7f     9695         0          0        0  40661680128            0           
0  2184                   active+clean     5h  224790'12850   225409:57529      
  [8,17,4,1,14,0,19,8]p8        [8,17,4,1,14,0,19,8]p8  
2021-05-11T22:41:11.332885+0000  2021-05-11T22:41:11.332885+0000

I'm now considering using device classes and assigning the OSDs to either hdd1 
or hdd2...  Unless someone has another idea?

Thanks,
Bryan

> On May 14, 2021, at 12:35 PM, Bryan Stillwell <bstillw...@godaddy.com> wrote:
> 
> This works better than my solution.  It allows the cluster to put more PGs on 
> the systems with more space on them:
> 
> # for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r 
> '.pg_stats[].pgid'); do
>>  echo $pg
>>  for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do
>>    ceph osd find $osd | jq -r '.host'
>>  done | sort | uniq -c | sort -n -k1
>> done
> 8.0
>      1 excalibur
>      1 mandalaybay
>      2 aladdin
>      2 harrahs
>      2 paris
> 8.1
>      1 aladdin
>      1 excalibur
>      1 harrahs
>      1 mirage
>      2 mandalaybay
>      2 paris
> 8.2
>      1 aladdin
>      1 mandalaybay
>      2 harrahs
>      2 mirage
>      2 paris
> ...
> 
> Thanks!
> Bryan
> 
>> On May 13, 2021, at 2:58 AM, Ján Senko <ja...@protonmail.ch> wrote:
>> 
>> Caution: This email is from an external sender. Please do not click links or 
>> open attachments unless you recognize the sender and know the content is 
>> safe. Forward suspicious emails to isitbad@.
>> 
>> 
>> 
>> Would something like this work?
>> 
>> step take default
>> step choose indep 4 type host
>> step chooseleaf indep 1 type osd
>> step emit
>> step take default
>> step choose indep 0 type host
>> step chooseleaf indep 1 type osd
>> step emit
>> 
>> J.
>> 
>> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
>> 
>> On Wednesday, May 12th, 2021 at 17:58, Bryan Stillwell 
>> <bstillw...@godaddy.com> wrote:
>> 
>>> I'm trying to figure out a CRUSH rule that will spread data out across my 
>>> cluster as much as possible, but not more than 2 chunks per host.
>>> 
>>> If I use the default rule with an osd failure domain like this:
>>> 
>>> step take default
>>> 
>>> step choose indep 0 type osd
>>> 
>>> step emit
>>> 
>>> I get clustering of 3-4 chunks on some of the hosts:
>>> 
>>> for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r 
>>> '.pg_stats[].pgid'); do
>>> =======================================================================================
>>> 
>>>> echo $pg
>>>> 
>>>> for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do
>>>> 
>>>> ceph osd find $osd | jq -r '.host'
>>>> 
>>>> done | sort | uniq -c | sort -n -k1
>>> 
>>> 8.0
>>> 
>>> 1 harrahs
>>> 
>>> 3 paris
>>> 
>>> 4 aladdin
>>> 
>>> 8.1
>>> 
>>> 1 aladdin
>>> 
>>> 1 excalibur
>>> 
>>> 2 mandalaybay
>>> 
>>> 4 paris
>>> 
>>> 8.2
>>> 
>>> 1 harrahs
>>> 
>>> 2 aladdin
>>> 
>>> 2 mirage
>>> 
>>> 3 paris
>>> 
>>> ...
>>> 
>>> However, if I change the rule to use:
>>> 
>>> step take default
>>> 
>>> step choose indep 0 type host
>>> 
>>> step chooseleaf indep 2 type osd
>>> 
>>> step emit
>>> 
>>> I get the data spread across 4 hosts with 2 chunks per host:
>>> 
>>> for pg in $(ceph pg ls-by-pool cephfs_data_ec62 -f json | jq -r 
>>> '.pg_stats[].pgid'); do
>>> =======================================================================================
>>> 
>>>> echo $pg
>>>> 
>>>> for osd in $(ceph pg map $pg -f json | jq -r '.up[]'); do
>>>> 
>>>> ceph osd find $osd | jq -r '.host'
>>>> 
>>>> done | sort | uniq -c | sort -n -k1
>>>> 
>>>> done
>>> 
>>> 8.0
>>> 
>>> 2 aladdin
>>> 
>>> 2 harrahs
>>> 
>>> 2 mandalaybay
>>> 
>>> 2 paris
>>> 
>>> 8.1
>>> 
>>> 2 aladdin
>>> 
>>> 2 harrahs
>>> 
>>> 2 mandalaybay
>>> 
>>> 2 paris
>>> 
>>> 8.2
>>> 
>>> 2 harrahs
>>> 
>>> 2 mandalaybay
>>> 
>>> 2 mirage
>>> 
>>> 2 paris
>>> 
>>> ...
>>> 
>>> Is it possible to get the data to spread out over more hosts? I plan on 
>>> expanding the cluster in the near future and would like to see more hosts 
>>> get 1 chunk instead of 2.
>>> 
>>> Also, before you recommend adding two more hosts and switching to a 
>>> host-based failure domain, the cluster is on a variety of hardware with 
>>> between 2-6 drives per host and drives that are 4TB-12TB in size (it's part 
>>> of my home lab).
>>> 
>>> Thanks,
>>> 
>>> Bryan
>>> 
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> 
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to