Indeed,scaling up PGs in an OSD may be needed for larger HDDs.
Increasing the number of PGs by 5 fold or 10 fold would have adverse impact of
OSD peering. What is the practical limits on the number of PGs per OSD with
default setting, OR should we tuning some Ceph default setting for
HI Matthew,
the results of the commands are:
ceph df detail
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 190 TiB 61 TiB 129 TiB 129 TiB 67.70
TOTAL 190 TiB 61 TiB 129 TiB 129 TiB 67.70
--- POOLS ---
POOL ID PGS
Hi Gents,
There is a cluster with 14 hosts in this state:
https://i.ibb.co/HPF3Pdr/6-ACB2-C5-B-6-B54-476-B-835-D-227-E9-BFB1247.jpg
There is a host based crush rule ec 3:1 and there are 3 hosts where are osds
down.
Unfortunately there are pools with 3 replicas also which is host based.
2
Sorry if anyone gets this twice. It didn't make it to the list. -- Frank
From: Frank Schilder
Sent: 12 March 2021 13:48
To: Chris Dunlop
Cc: ceph-users@ceph.io; Wissem MIMOUNA
Subject: Re: [ceph-users] OSD id 241 != my id 248: conversion from "ceph-disk"
Hi everyone,
I try to configure HA service for rgw with cephadm. I have 2 rgw on cnrgw1
et cnrgw2 for the same pool.
i use a virtual IP address 192.168.0.15 cnrgwha and the config from
https://docs.ceph.com/en/latest/cephadm/rgw/#high-availability-service-for-rgw
# from root@cnrgw1
[root@cnrgw1
> So perhaps we'll need to change the OSD to allow for 500 or 1000 PGs
We had a support case last year where we where forced to set the OSD
limit to >4000 for a few days, and had more then 4k active PGs on that
single OSD. You can do that, however it is quite uncommon.
--
Martin Verges
Managing
OK
Btw, you might need to fail to a new mgr... I'm not sure if the current
active will read that new config.
.. dan
On Sat, Mar 13, 2021, 4:36 PM Boris Behrens wrote:
> Hi,
>
> ok thanks. I just changed the value and rewighted everything back to 1.
> Now I let it sync the weekend and check
Hi,
ok thanks. I just changed the value and rewighted everything back to 1. Now
I let it sync the weekend and check how it will be on monday.
We tried to have the systems total storage balanced as possible. New
systems will be with 8TB disks but for the exiting ones we added 16TB to
offset the
Dear Michael,
good to hear that it is over.
I'm a bit surprised and also worried that you lost data again. Was the cluster
rebalancing when the restarts happened? I had OSDs restart all over the place
due to bugs, OOM or admin accidents and never lost anything (except data access
for a
Thanks.
Decreasing the max deviation to 2 or 1 should help in your case. This
option controls when the balancer stops trying to move PGs around -- by
default it stops when the deviation from the mean is 5. Yes this is too
large IMO -- all of our clusters have this set to 1.
And given that you
Hi Dan,
upmap_max_deviation is default (5) in our cluster. Is 1 the recommended
deviation?
I added the whole ceph osd df tree, (I need to remove some OSDs and readd
them as bluestore with SSD, so 69, 73 and 82 are a bit off now. I also
reweighted to try to get the %USE mitigated).
I will
No, increasing num PGs won't help substantially.
Can you share the entire output of ceph osd df tree ?
Did you already set
ceph config set mgr mgr/balancer/upmap_max_deviation 1
??
And I recommend debug_mgr 4/5 so you can see some basic upmap balancer
logging.
.. Dan
On Sat, Mar 13,
Hello people,
I am still struggeling with the balancer
(https://www.mail-archive.com/ceph-users@ceph.io/msg09124.html)
Now I've read some more and might think that I do not have enough PGs.
Currently I have 84OSDs and 1024PGs for the main pool (3008 total). I
have the autoscaler enabled, but I
On Fri, Mar 12, 2021 at 6:35 PM Robert Sander
wrote:
>
> Am 12.03.21 um 18:30 schrieb huxia...@horebdata.cn:
>
> > Any other aspects on the limits of bigger capacity hard disk drives?
>
> Recovery will take longer increasing the risk of another failure in the
> same time.
>
Another limitation is
If you have a small cluster, without host redundancy, you are still
able to configure this in Ceph to be handled correctly by adding a
drive failure domain between host and OSD level. So yes you need to
change more then just failure-domain=OSD, as this would be a problem.
However it is absolutely
> failure-domain=host
yes (or rack/room/datacenter/..), for regular clusters it's therefore
absolute no problem as you correctly assumed.
--
Martin Verges
Managing director
Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges
croit GmbH, Freseniusstr. 31h,
> Well, if you run with failure-domain=host, then if it says "I have 8
> 14TB drives and one failed" or "I have 16 7TB drives and two failed"
> isn't going to matter much in terms of recovery, is it?
> It would mostly matter for failure-domain=OSD, otherwise it seems about
> equal.
Yes, but
Den lör 13 mars 2021 kl 12:56 skrev Marc :
> > A good mix of size and performance is the Seagate 2X14 MACH.2 Dual
> > Actor 14TB HDD.
> > This drive reports as 2x 7TB individual block devices and you install
> > a OSD on each.
>
> My first thought was, wow quite nice this dual exposes itself as
>
> A good mix of size and performance is the Seagate 2X14 MACH.2 Dual
> Actor 14TB HDD.
> This drive reports as 2x 7TB individual block devices and you install
> a OSD on each.
My first thought was, wow quite nice this dual exposes itself as two drives. I
was always under the impression that
Thanks a lot for the insightful comments.
huxia...@horebdata.cn
From: Janne Johansson
Date: 2021-03-13 11:36
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] How big an OSD disk could be?
Den fre 12 mars 2021 kl 18:10 skrev huxia...@horebdata.cn
:
> Dear cephers,
> Just
Den fre 12 mars 2021 kl 18:10 skrev huxia...@horebdata.cn
:
> Dear cephers,
> Just wonder how big an OSD disk could be? Currently the biggest HDD has a
> capacity of 18TB or 20TB. It is suitable for an OSD still?
> Is there a limitation of the capacity of a single OSD? Can it be 30TB , 50TB
> or
As Nathan describes, this information is maintained in the database on mon /
monitor nodes.
One always runs multiple mons in production, at least 3 and commonly 5. Each
has a full copy of everything, so that the loss of a node does not lose data or
impact operation.
BTW, it’s Ceph not CEPH
22 matches
Mail list logo