[ceph-users] how to avoid pglogs dups bug in Pacific

2024-01-30 Thread ADRIAN NICOLAE
 Hi,  I'm running Pacific 16.2.4 and I want to start a manual pg split process on the data pool (from 2048 to 4096).  I'm reluctant to upgrade to 16.2.14/15 at this point. Can I avoid the dups bug (https://tracker.ceph.com/issues/53729)  if I will increase the pgs slowly with 32 or 64pgs at

[ceph-users] orchestrator issues on ceph 16.2.9

2023-03-04 Thread Adrian Nicolae
Hi, I have some orchestrator issues on our cluster running 16.2.9 with rgw only services. We first noticed these issues a few weeks ago when adding new hosts to the cluster -  the orch was not detecting the new drives to build the osd containers for them. Debugging the mgr logs, I noticed

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
the problem was a single osd daemon (not reported on health detail) which slowed down the entire peering process, after restarting it the cluster got back to normal. On 11/4/2022 10:49 AM, Adrian Nicolae wrote:  ceph health detail HEALTH_WARN Reduced data availability: 42 pgs inactive, 33

[ceph-users] Re: ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
.rgw.buckets.data' pg_num 2480 is not a power of two [WRN] SLOW_OPS: 2371 slow ops, oldest one blocked for 6218 sec, daemons [osd.103,osd.115,osd.126,osd.129,osd.130,osd.138,osd.155,osd.174,osd.179,osd.181]... have slow ops. On 11/4/2022 10:45 AM, Adrian Nicolae wrote: Hi, We have a Pacific cluster (16.2.4

[ceph-users] ceph is stuck after increasing pg_nums

2022-11-04 Thread Adrian Nicolae
Hi, We have a Pacific cluster (16.2.4) with 30 servers and 30 osds. We started to increase the pg_num for the data bucket for more than a month, I usually added 64 pgs in every step I didn't have any issue. The cluster was healthy before increasing the pgs. Today I've added 128 pgs  and the

[ceph-users] questions about rgw gc max objs and rgw gc speed in general

2022-09-22 Thread Adrian Nicolae
 Hi,  We have  a system running Ceph Pacific with  a large number of delete requests (several hundred thousands files per day) and I'm investigating how can I increase the gc speed to keep up with our deletes (right now there are 44 millions of objects in the gc list).  I changed

[ceph-users] Re: v16.2.6 Pacific released

2021-09-17 Thread Adrian Nicolae
Hi, Does the 16.2.6 version fixed the following bug : https://github.com/ceph/ceph/pull/42690 ? It's not listed in the changelog. Message: 3 Date: Thu, 16 Sep 2021 15:48:42 -0400 From: David Galloway Subject: [ceph-users] v16.2.6 Pacific released To: ceph-annou...@ceph.io,

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-25 Thread Adrian Nicolae
Hi, On my setup I didn't enable a strech cluster. It's just a 3 x VM setup running on the same Proxmox node, all the nodes are using a single unique network. I installed Ceph using the documented cephadm flow. Thanks for the confirmation, Greg! I‘ll try with a newer release then. >That’s

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
experienced this. 在 2021年5月24日,00:35,Adrian Nicolae 写道: Hi, I waited for more than a day on the first mon failure, it didn't resolve automatically. I checked with 'ceph status' and also the ceph.conf on that hosts and the failed mon was removed from the monmap. The cluster reported only 2

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
dump” to see whether mon.node03 is actually removed from monmap when it failed to start? 在 2021年5月23日,16:40,Adrian Nicolae 写道: Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access. I noticed a weird

[ceph-users] Re: Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
<mailto:istvan.sz...@agoda.com> --- On 2021. May 23., at 15:40, Adrian Nicolae wrote: Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access. I noticed a

[ceph-users] Ceph Pacific mon is not starting after host reboot

2021-05-23 Thread Adrian Nicolae
Hi guys, I'm testing Ceph Pacific 16.2.4 in my lab before deciding if I will put it in production on a 1PB+ storage cluster with rgw-only access. I noticed a weird issue with my mons : - if I reboot a mon host, the ceph-mon container is not starting after reboot - I can see with 'ceph

[ceph-users] adding a second rgw instance on the same zone

2021-02-11 Thread Adrian Nicolae
Hi guys, I have a Mimic cluster with only one RGW machine.  My setup is simple - one realm, one zonegroup, one zone.  How can I safely add a second RGW server to the same zone ? Is it safe to just run "ceph-deploy rgw create" for the second server without impacting the existing metadata

[ceph-users] safest way to remove a host from Mimic

2021-01-07 Thread Adrian Nicolae
Hi guys, I need to remove a host (osd server) from my Ceph Mimic.  First I started to remove every OSD drive one by one with 'ceph osd out' and then 'ceph osd purge'.  After all the drives will be removed from the crush map, I will still have the host empty, without any drive, and with the

[ceph-users] Re: Ceph on ARM ?

2020-11-25 Thread Adrian Nicolae
<mailto:r.san...@heinlein-support.de>] Sent: Tuesday, November 24, 2020 5:56 AM To: ceph-users@ceph.io <mailto:ceph-users@ceph.io> Subject: [ceph-users] Re: Ceph on ARM ? Am 24.11.20 um 13:12 schrieb Adrian Nicolae: >     Has anyone tested Ceph in such sc

[ceph-users] Ceph on ARM ?

2020-11-24 Thread Adrian Nicolae
Hi guys, I was looking at some Huawei ARM-based servers and the datasheets are very interesting. The high CPU core numbers and the SoC architecture should be ideal for a distributed storage like Ceph, at least in theory.  I'm planning to build a new Ceph cluster in the future and my best

[ceph-users] question about rgw index pool

2020-11-21 Thread Adrian Nicolae
Hi guys, I'll have a future Ceph deployment with the following setup : - 7 powerful nodes running Ceph 15.2.x with mon, rgw and osd daemons colocated - 100+ SATA drives with EC 4+2 - every OSD will have a large NVME partition (300GB) for rocksdb - the storage will be dedicated for rgw

[ceph-users] Re: question about rgw delete speed

2020-11-13 Thread Adrian Nicolae
gateways US Production(SSD): Nautilus 14.2.11 with 6 osd servers, 3 mons, 4 gateways, 2 iscsi gateways UK Production(SSD): Octopus 15.2.5 with 5 osd servers, 3 mons, 4 gateways -Original Message- From: Adrian Nicolae Sent: Wednesday, November 11, 2020 3:42 PM To: ceph-users Subject

[ceph-users] question about rgw delete speed

2020-11-11 Thread Adrian Nicolae
Hey guys, I'm in charge of a local cloud-storage service. Our primary object storage is a vendor-based one and I want to replace it in the near future with Ceph with the following setup : - 6 OSD servers with 36 SATA 16TB drives each and 3 big NVME per server (1 big NVME for every 12

[ceph-users] changing acces vlan for all the OSDs - potential downtime ?

2020-06-04 Thread Adrian Nicolae
Hi all, I have a Ceph cluster with a standard setup : - the public network : MONs and OSDs conected in the same agg switch with ports in the same access vlan - private network :  OSDs connected in another switch with a second eth connected in another access vlan I need to change the

[ceph-users] Re: RGW resharding

2020-05-25 Thread Adrian Nicolae
objects(max_shards=16) then you should be ok. linyunfan Adrian Nicolae 于2020年5月25日周一 下午3:04写道: I'm using only Swift , not S3. We have a container for every customer. Right now there are thousands of containers. On 5/25/2020 9:02 AM, lin yunfan wrote: Can you store your data in different

[ceph-users] Re: RGW resharding

2020-05-25 Thread Adrian Nicolae
I'm using only Swift , not S3.  We have a container for every customer. Right now there are thousands of containers. On 5/25/2020 9:02 AM, lin yunfan wrote: Can you store your data in different buckets? linyunfan Adrian Nicolae 于2020年5月19日周二 下午3:32写道: Hi, I have the following Ceph Mimic

[ceph-users] Bluestore config recommendations

2020-05-22 Thread Adrian Nicolae
 Hi,   I'm planning to install a new Ceph cluster (Nautilus) using 8+3 EC, SATA-only storage. We want to store here only big files (from 40-50MB to 200-300GB each). The write load will be higher than the read load .   I was thinking at the following Bluestore config to reduce the load on

[ceph-users] RGW resharding

2020-05-19 Thread Adrian Nicolae
lead to OSD's flapping or having io timeouts during deep-scrub or even to have ODS's failures due to the leveldb compacting all the time if we have a large number of DELETEs. Any advice would be appreciated. Thank you, Adrian Nicolae ___ ceph

[ceph-users] Removing OSDs in Mimic

2020-04-06 Thread ADRIAN NICOLAE
Hi all, I have a Ceph cluster with ~ 70 OSDs of different sizes running on Mimic . I'm using ceph-deploy for managing the cluster size. I have to remove some smaller drives and replace them with bigger drives. From your experience, are the removing an OSD guidelines from Mimic docs accurate