[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
hard lately? > Did you test using client programs like s3cmd & rclone...? > > I didn't have time to work on that this week, but I have to find a > solution too. > Meanwhile, I run with a lower shard number and my customer can access > all his data. > Cheers! > > On 7/1

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
Is there way to remove a file from a bucket without removing it from the bucketindex? Am Fr., 16. Juli 2021 um 17:36 Uhr schrieb Boris Behrens : > > Hi everybody, > a customer mentioned that he got problems in accessing hist rgw data. > I checked the bucket index and the file should

[ceph-users] Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
Hi everybody, a customer mentioned that he got problems in accessing hist rgw data. I checked the bucket index and the file should be available. Then I pulled a list with radosgw-admin radoslist --bucket BUCKET and it seems that the file is gone. beside the "yaiks, is there a way the file might

[ceph-users] Re: best practice balance mode in HAproxy in front of RGW?

2021-05-27 Thread Boris Behrens
Am Do., 27. Mai 2021 um 07:47 Uhr schrieb Janne Johansson : > > Den ons 26 maj 2021 kl 16:33 skrev Boris Behrens : > > > > Hi Janne, > > do you know if there can be data duplication which leads to orphan objects? > > > > I am currently huntin stra

[ceph-users] Re: best practice balance mode in HAproxy in front of RGW?

2021-05-26 Thread Boris Behrens
Johansson : > > I guess normal round robin should work out fine too, regardless of if > there are few clients making several separate connections or many > clients making a few. > > Den ons 26 maj 2021 kl 12:32 skrev Boris Behrens : > > > > Hello togehter, > > > >

[ceph-users] best practice balance mode in HAproxy in front of RGW?

2021-05-26 Thread Boris Behrens
Hello togehter, is there any best practive on the balance mode when I have a HAproxy in front of my rgw_frontend? currently we use "balance leastconn". Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-26 Thread Boris Behrens
The more files I delete, the more space is used. How can this be? Am Di., 25. Mai 2021 um 14:41 Uhr schrieb Boris Behrens : > > Am Di., 25. Mai 2021 um 09:23 Uhr schrieb Boris Behrens : > > > > Hi, > > I am still searching for a reason why these two values diff

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Am Di., 25. Mai 2021 um 09:23 Uhr schrieb Boris Behrens : > > Hi, > I am still searching for a reason why these two values differ so much. > > I am currently deleting a giant amount of orphan objects (43mio, most > of them under 64kb), but the difference get larger

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Am Di., 25. Mai 2021 um 09:39 Uhr schrieb Konstantin Shalygin : > > Hi, > > On 25 May 2021, at 10:23, Boris Behrens wrote: > > I am still searching for a reason why these two values differ so much. > > I am currently deleting a giant amount of orphan objects (43mio, m

[ceph-users] summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Hi, I am still searching for a reason why these two values differ so much. I am currently deleting a giant amount of orphan objects (43mio, most of them under 64kb), but the difference get larger instead of smaller. This was the state two days ago: > > [root@s3db1 ~]# radosgw-admin bucket stats

[ceph-users] question regarding markers in radosgw

2021-05-21 Thread Boris Behrens
Hello everybody, It seems that I have a metric ton of orphan objects in my s3 cluster. They look like this: $ rados -p eu-central-1.rgw.buckets.data stat ff7a8b0c-07e6-463a-861b-78f0adeba8ad.811806.9_1063978/features/2018-02-23.json

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
Reading through the bugtracker: https://tracker.ceph.com/issues/50293 Thanks for your patience. Am Do., 20. Mai 2021 um 15:10 Uhr schrieb Boris Behrens : > I try to bump it once more, because it makes finding orphan objects nearly > impossible. > > Am Di., 11. Mai 2021 um 13:03

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
I try to bump it once more, because it makes finding orphan objects nearly impossible. Am Di., 11. Mai 2021 um 13:03 Uhr schrieb Boris Behrens : > Hi together, > > I still search for orphan objects and came across a strange bug: > There is a huge multipart upload happening

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-19 Thread Boris Behrens
for your support Igor <3 Am Di., 18. Mai 2021 um 09:54 Uhr schrieb Boris Behrens : > One more question: > How do I get rid of the bluestore spillover message? > osd.68 spilled over 64 KiB metadata from 'db' device (13 GiB used of > 50 GiB) to slow device > > I tried an

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-18 Thread Boris Behrens
One more question: How do I get rid of the bluestore spillover message? osd.68 spilled over 64 KiB metadata from 'db' device (13 GiB used of 50 GiB) to slow device I tried an offline compactation, which did not help. Am Mo., 17. Mai 2021 um 15:56 Uhr schrieb Boris Behrens : > I h

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
> > Thanks, > > Igor > > On 5/17/2021 3:47 PM, Boris Behrens wrote: > > The FSCK looks good: > > > > [root@s3db10 export-bluefs2]# ceph-bluestore-tool --path > > /var/lib/ceph/osd/ceph-68 fsck > > fsck success > > > > Am Mo., 17. Mai 2021

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
See my last mail :) Am Mo., 17. Mai 2021 um 14:52 Uhr schrieb Igor Fedotov : > Would you try fsck without standalone DB? > > On 5/17/2021 3:39 PM, Boris Behrens wrote: > > Here is the new output. I kept both for now. > > > > [root@s3db10 export-bluefs2]# ls * > &g

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
The FSCK looks good: [root@s3db10 export-bluefs2]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 fsck fsck success Am Mo., 17. Mai 2021 um 14:39 Uhr schrieb Boris Behrens : > Here is the new output. I kept both for now. > > [root@s3db10 export-bluefs2]# ls * > db: > 018

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
; > > Am Mo., 17. Mai 2021 um 13:45 Uhr schrieb Igor Fedotov >: > > > >> You might want to check file structure at new DB using bluestore-tools's > >> bluefs-export command: > >> > >> ceph-bluestore-tool --path --command bluefs-export --out >

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
lueFS directory > structure - multiple .sst files, CURRENT and IDENTITY files etc? > > If so then please check and share the content of /db/CURRENT > file. > > > Thanks, > > Igor > > On 5/17/2021 1:32 PM, Boris Behrens wrote: > > Hi Igor, > > I posted it on paste

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
or > > On 5/17/2021 1:09 PM, Boris Behrens wrote: > > Hi, > > sorry for replying to this old thread: > > > > I tried to add a block.db to an OSD but now the OSD can not start with > the > > error: > > Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[260

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Hi, sorry for replying to this old thread: I tried to add a block.db to an OSD but now the OSD can not start with the error: Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17 09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not end with newline Mai 17

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
I actually WAS the amount of watchers... narf.. This is so embarissing.. Thanks a lot for all your input. Am Di., 11. Mai 2021 um 13:54 Uhr schrieb Boris Behrens : > I tried to debug it with --debug-ms=1. > Maybe someone could help me to wrap my head around it? > https://pastebin.com

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
I tried to debug it with --debug-ms=1. Maybe someone could help me to wrap my head around it? https://pastebin.com/LD9qrm3x Am Di., 11. Mai 2021 um 11:17 Uhr schrieb Boris Behrens : > Good call. I just restarted the whole cluster, but the problem still > persists. > I do

[ceph-users] "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-11 Thread Boris Behrens
Hi together, I still search for orphan objects and came across a strange bug: There is a huge multipart upload happening (around 4TB), and listing the rados objects in the bucket loops over the multipart upload. -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
>> >> Kind regards, >> Thomas >> >> >> Am 11. Mai 2021 08:47:06 MESZ schrieb Boris Behrens : >> >Hi Amit, >> > >> >I just pinged the mons from every system and they are all available. >> > >> >Am Mo., 10. M

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
he problem was gone. > I have no good way to debug the problem since it never occured again after > we restarted the OSDs. > > Kind regards, > Thomas > > > Am 11. Mai 2021 08:47:06 MESZ schrieb Boris Behrens : > >Hi Amit, > > > >I just pinged the mons from ev

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
all nodes are successfully ping. > > > -AmitG > > > On Tue, 11 May 2021 at 12:12 AM, Boris Behrens wrote: > >> Hi guys, >> >> does someone got any idea? >> >> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens : >> >> > Hi, >&g

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi guys, does someone got any idea? Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens : > Hi, > since a couple of days we experience a strange slowness on some > radosgw-admin operations. > What is the best way to debug this? > > For example creating a user takes over

[ceph-users] radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-05 Thread Boris Behrens
Hi, since a couple of days we experience a strange slowness on some radosgw-admin operations. What is the best way to debug this? For example creating a user takes over 20s. [root@s3db1 ~]# time radosgw-admin user create --uid test-bb-user --display-name=test-bb-user 2021-05-05 14:08:14.297

[ceph-users] global multipart lc policy in radosgw

2021-05-02 Thread Boris Behrens
Hi, I have a lot of multipart uploads that look like they never finished. Some of them date back to 2019. Is there a way to clean them up when they didn't finish in 28 days? I know I can implement a LC policy per bucket, but how do I implement it cluster wide? Cheers Boris -- Die

[ceph-users] Re: how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-27 Thread Boris Behrens
Uhr schrieb Boris Behrens : > Hi Anthony, > > yes we are using replication, the lost space is calculated before it's > replicated. > RAW STORAGE: > CLASS SIZEAVAIL USEDRAW USED %RAW USED > hdd 1.1 PiB 191 TiB 968 TiB

[ceph-users] Re: how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-26 Thread Boris Behrens
release were your OSDs built? BlueStore? Filestore? > What is your RGW object population like? Lots of small objects? Mostly > large objects? Average / median object size? > > > On Apr 26, 2021, at 9:32 PM, Boris Behrens wrote: > > > > HI, > > > > we still hav

[ceph-users] how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-26 Thread Boris Behrens
HI, we still have the problem that our rgw eats more diskspace than it should. Summing up the "size_kb_actual" of all buckets show only half of the used diskspace. There are 312TiB stored acording to "ceph df" but we only need around 158TB. I've already wrote to this ML with the problem, but

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Boris Behrens
Am Fr., 23. Apr. 2021 um 12:16 Uhr schrieb Ilya Dryomov : > On Fri, Apr 23, 2021 at 12:03 PM Boris Behrens wrote: > > > > > > > > Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> > >> This

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Boris Behrens
Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov : > > This snippet confirms my suspicion. Unfortunately without a verbose > log from that VM from three days ago (i.e. when it got into this state) > it's hard to tell what exactly went wrong. > > The problem is that the VM doesn't consider

[ceph-users] Re: s3 requires twice the space it should use

2021-04-23 Thread Boris Behrens
ge": { "search_stage": "comparing", "shard": 0, "marker": "" } } } }, Am Fr., 16. Apr. 2021 um 10:57 Uhr schrieb Boris Behrens : > Could this also be failed multipart

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Am Do., 22. Apr. 2021 um 17:27 Uhr schrieb Ilya Dryomov : > On Thu, Apr 22, 2021 at 5:08 PM Boris Behrens wrote: > > > > > > > > Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> On Thu, Apr 22,

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov : > On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens wrote: > > > > Hi, > > > > I have a customer VM that is running fine, but I can not make snapshots > > anymore. > > rbd snap create rbd/IMAGE@test-bb

[ceph-users] rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Hi, I have a customer VM that is running fine, but I can not make snapshots anymore. rbd snap create rbd/IMAGE@test-bb-1 just hangs forever. When I checked the status with rbd status rbd/IMAGE it shows one watcher, the cpu node where the VM is running. What can I do to investigate further,

[ceph-users] Re: [Suspicious newsletter] cleanup multipart in radosgw

2021-04-19 Thread Boris Behrens
er > --- > Agoda Services Co., Ltd. > e: istvan.sz...@agoda.com > --- > > -Original Message- > From: Boris Behrens > Sent: Monday, April 19, 2021 4:10 PM > To: ceph-users@ceph.io > S

[ceph-users] cleanup multipart in radosgw

2021-04-19 Thread Boris Behrens
Hi, is there a way to remove multipart uploads that are older than X days? It doesn't need to be build into ceph or is automated to the end. Just something I don't need to build on my own. I currently try to debug a problem where ceph reports a lot more used space than it actually requires (

[ceph-users] Re: s3 requires twice the space it should use

2021-04-16 Thread Boris Behrens
Could this also be failed multipart uploads? Am Do., 15. Apr. 2021 um 18:23 Uhr schrieb Boris Behrens : > Cheers, > > [root@s3db1 ~]# ceph daemon osd.23 perf dump | grep numpg > "numpg": 187, > "numpg_primary": 64, > "nump

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
胡 玮文 : > Hi Boris, > > Could you check something like > > ceph daemon osd.23 perf dump | grep numpg > > to see if there are some stray or removing PG? > > Weiwen Hu > > > 在 2021年4月15日,22:53,Boris Behrens 写道: > > > > 

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
ues ,bluestore_min_alloc_size_hdd & > bluestore_min_alloc_size_sdd, If you are using hdd disk then > bluestore_min_alloc_size_hdd are applicable. > > On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens wrote: > >> So, I need to live with it? A value of zero leads to use the

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
l are actually bucket object size but on OSD level the > bluestore_min_alloc_size default 64KB and SSD are 16KB > > > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/osd-bluestore > > -AmitG > > On Thu, Apr 15, 2021 at 7:29 PM Boris

[ceph-users] s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Hi, maybe it is just a problem in my understanding, but it looks like our s3 requires twice the space it should use. I ran "radosgw-admin bucket stats", and added all "size_kb_actual" values up and divided to TB (/1024/1024/1024). The resulting space is 135,1636733 TB. When I tripple it because

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
er the 90% default limit. > > -- dan > > On Tue, Mar 30, 2021 at 3:18 PM Boris Behrens wrote: > > > > The output from ceph osd pool ls detail tell me nothing, except that the > pgp_num is not where it should be. Can you help me to read the output? How > do I estimate

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
itting should take. This will help: > > ceph status > ceph osd pool ls detail > > -- dan > > On Tue, Mar 30, 2021 at 3:00 PM Boris Behrens wrote: > > > > I would think due to splitting, because the balancer doesn't refuses > it's work, because to many misplace

[ceph-users] Re: forceful remap PGs

2021-03-30 Thread Boris Behrens
> On 3/30/21 12:55 PM, Boris Behrens wrote: > > I just move one PG away from the OSD, but the diskspace will not get > freed. > > How did you move? I would suggest you use upmap: > > ceph osd pg-upmap-items > Invalid command: missing required parameter pgid() > osd pg

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
any other trick. > > -- dan > > On Tue, Mar 30, 2021 at 2:07 PM Boris Behrens wrote: > > > > One week later the ceph is still balancing. > > What worries me like hell is the %USE on a lot of those OSDs. Does ceph > > resolv this on it's own? We are currently down to

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
iB 6.7 TiB 6.7 TiB 322 MiB 16 GiB 548 GiB 92.64 1.18 121 up osd.66 46 hdd 7.27739 1.0 7.3 TiB 6.8 TiB 6.7 TiB 316 MiB 16 GiB 536 GiB 92.81 1.18 119 up osd.46 Am Di., 23. März 2021 um 19:59 Uhr schrieb Boris Behrens : > Good point. Thanks for the hi

[ceph-users] Re: forceful remap PGs

2021-03-30 Thread Boris Behrens
I just move one PG away from the OSD, but the diskspace will not get freed. Do I need to do something to clean obsolete objects from the osd? Am Di., 30. März 2021 um 11:47 Uhr schrieb Boris Behrens : > Hi, > I have a couple OSDs that currently get a lot of data, and are running > t

[ceph-users] forceful remap PGs

2021-03-30 Thread Boris Behrens
Hi, I have a couple OSDs that currently get a lot of data, and are running towards 95% fillrate. I would like to forcefully remap some PGs (they are around 100GB) to more empty OSDs and drop them from the full OSDs. I know this would lead to degraded objects, but I am not sure how long the

[ceph-users] Re: add and start OSD without rebalancing

2021-03-24 Thread Boris Behrens
out. Am Mi., 24. März 2021 um 16:31 Uhr schrieb Janne Johansson < icepic...@gmail.com>: > Den ons 24 mars 2021 kl 14:55 skrev Boris Behrens : > > > > Oh cool. Thanks :) > > > > How do I find the correct weight after it is added? > > For the current process I j

[ceph-users] Re: add and start OSD without rebalancing

2021-03-24 Thread Boris Behrens
Oh cool. Thanks :) How do I find the correct weight after it is added? For the current process I just check the other OSDs but this might be a question that someone will raise. I could imagine that I need to adjust the ceph-gentle-reweight's target weight to the correct one. Am Mi., 24. März

[ceph-users] add and start OSD without rebalancing

2021-03-24 Thread Boris Behrens
Hi people, I currently try to add ~30 OSDs to our cluster and wanted to use the gentle-rerweight script for that. I use ceph-colume lvm prepare --data /dev/sdX to create the osd and want to start it without weighting it in. systemctl start ceph-osd@OSD starts the OSD with full weight. Is this

[ceph-users] Re: should I increase the amount of PGs?

2021-03-23 Thread Boris Behrens
umulating on the mons and osds -- > this itself will start to use a lot of space, and active+clean is the > only way to trim the old maps. > > -- dan > > On Tue, Mar 23, 2021 at 7:05 PM Boris Behrens wrote: > > > > So, > > doing nothing and wait for the ceph to recove

[ceph-users] Re: should I increase the amount of PGs?

2021-03-23 Thread Boris Behrens
that's the > case and we can see about changing osd_max_backfills, some weights or > maybe using the upmap-remapped tool. > > -- Dan > > On Tue, Mar 23, 2021 at 6:07 PM Boris Behrens wrote: > > > > Ok, I should have listened to you :) > > > > In the last wee

[ceph-users] Re: should I increase the amount of PGs?

2021-03-23 Thread Boris Behrens
+backfilling 32 active+remapped+backfill_toofull io: client: 27 MiB/s rd, 69 MiB/s wr, 497 op/s rd, 153 op/s wr recovery: 1.5 GiB/s, 922 objects/s Am Di., 16. März 2021 um 09:34 Uhr schrieb Boris Behrens : > Hi Dan, > > my EC profile look very "default&q

[ceph-users] Re: should I increase the amount of PGs?

2021-03-16 Thread Boris Behrens
er. > > 2. You can also use another script from that repo to see the PGs per > OSD normalized to crush weight: > ceph-scripts/tools/ceph-pg-histogram --normalize --pool=15 > >This might explain what is going wrong. > > Cheers, Dan > > > On Mon, Mar 15, 20

[ceph-users] Re: should I increase the amount of PGs?

2021-03-15 Thread Boris Behrens
; d...@vanderster.com>: > OK thanks. Indeed "prepared 0/10 changes" means it thinks things are > balanced. > Could you again share the full ceph osd df tree? > > On Mon, Mar 15, 2021 at 2:54 PM Boris Behrens wrote: > > > > Hi Dan, > > > > I've set the autoscal

[ceph-users] Re: should I increase the amount of PGs?

2021-03-15 Thread Boris Behrens
er > /var/log/ceph/ceph-mgr.*.log > > -- Dan > > On Mon, Mar 15, 2021 at 1:47 PM Boris Behrens wrote: > > > > Hi, > > this unfortunally did not solve my problem. I still have some OSDs that > fill up to 85% > > > > According to the logging, the autoscale

[ceph-users] Re: should I increase the amount of PGs?

2021-03-15 Thread Boris Behrens
u might need to fail to a new mgr... I'm not sure if the current > active will read that new config. > > .. dan > > > On Sat, Mar 13, 2021, 4:36 PM Boris Behrens wrote: > >> Hi, >> >> ok thanks. I just changed the value and rewighted everything back t

[ceph-users] Re: Safe to remove osd or not? Which statement is correct?

2021-03-14 Thread Boris Behrens
Hi, do you know why the OSDs are not starting? When I had the problem that a start does not work, I tried the 'ceph-volume lvm activate --all' on the host, which brought the OSDs back up. But I can't tell you if it is safe to remove the OSD. Cheers Boris Am So., 14. März 2021 um 02:38 Uhr

[ceph-users] Re: should I increase the amount of PGs?

2021-03-13 Thread Boris Behrens
e than 200 PGs, you definitely > shouldn't increase the num PGs. > > But anyway with your mixed device sizes it might be challenging to make a > perfectly uniform distribution. Give it a try with 1 though, and let us > know how it goes. > > .. Dan > > > > >

[ceph-users] Re: should I increase the amount of PGs?

2021-03-13 Thread Boris Behrens
recommend debug_mgr 4/5 so you can see some basic upmap balancer > logging. > > .. Dan > > > > > > > On Sat, Mar 13, 2021, 3:49 PM Boris Behrens wrote: > >> Hello people, >> >> I am still struggeling with the balancer >> (https://www.mail-ar

[ceph-users] should I increase the amount of PGs?

2021-03-13 Thread Boris Behrens
Hello people, I am still struggeling with the balancer (https://www.mail-archive.com/ceph-users@ceph.io/msg09124.html) Now I've read some more and might think that I do not have enough PGs. Currently I have 84OSDs and 1024PGs for the main pool (3008 total). I have the autoscaler enabled, but I

[ceph-users] Re: how to tell balancer to balance

2021-03-11 Thread Boris Behrens
ill only go to 50% (or 4 TB) - so in > effect wasting 4TB of the 8 TB disk > > our cluster & our pool > All our disks no matter what are 8 TB in size. > > > > > > >>> Boris Behrens 3/11/2021 5:53 AM >>> > Hi, > I know this topic seem

[ceph-users] how to tell balancer to balance

2021-03-11 Thread Boris Behrens
Hi, I know this topic seems to be handled a lot (as far as I can see), but I reached the end of my google_foo. * We have OSDs that are near full, but there are also OSDs that are only loaded with 50%. * We have 4,8,16 TB rotating disks in the cluster. * The disks that get packed are 4TB disks and

[ceph-users] buckets with negative num_objects

2021-03-10 Thread Boris Behrens
Hi, I am in the process of resharding large buckets and to find them I ran radosgw-admin bucket limit check | grep '"fill_status": "OVER' -B5 and I see that there are two buckets with negative num_objects "bucket": "ncprod", "tenant": "",

[ceph-users] Re: ceph pool with a whitespace as name

2021-03-10 Thread Boris Behrens
After doing radosgw-admin period update --commit it looks like it is gone now. Sorry for spamming the ML, but I am not denvercoder9 :) Am Mi., 10. März 2021 um 08:29 Uhr schrieb Boris Behrens : > Ok, > i changed the value to > "metadata_heap": "", > but it

[ceph-users] Re: ceph pool with a whitespace as name

2021-03-09 Thread Boris Behrens
Ok, i changed the value to "metadata_heap": "", but it is still used. Any ideas how to stop this? Am Mi., 10. März 2021 um 08:14 Uhr schrieb Boris Behrens : > Found it. > [root@s3db1 ~]# radosgw-admin zone get --rgw-zone=eu-central-1 > { > "id&qu

[ceph-users] Re: ceph pool with a whitespace as name

2021-03-09 Thread Boris Behrens
b84a-459b-bce2-bccac338b3ef" } Am Mi., 10. März 2021 um 07:37 Uhr schrieb Boris Behrens : > Good morning ceph people, > > I have a pool that got a whitespace as name. And I want to know what > creates the pool. > I already renamed it, but something recreates the pool. > &g

[ceph-users] ceph pool with a whitespace as name

2021-03-09 Thread Boris Behrens
Good morning ceph people, I have a pool that got a whitespace as name. And I want to know what creates the pool. I already renamed it, but something recreates the pool. Is there a way to find out what created the pool and what the content ist? When I checked it's content I get [root@s3db1 ~]#

[ceph-users] Re: after update to 14.2.16 osd daemons begin to crash

2021-02-17 Thread Boris Behrens
back to bitmap or avl allocator. > > Thanks, > > Igor > > > On 2/17/2021 12:36 PM, Boris Behrens wrote: > > Hi, > > > > currently we experience osd daemon crashes and I can't pin the issue. I > > hope someone can help me with it. > > > > * We ope

[ceph-users] after update to 14.2.16 osd daemons begin to crash

2021-02-17 Thread Boris Behrens
Hi, currently we experience osd daemon crashes and I can't pin the issue. I hope someone can help me with it. * We operate multiple cluster (440 SSD - 1PB, 36 SSD - 126TB, 40SSD 100TB, 84HDD - 680TB) * All clusters were updated around the same time (2021-02-03) * We restarted ALL ceph daemons

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
I've outed osd.18 and osd.54 and let it sync for some time and now the problem is gone. *shrugs Thank you for the hints. Am Mo., 8. Feb. 2021 um 14:46 Uhr schrieb Boris Behrens : > Hi, > sure > > ID CLASS WEIGHTTYPE NAME STATUS REWEIGHT PRI-AFF > -1 672.684

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
priate? I've seen stuck PGs because of OSD weight imbalance. Is > the OSD in the correct subtree? > > > Zitat von Boris Behrens : > > > Hi Eugen, > > > > I've set it to 0 but the "degraded objects" count does not go down. > > > > Am Mo.

[ceph-users] Re: one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Hi Eugen, I've set it to 0 but the "degraded objects" count does not go down. Am Mo., 8. Feb. 2021 um 14:23 Uhr schrieb Eugen Block : > Hi, > > one option would be to decrease (or set to 0) the primary-affinity of > osd.14 and see if that brings the pg back. > > Regards, > Eugen > > -- Die

[ceph-users] one PG is stuck on single OSD and does not sync back

2021-02-08 Thread Boris Behrens
Good day together, I've got an issue after rebooting an osd node. It looks like there is one PG that does not sync back to the other UP osds. I've tried to restart the ceph processes for all three OSDs and when I stopped the one on OSD.14 the PG went down. Any ideas what I can do? # ceph pg ls

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
67.2196 TiB pool 44 7.9 MiB 2.01k 68 MiB 096 TiB pool 5519 B 2 36 KiB 096 TiB Am Di., 16. Juni 2020 um 14:13 Uhr schrieb Dan van der Ster : > > On Tue, Jun 16, 2020 at 2:00 PM Boris B

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
See inline comments Am Di., 16. Juni 2020 um 13:29 Uhr schrieb Zhenshi Zhou : > > I did this on my cluster and there was a huge number of pg rebalanced. > I think setting this option to 'on' is a good idea if it's a brand new > cluster. > On our new cluster we enabled them, but not on our

[ceph-users] Re: enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
ize_ratio or > target_size_bytes accordingly. > > BTW, do you have some feeling that your 17000 PGs are currently not > correctly proportioned for your cluster? > > -- Dan > > On Tue, Jun 16, 2020 at 11:31 AM Boris Behrens wrote: > > > > Hi, > > > >

[ceph-users] enabling pg_autoscaler on a large production storage?

2020-06-16 Thread Boris Behrens
Hi, I would like to enable the pg_autoscaler on our nautilus cluster. Someone told me that I should be really really careful to NOT have customer impact. Maybe someone can share some experience on this? The Cluster got 455 OSDs on 19 hosts with ~17000 PGs and ~1petabyte raw storage where ~600TB

[ceph-users] Re: MAX AVAIL goes up when I reboot an OSD node

2020-05-29 Thread Boris Behrens
Hi Sinan, this happens with any node, and any single OSD. On Fri, May 29, 2020 at 10:09 AM si...@turka.nl wrote: > > Does this happen with any random node or specific to 1 node? > > If specific to 1 node, does this node holds more data compared to other nodes > (ceph osd df)? > > Sinan Polat >

[ceph-users] Re: MAX AVAIL goes up when I reboot an OSD node

2020-05-29 Thread Boris Behrens
Well, this happens when any OSD goes offline. (I stopped a single OSD service on one of our OSD nodes) On Fri, May 29, 2020 at 8:44 AM KervyN wrote: > > Hi Eugene, > no. The mgr services are located on our mon servers. > > This happens when I reboot any OSD node. > > >

[ceph-users] MAX AVAIL goes up when I reboot an OSD node

2020-05-28 Thread Boris Behrens
Dear people on this mailing list, I've got the "problem" that our MAX AVAIL value increases by about 5-10 TB when I reboot a whole OSD node. After the reboot the value goes back to normal. I would love to know WHY. Under normal circumstances I would ignore this behavior, but because I am very

<    1   2   3