[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-10 Thread Boris Behrens
gt; > > > > Надіслано з пристрою Galaxy > > > ---- Оригінальне повідомлення > Від: mhnx > Дата: 08.11.21 13:28 (GMT+02:00) > Кому: Сергей Процун > Копія: "Szabo, Istvan (Agoda)" , Boris Behrens < > b...@kervyn.de>, Ceph Users &g

[ceph-users] Re: Question if WAL/block.db partition will benefit us

2021-11-10 Thread Boris Behrens
-4201-8840-a678-c2e23d38bfd6,... When the SSD fails, can I just remove the tags and restart the OSD with ceph-volume lvm activate --all? And after replacing the failed SSD readd the tags with the correct IDs? Do I need to do anything else to prepare a block.db partition? Cheers Boris Am Di., 9

[ceph-users] Re: Question if WAL/block.db partition will benefit us

2021-11-08 Thread Boris Behrens
> That does not seem like a lot. Having SSD based metadata pools might > reduce latency though. > So block.db and block.wal doesn't make sense? I would like to have a consistent cluster. In either case I would need to remove or add SSDs, because we currently have this mixed. It does waste a lot of

[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-08 Thread Boris Behrens
e a temp bucket. No versioning, No >>> multisite, >>> No index if it's possible. >>> >>> >>> >>> Szabo, Istvan (Agoda) , 5 Kas 2021 Cum, 12:30 >>> tarihinde şunu yazdı: >>> >>> > You mean prepare or reshard? >&g

[ceph-users] Re: Question if WAL/block.db partition will benefit us

2021-11-08 Thread Boris Behrens
e the rgw.meta pools on it, but it looks like a waste of space. Having a 2TB OSD in evey chassis that only handles 23GB of data. Am Mo., 8. Nov. 2021 um 12:30 Uhr schrieb Stefan Kooman : > On 11/8/21 12:07, Boris Behrens wrote: > > Hi, > > we run a larger octopus s3 cluster with only rot

[ceph-users] Question if WAL/block.db partition will benefit us

2021-11-08 Thread Boris Behrens
restructuring the cluster and also two other clusters. And does it make a different to have only a block.db partition or a block.db and a block.wal partition? Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph

[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-05 Thread Boris Behrens
Cheers Istvan, how do you do this? Am Do., 4. Nov. 2021 um 19:45 Uhr schrieb Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > This one you need to prepare, you beed to preshard the bucket which you > know that will hold more than millions of objects. > > I have a bucket where we store 1.2 bill

[ceph-users] Re: large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-05 Thread Boris Behrens
Hi Teoman, I don't sync the bucket content. It's just the metadata that get's synced. But turning off the access to our s3 is not an option, because our customer rely on it (the make backups and serve objects for their web applications through it). Am Do., 4. Nov. 2021 um 18:20 Uhr schrieb Teoman

[ceph-users] large bucket index in multisite environement (how to deal with large omap objects warning)?

2021-11-04 Thread Boris Behrens
ch addresses this issue. Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: s3cmd does not show multiparts in nautilus RGW on specific bucket (--debug shows loop)

2021-10-29 Thread Boris Behrens
Hi guys, we just updated the cluster to latest octopus, but we still can not list multipart uploads if there are more than 2k multiparts. Is there any way to show the multiparts and maybe cancel them? Am Mo., 25. Okt. 2021 um 16:23 Uhr schrieb Boris Behrens : > Hi Casey, > > thanks

[ceph-users] Re: upgrade OSDs before mon

2021-10-26 Thread Boris Behrens
Am Di., 26. Okt. 2021 um 15:47 Uhr schrieb Yury Kirsanov < y.kirsa...@gmail.com>: > You can downgrade any CEPH packages if you want to. Just specify the > number you'd like to go to. > > On Wed, Oct 27, 2021 at 12:36 AM Boris Behrens wrote: > >> Hi, >> I just

[ceph-users] upgrade OSDs before mon

2021-10-26 Thread Boris Behrens
Hi, I just added new storage to our s3 cluster and saw that ubuntu didn't priortize the nautilus package over the octopus package. Now I have 10 OSDs with octopus in a pure nautilus cluster. Can I leave it this way, or should I remove the OSDs and first upgrade the mons? Cheers Boris --

[ceph-users] Re: s3cmd does not show multiparts in nautilus RGW on specific bucket (--debug shows loop)

2021-10-25 Thread Boris Behrens
. Okt. 2021 um 16:19 Uhr schrieb Casey Bodley : > hi Boris, this sounds a lot like > https://tracker.ceph.com/issues/49206, which says "When deleting a > bucket with an incomplete multipart upload that has about 2000 parts > uploaded, we noticed an infinite loop, which stopped s3cm

[ceph-users] s3cmd does not show multiparts in nautilus RGW on specific bucket (--debug shows loop)

2021-10-25 Thread Boris Behrens
Good day everybody, I just came across very strange behavior. I have two buckets where s3cmd hangs when I try to show current multipart uploads. When I use --debug I see that it loops over the same response. What I tried to fix it on one bucket: * radosgw-admin bucket check --bucket=BUCKETNAME *

[ceph-users] Re: recreate a period in radosgw

2021-10-14 Thread Boris Behrens
realm with realm set 7. period update; period update --commit This looks like it is correct, but I am not sure if this is the correct way. Does someone got another way to do this? Am Do., 14. Okt. 2021 um 15:44 Uhr schrieb Boris Behrens : > Hi, > is there a way to restore a deleted peri

[ceph-users] recreate a period in radosgw

2021-10-14 Thread Boris Behrens
Hi, is there a way to restore a deleted period? The realm, zonegroup and zone are still there, but I can't apply any changes, because the period is missing. Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an ema

[ceph-users] shards falling behind on multisite metadata sync

2021-10-01 Thread Boris Behrens
Hi, does someone got a quick fix for falling behin shards in the metadata sync? I can do a radosgw-admin metadata sync init and restart the rgw daemons to get a full sync, but after a day the first shards falls behind, and after two days I also get the message with "oldest incremental change not

[ceph-users] Re: debugging radosgw sync errors

2021-09-20 Thread Boris Behrens
-ae86-4dc1-b432-470b0772fded 284760 [root@s3db16 ~]# radosgw-admin mdlog list | grep name | wc -l No --period given, using current period=e8fc96f1-ae86-4dc1-b432-470b0772fded 343078 Is it safe to clear the mdlog? Am Mo., 20. Sept. 2021 um 01:00 Uhr schrieb Boris Behrens : > I just deleted the ra

[ceph-users] Re: debugging radosgw sync errors

2021-09-19 Thread Boris Behrens
., 17. Sept. 2021 um 17:54 Uhr schrieb Boris Behrens : > While searching for other things I came across this: > [root ~]# radosgw-admin metadata list bucket | grep www1 > "www1", > [root ~]# radosgw-admin metadata list bucket.instance | grep www1 > "www1:ff7a8b

[ceph-users] Re: debugging radosgw sync errors

2021-09-17 Thread Boris Behrens
While searching for other things I came across this: [root ~]# radosgw-admin metadata list bucket | grep www1 "www1", [root ~]# radosgw-admin metadata list bucket.instance | grep www1 "www1:ff7a8b0c-07e6-463a-861b-78f0adeba8ad.81095307.31103", "www1.company.dev", [root ~]# radosgw-admin

[ceph-users] Re: radosgw find buckets which use the s3website feature

2021-09-17 Thread Boris Behrens
Found it: for bucket in `radosgw-admin metadata list bucket.instance | jq .[] | cut -f2 -d\"`; do if radosgw-admin metadata get --metadata-key=bucket.instance:$bucket | grep --silent website_conf; then echo $bucket fi done Am Do., 16. Sept. 2021 um 09:49 Uhr schrieb Boris Behrens :

[ceph-users] debugging radosgw sync errors

2021-09-17 Thread Boris Behrens
Hello again, as my tests with some fresh clusters answerd most of my config questions, I now wanted to start with our production cluster and the basic setup looks good, but the sync does not work: [root@3cecef5afb05 ~]# radosgw-admin sync status realm 5d6f2ea4-b84a-459b-bce2-bccac338b3e

[ceph-users] radosgw find buckets which use the s3website feature

2021-09-16 Thread Boris Behrens
Hi people, is there a way to find bucket that use the s3website feature? Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Questions about multiple zonegroups (was Problem with multi zonegroup configuration)

2021-09-15 Thread Boris Behrens
Ok, I think I found the basic problem. I used to talk to the endpoint that is also the Domain for the s3websites. After switching the domains around everything worked fine. :partyemote: I have wrote down what I think how things work together (wrote down here IYAI https://pastebin.com/6Gj9Q5hJ), an

[ceph-users] Re: Problem with multi zonegroup configuration

2021-09-13 Thread Boris Behrens
Sept. 2021 um 11:47 Uhr schrieb Boris Behrens : > Dear ceph community, > > I am still stuck with the multi zonegroup configuration. I did these steps: > 1. Create realm (company), zonegroup(eu), zone(eu-central-1), sync user on > the site fra1 > 2. Pulled the realm and the period

[ceph-users] Re: [Suspicious newsletter] Problem with multi zonegroup configuration

2021-09-13 Thread Boris Behrens
Co., Ltd. > e: istvan.sz...@agoda.com > --- > > -Original Message- > From: Boris Behrens > Sent: Monday, September 13, 2021 4:48 PM > To: ceph-users@ceph.io > Subject: [Suspicious newsletter] [ceph-users] Problem with multi zonegroup > configuration > > Email re

[ceph-users] Problem with multi zonegroup configuration

2021-09-13 Thread Boris Behrens
Dear ceph community, I am still stuck with the multi zonegroup configuration. I did these steps: 1. Create realm (company), zonegroup(eu), zone(eu-central-1), sync user on the site fra1 2. Pulled the realm and the period in fra2 3. Creted the zonegroup(eu-central-2), zone (eu-central-2), modified

[ceph-users] Re: [Suspicious newsletter] Re: create a Multi-zone-group sync setup

2021-08-17 Thread Boris Behrens
So > 1 realm, multiple dc BUT no sync? > > Istvan Szabo > Senior Infrastructure Engineer > --- > Agoda Services Co., Ltd. > e: istvan.sz...@agoda.com > ------- > > -Original M

[ceph-users] Re: create a Multi-zone-group sync setup

2021-08-17 Thread Boris Behrens
an empty response (because there are no buckets to list). I get this against both radosgw locations. I have an nginx in between the internet and radosgw that will just proxy pass every address and sets host and x-forwarded-for header. Am Fr., 30. Juli 2021 um 16:46 Uhr schrieb Boris Behrens

[ceph-users] Re: Discard / Trim does not shrink rbd image size when disk is partitioned

2021-08-13 Thread Boris Behrens
create and attach an empty block device, and they will certainly not check if the partitions are aligned correctly. Cheers Boris Am Fr., 13. Aug. 2021 um 08:44 Uhr schrieb Janne Johansson < icepic...@gmail.com>: > Den tors 12 aug. 2021 kl 17:04 skrev Boris Behrens : > > Hi ev

[ceph-users] Discard / Trim does not shrink rbd image size when disk is partitioned

2021-08-12 Thread Boris Behrens
Hi everybody, we just stumbled over a problem where the rbd image does not shrink, when files are removed. This only happenes when the rbd image is partitioned. * We tested it with centos8/ubuntu20.04 with ext4 and a gpt partition table (/boot and /) * the kvm device is virtio-scsi-pci with krbd

[ceph-users] create a Multi-zone-group sync setup

2021-07-30 Thread Boris Behrens
setup, where I sync the actual zone data, but have a global namespace where all buckets and users are uniqe. Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___ ceph-users mailing lis

[ceph-users] understanding multisite radosgw syncing

2021-07-27 Thread Boris Behrens
oup da651dc1-2663-4e1b-af2e-ac4454f24c9d (eu) zone ff7a8b0c-07e6-463a-861b-78f0adeba8ad (eu-central-1) metadata sync no sync (zone is master) 2021-07-27 11:24:24.645 7fe30fc07840 0 data sync zone:07cdb1c7 ERROR: failed to fetch datalog info data sync source: 07cdb1c7-8c8e-4a23-ab

[ceph-users] Deleting large objects via s3 API leads to orphan objects

2021-07-27 Thread Boris Behrens
. We are currently running 14.2.21 through the board. Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email t

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-23 Thread Boris Behrens
ieb Rafael Lopez : > > Thanks for further clarification Dan. > > Boris, if you have a test/QA environment on the same code as production, you > can confirm if the problem is as above. Do NOT do this in production - if the > problem exists it might result in losing production data. &

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-21 Thread Boris Behrens
Good morning everybody, we've dug further into it but still don't know how this could happen. What we ruled out for now: * Orphan objects cleanup process. ** There is only one bucket with missing data (I checked all other buckets yesterday) ** The "keep this files" list is generated by radosgw-adm

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-19 Thread Boris Behrens
w index shard much larger than others - ceph-users - > lists.ceph.io" > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/MO7IHRGJ7TGPKT3GXCKMFLR674G3YGUX/ > > On Mon, 19 Jul 2021, 18:00 Boris Behrens, wrote: >> >> Hi Dan, >> how do I find out if

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-19 Thread Boris Behrens
Hi Dan, how do I find out if a bucket got versioning enabled? Am Mo., 19. Juli 2021 um 17:00 Uhr schrieb Dan van der Ster : > > Hi Boris, > > Does the bucket have object versioning enabled? > We saw something like this once a while ago: `s3cmd ls` showed an > entry for an o

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-19 Thread Boris Behrens
t seem to have a filename (_shadow_.Sxj4BEhZS6PZg1HhsvSeqJM4Y0wRCto_4) It doesn't seem to be a careless "rados -p POOL rm OBJECT" because then it should be still in the "radosgw-admin bucket radoslist --bucket BUCKET" output. (just tested that on a testbucket). Am Fr., 16. Jul

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-19 Thread Boris Behrens
-78f0adeba8ad.83821626.6927__shadow_.yscyiu0DpWRh_Agsnii3635ZNnrO16x_5 What are those files? o0 Am Sa., 17. Juli 2021 um 22:54 Uhr schrieb Boris Behrens : > > Hi k, > > all systems run 14.2.21 > > Cheers > Boris > > Am Sa., 17. Juli 2021 um 22:12 Uhr schrieb Konstantin Shalyg

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-17 Thread Boris Behrens
Hi k, all systems run 14.2.21 Cheers Boris Am Sa., 17. Juli 2021 um 22:12 Uhr schrieb Konstantin Shalygin : > > Boris, what is your Ceph version? > > > k > > On 17 Jul 2021, at 11:04, Boris Behrens wrote: > > I really need help with this issue. > > -- Die

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-17 Thread Boris Behrens
Is it possible to not complete a file upload so the actual file is not there, but it is listed in the bucket index? I really need help with this issue. Am Fr., 16. Juli 2021 um 19:35 Uhr schrieb Boris Behrens : > > exactly. > rados rm wouldn't remove it from the "radosgw-adm

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-17 Thread Boris Behrens
Am Fr., 16. Juli 2021 um 19:35 Uhr schrieb Boris Behrens : > > exactly. > rados rm wouldn't remove it from the "radosgw-admin bucket radoslist" > list, correct? > > our usage statistics are not really usable because it fluctuates in a > 200tb range. > >

[ceph-users] Re: difference between rados ls and radosgw-admin bucket radoslist

2021-07-17 Thread Boris Behrens
it rebuild the "bi" from the pool level (rados ls), so I'm not sure the > bucketindex is "that" much important, knowing that you can rebuilt it > from the pool. (?) > > > > > On 7/16/21 1:47 PM, Boris Behrens wrote: > > [Externe UL*] >

[ceph-users] difference between rados ls and radosgw-admin bucket radoslist

2021-07-16 Thread Boris Behrens
Hi, is there a difference between those two? I always thought that radosgw-admin radoslist only shows the objects that are somehow associated with a bucket. But if the bucketindex is broken, would this reflect in the output? -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
stats that can confirm that the data has been > deleted and/or are still there. (at the pool level maybe?) > Hopping for you that it's just a data/index/shard mismatch... > > > On 7/16/21 12:44 PM, Boris Behrens wrote: > > [Externe UL*] > > > > Hi Jean-Sebas

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
Hi Jean-Sebastien, I have the exact opposite. Files can be listed (the are in the bucket index), but are not available anymore. Am Fr., 16. Juli 2021 um 18:41 Uhr schrieb Jean-Sebastien Landry : > > Hi Boris, I don't have any answer for you, but I have situation similar > to yo

[ceph-users] Re: Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
Is there way to remove a file from a bucket without removing it from the bucketindex? Am Fr., 16. Juli 2021 um 17:36 Uhr schrieb Boris Behrens : > > Hi everybody, > a customer mentioned that he got problems in accessing hist rgw data. > I checked the bucket index and the file should

[ceph-users] Files listed in radosgw BI but is not available in ceph

2021-07-16 Thread Boris Behrens
might be somewhere else in ceph?" how can this happen? We do occational orphan objects cleanups but this does not pull the bucket index into account. It is a large bucket with 2.1m files in it and with 34 shards. Cheers and happy weekend Boris -- Die Selbsthilfegruppe "UTF-8-Proble

[ceph-users] Re: best practice balance mode in HAproxy in front of RGW?

2021-05-27 Thread Boris Behrens
Am Do., 27. Mai 2021 um 07:47 Uhr schrieb Janne Johansson : > > Den ons 26 maj 2021 kl 16:33 skrev Boris Behrens : > > > > Hi Janne, > > do you know if there can be data duplication which leads to orphan objects? > > > > I am currently huntin strange err

[ceph-users] Re: best practice balance mode in HAproxy in front of RGW?

2021-05-26 Thread Boris Behrens
Janne Johansson : > > I guess normal round robin should work out fine too, regardless of if > there are few clients making several separate connections or many > clients making a few. > > Den ons 26 maj 2021 kl 12:32 skrev Boris Behrens : > > > > Hello togehter, > > >

[ceph-users] best practice balance mode in HAproxy in front of RGW?

2021-05-26 Thread Boris Behrens
Hello togehter, is there any best practive on the balance mode when I have a HAproxy in front of my rgw_frontend? currently we use "balance leastconn". Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send a

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
The more files I delete, the more space is used. How can this be? Am Di., 25. Mai 2021 um 14:41 Uhr schrieb Boris Behrens : > > Am Di., 25. Mai 2021 um 09:23 Uhr schrieb Boris Behrens : > > > > Hi, > > I am still searching for a reason why these two values diff

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Am Di., 25. Mai 2021 um 09:23 Uhr schrieb Boris Behrens : > > Hi, > I am still searching for a reason why these two values differ so much. > > I am currently deleting a giant amount of orphan objects (43mio, most > of them under 64kb), but the difference get larger instead of sma

[ceph-users] Re: summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Am Di., 25. Mai 2021 um 09:39 Uhr schrieb Konstantin Shalygin : > > Hi, > > On 25 May 2021, at 10:23, Boris Behrens wrote: > > I am still searching for a reason why these two values differ so much. > > I am currently deleting a giant amount of orphan objects (43mio, m

[ceph-users] summarized radosgw size_kb_actual vs pool stored value doesn't add up

2021-05-25 Thread Boris Behrens
Hi, I am still searching for a reason why these two values differ so much. I am currently deleting a giant amount of orphan objects (43mio, most of them under 64kb), but the difference get larger instead of smaller. This was the state two days ago: > > [root@s3db1 ~]# radosgw-admin bucket stats |

[ceph-users] question regarding markers in radosgw

2021-05-21 Thread Boris Behrens
can I delete them fast? Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
Reading through the bugtracker: https://tracker.ceph.com/issues/50293 Thanks for your patience. Am Do., 20. Mai 2021 um 15:10 Uhr schrieb Boris Behrens : > I try to bump it once more, because it makes finding orphan objects nearly > impossible. > > Am Di., 11. Mai 2021 um 13:03

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
I try to bump it once more, because it makes finding orphan objects nearly impossible. Am Di., 11. Mai 2021 um 13:03 Uhr schrieb Boris Behrens : > Hi together, > > I still search for orphan objects and came across a strange bug: > There is a huge multipart upload happening (arou

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-19 Thread Boris Behrens
your support Igor <3 Am Di., 18. Mai 2021 um 09:54 Uhr schrieb Boris Behrens : > One more question: > How do I get rid of the bluestore spillover message? > osd.68 spilled over 64 KiB metadata from 'db' device (13 GiB used of > 50 GiB) to slow device > >

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-18 Thread Boris Behrens
One more question: How do I get rid of the bluestore spillover message? osd.68 spilled over 64 KiB metadata from 'db' device (13 GiB used of 50 GiB) to slow device I tried an offline compactation, which did not help. Am Mo., 17. Mai 2021 um 15:56 Uhr schrieb Boris Behrens : &

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
on. > > > Thanks, > > Igor > > On 5/17/2021 3:47 PM, Boris Behrens wrote: > > The FSCK looks good: > > > > [root@s3db10 export-bluefs2]# ceph-bluestore-tool --path > > /var/lib/ceph/osd/ceph-68 fsck > > fsck success > > > > Am Mo., 17

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
See my last mail :) Am Mo., 17. Mai 2021 um 14:52 Uhr schrieb Igor Fedotov : > Would you try fsck without standalone DB? > > On 5/17/2021 3:39 PM, Boris Behrens wrote: > > Here is the new output. I kept both for now. > > > > [root@s3db10 export-bluefs2]# ls * > &g

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
The FSCK looks good: [root@s3db10 export-bluefs2]# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-68 fsck fsck success Am Mo., 17. Mai 2021 um 14:39 Uhr schrieb Boris Behrens : > Here is the new output. I kept both for now. > > [root@s3db10 export-bluefs2]# ls * > db: > 018

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
.sst 020006.sst 020041.sst 020064.sst 020096.sst 020114.sst db.slow: db.wal: 020085.log 020088.log [root@s3db10 export-bluefs2]# du -hs 12G . [root@s3db10 export-bluefs2]# cat db/CURRENT MANIFEST-020084 Am Mo., 17. Mai 2021 um 14:28 Uhr schrieb Igor Fedotov : > On 5/17/2021 2:53 PM, Bo

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
lid BlueFS directory > structure - multiple .sst files, CURRENT and IDENTITY files etc? > > If so then please check and share the content of /db/CURRENT > file. > > > Thanks, > > Igor > > On 5/17/2021 1:32 PM, Boris Behrens wrote: > > Hi Igor, > > I posted it on

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Hi Igor, I posted it on pastebin: https://pastebin.com/Ze9EuCMD Cheers Boris Am Mo., 17. Mai 2021 um 12:22 Uhr schrieb Igor Fedotov : > Hi Boris, > > could you please share full OSD startup log and file listing for > '/var/lib/ceph/osd/ceph-68'? > > > Thanks, &g

[ceph-users] Re: Process for adding a separate block.db to an osd

2021-05-17 Thread Boris Behrens
Hi, sorry for replying to this old thread: I tried to add a block.db to an OSD but now the OSD can not start with the error: Mai 17 09:50:38 s3db10.fra2.gridscale.it ceph-osd[26038]: -7> 2021-05-17 09:50:38.362 7fc48ec94a80 -1 rocksdb: Corruption: CURRENT file does not end with newline Mai 17 09:5

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
I actually WAS the amount of watchers... narf.. This is so embarissing.. Thanks a lot for all your input. Am Di., 11. Mai 2021 um 13:54 Uhr schrieb Boris Behrens : > I tried to debug it with --debug-ms=1. > Maybe someone could help me to wrap my head around it? > https://pastebin.com

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
I tried to debug it with --debug-ms=1. Maybe someone could help me to wrap my head around it? https://pastebin.com/LD9qrm3x Am Di., 11. Mai 2021 um 11:17 Uhr schrieb Boris Behrens : > Good call. I just restarted the whole cluster, but the problem still > persists. > I don't

[ceph-users] "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-11 Thread Boris Behrens
Hi together, I still search for orphan objects and came across a strange bug: There is a huge multipart upload happening (around 4TB), and listing the rados objects in the bucket loops over the multipart upload. -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groü

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
>> >> Kind regards, >> Thomas >> >> >> Am 11. Mai 2021 08:47:06 MESZ schrieb Boris Behrens : >> >Hi Amit, >> > >> >I just pinged the mons from every system and they are all available. >> > >> >Am Mo., 10. Mai 2021 um

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-11 Thread Boris Behrens
and the problem was gone. > I have no good way to debug the problem since it never occured again after > we restarted the OSDs. > > Kind regards, > Thomas > > > Am 11. Mai 2021 08:47:06 MESZ schrieb Boris Behrens : > >Hi Amit, > > > >I just pinged the mons fr

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
all nodes are successfully ping. > > > -AmitG > > > On Tue, 11 May 2021 at 12:12 AM, Boris Behrens wrote: > >> Hi guys, >> >> does someone got any idea? >> >> Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens : >> >> > Hi, >&g

[ceph-users] Re: radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-10 Thread Boris Behrens
Hi guys, does someone got any idea? Am Mi., 5. Mai 2021 um 16:16 Uhr schrieb Boris Behrens : > Hi, > since a couple of days we experience a strange slowness on some > radosgw-admin operations. > What is the best way to debug this? > > For example creating a user takes over

[ceph-users] radosgw-admin user create takes a long time (with failed to distribute cache message)

2021-05-05 Thread Boris Behrens
ral-1-s3db2, eu-central-1-s3db3) * We also added dedicated rgw daemons for garbage collection, because the current one were not able to keep up. * So basically ceph status went from "rgw: 1 daemon active (eu-central-1)" to "rgw: 14 daemons active (eu-central-1-s3

[ceph-users] global multipart lc policy in radosgw

2021-05-02 Thread Boris Behrens
Hi, I have a lot of multipart uploads that look like they never finished. Some of them date back to 2019. Is there a way to clean them up when they didn't finish in 28 days? I know I can implement a LC policy per bucket, but how do I implement it cluster wide? Cheers Boris --

[ceph-users] Re: how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-27 Thread Boris Behrens
Uhr schrieb Boris Behrens : > Hi Anthony, > > yes we are using replication, the lost space is calculated before it's > replicated. > RAW STORAGE: > CLASS SIZEAVAIL USEDRAW USED %RAW USED > hdd 1.1 PiB 191 TiB 968 TiB

[ceph-users] Re: how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-26 Thread Boris Behrens
ns is mixed, but the most amount of data is in huge files. We store our platforms RBD snapshots in it. Cheers Boris Am Di., 27. Apr. 2021 um 06:49 Uhr schrieb Anthony D'Atri < anthony.da...@gmail.com>: > Are you using Replication? EC? How many copies / which profile? > On w

[ceph-users] how to handle rgw leaked data (aka data that is not available via buckets but eats diskspace)

2021-04-26 Thread Boris Behrens
HI, we still have the problem that our rgw eats more diskspace than it should. Summing up the "size_kb_actual" of all buckets show only half of the used diskspace. There are 312TiB stored acording to "ceph df" but we only need around 158TB. I've already wrote to this ML with the problem, but the

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Boris Behrens
Am Fr., 23. Apr. 2021 um 12:16 Uhr schrieb Ilya Dryomov : > On Fri, Apr 23, 2021 at 12:03 PM Boris Behrens wrote: > > > > > > > > Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> > >> This

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Boris Behrens
Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov : > > This snippet confirms my suspicion. Unfortunately without a verbose > log from that VM from three days ago (i.e. when it got into this state) > it's hard to tell what exactly went wrong. > > The problem is that the VM doesn't consider

[ceph-users] Re: s3 requires twice the space it should use

2021-04-23 Thread Boris Behrens
ge": { "search_stage": "comparing", "shard": 0, "marker": "" } } } }, Am Fr., 16. Apr. 2021 um 10:57 Uhr schrieb Boris Behrens : > Could this also be failed multipart

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Am Do., 22. Apr. 2021 um 17:27 Uhr schrieb Ilya Dryomov : > On Thu, Apr 22, 2021 at 5:08 PM Boris Behrens wrote: > > > > > > > > Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov < > idryo...@gmail.com>: > >> > >> On Thu, Apr 22,

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov : > On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens wrote: > > > > Hi, > > > > I have a customer VM that is running fine, but I can not make snapshots > > anymore. > > rbd snap create rbd/IMAGE@test-bb

[ceph-users] rbd snap create now working and just hangs forever

2021-04-22 Thread Boris Behrens
Hi, I have a customer VM that is running fine, but I can not make snapshots anymore. rbd snap create rbd/IMAGE@test-bb-1 just hangs forever. When I checked the status with rbd status rbd/IMAGE it shows one watcher, the cpu node where the VM is running. What can I do to investigate further, witho

[ceph-users] Re: [Suspicious newsletter] cleanup multipart in radosgw

2021-04-19 Thread Boris Behrens
Hi Istvan, both of them require bucket access, correct? Is there a way to add the LC policy globally? Cheers Boris Am Mo., 19. Apr. 2021 um 11:58 Uhr schrieb Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > Hi, > > You have 2 ways: > > First is using s3vrowser app and i

[ceph-users] cleanup multipart in radosgw

2021-04-19 Thread Boris Behrens
Hi, is there a way to remove multipart uploads that are older than X days? It doesn't need to be build into ceph or is automated to the end. Just something I don't need to build on my own. I currently try to debug a problem where ceph reports a lot more used space than it actually requires ( http

[ceph-users] Re: s3 requires twice the space it should use

2021-04-16 Thread Boris Behrens
Could this also be failed multipart uploads? Am Do., 15. Apr. 2021 um 18:23 Uhr schrieb Boris Behrens : > Cheers, > > [root@s3db1 ~]# ceph daemon osd.23 perf dump | grep numpg > "numpg": 187, > "numpg_primary": 64, > "nump

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Cheers, [root@s3db1 ~]# ceph daemon osd.23 perf dump | grep numpg "numpg": 187, "numpg_primary": 64, "numpg_replica": 121, "numpg_stray": 2, "numpg_removing": 0, Am Do., 15. Apr. 2021 um 18:18 Uhr schrieb

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
ues ,bluestore_min_alloc_size_hdd & > bluestore_min_alloc_size_sdd, If you are using hdd disk then > bluestore_min_alloc_size_hdd are applicable. > > On Thu, Apr 15, 2021 at 8:06 PM Boris Behrens wrote: > >> So, I need to live with it? A value of zero leads to use the

[ceph-users] Re: s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
e actually bucket object size but on OSD level the > bluestore_min_alloc_size default 64KB and SSD are 16KB > > > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html/administration_guide/osd-bluestore > > -AmitG > > On Thu, Apr 15, 2021 at 7:29 PM Boris

[ceph-users] s3 requires twice the space it should use

2021-04-15 Thread Boris Behrens
Hi, maybe it is just a problem in my understanding, but it looks like our s3 requires twice the space it should use. I ran "radosgw-admin bucket stats", and added all "size_kb_actual" values up and divided to TB (/1024/1024/1024). The resulting space is 135,1636733 TB. When I tripple it because o

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
90% default limit. > > -- dan > > On Tue, Mar 30, 2021 at 3:18 PM Boris Behrens wrote: > > > > The output from ceph osd pool ls detail tell me nothing, except that the > pgp_num is not where it should be. Can you help me to read the output? How > do I estimate

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
is current splitting should take. This will help: > > ceph status > ceph osd pool ls detail > > -- dan > > On Tue, Mar 30, 2021 at 3:00 PM Boris Behrens wrote: > > > > I would think due to splitting, because the balancer doesn't refuses > it'

[ceph-users] Re: forceful remap PGs

2021-03-30 Thread Boris Behrens
> On 3/30/21 12:55 PM, Boris Behrens wrote: > > I just move one PG away from the OSD, but the diskspace will not get > freed. > > How did you move? I would suggest you use upmap: > > ceph osd pg-upmap-items > Invalid command: missing required parameter pgid() > osd pg

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
y to pause them with > upmap or any other trick. > > -- dan > > On Tue, Mar 30, 2021 at 2:07 PM Boris Behrens wrote: > > > > One week later the ceph is still balancing. > > What worries me like hell is the %USE on a lot of those OSDs. Does ceph > > resolv this on

[ceph-users] Re: should I increase the amount of PGs?

2021-03-30 Thread Boris Behrens
27739 0.95000 7.3 TiB 6.7 TiB 6.7 TiB 322 MiB 16 GiB 548 GiB 92.64 1.18 121 up osd.66 46 hdd 7.27739 1.0 7.3 TiB 6.8 TiB 6.7 TiB 316 MiB 16 GiB 536 GiB 92.81 1.18 119 up osd.46 Am Di., 23. März 2021 um 19:59 Uhr schrieb Boris Behrens : > Good point. Than

[ceph-users] Re: forceful remap PGs

2021-03-30 Thread Boris Behrens
I just move one PG away from the OSD, but the diskspace will not get freed. Do I need to do something to clean obsolete objects from the osd? Am Di., 30. März 2021 um 11:47 Uhr schrieb Boris Behrens : > Hi, > I have a couple OSDs that currently get a lot of data, and are running > t

[ceph-users] forceful remap PGs

2021-03-30 Thread Boris Behrens
Hi, I have a couple OSDs that currently get a lot of data, and are running towards 95% fillrate. I would like to forcefully remap some PGs (they are around 100GB) to more empty OSDs and drop them from the full OSDs. I know this would lead to degraded objects, but I am not sure how long the cluster

<    1   2   3   4   >