[ceph-users] Re: quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Fr., 24. Mai 2024 um 14:19 Uhr schrieb Boris : > Hi, > > we are currently in the process of adopting the main s3 cluster to > orchestrator. > We have two realms (one for us and one for the customer). > > The old config worked fine and depending on the port I requested, I

[ceph-users] quincy rgw with ceph orch and two realms only get answers from first realm

2024-05-24 Thread Boris
Hi, we are currently in the process of adopting the main s3 cluster to orchestrator. We have two realms (one for us and one for the customer). The old config worked fine and depending on the port I requested, I got different x-amz-request-id header back: x-amz-request-id:

[ceph-users] Re: purging already destroyed OSD leads to degraded and misplaced objects?

2024-04-04 Thread Boris
Hi Tobias, what we usually do, when we want to remove an OSD is to reweight the crush map to 0. This stops the rebalancing after removing the OSD from the crush map. Setting an OSD to out, keeps it weighted in the crush map and when it gets removed, the cluster will rebalance the PGs to reflect

[ceph-users] Re: [quincy 17.2.7] ceph orchestrator not doing anything

2024-01-16 Thread Boris
Good morning Eugen, I just found this thread and saw that I had a test image for rgw in the config. After removing the global and the rgw config value everything was instantly fine. Cheers and a happy week Boris Am Di., 16. Jan. 2024 um 10:20 Uhr schrieb Eugen Block : > Hi, > > t

[ceph-users] [quincy 17.2.7] ceph orchestrator not doing anything

2024-01-11 Thread Boris
/troubleshooting/#ssh-errors) The logs always show a message like "took the task" but then nothing happens. Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Assistance Needed with Ceph Cluster Slow Ops Issue

2023-12-06 Thread Boris
e were hit with this issue: Stop snaptrim, update to pacific, do an offline rocksdb compression before the OSDs start after the upgrade, start the OSDs and hate our lives while they started, wait a week, slowly start the snaptrim and hope for the best. :-) Mit freundlichen Grüßen - Boris Beh

[ceph-users] Re: Assistance Needed with Ceph Cluster Slow Ops Issue

2023-12-06 Thread Boris
Hi Peter, try to set the cluster to nosnaptrim If this helps, you might need to upgrade to pacific, because you are hit by the pg dups bug. See: https://www.clyso.com/blog/how-to-identify-osds-affected-by-pg-dup-bug/ Mit freundlichen Grüßen - Boris Behrens > Am 06.12.2023 um 19:01 schr

[ceph-users] RadosGW public HA traffic - best practices?

2023-11-17 Thread Boris Behrens
of RGW instances. So long story short: What are your easy setups to serve public RGW traffic with some sort of HA and LB (without using a big HW LB that is capable of 100GBit traffic)? And have you experienced problems when you do not shift around IP addresses. Cheers Boris

[ceph-users] Re: how to disable ceph version check?

2023-11-07 Thread Boris
You can mute it with "ceph health mute ALERT" where alert is the caps keyword from "ceph health detail" But I would update asap. Cheers Boris > Am 08.11.2023 um 02:02 schrieb zxcs : > > Hi, Experts, > > we have a ceph cluster report HEALTH_ERR due to mu

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens
this in an orchestrator environment) Good luck. Am Do., 2. Nov. 2023 um 12:48 Uhr schrieb Mohamed LAMDAOUAR < mohamed.lamdao...@enyx.fr>: > Hello Boris, > > I have one server monitor up and two other servers of the cluster are also > up (These two servers are not monitors ) . > I have fo

[ceph-users] Re: Emergency, I lost 4 monitors but all osd disk are safe

2023-11-02 Thread Boris Behrens
Hi Mohamed, are all mons down, or do you still have at least one that is running? AFAIK: the mons save their DB on the normal OS disks, and not within the ceph cluster. So if all mons are dead, which mean the disks which contained the mon data are unrecoverable dead, you might need to bootstrap a

[ceph-users] Re: RGW access logs with bucket name

2023-10-30 Thread Boris Behrens
Hi Dan, we are currently moving all the logging into lua scripts, so it is not an issue anymore for us. Thanks ps: the ceph analyzer is really cool. plusplus Am Sa., 28. Okt. 2023 um 22:03 Uhr schrieb Dan van der Ster < dan.vanders...@clyso.com>: > Hi Boris, > > I found that

[ceph-users] traffic by IP address / bucket / user

2023-10-18 Thread Boris Behrens
Hi, did someone have a solution ready to monitor traffic by IP address? Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Autoscaler problems in pacific

2023-10-05 Thread Boris Behrens
ug > reports to improve it. > > Zitat von Boris Behrens : > > > Hi, > > I've just upgraded to our object storages to the latest pacific version > > (16.2.14) and the autscaler is acting weird. > > On one cluster it just shows nothing: > > ~# ceph osd pool autoscal

[ceph-users] Re: Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Also found what the 2nd problem was: When there are pools using the default replicated_ruleset while there are multiple rulesets with differenct device classes, the autoscaler does not produce any output. Should I open a bug for that? Am Mi., 4. Okt. 2023 um 14:36 Uhr schrieb Boris Behrens

[ceph-users] Re: Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Found the bug for the TOO_MANY_PGS: https://tracker.ceph.com/issues/62986 But I am still not sure, why I don't have any output on that one cluster. Am Mi., 4. Okt. 2023 um 14:08 Uhr schrieb Boris Behrens : > Hi, > I've just upgraded to our object storages to the latest pacific version >

[ceph-users] Autoscaler problems in pacific

2023-10-04 Thread Boris Behrens
Hi, I've just upgraded to our object storages to the latest pacific version (16.2.14) and the autscaler is acting weird. On one cluster it just shows nothing: ~# ceph osd pool autoscale-status ~# On the other clusters it shows this when it is set to warn: ~# ceph health detail ... [WRN]

[ceph-users] multiple rgw instances with same cephx key

2023-09-22 Thread Boris Behrens
, that get removed when the RGW shuts down. I don't want to use the orchestrator for this, because I would need to add all the compute nodes to it and there might be other processes in place that add FW rules in our provisioning. Cheers Boris ___ ceph-users

[ceph-users] ceph orch osd data_allocate_fraction does not work

2023-09-21 Thread Boris Behrens
I have a use case where I want to only use a small portion of the disk for the OSD and the documentation states that I can use data_allocation_fraction [1] But cephadm can not use this and throws this error: /usr/bin/podman: stderr ceph-volume lvm batch: error: unrecognized arguments:

[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-18 Thread Boris Behrens
-de57f40ba853.target; enabled; vendor preset: enabled) Active: inactive (dead) Am Sa., 16. Sept. 2023 um 13:29 Uhr schrieb Boris : > The other hosts are still online and the cluster only lost 1/3 of its > services. > > > > > Am 16.09.2023 um 12:53 schrieb Eugen Block :

[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-16 Thread Boris
r > daemons are down. The orchestrator is a mgr module, so that’s a bit weird, > isn’t it? > > Zitat von Boris Behrens : > >> Hi Eugen, >> the test-test cluster where we started with simple ceph and the adoption >> when straight forward are working fine. >&g

[ceph-users] Re: Make ceph orch daemons reboot safe

2023-09-16 Thread Boris Behrens
a node where I had to "play around" a bit with removed and > redeployed osd containers. At some point they didn't react to > systemctl commands anymore, but a reboot fixed that. But I haven't > seen that in a production cluster yet, so some more details would be > useful.

[ceph-users] Make ceph orch daemons reboot safe

2023-09-15 Thread Boris Behrens
cluster and there the pods start very fast. On the legacy test cluster, which got adopted to cephadm, it does not. Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph orchestator managed daemons do not use authentication (was: ceph orchestrator pulls strange images from docker.io)

2023-09-15 Thread Boris Behrens
none * global advanced auth_service_required none Am Fr., 15. Sept. 2023 um 13:01 Uhr schrieb Boris Behrens : > Oh, we found the issue. A very old update was stuck in the pipeline. We > canceled it and then the correct images got

[ceph-users] Re: ceph orchestator managed daemons do not use authentication (was: ceph orchestrator pulls strange images from docker.io)

2023-09-15 Thread Boris Behrens
.0cc47a6df330@-1(probing) e0 handle_auth_bad_method hmm, they didn't like 2 result (95) Operation not supported I added the mon via: ceph orch daemon add mon FQDN:[IPv6_address] Am Fr., 15. Sept. 2023 um 09:21 Uhr schrieb Boris Behrens : > Hi Stefan, > > the cluster is running 17.6.

[ceph-users] Re: ceph orchestator pulls strange images from docker.io

2023-09-15 Thread Boris Behrens
alling the hosts, but as I have to adopt 17 clusters to the orchestrator, I rather get some learnings from the not working thing :) Am Fr., 15. Sept. 2023 um 08:26 Uhr schrieb Stefan Kooman : > On 14-09-2023 17:49, Boris Behrens wrote: > > Hi, > > I currently try to adopt our

[ceph-users] ceph orchestator pulls strange images from docker.io

2023-09-14 Thread Boris Behrens
Hi, I currently try to adopt our stage cluster, some hosts just pull strange images. root@0cc47a6df330:/var/lib/containers/storage/overlay-images# podman ps CONTAINER ID IMAGE COMMAND CREATEDSTATUSPORTS NAMES

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-23 Thread Boris Behrens
ility of both: old and new network, until end of migration > > k > Sent from my iPhone > > > On 22 Aug 2023, at 10:43, Boris Behrens wrote: > > > > The OSDs are still only bound to one IP address. > > -- Die Selbsthilfegruppe "UTF-

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
IP, > I'm not aware of a way to have them bind to multiple public IPs like > the MONs can. You'll probably need to route the compute node traffic > towards the new network. Please correct me if I misunderstood your > response. > > Zitat von Boris Behrens : > > > The OSDs ar

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-22 Thread Boris Behrens
o have both old and new network in there, but I'd try on one > host first and see if it works. > > Zitat von Boris Behrens : > > > We're working on the migration to cephadm, but it requires some > > prerequisites that still needs planing. > > > > root@host:~#

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
via cephadm / > > orchestrator. > > I just assumed that with Quincy it already would be managed by > cephadm. So what does the ceph.conf currently look like on an OSD host > (mask sensitive data)? > > Zitat von Boris Behrens : > > > Hey Eugen, > > I don't ha

[ceph-users] Re: [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
tps://www.spinics.net/lists/ceph-users/msg75162.html > [2] > > https://docs.ceph.com/en/quincy/cephadm/services/mon/#moving-monitors-to-a-different-network > > Zitat von Boris Behrens : > > > Hi, > > I need to migrate a storage cluster to a new network. > > > > I adde

[ceph-users] [quincy] Migrating ceph cluster to new network, bind OSDs to multple public_nework

2023-08-21 Thread Boris Behrens
Hi, I need to migrate a storage cluster to a new network. I added the new network to the ceph config via: ceph config set global public_network "old_network/64, new_network/64" I've added a set of new mon daemons with IP addresses in the new network and they are added to the quorum and seem to

[ceph-users] Re: Upgrading nautilus / centos7 to octopus / ubuntu 20.04. - Suggestions and hints?

2023-08-01 Thread Boris Behrens
are okayish but ugly :D ). And because of the bug, we went another route with the last cluster. I reinstalled all hosts with ubuntu 18.04, then update straight to pacific, and then upgrade to ubuntu 20.04. Hope that helped. Cheers Boris Am Di., 1. Aug. 2023 um 20:06 Uhr schrieb Götz Reinicke

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-07-04 Thread Boris Behrens
Are there any ideas how to work with this? We disabled the logging so we do not run our of diskspace, but the rgw daemon still requires A LOT of cpu because of this. Am Mi., 21. Juni 2023 um 10:45 Uhr schrieb Boris Behrens : > I've update the dc3 site from octopus to pacific and the prob

[ceph-users] Re: list of rgw instances in ceph status

2023-07-03 Thread Boris Behrens
Hi Mahnoosh, that helped. Thanks a lot! Am Mo., 3. Juli 2023 um 13:46 Uhr schrieb mahnoosh shahidi < mahnooosh@gmail.com>: > Hi Boris, > > You can list your rgw daemons with the following command > > ceph service dump -f json-pretty | jq '.services.rgw.daemons' > &

[ceph-users] list of rgw instances in ceph status

2023-07-03 Thread Boris Behrens
but we are not at the stage that we are going to implement the orchestrator yet. Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: device class for nvme disk is ssd

2023-06-29 Thread Boris Behrens
So basically it does not matter unless I want to have that split up. Thanks for all the answers. I am still lobbying to phase out SATA SSDs and replace them with NVME disks. :) Am Mi., 28. Juni 2023 um 18:14 Uhr schrieb Anthony D'Atri < a...@dreamsnake.net>: > Even when you factor in density,

[ceph-users] device class for nvme disk is ssd

2023-06-28 Thread Boris Behrens
"nvme0n1", "distro": "ubuntu", "distro_description": "Ubuntu 20.04.6 LTS", "distro_version": "20.04", ... "journal_rotational": "0", "kernel_description": "#169-Ubuntu SMP Tue Jun 6 22:23:09 UTC 2023", "kernel_version": "5.4.0-152-generic", "mem_swap_kb": "0", "mem_total_kb": "196668116", "network_numa_unknown_ifaces": "back_iface,front_iface", "objectstore_numa_node": "0", "objectstore_numa_nodes": "0", "os": "Linux", "osd_data": "/var/lib/ceph/osd/ceph-0", "osd_objectstore": "bluestore", "osdspec_affinity": "", "rotational": "0" } Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-21 Thread Boris Behrens
I've update the dc3 site from octopus to pacific and the problem is still there. I find it very weird that in only happens from one single zonegroup to the master and not from the other two. Am Mi., 21. Juni 2023 um 01:59 Uhr schrieb Boris Behrens : > I recreated the site and the problem st

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
a-861b-78f0adeba8ad.2297866866.29 Am Di., 20. Juni 2023 um 19:29 Uhr schrieb Boris : > Hi Casey, > already did restart all RGW instances. Only helped for 2 minutes. We now > stopped the new site. > > I will remove and recreate it later. > As twi other sites don't have the problem

[ceph-users] Re: radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris
Hi Casey, already did restart all RGW instances. Only helped for 2 minutes. We now stopped the new site. I will remove and recreate it later. As twi other sites don't have the problem I currently think I made a mistake in the process. Mit freundlichen Grüßen - Boris Behrens >

[ceph-users] radosgw new zonegroup hammers master with metadata sync

2023-06-20 Thread Boris Behrens
Hi, yesterday I added a new zonegroup and it looks like it seems to cycle over the same requests over and over again. In the log of the main zone I see these requests: 2023-06-20T09:48:37.979+ 7f8941fb3700 1 beast: 0x7f8a602f3700: fd00:2380:0:24::136 - - [2023-06-20T09:48:37.979941+]

[ceph-users] Re: Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
E:OLD_BUCKET_ID < bucket.instance:BUCKET_NAME:NEW_BUCKET_ID.json Am Do., 27. Apr. 2023 um 13:32 Uhr schrieb Boris Behrens : > To clarify a bit: > The bucket data is not in the main zonegroup. > I wanted to start the reshard in the zonegroup where the bucket and the > data is located, but rgw told me to

[ceph-users] Re: Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
les Am Do., 27. Apr. 2023 um 13:08 Uhr schrieb Boris Behrens : > Hi, > I just resharded a bucket on an octopus multisite environment from 11 to > 101. > > I did it on the master zone and it went through very fast. > But now the index is empty. > > The files are still there

[ceph-users] Bucket empty after resharding on multisite environment

2023-04-27 Thread Boris Behrens
Hi, I just resharded a bucket on an octopus multisite environment from 11 to 101. I did it on the master zone and it went through very fast. But now the index is empty. The files are still there when doing a radosgw-admin bucket radoslist --bucket-id Do I just need to wait or do I need to

[ceph-users] Re: How to find the bucket name from Radosgw log?

2023-04-27 Thread Boris Behrens
Cheers Dan, would it be an option to enable the ops log? I still didn't figure out how it is actually working. But I am also thinking to move to the logparsing in HAproxy and disable the access log on the RGW instances. Am Mi., 26. Apr. 2023 um 18:21 Uhr schrieb Dan van der Ster <

[ceph-users] Re: Veeam backups to radosgw seem to be very slow

2023-04-27 Thread Boris Behrens
Thanks Janne, I will hand that to the customer. > Look at https://community.veeam.com/blogs-and-podcasts-57/sobr-veeam > -capacity-tier-calculations-and-considerations-in-v11-2548 > for "extra large blocks" to make them 8M at least. > We had one Veeam installation vomit millions of files onto our

[ceph-users] Veeam backups to radosgw seem to be very slow

2023-04-25 Thread Boris Behrens
om the s3cmd/aws cli standpoint. Does anyone here ever experienced veeam problems with rgw? Cheers Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: radosgw-admin bucket stats doesn't show real num_objects and size

2023-04-11 Thread Boris Behrens
I don't think you can exclude that. We've build a notification in the customer panel that there are incomplete multipart uploads which will be added as space to the bill. We also added a button to create a LC policy for these objects. Am Di., 11. Apr. 2023 um 19:07 Uhr schrieb : > The

[ceph-users] Re: RGW can't create bucket

2023-03-31 Thread Boris Behrens
x_buckets": 1000, and those users have the same access_denied issue > when creating a bucket. > > We also tried other bucket names and it is the same issue. > > On Thu, Mar 30, 2023 at 6:28 PM Boris Behrens wrote: > >> Hi Kamil, >> is this with all new buckets o

[ceph-users] Re: OSD down cause all OSD slow ops

2023-03-30 Thread Boris Behrens
Hi, you might suffer from the same bug we suffered: https://tracker.ceph.com/issues/53729 https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/KG35GRTN4ZIDWPLJZ5OQOKERUIQT5WQ6/#K45MJ63J37IN2HNAQXVOOT3J6NTXIHCA Basically there is a bug that prevents the removal of PGlog items. You need

[ceph-users] Re: Eccessive occupation of small OSDs

2023-03-30 Thread Boris Behrens
Hi Nicola, can you send the output of ceph osd df tree ceph df ? Cheers Boris Am Do., 30. März 2023 um 16:36 Uhr schrieb Nicola Mori : > Dear Ceph users, > > my cluster is made up of 10 old machines, with uneven number of disks and > disk size. Essentially I have just one big da

[ceph-users] Re: RGW can't create bucket

2023-03-30 Thread Boris Behrens
Hi Kamil, is this with all new buckets or only the 'test' bucket? Maybe the name is already taken? Can you check s3cmd --debug if you are connecting to the correct endpoint? Also I see that the user seems to not be allowed to create bukets ... "max_buckets": 0, ... Cheers Boris

[ceph-users] Re: RGW access logs with bucket name

2023-03-30 Thread Boris Behrens
frastructure Engineer > --- > Agoda Services Co., Ltd. > e: istvan.sz...@agoda.com > --- > > On 2023. Mar 30., at 17:44, Boris Behrens wrote: > > Email received from the internet. If in doubt, d

[ceph-users] Re: RGW access logs with bucket name

2023-03-30 Thread Boris Behrens
Bringing up that topic again: is it possible to log the bucket name in the rgw client logs? currently I am only to know the bucket name when someone access the bucket via https://TLD/bucket/object instead of https://bucket.TLD/object. Am Di., 3. Jan. 2023 um 10:25 Uhr schrieb Boris Behrens

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-30 Thread Boris Behrens
. After idling over night it is back up to 120 IOPS Am Do., 30. März 2023 um 09:45 Uhr schrieb Boris Behrens : > After some digging in the nautilus cluster I see that the disks with the > exceptional high IOPS performance are actually SAS attached NVME disks > (these

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-30 Thread Boris Behrens
(4h resolution) goes up again (2023-03-01 upgrade to pacific, the dip around 25th was the redeploy and now it seems to go up again) [image: image.png] Am Mo., 27. März 2023 um 17:24 Uhr schrieb Igor Fedotov < igor.fedo...@croit.io>: > > On 3/27/2023 12:19 PM, Boris Behrens wrote:

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
stay at the current level or degrade over time. Am Mo., 27. März 2023 um 13:42 Uhr schrieb Igor Fedotov < igor.fedo...@croit.io>: > Hi Boris, > > I wouldn't recommend to take absolute "osd bench" numbers too seriously. > It's definitely not a full-scale quality

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-27 Thread Boris Behrens
? @marc If I interpret the linked bug correctly, you might want to have the metadata on an SSD, because the write aplification might hit very hard on HDDs. But maybe someone else from the mailing list can say more about it. Cheers Boris Am Mi., 22. März 2023 um 22:45 Uhr schrieb Boris Behrens : >

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-22 Thread Boris Behrens
Cluster before recreating OSDs osd.40 = 8TB osd.162 = 2TB https://pastebin.com/wKMmSW9T Pacific Cluster after recreation OSDs osd.40 = 8TB osd.162 = 2TB https://pastebin.com/80eMwwBW Am Mi., 22. März 2023 um 11:09 Uhr schrieb Igor Fedotov < igor.fedo...@croit.io>: > Hi Boris, > > f

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-22 Thread Boris Behrens
Might be. Josh also pointed in that direction. I currently search for ways to mitigate it. Am Mi., 22. März 2023 um 10:30 Uhr schrieb Konstantin Shalygin < k0...@k0ste.ru>: > Hi, > > > Maybe [1] ? > > > [1] https://tracker.ceph.com/issues/58530 > k > > On

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-21 Thread Boris Behrens
>=5. Am Di., 21. März 2023 um 10:46 Uhr schrieb Igor Fedotov < igor.fedo...@croit.io>: > Hi Boris, > > additionally you might want to manually compact RocksDB for every OSD. > > > Thanks, > > Igor > On 3/21/2023 12:22 PM, Boris Behrens

[ceph-users] Re: Changing os to ubuntu from centos 8

2023-03-21 Thread Boris Behrens
he configs. Afterwards I just start all the OSDs (ceph-volume lvm activate --all) and wait for the cluster to become green again. Cheers Boris Am Di., 21. März 2023 um 08:54 Uhr schrieb Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > Hi, > > I'd like to change the os to ubuntu 2

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-03-21 Thread Boris Behrens
? Cheers Boris Am Di., 28. Feb. 2023 um 22:46 Uhr schrieb Boris Behrens : > Hi Josh, > thanks a lot for the breakdown and the links. > I disabled the write cache but it didn't change anything. Tomorrow I will > try to disable bluefs_buffered_io. > > It doesn't sound that I can mi

[ceph-users] Re: radosgw SSE-C is not working (InvalidRequest)

2023-03-17 Thread Boris Behrens
Ha, found the error and now I feel just a tiny bit stupid: haproxy did not add the X-Forwarded-Proto header. Am Fr., 17. März 2023 um 12:03 Uhr schrieb Boris Behrens : > Hi, > I try to evaluate SSE-C (so customer provides keys) for our object > storages. > We do not provide

[ceph-users] radosgw SSE-C is not working (InvalidRequest)

2023-03-17 Thread Boris Behrens
ample and is willing to share it with me, or did also encounter this problem and knows what to do? It's and octopus cluster. Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___

[ceph-users] Re: Concerns about swap in ceph nodes

2023-03-16 Thread Boris Behrens
Maybe worth to mention, because it caught me by surprise: Ubuntu creates a swap file (/swap.img) if you do not specify a swap partition (check /etc/fstab). Cheers Boris Am Mi., 15. März 2023 um 22:11 Uhr schrieb Anthony D'Atri < a...@dreamsnake.net>: > > With CentOS/Rocky 7-8 I

[ceph-users] radosgw - octopus - 500 Bad file descriptor on upload

2023-03-09 Thread Boris Behrens
e had >20k PUT requests the were working fine on other buckets. This error also seem to happen to other buckets, but only very sporadically. Did someone encounter this issue or knows what it could be? Cheers Boris ___ ceph-users mailing list -- ceph-us

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
aer...@digitalocean.com>: > Hi Boris, > > OK, what I'm wondering is whether > https://tracker.ceph.com/issues/58530 is involved. There are two > aspects to that ticket: > * A measurable increase in the number of bytes written to disk in > Pacific as compared to Nautilus > *

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
Hi Josh, we upgraded 15.2.17 -> 16.2.11 and we only use rbd workload. Am Di., 28. Feb. 2023 um 15:00 Uhr schrieb Josh Baergen < jbaer...@digitalocean.com>: > Hi Boris, > > Which version did you upgrade from and to, specifically? And what > workload are you running

[ceph-users] avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Boris Behrens
Hi, today I did the first update from octopus to pacific, and it looks like the avg apply latency went up from 1ms to 2ms. All 36 OSDs are 4TB SSDs and nothing else changed. Someone knows if this is an issue, or am I just missing a config value? Cheers Boris

[ceph-users] Re: growing osd_pglog_items (was: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior)

2023-02-23 Thread Boris Behrens
. Is there anything I can do with an octopus cluster, or is the only way to upgrade? And why does it happen? Am Di., 21. Feb. 2023 um 18:31 Uhr schrieb Boris Behrens : > Thanks a lot Josh. That really seems like my problem. > That does not look healthy in the cluster. oof. > ~# ceph tell osd.* perf d

[ceph-users] Re: increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior

2023-02-21 Thread Boris Behrens
"osd_pglog_bytes": 541849048, "osd_pglog_items": 3880437, ... Am Di., 21. Feb. 2023 um 18:21 Uhr schrieb Josh Baergen < jbaer...@digitalocean.com>: > Hi Boris, > > This sounds a bit like https://tracker.ceph.com/issues/53729. > https://tracker.c

[ceph-users] increasing PGs OOM kill SSD OSDs (octopus) - unstable OSD behavior

2023-02-21 Thread Boris Behrens
e up (the 8TB OSDs take around 10 minutes with "load_pgs". I am unsure what I can do to restore normal cluster performance. Any ideas or suggestions or maybe even known bugs? Maybe a line for what I can search in the logs. Cheers Boris ___ ceph

[ceph-users] Re: Very slow snaptrim operations blocking client I/O

2023-02-20 Thread Boris Behrens
Hi, we've encountered the same issue after upgrading to octopus on on of our rbd cluster, and now it reappears after the autoscaler lowered the PGs form 8k to 2k for the RBD pool. What we've done in the past: - recreate all OSD after our 2nd incident with slow OPS in a single week after the ceph

[ceph-users] Re: [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
I've tried it the other way around and let cat give out all escaped chars and the did the grep: # cat -A omapkeys_list | grep -aFn '/' 9844:/$ 9845:/^@v913^@$ 88010:M-^@1000_/^@$ 128981:M-^@1001_/$ Did anyone ever saw something like this? Am Mo., 13. Feb. 2023 um 14:31 Uhr schrieb Boris Behrens

[ceph-users] Re: [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
rminal) <80>1000_//^@ Any idea what this is? Am Mo., 13. Feb. 2023 um 13:57 Uhr schrieb Boris Behrens : > Hi, > I have one bucket that showed up with a large omap warning, but the amount > of objects in the bucket, does not align with the amount of omap keys. The > buck

[ceph-users] [RGW - octopus] too many omapkeys on versioned bucket

2023-02-13 Thread Boris Behrens
p -F '"idx":' | wc -l) I have tried to fix it with radosgw-admin bucket check --check-objects --fix --bucket BUCKET but this did not change anything. Is this a known bug or might there be something else going on. How can I investigate further? Cheers Boris -- Die Selbsthilfegruppe "UT

[ceph-users] Re: Migrate a bucket from replicated pool to ec pool

2023-02-13 Thread Boris Behrens
Hi Casey, changes to the user's default placement target/storage class don't > apply to existing buckets, only newly-created ones. a bucket's default > placement target/storage class can't be changed after creation > so I can easily update the placement rules for this user and can migrate

[ceph-users] Migrate a bucket from replicated pool to ec pool

2023-02-11 Thread Boris Behrens
to the existing user data? How do I move the existing data to the new pool? Does it somehow interfere with ongoing data upload (it is one internal user, with 800 buckets which constantly get new data and old data removed)? Cheers Boris ps: Can't wait to see some of you at the cephalocon :) -- Die

[ceph-users] Re: PG_BACKFILL_FULL

2023-01-16 Thread Boris Behrens
data to disks that are free enough (i.e. `ceph osd reweight osd.60 0.9`) - if you have enough capacity in the cluster (577+ OSDs should be able to take that :) ) Cheers Boris Am Mo., 16. Jan. 2023 um 15:01 Uhr schrieb Iztok Gregori < iztok.greg...@elettra.eu>: > Hi to al

[ceph-users] RGW - large omaps even when buckets are sharded

2023-01-16 Thread Boris Behrens
Hi, since last week the scrubbing results in large omap warning. After some digging I've got these results: # searching for indexes with large omaps: $ for i in `rados -p eu-central-1.rgw.buckets.index ls`; do rados -p eu-central-1.rgw.buckets.index listomapkeys $i | wc -l | tr -d '\n' >>

[ceph-users] radosgw ceph.conf question

2023-01-13 Thread Boris Behrens
ld be very nice if someone could help me clarify this. Cheers and happy weekend Boris ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Octopus RGW large omaps in usage

2023-01-10 Thread Boris Behrens
Hi, I am currently trying to figure out how to resolve the "large objects found in pool 'rgw.usage'" error. In the past I trimmed the usage log, but now I am at the point that I need to trim it down to two weeks. I checked and amount of omapkeys and the distribution is quite off: # for OBJECT

[ceph-users] Re: docs.ceph.com -- Do you use the header navigation bar? (RESPONSES REQUESTED)

2023-01-09 Thread Boris Behrens
I actually do not mind if i need to scroll up a line, but I also think it is a good idea to remove it. Am Mo., 9. Jan. 2023 um 11:06 Uhr schrieb Frank Schilder : > > Hi John, > > firstly, image attachments are filtered out by the list. How about you upload > the image somewhere like

[ceph-users] Re: rgw - unable to remove some orphans

2023-01-03 Thread Boris Behrens
Hi Andrei, happy new year to you too. The file might be already removed. You can check if the radosobject is there with `rados -p ls ...` You can also check if the file is is still in the bucket with `radosgw-admin bucket radoslist --bucket BUCKET` Cheers Boris Am Di., 3. Jan. 2023 um 13:47

[ceph-users] RGW access logs with bucket name

2023-01-03 Thread Boris Behrens
Hi, I am looking forward to move our logs from /var/log/ceph/ceph-client...log to our logaggregator. Is there a way to have the bucket name in the log file? Or can I write the rgw_enable_ops_log into a file? Maybe I could work with this. Cheers and happy new year Boris

[ceph-users] Re: How to shutdown a ceph node

2022-12-31 Thread Boris
tri : > > Are you using size=3 replication and failure domain = host? If so you’ll be > ok. > We see folks sometimes using an EC profile that will result in PGs down, > especially with such a small cluster. > >> On Dec 31, 2022, at 4:11 AM, Boris wrote: >> >>

[ceph-users] Re: How to shutdown a ceph node

2022-12-31 Thread Boris
Hi, I usually do 'ceph osd set noout' and 'ceph osd set norebalance' and then shut down the OS normally. After everything is done I unset bot values and let the objects recover. Cheers and happy new year. > Am 31.12.2022 um 08:52 schrieb Bülent ŞENGÜLER : > > Hello, > > I have a ceph

[ceph-users] Re: nautilus mgr die when the balancer runs

2022-12-13 Thread Boris
After some manual rebalancing, all PGs went into a clean state and I was abler to start the balancer again. ¯\_(ツ)_/¯ > Am 14.12.2022 um 01:18 schrieb Boris Behrens : > > Hi, > we had an issue with an old cluster, where we put disks from one host > to another. > We d

[ceph-users] Re: octopus (15.2.16) OSDs crash or don't answer heathbeats (and get marked as down)

2022-12-13 Thread Boris Behrens
rgw_enable_gc_threads = false [client.gc-host1] rgw_frontends = "beast endpoint=[::1]:7489" rgw_enable_gc_threads = true Am Mi., 14. Dez. 2022 um 01:14 Uhr schrieb Jakub Jaszewski : > > Hi Boris, many thanks for the link! > > I see that GC list keep growing on my cluster and t

[ceph-users] nautilus mgr die when the balancer runs

2022-12-13 Thread Boris Behrens
Hi, we had an issue with an old cluster, where we put disks from one host to another. We destroyed the disks and added them as new OSDs, but since then the mgr daemon were restarting in 120s intervals. I tried to debug it a bit, and it looks like the balancer is the problem. I tried to disable it

[ceph-users] Re: radosgw - limit maximum file size

2022-12-09 Thread Boris Behrens
nuts.[2]) [1]: https://tracker.ceph.com/issues/53585 [2]: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/5XSUELNB64VTKRYRN6TXB5CU7VITPBVP/ Am Fr., 9. Dez. 2022 um 11:45 Uhr schrieb Eric Goirand : > Hello Boris, > > I think you may be looking for these RGW daemon p

[ceph-users] radosgw - limit maximum file size

2022-12-09 Thread Boris Behrens
. Cheers Boris -- Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im groüen Saal. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-09 Thread Boris Behrens
Hello together, @Alex: I am not sure for what to look in /sys/block//device There are a lot of files.Is there anything I should check in particular? You have sysfs access in /sys/block//device - this will show a lot > of settings. You can go to this directory on CentOS vs. Ubuntu, and see if >

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
< icepic...@gmail.com>: > Perhaps run "iostat -xtcy 5" on the OSD hosts to > see if any of the drives have weirdly high utilization despite low > iops/requests? > > > Den tis 6 dec. 2022 kl 10:02 skrev Boris Behrens : > > > > Hi Sven, > > I am

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-06 Thread Boris Behrens
schrieb Sven Kieske : > On Sa, 2022-12-03 at 01:54 +0100, Boris Behrens wrote: > > hi, > > maybe someone here can help me to debug an issue we faced today. > > > > Today one of our clusters came to a grinding halt with 2/3 of our OSDs > > reporting slow ops. > &

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-04 Thread Boris Behrens
@Marius: no swap at all. I rather buy more memory than use swap :) Am So., 4. Dez. 2022 um 20:10 Uhr schrieb Marius Leustean < marius.l...@gmail.com>: > Hi Boris > > Do you have swap enabled on any of the OSD hosts? That may slow down > RocksDB drastically. > > On Sun

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-04 Thread Boris Behrens
ago. (sync out, destroy, wipe, create, sync in). Maybe I will reinstall with ubuntu 20.04 (currently centos7) for newer kernel. Am So., 4. Dez. 2022 um 19:58 Uhr schrieb Alex Gorbachev < a...@iss-integration.com>: > Hi Boris, > > These waits seem to be all over the place. Usually, i

  1   2   3   4   >