[ceph-users] Re: Clients failing to advance oldest client?

2024-03-25 Thread David Yang
You can use the "ceph health detail" command to see which clients are not responding. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Clients failing to advance oldest client?

2024-03-25 Thread Erich Weiler
Ok! Thank you. Is there a way to tell which client is slow? > On Mar 25, 2024, at 9:06 PM, David Yang wrote: > > It is recommended to disconnect the client first and then observe > whether the cluster's slow requests recover. > > Erich Weiler 于2024年3月26日周二 05:02写道: >> >> Hi Y'all, >> >>

[ceph-users] Re: Clients failing to advance oldest client?

2024-03-25 Thread David Yang
It is recommended to disconnect the client first and then observe whether the cluster's slow requests recover. Erich Weiler 于2024年3月26日周二 05:02写道: > > Hi Y'all, > > I'm seeing this warning via 'ceph -s' (this is on Reef): > > # ceph -s >cluster: > id:

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2024-03-25 Thread Marc
> "complexity, OMG!!!111!!!" is not enough of a statement. You have to explain > what complexity you gain and what complexity you reduce. > Installing SeaweedFS consists of the following: `cd seaweedfs/weed && make > install` > This is the type of problem that Ceph is trying to solve, and starting

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Torkil Svensgaard
On 25-03-2024 23:07, Kai Stian Olstad wrote: On Mon, Mar 25, 2024 at 10:58:24PM +0100, Kai Stian Olstad wrote: On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote: My tally came to 412 out of 539 OSDs showing up in a blocked_by list and that is about every OSD with data prior

[ceph-users] put bucket notification configuration - access denied

2024-03-25 Thread Giada Malatesta
Hello everyone, we are facing a problem regarding the s3 operation put bucket notification configuration. We are using Ceph version 17.2.6. We are trying to configure buckets in our cluster so that  a notification message is sent via amqps protocol when the content of the bucket change. To

[ceph-users] ceph RGW reply "ERROR: S3 error: 404 (NoSuchKey)" but rgw object metadata exist

2024-03-25 Thread xuchenhuig
Hi, My ceph cluster has 9 nodes for Ceph Object Store. Recently, I have experienced data loss that reply 404 (NoSuchKey) by s3cmd get xxx command. However, I can get metadata info by s3cmd ls xxx. The RGW object size is above 1GB that have many multipart object. Commanding 'rados -p

[ceph-users] Quincy/Dashboard: Object Gateway not accessible after applying self-signed cert to rgw service

2024-03-25 Thread stephan . budach
Hi, I am running a Ceph cluster and configured RGW for S3, initially w/o SSL. The service works nicely and I updated the service usinfg SSL certs, signed by our own CA, just as I already did for the dashboard itself. However, as soon as I applied the new config, the dashboard wasn't able to

[ceph-users] Re: Mounting A RBD Via Kernal Modules

2024-03-25 Thread Alwin Antreich
Hi, March 24, 2024 at 8:19 AM, "duluxoz" wrote: > > Hi, > > Yeah, I've been testing various configurations since I sent my last > > email - all to no avail. > > So I'm back to the start with a brand new 4T image which is rbdmapped to > > /dev/rbd0. > > Its not formatted (yet) and so not

[ceph-users] Linux Laptop Losing CephFS mounts on Sleep/Hibernate

2024-03-25 Thread matthew
Hi All, So I've got a Ceph Reef Cluster (latest version) with a CephFS system set up with a number of directories on it. On a Laptop (running Rocky Linux (latest version)) I've used fstab to mount a number of those directories - all good, everything works, happy happy joy joy! :-) However,

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Torkil Svensgaard
On 25-03-2024 22:58, Kai Stian Olstad wrote: On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote: My tally came to 412 out of 539 OSDs showing up in a blocked_by list and that is about every OSD with data prior to adding ~100 empty OSDs. How 400 read targets and 100 write

[ceph-users] Mounting A RBD Image via Kernal Modules

2024-03-25 Thread matthew
Hi All, I'm looking for a bit of advice on the subject of this post. I've been "staring at the trees so long I can't see the forest any more". :-) Rocky Linux Client latest version. Ceph Reef latest version. I have read *all* the doco on the Ceph website. I have created a pool (my_pool) and

[ceph-users] Re: Why a lot of pgs are degraded after host(+osd) restarted?

2024-03-25 Thread jaemin joo
I understood the mechanism more through your answer. I'm using erasure coding and backfilling step took quite a long time :( If there was just a lot of pg peering. I think it's reasonable. but I was curious why there was a lot of backfill_wait instead of peering. (e.g. pg 9.5a is stuck undersized

[ceph-users] Cephadm host keeps trying to set osd_memory_target to less than minimum

2024-03-25 Thread mads2a
I have a virtual ceph cluster running 17.2.6 with 4 ubuntu 22.04 hosts in it, each with 4 OSD's attached. The first 2 servers hosting mgr's have 32GB of RAM each, and the remaining have 24gb For some reason i am unable to identify, the first host in the cluster appears to constantly be trying

[ceph-users] #1359 (update) Ceph filesystem failure | Ceph filesystem probleem

2024-03-25 Thread Postmaster C (Simon)
English follows Dutch ## Update 2024-03-19 Positief nieuws; we zijn nu bezig met het kopiëren van data uit CephFS. We konden het filesystem weer mounten met hulp van 42on, onze support club. We kopiëren nu de data en dat lijkt goed te gaan. Op elk moment kunnen we tegen problematische metadata

[ceph-users] S3 Partial Reads from Erasure Pool

2024-03-25 Thread E3GH75
I am dealing with a cluster that is having terrible performance with partial reads from an erasure coded pool. Warp tests and s3bench tests result in acceptable performance but when the application hits the data, performance plummets. Can anyone clear this up for me, When radosgw gets a partial

[ceph-users] Ceph Dashboard Clear Cache

2024-03-25 Thread ashar . khan
Hello Ceph members, How do I clear the Ceph dashboard cache? Kindly guide me on how to do this. Thanks ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Upgrading from Pacific to Quincy fails with "Unexpected error"

2024-03-25 Thread Aaron Moate
We were having this same error; after some troubleshooting it turned out that the 17.2.7 cephadm orchestrator's ssh client was choking on the keyboard-interactive AuthenticationMethod (which is really PAM); Our sshd configuration was: AuthenticationMethods keyboard-interactive

[ceph-users] Ceph-Cluster integration with Ovirt-Cluster

2024-03-25 Thread ankit
Hi Guys, I have a running ovirt-4.3 cluster with 1 manager and 4 hypervisors nodes and for storage using traditional SAN storage which is connect using iscsi. where i can create VM's and assign storge from SAN. This is running fine since a decade but now i want to move from traditional SAN

[ceph-users] Re: MANY_OBJECT_PER_PG on 1 pool which is cephfs_metadata

2024-03-25 Thread e . fazenda
Dear Eugen, Sorry i forgot to update the case. I have upgraded to the latest pacific release 16.2.15 and i have done the necessary for the pg_num :) Thanks for the followup on this. Topic can be closed. ___ ceph-users mailing list --

[ceph-users] Adding new OSD's - slow_ops and other issues.

2024-03-25 Thread jskr
Hi. We have a cluster working very nicely since it was put up more than a year ago. Now we needed to add more NVMe drives to expand. After setting all the "no" flags.. we added them using $ ceph orch osd add The twist is that we have managed to get the default weights set to 1 for all

[ceph-users] Re: PG damaged "failed_repair"

2024-03-25 Thread romain . lebbadi-breteau
Hi, Sorry for the broken formatting. Here are the outputs again. ceph osd df: ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL%USE VAR PGS STATUS 3hdd 1.81879 0 0 B 0 B 0 B 0 B 0 B 0 B 0 00

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2024-03-25 Thread maxadamo
Dear Nico, do you think it is sensible and it's a precise statement saying that "we can't reduce complexity by adding a layer of complexity"? Containers are always adding a so-called layer, but people keep using them, and in some cases, they offload complexity from another side. Claiming the

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-25 Thread e . fazenda
Dear Eugen, Thanks again for the help. We managed to upgrade to a minor release 16.2.3, next week we will upgrade to latest 16.2.15. You were right about the number of manager which was blocking the update. Thanks again for the help. I Topic solved. Best Regards.

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Kai Stian Olstad
On Mon, Mar 25, 2024 at 10:58:24PM +0100, Kai Stian Olstad wrote: On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote: My tally came to 412 out of 539 OSDs showing up in a blocked_by list and that is about every OSD with data prior to adding ~100 empty OSDs. How 400 read targets

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Kai Stian Olstad
On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote: My tally came to 412 out of 539 OSDs showing up in a blocked_by list and that is about every OSD with data prior to adding ~100 empty OSDs. How 400 read targets and 100 write targets can only equal ~60 backfills with

[ceph-users] quincy-> reef upgrade non-cephadm

2024-03-25 Thread Christopher Durham
Hi, I am upgrading my test cluster from 17.2.6 (quincy) to 18.2.2 (reef). As it was an rpm install, i am following the directions here: Reef — Ceph Documentation | | | | Reef — Ceph Documentation | | | The upgrade worked, but I have some observations and questions before I move to

[ceph-users] Clients failing to advance oldest client?

2024-03-25 Thread Erich Weiler
Hi Y'all, I'm seeing this warning via 'ceph -s' (this is on Reef): # ceph -s cluster: id: 58bde08a-d7ed-11ee-9098-506b4b4da440 health: HEALTH_WARN 3 clients failing to advance oldest client/flush tid 1 MDSs report slow requests 1 MDSs behind on

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Torkil Svensgaard
Neither downing or restarting the OSD cleared the bogus blocked_by. I guess it makes no sense to look further at blocked_by as the cause when the data can't be trusted and there is no obvious smoking gun like a few OSDs blocking everything. My tally came to 412 out of 539 OSDs showing up in a

[ceph-users] mark direct Zabbix support deprecated? Re: Ceph versus Zabbix: failure: no data sent

2024-03-25 Thread John Jasen
Well, at least on my RHEL Ceph cluster, turns out zabbix-sender, zabbix-agent, etc aren't in the container image. Doesn't explain why it didn't work with the Debian/proxmox version, but *shrug*. It appears there is no interest in adding them back in, per:

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Anthony D'Atri
First try "ceph osd down 89" > On Mar 25, 2024, at 15:37, Alexander E. Patrakov wrote: > > On Mon, Mar 25, 2024 at 7:37 PM Torkil Svensgaard wrote: >> >> >> >> On 24/03/2024 01:14, Torkil Svensgaard wrote: >>> On 24-03-2024 00:31, Alexander E. Patrakov wrote: Hi Torkil, >>> >>> Hi

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread John Mulligan
On Monday, March 25, 2024 3:22:26 PM EDT Alexander E. Patrakov wrote: > On Mon, Mar 25, 2024 at 11:01 PM John Mulligan > > wrote: > > On Friday, March 22, 2024 2:56:22 PM EDT Alexander E. Patrakov wrote: > > > Hi John, > > > > > > > A few major features we have planned include: > > > > *

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Alexander E. Patrakov
On Mon, Mar 25, 2024 at 7:37 PM Torkil Svensgaard wrote: > > > > On 24/03/2024 01:14, Torkil Svensgaard wrote: > > On 24-03-2024 00:31, Alexander E. Patrakov wrote: > >> Hi Torkil, > > > > Hi Alexander > > > >> Thanks for the update. Even though the improvement is small, it is > >> still an

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread Alexander E. Patrakov
On Mon, Mar 25, 2024 at 11:01 PM John Mulligan wrote: > > On Friday, March 22, 2024 2:56:22 PM EDT Alexander E. Patrakov wrote: > > Hi John, > > > > > A few major features we have planned include: > > > * Standalone servers (internally defined users/groups) > > > > No concerns here > > > > > *

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread John Mulligan
On Monday, March 25, 2024 1:46:26 PM EDT Ralph Boehme wrote: > Hi John, > > On 3/21/24 20:12, John Mulligan wrote: > > > I'd like to formally let the wider community know of some work I've been > > involved with for a while now: adding Managed SMB Protocol Support to > > Ceph. SMB being the

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread Ralph Boehme
Hi John, On 3/21/24 20:12, John Mulligan wrote: I'd like to formally let the wider community know of some work I've been involved with for a while now: adding Managed SMB Protocol Support to Ceph. SMB being the well known network file protocol native to Windows systems and supported by MacOS

[ceph-users] Re: Spam in log file

2024-03-25 Thread Patrick Donnelly
Nope. On Mon, Mar 25, 2024 at 8:33 AM Albert Shih wrote: > > Le 25/03/2024 à 08:28:54-0400, Patrick Donnelly a écrit > Hi, > > > > > The fix is in one of the next releases. Check the tracker ticket: > > https://tracker.ceph.com/issues/63166 > > Oh thanks. Didn't find it with google. > > Is they

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread John Mulligan
On Friday, March 22, 2024 2:56:22 PM EDT Alexander E. Patrakov wrote: > Hi John, > > > A few major features we have planned include: > > * Standalone servers (internally defined users/groups) > > No concerns here > > > * Active Directory Domain Member Servers > > In the second case, what is

[ceph-users] March Ceph Science Virtual User Group

2024-03-25 Thread Kevin Hrpcek
Hey All, We will be having a Ceph science/research/big cluster call on Wednesday March 27th. If anyone wants to discuss something specific they can add it to the pad linked below. If you have questions or comments you can contact me. This is an informal open call of community members mostly

[ceph-users] Re: Spam in log file

2024-03-25 Thread Albert Shih
Le 25/03/2024 à 08:28:54-0400, Patrick Donnelly a écrit Hi, > > The fix is in one of the next releases. Check the tracker ticket: > https://tracker.ceph.com/issues/63166 Oh thanks. Didn't find it with google. Is they are any risk/impact for the cluster ? Regards. -- Albert SHIH 嶺 

[ceph-users] Re: Spam in log file

2024-03-25 Thread Patrick Donnelly
Hi Albert, The fix is in one of the next releases. Check the tracker ticket: https://tracker.ceph.com/issues/63166 On Mon, Mar 25, 2024 at 8:23 AM Albert Shih wrote: > > Hi everyone. > > On my cluster I got spam by my cluster with message like > > Mar 25 13:10:13 cthulhu2 ceph-mgr[2843]: mgr

[ceph-users] Spam in log file

2024-03-25 Thread Albert Shih
Hi everyone. On my cluster I got spam by my cluster with message like Mar 25 13:10:13 cthulhu2 ceph-mgr[2843]: mgr finish mon failed to return metadata for mds.cephfs.cthulhu2.dqahyt: (2) No such file or directory Mar 25 13:10:13 cthulhu2 ceph-mgr[2843]: mgr finish mon failed to return

[ceph-users] Re: Large number of misplaced PGs but little backfill going on

2024-03-25 Thread Torkil Svensgaard
On 24/03/2024 01:14, Torkil Svensgaard wrote: On 24-03-2024 00:31, Alexander E. Patrakov wrote: Hi Torkil, Hi Alexander Thanks for the update. Even though the improvement is small, it is still an improvement, consistent with the osd_max_backfills value, and it proves that there are still

[ceph-users] Re: ceph cluster extremely unbalanced

2024-03-25 Thread Alexander E. Patrakov
Hi Denis, As the vast majority of OSDs have bluestore_min_alloc_size = 65536, I think you can safely ignore https://tracker.ceph.com/issues/64715. The only consequence will be that 58 OSDs will be less full than others. In other words, please use either the hybrid approach or the built-in

[ceph-users] Re: ceph cluster extremely unbalanced

2024-03-25 Thread Denis Polom
Hi Alexander, that sounds pretty promising to me. I've checked bluestore_min_alloc_size and most 1370 OSDs have value 65536. You mentioned: "You will have to do that weekly until you redeploy all OSDs that were created with 64K bluestore_min_alloc_size" Is it the only way to approach this,

[ceph-users] Re: Call for Interest: Managed SMB Protocol Support

2024-03-25 Thread Robert Sander
Hi, On 3/22/24 19:56, Alexander E. Patrakov wrote: In fact, I am quite skeptical, because, at least in my experience, every customer's SAMBA configuration as a domain member is a unique snowflake, and cephadm would need an ability to specify arbitrary UID mapping configuration to match what