[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Eugen Block
Okay, so the first thing I would do is to stop the upgrade. Then make sure that you have two running MGRs with the current version of the rest of the cluster (.1). If no other daemons have been upgraded it shouldn't be a big issue. If necessary you can modify the unit.run file and specify

[ceph-users] Re: Monitoring Ceph Bucket and overall ceph cluster remaining space

2024-03-06 Thread Michael Worsham
SW is SolarWinds (www.soparwinds.com), a network and application monitoring and alerting platform. It's not very open source at all, but it's what we use for monitoring all of our physical and virtual servers, network switches, SAN and NAS devices, and anything else with a network card in it.

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Edouard FAZENDA
Dear Eugen, I have removed one mgr on the node 3 , the second one is still crashlooping and on node 1 mgr is in 16.2.2 Not sure to understand your workaround. * Stopping current upgrade to rollback if possible and afterward upgrading to latest release of pacific ? Best Regards, Edouard

[ceph-users] Re: How to build ceph without QAT?

2024-03-06 Thread Ilya Dryomov
On Wed, Mar 6, 2024 at 7:41 AM Feng, Hualong wrote: > > Hi Dongchuan > > Could I know which version or which commit that you are building and your > environment: system, CPU, kernel? > > ./do_cmake.sh -DCMAKE_BUILD_TYPE=RelWithDebInfo this command should be OK > without QAT. Hi Hualong, I

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-06 Thread Patrick Donnelly
On Wed, Mar 6, 2024 at 2:55 AM Venky Shankar wrote: > > +Patrick Donnelly > > On Tue, Mar 5, 2024 at 9:18 PM Yuri Weinstein wrote: > > > > Details of this release are summarized here: > > > > https://tracker.ceph.com/issues/64721#note-1 > > Release Notes - TBD > > LRC upgrade - TBD > > > >

[ceph-users] ceph Quincy to Reef non cephadm upgrade

2024-03-06 Thread sarda . ravi
I want to perform non cephadm upgrade from Quincy to Reef. Reason for not using cephadm is do not want to go for ceph in containers. My test deployment is as given below. Total cluster hosts : 5 ceph-mon hosts: 3 ceph-mgr hosts: 3 (ceph-mgr active on one node, and other ceph-mgr each on

[ceph-users] bluestore_min_alloc_size and bluefs_shared_alloc_size

2024-03-06 Thread Joel Davidow
Summary -- The relationship of the values configured for bluestore_min_alloc_size and bluefs_shared_alloc_size are reported to impact space amplification, partial overwrites in erasure coded pools, and storage capacity as an osd becomes more fragmented and/or more full. Previous

[ceph-users] Re: Hanging request in S3

2024-03-06 Thread Casey Bodley
ost;x-amz-content-sha256;x-amz-date > e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 > -- > DEBUG: signature-v4 headers: {'x-amz-date': '20240306T183435Z', > 'Authorization': 'AWS4-HMAC-SHA256 > Credential=VL0FRB7CYGMHBGCD419M/20240306/[...snip...]/s3/aws4_request,SignedHe

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Gregory Farnum
Has the link on the website broken? https://ceph.com/en/community/connect/ We've had trouble keeping it alive in the past (getting a non-expiring invite), but I thought that was finally sorted out. -Greg On Wed, Mar 6, 2024 at 8:46 AM Matthew Vernon wrote: > > Hi, > > How does one get an invite

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Gregory Farnum
On Wed, Mar 6, 2024 at 8:56 AM Matthew Vernon wrote: > > Hi, > > On 06/03/2024 16:49, Gregory Farnum wrote: > > Has the link on the website broken? https://ceph.com/en/community/connect/ > > We've had trouble keeping it alive in the past (getting a non-expiring > > invite), but I thought that was

[ceph-users] Slow RGW multisite sync due to "304 Not Modified" responses on primary zone

2024-03-06 Thread praveenkumargpk17
Hi, We have 2 clusters (v18.2.1) primarily used for RGW which has over 2+ billion RGW objects. They are also in multisite configuration totaling to 2 zones and we've got around 2 Gbps of bandwidth dedicated (P2P) for the multisite traffic. We see that using "radosgw-admin sync status" on the

[ceph-users] RGW multisite slowness issue due to the "304 Not Modified" responses on primary zone

2024-03-06 Thread praveenkumargpk17
Hi, We have 2 clusters (v18.2.1) primarily used for RGW which has over 2+ billion RGW objects. They are also in multisite configuration totaling to 2 zones and we've got around 2 Gbps of bandwidth dedicated (P2P) for the multisite traffic. We see that using "radosgw-admin sync status" on the

[ceph-users] InvalidAccessKeyId

2024-03-06 Thread ashar . khan
Dear Team, I am facing one issue, which could be a possible bug, but not able to find any solution. We are unable to create a bucket on ceph from ceph-dashboard. Bucket is getting created with a fresh/different name but once I am trying with a deleted bucket name then it is not getting created.

[ceph-users] Re: ceph-volume fails when adding spearate DATA and DATA.DB volumes

2024-03-06 Thread Adam King
If you want to be directly setting up the OSDs using ceph-volume commands (I'll pretty much always recommend following https://docs.ceph.com/en/latest/cephadm/services/osd/#dedicated-wal-db over manual ceph-volume stuff in cephadm deployments unless what you're doing can't be done with the spec

[ceph-users] Re: Ceph storage project for virtualization

2024-03-06 Thread egoitz
Hi Eneko! Sorry for the delay answering. Thank you so much for your time really mate :) . I answer you in green bold for instance between your lines for better understanding what I'm talking about. moving forward below! El 2024-03-05 12:26, Eneko Lacunza escribió: > Hi Egoitz, > > I don't

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Matthew Vernon
Hi, On 06/03/2024 16:49, Gregory Farnum wrote: Has the link on the website broken? https://ceph.com/en/community/connect/ We've had trouble keeping it alive in the past (getting a non-expiring invite), but I thought that was finally sorted out. Ah, yes, that works. Sorry, I'd gone to

[ceph-users] Re: Ceph reef mon is not starting after host reboot

2024-03-06 Thread Adam King
When you ran this, was it directly on the host, or did you run `cephadm shell` first? The two things you tend to need to connect to the cluster (that "RADOS timed out" error is generally what you get when connecting to the cluster fails. A bunch of different causes all end with that error) are a

[ceph-users] Ceph is constantly scrubbing 1/4 of all PGs and still have pigs not scrubbed in time

2024-03-06 Thread thymus_03fumbler
I recently switched from 16.2.x to 18.2.x and migrated to cephadm, since the switch the cluster is constantly scrubbing, 24/7 up to 50 PGs simultaneously and up to 20 deep scrubs simultaneously in a cluster that has only 12 (in use) OSDs. Furthermore it still manages to regularly have a warning

[ceph-users] Re: ambigous mds behind on trimming and slowops (ceph 17.2.5 and rook operator 1.10.8)

2024-03-06 Thread a . warkhade98
Thanks Dhairya for response. its ceph 17.2.5 I don't have exact output for ceph -s currently as it is past issue.but it was like below and all PGs were active + clean AFAIR mds slow requests Mds behind on trimming don't know root cause why mds was crashed but i am suspecting its something to

[ceph-users] ceph-volume fails when adding spearate DATA and DATA.DB volumes

2024-03-06 Thread service . plant
Hi all! I;ve faced an issue I couldnt even google. Trying to create OSD with two separate LVM for data.db and data, gives me intresting error ``` root@ceph-uvm2:/# ceph-volume lvm prepare --bluestore --data ceph-block-0/block-0 --block.db ceph-db-0/db-0 --> Incompatible flags were found, some

[ceph-users] PGs with status active+clean+laggy

2024-03-06 Thread mori . ricardo
Dear community, I have a ceph quincy cluster with 5 nodes currently. But only 3 with SSDs and others with nvme. On separate pools. I have had many alerts from PGs with active-clean-laggy status. This has caused problems with slow writing. I wanted to know how to troubleshoot properly. I

[ceph-users] Re: PGs with status active+clean+laggy

2024-03-06 Thread mori . ricardo
I expressed myself wrong. I only have SSDs in my cluster. I meant that in 2 of 3 nodes I have nvme in another pool. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-06 Thread Redouane Kachach
Looks good to me. Testing went OK without any issues. Thanks, Redo. On Tue, Mar 5, 2024 at 5:22 PM Travis Nielsen wrote: > Looks great to me, Redo has tested this thoroughly. > > Thanks! > Travis > > On Tue, Mar 5, 2024 at 8:48 AM Yuri Weinstein wrote: > >> Details of this release are

[ceph-users] Re: reef 18.2.2 (hot-fix) QE validation status

2024-03-06 Thread Laura Flores
Went over the rados results with Radek. All looks good for the hotfix, and we are ready to upgrade the LRC. Rados approved! Laura Flores She/Her/Hers Software Engineer, Ceph Storage Chicago, IL lflo...@ibm.com | lflo...@redhat.com M: +17087388804 On Wed, Mar 6, 2024 at

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Wesley Dillingham
At the very bottom of this page is a link https://ceph.io/en/community/connect/ Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Wed, Mar 6, 2024 at 11:45 AM Matthew Vernon wrote: > Hi, > > How does one get an invite to the

[ceph-users] PG damaged "failed_repair"

2024-03-06 Thread Romain Lebbadi-Breteau
Hi, We're a student club from Montréal where we host an Openstack cloud with a Ceph backend for storage of virtual machines and volumes using rbd. Two weeks ago we received an email from our ceph cluster saying that some pages were damaged. We ran "sudo ceph pg repair " but then there was

[ceph-users] Number of pgs

2024-03-06 Thread ndandoul
Hi all, Pretty sure not the very first time you see a thread like this. Our cluster consists of 12 nodes/153 OSDs/1.2 PiB used, 708 TiB /1.9 PiB avail The data pool is 2048 pgs big exactly the number when the cluster started. We have no issues with the cluster, everything runs as expected

[ceph-users] Hanging request in S3

2024-03-06 Thread Christian Kugler
', 'Authorization': 'AWS4-HMAC-SHA256 Credential=VL0FRB7CYGMHBGCD419M/20240306/[...snip...]/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date,Signature=45b133675535ab611bbf2b9a7a6e40f9f510c0774bf155091dc9a05b76856cb7', 'x-amz-content-sha256

[ceph-users] ceph commands on host cannot connect to cluster after cephx disabling

2024-03-06 Thread service . plant
Hello everybody, Suddenly faced with a problem with (probably) authorization playing with cephx. So, long story short: 1) Rollout completely new testing cluster by cephadm with only one node 2) According to docs I've set this to /etc/ceph/ceph.conf auth_cluster_required = none

[ceph-users] Re: change ip node and public_network in cluster

2024-03-06 Thread farhad khedriyan
Hi, thanks But this document is for old versions and cannot be used for the container version . I used the reef version and when I retrieve the monmap and edit I can't use ceph-mon to change the monmap or any way to change I tried to solve this problem by adding a new node and deleting the old

[ceph-users] Re: has anyone enabled bdev_enable_discard?

2024-03-06 Thread jsterr
Is there any update on this? Did someone test the option and has performance values before and after? Is there any good documentation regarding this option? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: Ceph Cluster Config File Locations?

2024-03-06 Thread matthew
Thanks Eugen, you pointed me in the right direction :-) Yes, the config files I mentioned were the ones in `/var/lib/ceph/{FSID}/mgr.{MGR}/config` - I wasn't aware there were others (well, I suspected their was, hence my Q). The `global public-network` was (re-)set to the old subnet, while

[ceph-users] Re: Ceph is constantly scrubbing 1/4 of all PGs and still have pigs not scrubbed in time

2024-03-06 Thread Anthony D'Atri
I don't see these in the config dump. I think you might have to apply them to `global` for them to take effect, not just `osd`, FWIW. > I have tried various settings, like osd_deep_scrub_interval, osd_max_scrubs, > mds_max_scrub_ops_in_progress etc. > All those get ignored.

[ceph-users] Ceph-storage slack access

2024-03-06 Thread Matthew Vernon
Hi, How does one get an invite to the ceph-storage slack, please? Thanks, Matthew ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Marc
Is it possible to access this also with xmpp? > > At the very bottom of this page is a link > https://ceph.io/en/community/connect/ > > Respectfully, > > *Wes Dillingham* > w...@wesdillingham.com > LinkedIn > > > On Wed, Mar 6, 2024 at 11:45 AM

[ceph-users] Re: pg repair doesn't fix "got incorrect hash on read" / "candidate had an ec hash mismatch"

2024-03-06 Thread Kai Stian Olstad
Hi Eugen, thank you for the reply. The OSD was drained over the weekend, so OSD 223 and 269 have only the problematic PG 404.bc. I don't think moving the PG would help since I don't have any empty OSD to move it to, and a move would not fix the hash mismatch. The reason I just want to have

[ceph-users] Ceph reef mon is not starting after host reboot

2024-03-06 Thread ankit
Hi guys, i am very newbie to ceph-cluster but after multiple attempts, i was able to install ceph-reef cluster on debian-12 by cephadm tool on test environment with 2 mons and 3 OSD's om VM's. All was seeming good and i was exploring more about it so i rebooted cluster and found that now i am

[ceph-users] Re: Slow RGW multisite sync due to "304 Not Modified" responses on primary zone

2024-03-06 Thread praveenkumargpk17
Hi All , Regarding our earlier email regarding "Slow RGW multisite sync due to '304 Not Modified' responses on primary zone," We just wanted to quickly follow up. We wanted to make it clear that we still having problems and that we desperately need your help to find a solution. Thank you

[ceph-users] Re: bluestore_min_alloc_size and bluefs_shared_alloc_size

2024-03-06 Thread Anthony D'Atri
> On Feb 28, 2024, at 17:55, Joel Davidow wrote: > > Current situation > - > We have three Ceph clusters that were originally built via cephadm on octopus > and later upgraded to pacific. All osds are HDD (will be moving to wal+db on > SSD) and were resharded after the

[ceph-users] Re: change ip node and public_network in cluster

2024-03-06 Thread Eugen Block
Hi, your response arrived in my inbox today, so sorry for the delay. I wrote a blog post [1] just two weeks ago for that procedure with cephadm, Zac adopted that and updated the docs [2]. Can you give that a try and let me know if it worked? I repeated that procedure a couple of times to

[ceph-users] Ceph Leadership Team Meeting Minutes - March 6, 2024

2024-03-06 Thread Ernesto Puerta
Hi Cephers, These are the topics covered in today's meeting: - *Releases* - *Hot fixes Releases* - *18.2.2* - https://github.com/ceph/ceph/pull/55491 - reef: mgr/prometheus: fix orch check to prevent Prometheus crash - https://github.com/ceph/ceph/pull/55709 - reef:

[ceph-users] Re: Ceph-storage slack access

2024-03-06 Thread Zac Dover
Greg & co, https://github.com/ceph/ceph/pull/56010 contains the updated link. As soon as it passes its tests, I'll merge and backport it. Zac Dover Upstream Docs Ceph Foundation On Thursday, March 7th, 2024 at 3:01 AM, Gregory Farnum wrote: > > > On Wed, Mar 6, 2024 at 8:56 AM Matthew

[ceph-users] 回复:Re: How to build ceph without QAT?

2024-03-06 Thread 张东川
Hi Hualong and llya, Thanks for your help. More info: I am trying to buid ceph on Milkv Pioneer board (RISCV arch, OS is fedora-riscv 6.1.55). The ceph code being used was downloaded from github last week (master branch) ​Currently I am working on environment cleanup (I suspect my work

[ceph-users] Re: ceph Quincy to Reef non cephadm upgrade

2024-03-06 Thread Konstantin Shalygin
Hi, Yes, you upgrade ceph-common package, then restart your mons k Sent from my iPhone > On 6 Mar 2024, at 21:55, sarda.r...@gmail.com wrote: > > My question is - does this mean I need to upgrade all ceph packages (ceph, > ceph-common) and restart only monitor daemon first?

[ceph-users] Unable to map RBDs after running pg-upmap-primary on the pool

2024-03-06 Thread Torkil Svensgaard
Hi I tried to do offline read optimization[1] this morning but I am now unable to map the RBDs in the pool. I did this prior to running the pg-upmap-primary commands suggested by the optimizer, as suggested by the latest documentation[2]: " ceph osd set-require-min-compat-client reef "

[ceph-users] Re: Ceph Cluster Config File Locations?

2024-03-06 Thread Eugen Block
You're welcome, great that your cluster is healthy again. Zitat von matt...@peregrineit.net: Thanks Eugen, you pointed me in the right direction :-) Yes, the config files I mentioned were the ones in `/var/lib/ceph/{FSID}/mgr.{MGR}/config` - I wasn't aware there were others (well, I

[ceph-users] Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Edouard FAZENDA
Dear Ceph Community, I am in the process of upgrading ceph pacific 16.2.1 to 16.2.2 , I have followed the documentation : https://docs.ceph.com/en/pacific/cephadm/upgrade/ My cluster is in Healthy state , but the upgrade is not going forward , as on the cephadm logs I have the following :

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Edouard FAZENDA
The process has now started but I have the following error on mgr to the second node root@rke-sh1-1:~# ceph orch ps NAME HOST PORTSSTATUS REFRESHED AGE VERSION IMAGE ID CONTAINER ID crash.rke-sh1-1 rke-sh1-1 running

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Eugen Block
Hi, a couple of things. First, is there any specific reason why you're upgrading from .1 to .2? Why not directly to .15? It seems unnecessary and you're risking upgrading to a "bad" version (I believe it was 16.2.7) if you're applying evey minor release. Or why not upgrading to Quincy or

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Eugen Block
There was another issue when having more than two MGRs, maybe you're hitting that (https://tracker.ceph.com/issues/57675, https://github.com/ceph/ceph/pull/48258). I believe my workaround was to set the global config to a newer image (target version) and then deployed a new mgr. Zitat

[ceph-users] Re: Upgarde from 16.2.1 to 16.2.2 pacific stuck

2024-03-06 Thread Edouard FAZENDA
Dear Eugen, Thanks again for the help. We wanted to go smoothly, as we have unfortunaltey not test clusters, effectively the risk to get a bad version is high, you are right we will see to upgrade to the latest of pacific for the next steps. I have wait about 30 minutes. Still looking why