[ceph-users] Re: Custom CRUSH maps HOWTO?

2023-05-30 Thread Thorne Lawler

Thanks to Anthony D'Atri, Joshua Beaman and Nino Kotur.

TL;DR- I need to ditch the spinning rust.

As long as all my pools are using all the OSDs (currently necessary) 
this is not really a tuning problem- just a consequence of adding awful 
old recycled disks to my shiny NVME.


To answer a few questions:

 * I've tried KRBD and librbd, also iSCSI, NFS, CephFS on multiple
   different physical and virtual OSes, both *nix and Windows.
 * I've benchtested with fio and rbd bench.
 * Yes I'm using the default replicas and min_size.
 * Yes, I already set primary_affinity to zero for the spinning disks.
 * No I can't move disks around or add flash disks to the older
   machines with the spinning storage in them. Our hardware vendors are
   kinda butts.

I have gone back to my hardware vendor to see if they can do a much 
better price on more NVME 12 months later. Fingers crossed.


Thanks again for everyone's quick responses!

On 31/05/2023 12:51 am, Thorne Lawler wrote:

Hi folks!

I have a Ceph production 17.2.6 cluster with 6 machines in it - four 
newer, faster machines with 4x3.84TB NVME drives each, and two with 
24x1.68TB SAS disks each.


I know I should have done something smart with the CRUSH maps for this 
up front, but until now I have shied away from CRUSH maps as they 
sound really complex.


Right now my cluster's performance, especially write performance, is 
not what it needs to be, and I am looking for advice:


1. How should I be structuring my crush map, and why?

2. How does one actually edit and manage a CRUSH map? What /commands/ 
does one use? This isn't clear at all in the documentation. Are there 
any GUI tools out there for managing CRUSH?


3. Is this going to impact production performance or availability 
while I'm configuring it? I have tens of thousands of users relying on 
this thing, so I can't take any risks.


Thanks in advance!


--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and any 
attached files may be confidential information, and may also be the 
subject of legal professional privilege. _If you are not the intended 
recipient any use, disclosure or copying of this email is unauthorised. 
_If you received this email in error, please notify Discount Domain Name 
Services Pty Ltd on 03 9815 6868 to report this matter and delete all 
copies of this transmission together with any attachments. /

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph client version vs server version inter-operability

2023-05-30 Thread Mark Kirkwood

Hi,

We are running a ceph cluster that is currently on Luminous. At this 
point most of our clients are also Luminous, but as we provision new 
client hosts we are using client versions that are more recent (e.g 
Octopus, Pacific and more recently Quincy). Is this safe? Is there a 
known list of what client versions are compatible with what server version?


We are only using RBD and are specifying rbd_default_features (the same) 
on all server and client hosts.


regards

Mark
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef v18.1.0 QE Validation status

2023-05-30 Thread Travis Nielsen
Rook daily CI is passing against the image
quay.io/ceph/daemon-base:latest-main-devel, which means the Reef release is
looking good from Rook's perspective:


With the Reef release we need to have the tags soon:

quay.io/ceph/daemon-base:latest-reef-devel

quay.io/ceph/ceph:v18


Guillaume, will these happen automatically, or do we need some work done in
ceph-container?


Thanks,

Travis



On Tue, May 30, 2023 at 10:54 AM Yuri Weinstein  wrote:

> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/61515#note-1
> Release Notes - TBD
>
> Seeking approvals/reviews for:
>
> rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
> merge https://github.com/ceph/ceph/pull/51788 for
> the core)
> rgw - Casey
> fs - Venky
> orch - Adam King
> rbd - Ilya
> krbd - Ilya
> upgrade/octopus-x - deprecated
> upgrade/pacific-x - known issues, Ilya, Laura?
> upgrade/reef-p2p - N/A
> clients upgrades - not run yet
> powercycle - Brad
> ceph-volume - in progress
>
> Please reply to this email with approval and/or trackers of known
> issues/PRs to address them.
>
> gibba upgrade was done and will need to be done again this week.
> LRC upgrade TBD
>
> TIA
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: BlueStore fragmentation woes

2023-05-30 Thread Fox, Kevin M
Ok, I restarted it May 25th, ~11:30, let it run over the long weekend and just 
checked on it. Data attached.

May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  allocation stats probe 107: cnt: 17991 fr
ags: 17991 size: 32016760832
May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -1: 20267,  20267, 39482425344
May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -3: 19737,  19737, 37299027968
May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -7: 18498,  18498, 32395558912
May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -11: 20373,  20373, 35302801408
May 21 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-21T18:24:34.040+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -27: 19072,  19072, 33645854720
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  allocation stats probe 108: cnt: 24594 fr
ags: 24594 size: 56951898112
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -1: 17991,  17991, 32016760832
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -2: 20267,  20267, 39482425344
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -4: 19737,  19737, 37299027968
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -12: 20373,  20373, 35302801408
May 22 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-22T18:24:34.057+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -28: 19072,  19072, 33645854720
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  allocation stats probe 109: cnt: 24503 
frags: 24503 size: 58141900800
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -1: 24594,  24594, 56951898112
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -3: 20267,  20267, 39482425344
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -5: 19737,  19737, 37299027968
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -13: 20373,  20373, 35302801408
May 23 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-23T18:24:34.095+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -29: 19072,  19072, 33645854720
May 24 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-24T18:24:34.105+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  allocation stats probe 110: cnt: 27637 
frags: 27637 size: 63777406976
May 24 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-24T18:24:34.105+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -1: 24503,  24503, 58141900800
May 24 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-24T18:24:34.105+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -2: 24594,  24594, 56951898112
May 24 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-24T18:24:34.105+ 7f53603fc700  0 
bluestore(/var/lib/ceph/osd/ceph-183)  probe -6: 19737,  19737, 37299027968
May 24 11:24:34 cf8 ceph-4e4184f5-7733-453b-b72c-2b43422fd027-osd-183[2282674]: 
debug 2023-05-24T18:24:34.105+ 7f53603fc700  0 

[ceph-users] Re: Custom CRUSH maps HOWTO?

2023-05-30 Thread Nino Kotur
What kind of pool are you using, or do you have different pools for
different purposes... Do you have cephfs or rbd only pools etc... describe
your setup.
It is generally best practice to create new rules and apply them to pools
and not to modify existing pools, but that is possible as well. Below is
one relatively simple thing to do but it is just a proposal and it may not
fit your needs so take it with CAUTION!!!

If i did math right you have roughly 51TB SAS and 61TB NVMe, easiest thing
to do is what you can do even from webgui create new crush map for
replicated or EC pool depending which one you're currently using, set
failure domain to HOST, and set device class to NVMe, than repeat the
process for HDD only pool.

After that you can apply new crush configuration to the existing pool,
doing so will cause a lot of data movement which may be short or long
depending on your network and hard drive speeds, also depending on your
client needs if the cluster is usually under heavy load then clients will
definitely notice this action.

So doing it that way you would have two sets of disks to be used for
different purposes one for fast storage and one for slow storage.

Anyway doing any action of this sort I'd test it in at least VM environment
if you dont have some test cluster to run it on previously.

However if your need is to have large chunky pool there are certain
configurations to tell to cluster to place 1 or two replicas on fast drives
and remaining replicas on other device type, but don't take this for
granted i'm not 100% sure, as far as i know Ceph waits for confirmation of
all drives to finish writing process to acknowledge to client that
file/object is stored, so i'm not sure that you would benefit from setup
like that.




Kind regards,
Nino


On Tue, May 30, 2023 at 4:53 PM Thorne Lawler  wrote:

> Hi folks!
>
> I have a Ceph production 17.2.6 cluster with 6 machines in it - four
> newer, faster machines with 4x3.84TB NVME drives each, and two with
> 24x1.68TB SAS disks each.
>
> I know I should have done something smart with the CRUSH maps for this
> up front, but until now I have shied away from CRUSH maps as they sound
> really complex.
>
> Right now my cluster's performance, especially write performance, is not
> what it needs to be, and I am looking for advice:
>
> 1. How should I be structuring my crush map, and why?
>
> 2. How does one actually edit and manage a CRUSH map? What /commands/
> does one use? This isn't clear at all in the documentation. Are there
> any GUI tools out there for managing CRUSH?
>
> 3. Is this going to impact production performance or availability while
> I'm configuring it? I have tens of thousands of users relying on this
> thing, so I can't take any risks.
>
> Thanks in advance!
>
> --
>
> Regards,
>
> Thorne Lawler - Senior System Administrator
> *DDNS* | ABN 76 088 607 265
> First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
> P +61 499 449 170
>
> _DDNS
>
> /_*Please note:* The information contained in this email message and any
> attached files may be confidential information, and may also be the
> subject of legal professional privilege. _If you are not the intended
> recipient any use, disclosure or copying of this email is unauthorised.
> _If you received this email in error, please notify Discount Domain Name
> Services Pty Ltd on 03 9815 6868 to report this matter and delete all
> copies of this transmission together with any attachments. /
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: reef v18.1.0 QE Validation status

2023-05-30 Thread Ilya Dryomov
On Tue, May 30, 2023 at 6:54 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/61515#note-1
> Release Notes - TBD
>
> Seeking approvals/reviews for:
>
> rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
> merge https://github.com/ceph/ceph/pull/51788 for
> the core)
> rgw - Casey
> fs - Venky
> orch - Adam King
> rbd - Ilya
> krbd - Ilya

rbd and krbd approved.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: slow mds requests with random read test

2023-05-30 Thread Patrick Donnelly
On Tue, May 30, 2023 at 8:42 AM Ben  wrote:
>
> Hi,
>
> We are performing couple performance tests on CephFS using fio. fio is run
> in k8s pod and 3 pods will be up running mounting the same pvc to CephFS
> volume. Here is command line for random read:
> fio -direct=1 -iodepth=128 -rw=randread -ioengine=libaio -bs=4k -size=1G
> -numjobs=5 -runtime=500 -group_reporting -directory=/tmp/cache
> -name=Rand_Read_Testing_$BUILD_TIMESTAMP
> The random read is performed very slow. Here is the cluster log from
> dashboard:
> [...]
> Any suggestions on the problem?

Your random read workload is too extreme for your cluster of OSDs.
It's causing slow metadata ops for the MDS. To resolve this we would
normally suggest allocating a set of OSDs on SSDs for use by the
CephFS metadata pool to isolate the worklaods.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Red Hat Partner Engineer
IBM, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] reef v18.1.0 QE Validation status

2023-05-30 Thread Yuri Weinstein
Details of this release are summarized here:

https://tracker.ceph.com/issues/61515#note-1
Release Notes - TBD

Seeking approvals/reviews for:

rados - Neha, Radek, Travis, Ernesto, Adam King (we still have to
merge https://github.com/ceph/ceph/pull/51788 for
the core)
rgw - Casey
fs - Venky
orch - Adam King
rbd - Ilya
krbd - Ilya
upgrade/octopus-x - deprecated
upgrade/pacific-x - known issues, Ilya, Laura?
upgrade/reef-p2p - N/A
clients upgrades - not run yet
powercycle - Brad
ceph-volume - in progress

Please reply to this email with approval and/or trackers of known
issues/PRs to address them.

gibba upgrade was done and will need to be done again this week.
LRC upgrade TBD

TIA
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Custom CRUSH maps HOWTO?

2023-05-30 Thread Beaman, Joshua
I’m going to start by assuming your pool(s) are deployed with the default 3 
replicas and a min_size of 2.

The quickest and safest thing you can do to potentially realize some 
improvement, is set the primary-affinity for all of your HDD-based OSDs to zero.
https://docs.ceph.com/en/quincy/rados/operations/crush-map/#primary-affinity

Something like:
for osd in $(ceph osd ls-tree SAS-NODE1); do ceph osd primary-affinity $osd 
0.0; done

And of course repeat that for the other node.

That will have low impact on your users as ceph will start prioritizing reads 
from the fast NVMEs, and the slow ones will only have to do writes.  However, 
ceph may already be doing that, and if your SAS based hosts do not have a fast 
disks for the block DB and WAL (write-ahead log), any time 2 (or more) SAS 
disks are involved in a PG, your writes will still be as slow as the fastest 
HDD.

It is best when ceph has identical size and performance OSDs.  When you’re 
going to mix very fast disks, with relatively slow disks the next best thing is 
to have twice as much fast storage as slow.  If you have enough capacity 
available such that the total data STORED (add up from ceph df) is < 
3.84*4*2*0.7 = ~21.5TB, I’d suggest creating rack buckets in your crush map, so 
there’s 3 racks, each with 2 hosts, so that each PG will only have one slow 
disk.  The down side to that is, you are basically abandoning ~50TB of HDD 
capacity, your effective maximum RAW capacity ends up only ~92TB, and you’ll 
start getting near-full warnings between 75 and 80TB RAW or around 25-27TB 
stored.

The process for setting that would be adding 3 rack buckets, and then moving 
the host buckets into the rack buckets:
https://docs.ceph.com/en/quincy/rados/operations/crush-map/#add-a-bucket

That will cause a lot of data movement, so you should try to do it at a time 
when client i/o is expected to be low.  Ceph will do its best to limit the 
impact to client i/o caused by this backfill, but if your writes are already 
poor, they’ll definitely be worse during the movement.

If that capacity is going to be an issue, the recommended fixes get more 
complicated and risky.  However, the best thing you can do, even if you do add 
the suggested racks to your crush map, would be to get 2 NVMEs (or SSDs) for 
each of your SAS hosts to serve as db_devices for the HDDs.  You’ll have to 
remove and recreate those OSDs, but you can do them in smaller batches.
https://docs.ceph.com/en/quincy/cephadm/services/osd/#creating-new-osds

There is a GUI ceph dashboard available.  
https://docs.ceph.com/en/quincy/mgr/dashboard/
It is very limited in the changes that can be made, and these types of crush 
map changes are definitely not for the dashboard.  But it may help you get a 
useful view of the state of your cluster.

Best of luck,
Josh Beaman

From: Thorne Lawler 
Date: Tuesday, May 30, 2023 at 9:52 AM
To: ceph-users@ceph.io 
Subject: [EXTERNAL] [ceph-users] Custom CRUSH maps HOWTO?
Hi folks!

I have a Ceph production 17.2.6 cluster with 6 machines in it - four
newer, faster machines with 4x3.84TB NVME drives each, and two with
24x1.68TB SAS disks each.

I know I should have done something smart with the CRUSH maps for this
up front, but until now I have shied away from CRUSH maps as they sound
really complex.

Right now my cluster's performance, especially write performance, is not
what it needs to be, and I am looking for advice:

1. How should I be structuring my crush map, and why?

2. How does one actually edit and manage a CRUSH map? What /commands/
does one use? This isn't clear at all in the documentation. Are there
any GUI tools out there for managing CRUSH?

3. Is this going to impact production performance or availability while
I'm configuring it? I have tens of thousands of users relying on this
thing, so I can't take any risks.

Thanks in advance!

--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and any
attached files may be confidential information, and may also be the
subject of legal professional privilege. _If you are not the intended
recipient any use, disclosure or copying of this email is unauthorised.
_If you received this email in error, please notify Discount Domain Name
Services Pty Ltd on 03 9815 6868 to report this matter and delete all
copies of this transmission together with any attachments. /
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CEPH Version choice

2023-05-30 Thread Frank Schilder
Hi Marc,

I uploaded all scripts and a rudimentary readme to 
https://github.com/frans42/cephfs-bench . I hope it is sufficient to get 
started. I'm afraid its very much tailored to our deployment and I can't make 
it fully configurable anytime soon. I hope it serves a purpose though - at 
least I discovered a few bugs with it.

We actually kept the benchmark running through an upgrade from mimic to 
octopus. Was quite interesting to see how certain performance properties change 
with that. This benchmark makes it possible to compare versions with live 
timings coming in.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Marc 
Sent: Monday, May 15, 2023 11:28 PM
To: Frank Schilder
Subject: RE: [ceph-users] Re: CEPH Version choice

> I planned to put it on-line. The hold-back is that the main test is un-
> taring a nasty archive and this archive might contain personal
> information, so I can't just upload it as is. I can try to put together
> a similar archive from public sources. Please give me a bit of time. I'm
> also a bit under stress right now with our users being hit by an FS meta
> data corruption. That's also why I'm a bit trigger happy.
>

Ok thanks, very nice, no hurry!!!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Custom CRUSH maps HOWTO?

2023-05-30 Thread Thorne Lawler

Hi folks!

I have a Ceph production 17.2.6 cluster with 6 machines in it - four 
newer, faster machines with 4x3.84TB NVME drives each, and two with 
24x1.68TB SAS disks each.


I know I should have done something smart with the CRUSH maps for this 
up front, but until now I have shied away from CRUSH maps as they sound 
really complex.


Right now my cluster's performance, especially write performance, is not 
what it needs to be, and I am looking for advice:


1. How should I be structuring my crush map, and why?

2. How does one actually edit and manage a CRUSH map? What /commands/ 
does one use? This isn't clear at all in the documentation. Are there 
any GUI tools out there for managing CRUSH?


3. Is this going to impact production performance or availability while 
I'm configuring it? I have tens of thousands of users relying on this 
thing, so I can't take any risks.


Thanks in advance!

--

Regards,

Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170

_DDNS

/_*Please note:* The information contained in this email message and any 
attached files may be confidential information, and may also be the 
subject of legal professional privilege. _If you are not the intended 
recipient any use, disclosure or copying of this email is unauthorised. 
_If you received this email in error, please notify Discount Domain Name 
Services Pty Ltd on 03 9815 6868 to report this matter and delete all 
copies of this transmission together with any attachments. /

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RBD imagem mirroring doubt

2023-05-30 Thread Work Ceph
Hello guys,
What would happen if we set up an RBD mirroring configuration, and in the
target system (the system where the RBD image is mirrored) we create
snapshots of this image? Would that cause some problems?

Also, what happens if we delete the source RBD image? Would that trigger a
deletion in the target system RBD image as well?

Thanks in advance!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Important: RGW multisite bug may silently corrupt encrypted objects on replication

2023-05-30 Thread Casey Bodley
On Tue, May 30, 2023 at 8:22 AM Tobias Urdin  wrote:
>
> Hello Casey,
>
> Thanks for the information!
>
> Can you please confirm that this is only an issue when using 
> “rgw_crypt_default_encryption_key”
> config opt that says “testing only” in the documentation [1] to enable 
> encryption and not when using
> Barbican or Vault as KMS or using SSE-C with the S3 API?

unfortunately, all flavors of server-side encryption (SSE-C, SSE-KMS,
SSE-S3, and rgw_crypt_default_encryption_key) are affected by this
bug, as they share the same encryption logic. the main difference is
where they get the key

>
> [1] 
> https://docs.ceph.com/en/quincy/radosgw/encryption/#automatic-encryption-for-testing-only
>
> > On 26 May 2023, at 22:45, Casey Bodley  wrote:
> >
> > Our downstream QE team recently observed an md5 mismatch of replicated
> > objects when testing rgw's server-side encryption in multisite. This
> > corruption is specific to s3 multipart uploads, and only affects the
> > replicated copy - the original object remains intact. The bug likely
> > affects Ceph releases all the way back to Luminous where server-side
> > encryption was first introduced.
> >
> > To expand on the cause of this corruption: Encryption of multipart
> > uploads requires special handling around the part boundaries, because
> > each part is uploaded and encrypted separately. In multisite, objects
> > are replicated in their encrypted form, and multipart uploads are
> > replicated as a single part. As a result, the replicated copy loses
> > its knowledge about the original part boundaries required to decrypt
> > the data correctly.
> >
> > We don't have a fix yet, but we're tracking it in
> > https://tracker.ceph.com/issues/46062. The fix will only modify the
> > replication logic, so won't repair any objects that have already
> > replicated incorrectly. We'll need to develop a radosgw-admin command
> > to search for affected objects and reschedule their replication.
> >
> > In the meantime, I can only advise multisite users to avoid using
> > encryption for multipart uploads. If you'd like to scan your cluster
> > for existing encrypted multipart uploads, you can identify them with a
> > s3 HeadObject request. The response would include a
> > x-amz-server-side-encryption header, and the ETag header value (with
> > "s removed) would be longer than 32 characters (multipart ETags are in
> > the special form "-"). Take care not to delete the
> > corrupted replicas, because an active-active multisite configuration
> > would go on to delete the original copy.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Seeking feedback on Improving cephadm bootstrap process

2023-05-30 Thread Michel Jouvin

+1

Michel

Le 30/05/2023 à 11:23, Frank Schilder a écrit :

What I'm having in mind is if the command is already in history. A wrong history 
reference can execute a command with "--yes-i-really-mean-it" even though you 
really don't mean it. Been there. For an OSD this is maybe tolerable, but for an entire 
cluster ... not really. Some things need to be hard to limit the blast radius of a typo 
(or attacker).

For example, when issuing such a command the first time, the cluster could print a nonce 
that needs to be included in such a command to make it happen and which is only valid 
once for this exact command, so one actually needs to type something new every time to 
destroy stuff. An exception could be if a "safe-to-destroy" query for any 
daemon (pool etc.) returns true.

I would still not allow an entire cluster to be wiped with a single command. In 
a single step, only allow to destroy what could be recovered in some way (there 
has to be some form of undo). And there should be notifications to all admins 
about what is going on to be able to catch malicious execution of destructive 
commands.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Nico Schottelius 
Sent: Tuesday, May 30, 2023 10:51 AM
To: Frank Schilder
Cc: Nico Schottelius; Redouane Kachach; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
process


Hey Frank,

in regards to destroying a cluster, I'd suggest to reuse the old
--yes-i-really-mean-it parameter, as it is already in use by ceph osd
destroy [0]. Then it doesn't matter whether it's prod or not, if you
really mean it ... ;-)

Best regards,

Nico

[0] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/

Frank Schilder  writes:


Hi, I would like to second Nico's comment. What happened to the idea that a 
deployment tool should be idempotent? The most natural option would be:

1) start install -> something fails
2) fix problem
3) repeat exact same deploy command -> deployment picks up at current state 
(including cleaning up failed state markers) and tries to continue until next 
issue (go to 2)

I'm not sure (meaning: its a terrible idea) if its a good idea to
provide a single command to wipe a cluster. Just for the fat finger
syndrome. This seems safe only if it would be possible to mark a
cluster as production somehow (must be sticky, that is, cannot be
unset), which prevents a cluster destroy command (or any too dangerous
command) from executing. I understand the test case in the tracker,
but having such test-case utils that can run on a production cluster
and destroy everything seems a bit dangerous.

I think destroying a cluster should be a manual and tedious process
and figuring out how to do it should be part of the learning
experience. So my answer to "how do I start over" would be "go figure
it out, its an important lesson".

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Nico Schottelius 
Sent: Friday, May 26, 2023 10:40 PM
To: Redouane Kachach
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
process


Hello Redouane,

much appreciated kick-off for improving cephadm. I was wondering why
cephadm does not use a similar approach to rook in the sense of "repeat
until it is fixed?"

For the background, rook uses a controller that checks the state of the
cluster, the state of monitors, whether there are disks to be added,
etc. It periodically restarts the checks and when needed shifts
monitors, creates OSDs, etc.

My question is, why not have a daemon or checker subcommand of cephadm
that a) checks what the current cluster status is (i.e. cephadm
verify-cluster) and b) fixes the situation (i.e. cephadm 
verify-and-fix-cluster)?

I think that option would be much more beneficial than the other two
suggested ones.

Best regards,

Nico


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] slow mds requests with random read test

2023-05-30 Thread Ben
Hi,

We are performing couple performance tests on CephFS using fio. fio is run
in k8s pod and 3 pods will be up running mounting the same pvc to CephFS
volume. Here is command line for random read:
fio -direct=1 -iodepth=128 -rw=randread -ioengine=libaio -bs=4k -size=1G
-numjobs=5 -runtime=500 -group_reporting -directory=/tmp/cache
-name=Rand_Read_Testing_$BUILD_TIMESTAMP
The random read is performed very slow. Here is the cluster log from
dashboard:

5/30/23 8:13:16 PM

[INF]

Health check cleared: MDS_SLOW_REQUEST (was: 1 MDSs report slow requests)


5/30/23 8:13:16 PM

[INF]

Health check cleared: MDS_SLOW_METADATA_IO (was: 1 MDSs report slow
metadata IOs)


5/30/23 8:13:16 PM

[INF]

MDS health message cleared (mds.?): 1 slow metadata IOs are blocked > 30
secs, oldest blocked for 33 secs


5/30/23 8:13:16 PM

[INF]

MDS health message cleared (mds.?): 1 slow requests are blocked > 30 secs


5/30/23 8:13:14 PM

[WRN]

Health check update: 2 MDSs report slow requests (MDS_SLOW_REQUEST)


5/30/23 8:13:13 PM

[INF]

MDS health message cleared (mds.?): 1 slow requests are blocked > 30 secs


5/30/23 8:13:08 PM

[WRN]

Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST)


5/30/23 8:13:08 PM

[WRN]

Health check failed: 1 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO)


5/30/23 8:13:08 PM

[WRN]

slow request 34.213327 seconds old, received at
2023-05-30T12:12:33.951399+: client_request(client.270564:1406144
getattr pAsLsXsFs #0x70103d0 2023-05-30T12:12:33.947323+
caller_uid=0, caller_gid=0{}) currently failed to rdlock, waiting


5/30/23 8:13:08 PM

[WRN]

1 slow requests, 1 included below; oldest blocked for > 34.213328 secs


5/30/23 8:13:07 PM

[WRN]

slow request 33.169703 seconds old, received at
2023-05-30T12:12:33.952078+: peer_request:client.270564:1406144
currently dispatched


5/30/23 8:13:07 PM

[WRN]

1 slow requests, 1 included below; oldest blocked for > 33.169704 secs


5/30/23 8:13:04 PM

[INF]

Cluster is now healthy


5/30/23 8:13:04 PM

[INF]

Health check cleared: MDS_SLOW_REQUEST (was: 1 MDSs report slow requests)


5/30/23 8:13:04 PM

[INF]

Health check cleared: MDS_SLOW_METADATA_IO (was: 1 MDSs report slow
metadata IOs)


5/30/23 8:13:04 PM

[INF]

MDS health message cleared (mds.?): 9 slow metadata IOs are blocked > 30
secs, oldest blocked for 45 secs


5/30/23 8:13:04 PM

[INF]

MDS health message cleared (mds.?): 2 slow requests are blocked > 30 secs


5/30/23 8:12:57 PM

[WRN]

2 slow requests, 0 included below; oldest blocked for > 44.954377 secs


5/30/23 8:12:52 PM

[WRN]

2 slow requests, 0 included below; oldest blocked for > 39.954313 secs


5/30/23 8:12:48 PM

[WRN]

Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST)


5/30/23 8:12:47 PM

[WRN]

slow request 34.935921 seconds old, received at
2023-05-30T12:12:12.185614+: client_request(client.270564:1406139
create #0x701045b/atomic7966567911433736706tmp
2023-05-30T12:12:12.182999+ caller_uid=0, caller_gid=0{}) currently
submit entry: journal_and_reply


5/30/23 8:12:47 PM

[WRN]

slow request 34.954254 seconds old, received at
2023-05-30T12:12:12.167281+: client_request(client.270564:1406138
rename #0x7010457/build.xml #0x7010457/atomic6590865221269854506tmp
2023-05-30T12:12:12.162999+ caller_uid=0, caller_gid=0{}) currently
submit entry: journal_and_reply


5/30/23 8:12:47 PM

[WRN]

2 slow requests, 2 included below; oldest blocked for > 34.954254 secs


5/30/23 8:12:44 PM

[WRN]

Health check failed: 1 MDSs report slow metadata IOs (MDS_SLOW_METADATA_IO)


5/30/23 8:12:41 PM

[INF]

Cluster is now healthy


5/30/23 8:12:41 PM

[INF]

Health check cleared: MDS_SLOW_REQUEST (was: 1 MDSs report slow requests)


5/30/23 8:12:41 PM

[INF]

MDS health message cleared (mds.?): 1 slow requests are blocked > 30 secs


5/30/23 8:12:40 PM

[INF]

Health check cleared: MDS_SLOW_METADATA_IO (was: 1 MDSs report slow
metadata IOs)


5/30/23 8:12:40 PM

[INF]

MDS health message cleared (mds.?): 1 slow metadata IOs are blocked > 30
secs, oldest blocked for 38 secs

However, random write test is performing very good.

Any suggestions on the problem?

Thanks,
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Important: RGW multisite bug may silently corrupt encrypted objects on replication

2023-05-30 Thread Tobias Urdin
Hello Casey,

Thanks for the information!

Can you please confirm that this is only an issue when using 
“rgw_crypt_default_encryption_key”
config opt that says “testing only” in the documentation [1] to enable 
encryption and not when using
Barbican or Vault as KMS or using SSE-C with the S3 API?

[1] 
https://docs.ceph.com/en/quincy/radosgw/encryption/#automatic-encryption-for-testing-only

> On 26 May 2023, at 22:45, Casey Bodley  wrote:
> 
> Our downstream QE team recently observed an md5 mismatch of replicated
> objects when testing rgw's server-side encryption in multisite. This
> corruption is specific to s3 multipart uploads, and only affects the
> replicated copy - the original object remains intact. The bug likely
> affects Ceph releases all the way back to Luminous where server-side
> encryption was first introduced.
> 
> To expand on the cause of this corruption: Encryption of multipart
> uploads requires special handling around the part boundaries, because
> each part is uploaded and encrypted separately. In multisite, objects
> are replicated in their encrypted form, and multipart uploads are
> replicated as a single part. As a result, the replicated copy loses
> its knowledge about the original part boundaries required to decrypt
> the data correctly.
> 
> We don't have a fix yet, but we're tracking it in
> https://tracker.ceph.com/issues/46062. The fix will only modify the
> replication logic, so won't repair any objects that have already
> replicated incorrectly. We'll need to develop a radosgw-admin command
> to search for affected objects and reschedule their replication.
> 
> In the meantime, I can only advise multisite users to avoid using
> encryption for multipart uploads. If you'd like to scan your cluster
> for existing encrypted multipart uploads, you can identify them with a
> s3 HeadObject request. The response would include a
> x-amz-server-side-encryption header, and the ETag header value (with
> "s removed) would be longer than 32 characters (multipart ETags are in
> the special form "-"). Take care not to delete the
> corrupted replicas, because an active-active multisite configuration
> would go on to delete the original copy.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Seeking feedback on Improving cephadm bootstrap process

2023-05-30 Thread Frank Schilder
What I'm having in mind is if the command is already in history. A wrong 
history reference can execute a command with "--yes-i-really-mean-it" even 
though you really don't mean it. Been there. For an OSD this is maybe 
tolerable, but for an entire cluster ... not really. Some things need to be 
hard to limit the blast radius of a typo (or attacker).

For example, when issuing such a command the first time, the cluster could 
print a nonce that needs to be included in such a command to make it happen and 
which is only valid once for this exact command, so one actually needs to type 
something new every time to destroy stuff. An exception could be if a 
"safe-to-destroy" query for any daemon (pool etc.) returns true.

I would still not allow an entire cluster to be wiped with a single command. In 
a single step, only allow to destroy what could be recovered in some way (there 
has to be some form of undo). And there should be notifications to all admins 
about what is going on to be able to catch malicious execution of destructive 
commands.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Nico Schottelius 
Sent: Tuesday, May 30, 2023 10:51 AM
To: Frank Schilder
Cc: Nico Schottelius; Redouane Kachach; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
process


Hey Frank,

in regards to destroying a cluster, I'd suggest to reuse the old
--yes-i-really-mean-it parameter, as it is already in use by ceph osd
destroy [0]. Then it doesn't matter whether it's prod or not, if you
really mean it ... ;-)

Best regards,

Nico

[0] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/

Frank Schilder  writes:

> Hi, I would like to second Nico's comment. What happened to the idea that a 
> deployment tool should be idempotent? The most natural option would be:
>
> 1) start install -> something fails
> 2) fix problem
> 3) repeat exact same deploy command -> deployment picks up at current state 
> (including cleaning up failed state markers) and tries to continue until next 
> issue (go to 2)
>
> I'm not sure (meaning: its a terrible idea) if its a good idea to
> provide a single command to wipe a cluster. Just for the fat finger
> syndrome. This seems safe only if it would be possible to mark a
> cluster as production somehow (must be sticky, that is, cannot be
> unset), which prevents a cluster destroy command (or any too dangerous
> command) from executing. I understand the test case in the tracker,
> but having such test-case utils that can run on a production cluster
> and destroy everything seems a bit dangerous.
>
> I think destroying a cluster should be a manual and tedious process
> and figuring out how to do it should be part of the learning
> experience. So my answer to "how do I start over" would be "go figure
> it out, its an important lesson".
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Nico Schottelius 
> Sent: Friday, May 26, 2023 10:40 PM
> To: Redouane Kachach
> Cc: ceph-users@ceph.io
> Subject: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
> process
>
>
> Hello Redouane,
>
> much appreciated kick-off for improving cephadm. I was wondering why
> cephadm does not use a similar approach to rook in the sense of "repeat
> until it is fixed?"
>
> For the background, rook uses a controller that checks the state of the
> cluster, the state of monitors, whether there are disks to be added,
> etc. It periodically restarts the checks and when needed shifts
> monitors, creates OSDs, etc.
>
> My question is, why not have a daemon or checker subcommand of cephadm
> that a) checks what the current cluster status is (i.e. cephadm
> verify-cluster) and b) fixes the situation (i.e. cephadm 
> verify-and-fix-cluster)?
>
> I think that option would be much more beneficial than the other two
> suggested ones.
>
> Best regards,
>
> Nico


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Seeking feedback on Improving cephadm bootstrap process

2023-05-30 Thread Nico Schottelius

Hey Frank,

in regards to destroying a cluster, I'd suggest to reuse the old
--yes-i-really-mean-it parameter, as it is already in use by ceph osd
destroy [0]. Then it doesn't matter whether it's prod or not, if you
really mean it ... ;-)

Best regards,

Nico

[0] https://docs.ceph.com/en/latest/rados/operations/add-or-rm-osds/

Frank Schilder  writes:

> Hi, I would like to second Nico's comment. What happened to the idea that a 
> deployment tool should be idempotent? The most natural option would be:
>
> 1) start install -> something fails
> 2) fix problem
> 3) repeat exact same deploy command -> deployment picks up at current state 
> (including cleaning up failed state markers) and tries to continue until next 
> issue (go to 2)
>
> I'm not sure (meaning: its a terrible idea) if its a good idea to
> provide a single command to wipe a cluster. Just for the fat finger
> syndrome. This seems safe only if it would be possible to mark a
> cluster as production somehow (must be sticky, that is, cannot be
> unset), which prevents a cluster destroy command (or any too dangerous
> command) from executing. I understand the test case in the tracker,
> but having such test-case utils that can run on a production cluster
> and destroy everything seems a bit dangerous.
>
> I think destroying a cluster should be a manual and tedious process
> and figuring out how to do it should be part of the learning
> experience. So my answer to "how do I start over" would be "go figure
> it out, its an important lesson".
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: Nico Schottelius 
> Sent: Friday, May 26, 2023 10:40 PM
> To: Redouane Kachach
> Cc: ceph-users@ceph.io
> Subject: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
> process
>
>
> Hello Redouane,
>
> much appreciated kick-off for improving cephadm. I was wondering why
> cephadm does not use a similar approach to rook in the sense of "repeat
> until it is fixed?"
>
> For the background, rook uses a controller that checks the state of the
> cluster, the state of monitors, whether there are disks to be added,
> etc. It periodically restarts the checks and when needed shifts
> monitors, creates OSDs, etc.
>
> My question is, why not have a daemon or checker subcommand of cephadm
> that a) checks what the current cluster status is (i.e. cephadm
> verify-cluster) and b) fixes the situation (i.e. cephadm 
> verify-and-fix-cluster)?
>
> I think that option would be much more beneficial than the other two
> suggested ones.
>
> Best regards,
>
> Nico


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Seeking feedback on Improving cephadm bootstrap process

2023-05-30 Thread Frank Schilder
Hi, I would like to second Nico's comment. What happened to the idea that a 
deployment tool should be idempotent? The most natural option would be:

1) start install -> something fails
2) fix problem
3) repeat exact same deploy command -> deployment picks up at current state 
(including cleaning up failed state markers) and tries to continue until next 
issue (go to 2)

I'm not sure (meaning: its a terrible idea) if its a good idea to provide a 
single command to wipe a cluster. Just for the fat finger syndrome. This seems 
safe only if it would be possible to mark a cluster as production somehow (must 
be sticky, that is, cannot be unset), which prevents a cluster destroy command 
(or any too dangerous command) from executing. I understand the test case in 
the tracker, but having such test-case utils that can run on a production 
cluster and destroy everything seems a bit dangerous.

I think destroying a cluster should be a manual and tedious process and 
figuring out how to do it should be part of the learning experience. So my 
answer to "how do I start over" would be "go figure it out, its an important 
lesson".

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Nico Schottelius 
Sent: Friday, May 26, 2023 10:40 PM
To: Redouane Kachach
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: Seeking feedback on Improving cephadm bootstrap 
process


Hello Redouane,

much appreciated kick-off for improving cephadm. I was wondering why
cephadm does not use a similar approach to rook in the sense of "repeat
until it is fixed?"

For the background, rook uses a controller that checks the state of the
cluster, the state of monitors, whether there are disks to be added,
etc. It periodically restarts the checks and when needed shifts
monitors, creates OSDs, etc.

My question is, why not have a daemon or checker subcommand of cephadm
that a) checks what the current cluster status is (i.e. cephadm
verify-cluster) and b) fixes the situation (i.e. cephadm 
verify-and-fix-cluster)?

I think that option would be much more beneficial than the other two
suggested ones.

Best regards,

Nico


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2023-05-30 Thread Stefan Kooman

On 5/29/23 20:25, Dan van der Ster wrote:

Hi,

Sorry for poking this old thread, but does this issue still persist in
the 6.3 kernels?


We are running a mail cluster setup with 6.3.1 kernel and it's not 
giving us any performance issues. We have not upgraded our shared 
webhosting platform to this kernel yet (where we experienced this 
issue). But it does not seem like it is still an issue. A CephFS user 
asked this question during Cephalocon 2023, see [1], and Patrick answers 
that Xiubo Li got that fixed (supposedly a read ahead issue) but did not 
know if it was already mainlined.


Gr. Stefan

[1]: https://youtu.be/MRXbU96BMoE?t=1773
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io