[ceph-users] Re: How does mclock work?

2024-01-09 Thread Sridhar Seshasayee
Hello Frédéric, Please see answers below. > Could someone please explain how mclock works regarding reads and writes? > Does mclock intervene on both read and write iops? Or only on reads or only > on writes? > mClock schedules both read and write ops. > > And what type of underlying

[ceph-users] Re: About ceph disk slowops effect to cluster

2024-01-09 Thread David Yang
The 2*10Gbps shared network seems to be full (1.9GB/s). Is it possible to reduce part of the workload and wait for the cluster to return to a healthy state? Tip: Erasure coding needs to collect all data blocks when recovering data, so it takes up a lot of network card bandwidth and processor

[ceph-users] Re: RGW rate-limiting or anti-hammering for (external) auth requests // Anti-DoS measures

2024-01-09 Thread Szabo, Istvan (Agoda)
Hi, I'm using in the frontend https config on haproxy like this, it works so far good: stick-table type ip size 1m expire 10s store http_req_rate(10s) tcp-request inspect-delay 10s tcp-request content track-sc0 src http-request deny deny_status 429 if { sc_http_req_rate(0) gt 1 } Istvan

[ceph-users] Join us for the User + Dev Monthly Meetup - January 18th!

2024-01-09 Thread Laura Flores
Hi Ceph users and developers, You are invited to join us at the User + Dev meeting this week Thursday, January 18th at 10:00 AM Eastern Time! See below for more meeting details. The focus topic, "Ceph Feature Request from the DKIST Data Center: Add a service backed by tape that is analogous to

[ceph-users] Re: mds crashes after up:replay state

2024-01-09 Thread Paul Mezzanini
There isn't one specific thing I can point my finger at that would be "_this_ is where all the pain comes from". Some of these issues are also our own doing. We have been getting too comfortable seeing the cluster in HEALTH_WARN with "1 clients failing to respond to capability release" and

[ceph-users] Re: How does mclock work?

2024-01-09 Thread Anthony D'Atri
There was a client SSD sorta like that, a bit of Optane with TLC or QLC, but it didn't seem to sell well. Optane was groovy tech, but with certain challenges as well. > On Jan 9, 2024, at 14:30, Mark Nelson wrote: > > With HDDs and a lot of metadata, it's tough to get away from it imho. In

[ceph-users] Re: How does mclock work?

2024-01-09 Thread Mark Nelson
With HDDs and a lot of metadata, it's tough to get away from it imho.  In an alternate universe it would have been really neat if Intel could have worked with the HDD vendors to put like 16GB of user accessible optane on every HDD.  Enough for the WAL and L0 (and maybe L1). Mark On 1/9/24

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2024-01-09 Thread Frank Schilder
Quick answers: * ... osd_deep_scrub_randomize_ratio ... but not on Octopus: is it still a valid parameter? Yes, this parameter exists and can be used to prevent premature deep-scrubs. The effect is dramatic. * ... essentially by playing with osd_scrub_min_interval,... The main

[ceph-users] Re: How does mclock work?

2024-01-09 Thread Anthony D'Atri
Not strictly an answer to your worthy question, but IMHO this supports my stance that hybrid OSDs aren't worth the hassle. > On Jan 9, 2024, at 06:13, Frédéric Nass > wrote: > > With hybrid setups (RocksDB+WAL on SSDs or NVMes and Data on HDD), if mclock > only considers write performance,

[ceph-users] Re: Stuck in upgrade process to reef

2024-01-09 Thread Igor Fedotov
Hi Marek, I haven't looked through those upgrade logs yet but here are some comments regarding last OSD startup attempt. First of answering your question _init_alloc::NCB::restore_allocator() failed! Run Full Recovery from ONodes (might take a while) Is it a mandatory part of fsck?

[ceph-users] How does mclock work?

2024-01-09 Thread Frédéric Nass
  Hello,   Could someone please explain how mclock works regarding reads and writes? Does mclock intervene on both read and write iops? Or only on reads or only on writes?   And what type of underlying hardware performance is calculated and considered by mclock? Seems to be only write

[ceph-users] Re: RGW rate-limiting or anti-hammering for (external) auth requests // Anti-DoS measures

2024-01-09 Thread Christian Rohmann
Happy New Year Ceph-Users! With the holidays and people likely being away, I take the liberty to bluntly BUMP this question about protecting RGW from DoS below: On 22.12.23 10:24, Christian Rohmann wrote: Hey Ceph-Users, RGW does have options [1] to rate limit ops or bandwidth per bucket