I faced a similar issue. The PG just would never finish recovery. Changing
all OSDs in the PG to "osd_op_queue wpq" and then restarting them serially
ultimately allowed the PG to recover. Seemed to be some issue with mclock.
Respectfully,
*Wes Dillingham*
w...@wesdillingham.com
LinkedIn
Hello Frank.
I have 84 clients (high-end servers) with: Ubuntu 20.04.5 LTS - Kernel:
Linux 5.4.0-125-generic
My cluster 17.2.6 quincy.
I have some client nodes with "ceph-common/stable,now 17.2.7-1focal" I
wonder using new version clients is the main problem?
Maybe I have a communication error.
For what it's worth, we saw this last week at Clyso on two separate
customer clusters on 17.2.7 and also solved it by moving back to wpq.
We've been traveling this week so haven't created an upstream tracker
for it yet, but we're back to recommending wpq to our customers for all
production
I’ve seen C-states impact mons by dropping a bunch of packets — on nodes that
were lightly utilized so they transitioned a lot. Curiously both CPU and NIC
generation seemed to be factors, as it only happened on one cluster out of a
dozen or so.
If by SSD you mean SAS/SATA SSDs, then the
I started to investigate my clients.
for example:
root@ud-01:~# ceph health detail
HEALTH_WARN 1 clients failing to respond to cache pressure
[WRN] MDS_CLIENT_RECALL: 1 clients failing to respond to cache pressure
mds.ud-data.ud-02.xcoojt(mds.0): Client bmw-m4 failing to respond to
cache
Wow I noticed something!
To prevent ram overflow with gpu training allocations, I'm using a 2TB
Samsung 870 evo for swap.
As you can see below, swap usage 18Gi and server was idle, that means maybe
ceph client hits latency because of the swap usage.
I decided to tune cephfs client's kernels and increase network buffers to
increase speed.
This time my client has 1x 10Gbit DAC cable.
Client version is 1 step ahead: ceph-common/stable,now 17.2.7-1focal amd64
[installed]
The kernel tunnings:
root@maradona:~# cat /etc/sysctl.conf
> Just curious, can decreasing rocksdb_cf_compact_on_deletion_trigger 16384 >
> 4096 hurt performance of HDD OSDs in any way? I have no growing latency on
> HDD OSD, where data is stored, but it would be easier to set it to [osd]
> section without cherry picking only SSD/NVME OSDs, but for all at
On 1/26/24 11:26, Roman Pashin wrote:
Unfortunately they cannot. You'll want to set them in centralized conf
and then restart OSDs for them to take effect.
Got it. Thank you Josh! WIll put it to config of affected OSDs and restart
them.
Just curious, can decreasing
Hi,
The following article:
https://ceph.io/en/news/blog/2024/ceph-a-journey-to-1tibps/
suggests that dsabling C-states on your CPUs (on the OSD nodes) as one method
to improve performance. The article seems to indicate that the scenariobeing
addressed in the article was with NVMEs as OSDs.
Hi
A few years ago we were really strapped for space so we tweaked pg_num
for some pools to ensure all pgs were as to close to the same size as
possible while stile observing the power of 2 rule, in order to get the
most mileage space wise. We set the auto-scaler to off for the tweaked
pools
Yes, my dashboard looks good here as well. :-)
Zitat von Martin :
Hi Eugen,
Yes, you are right.
After upgrade from v18.2.0 ---> v18.2.1 it is necessary to create
the ceph-exporter service manually and deploy to all hosts.
The dasboard is fine as well.
Thanks for help.
Martin
On
Hi Mark,
In v17.2.7 we enabled a feature that automatically performs a compaction
>> if too many tombstones are present during iteration in RocksDB. It
>> might be worth upgrading to see if it helps (you might have to try
>> tweaking the settings if the defaults aren't helping enough). The PR
Hi Eugen,
Yes, you are right.
After upgrade from v18.2.0 ---> v18.2.1 it is necessary to create the
ceph-exporter service manually and deploy to all hosts.
The dasboard is fine as well.
Thanks for help.
Martin
On 26/01/2024 00:17, Eugen Block wrote:
Ah, there they are (different port):
Performance for small files is more about IOPS rather than throughput,
and the IOPS in your fio tests look okay to me. What you could try is
to split the PGs to get around 150 or 200 PGs per OSD. You're
currently at around 60 according to the ceph osd df output. Before you
do that, can you
If you ask me or Joachim, we'll tell you to disable autoscaler. ;-) It
doesn't seem mature enough yet, especially with many pools. There have
been multiple threads in the past discussing this topic, I'd suggest
to leave it disabled. Or you could help improving it, maybe create a
tracker
Hi Matt
Thanks for your answer.
Should I open a bug report then?
How would I be able to read more from it? Have multiple threads access
it and read from it simultaneously?
Marc
On 1/25/24 20:25, Matt Benjamin wrote:
Hi Marc,
No, the only thing you need to do with the Unix socket is to
Hi, this message is one of those that are often spurious. I don't recall in
which thread/PR/tracker I read it, but the story was something like that:
If an MDS gets under memory pressure it will request dentry items back from
*all* clients, not just the active ones or the ones holding many of
Hi,
This is a cluster running 17.2.7 upgraded from 16.2.6 on the 15 January
2024.
On Monday 22 January we had 4 HDD all on different server with I/O-error
because of some damage sectors, the OSD is hybrid so the DB is on SSD, 5
HDD share 1 SSD.
I set the OSD out, ceph osd out 223 269 290
Hi Marc,
1. if you can, yes, create a tracker issue on tracker.ceph.com?
2. you might be able to get more throughput with (some number) of
additional threads; the first thing I would try is prioritization (nice)
regards,
Matt
On Fri, Jan 26, 2024 at 6:08 AM Marc Singer wrote:
> Hi Matt
>
>
> Do you know if it rocksdb_cf_compact_on_deletion_trigger and
> rocksdb_cf_compact_on_deletion_sliding_window can be changed in runtime
> without OSD restart?
Unfortunately they cannot. You'll want to set them in centralized conf
and then restart OSDs for them to take effect.
Josh
On Fri, Jan
> Unfortunately they cannot. You'll want to set them in centralized conf
> and then restart OSDs for them to take effect.
>
Got it. Thank you Josh! WIll put it to config of affected OSDs and restart
them.
Just curious, can decreasing rocksdb_cf_compact_on_deletion_trigger 16384 >
4096 hurt
On Fri, Jan 26, 2024 at 3:35 AM Torkil Svensgaard wrote:
>
> The most weird one:
>
> Pool rbd_ec_data stores 683TB in 4096 pgs -> warn should be 1024
> Pool rbd_internal stores 86TB in 1024 pgs-> warn should be 2048
>
> That makes no sense to me based on the amount of data stored. Is this a
> bug
23 matches
Mail list logo