[ceph-users] Re: Stable and fastest ceph version for RBD cluster.

2024-08-12 Thread Mark Nelson
Hi Özkan, I've written a couple of articles that might be helpful: https://ceph.io/en/news/blog/2023/reef-osds-per-nvme/ https://ceph.io/en/news/blog/2023/reef-freeze-rbd-performance/ https://ceph.io/en/news/blog/2023/reef-freeze-rgw-performance/ https://ceph.io/en/news/blog/2024/ceph-a-journey-

[ceph-users] Re: Ceph ARM providing storage for x86

2024-05-27 Thread Mark Nelson
Once upon a time there were some endian issues running clusters with mixed hardware, though I don't think it affected clients.  As far as I know those were all resolved many years ago. Mark On 5/25/24 08:46, Anthony D'Atri wrote: Why not? The hwarch doesn't matter. On May 25, 2024, at 07

[Bug 2064999] Re: Prevent soft lockups during IOMMU streaming DMA mapping by limiting nvme max_hw_sectors_kb to cache optimised size

2024-05-09 Thread Mark Nelson
Hey folks, I think we may have encountered this or a variant of this while running extremely strenuous Ceph performance tests on a very high speed cluster we designed for a customer. We have a write-up that includes a section on needing to disable iommu here: https://ceph.io/en/news/blog/2024/ce

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Mark Nelson
after the more recent updates landed in 17.2.6+ though. Mark On 5/2/24 00:05, Sridhar Seshasayee wrote: Hi Mark, On Thu, May 2, 2024 at 3:18 AM Mark Nelson wrote: For our customers we are still disabling mclock and using wpq. Might be worth trying. Could you please elaborate a bit on t

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Mark Nelson
mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and

[ceph-users] Re: Migrating from S3 to Ceph RGW (Cloud Sync Module)

2024-04-15 Thread Mark Nelson
At Clyso we've been building a tool that can migrate S3 data around called Chorus. Normally I wouldn't promote it here, but it's open source and sounds like it might be useful in this case. I don't work on it myself, but thought I'd mention it: https://github.com/clyso/chorus One problem wi

[ceph-users] Re: Are we logging IRC channels?

2024-03-22 Thread Mark Nelson
e a lot of discussions have migrated into different channels, though #ceph still gets some community traffic (and a lot of hardware design discussion). Mark On 3/22/24 02:15, Alvaro Soto wrote: Should we bring to life this again? On Tue, Mar 19, 2024, 8:14 PM Mark Nelson <mailto:mark.a.nel...@gmai

[ceph-users] Re: Are we logging IRC channels?

2024-03-19 Thread Mark Nelson
A long time ago Wido used to have a bot logging IRC afaik, but I think that's been gone for some time. Mark On 3/19/24 19:36, Alvaro Soto wrote: Hi Community!!! Are we logging IRC channels? I ask this because a lot of people only use Slack, and the Slack we use doesn't have a subscription, s

[ceph-users] Re: Number of pgs

2024-03-05 Thread Mark Nelson
There are both pros and cons to having more PGs. Here are a couple of considerations: Pros: 1) Better data distribution prior to balancing (and maybe after) 2) Fewer objects/data per PG 3) Lower per-PG lock contention Cons: 1) Higher PG log memory usage until you hit the osd target unless you

[ceph-users] Re: Performance improvement suggestion

2024-03-04 Thread Mark Nelson
ep up with the WAL ingestion rate. Mark ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minne

[ceph-users] Re: High IO utilization for bstore_kv_sync

2024-02-22 Thread Mark Nelson
r SSD disks. Is there any parameter that we can change/tune to better handle the call "fdatsync"? Maybe using NVMEs for the RocksDB? On Thu, Feb 22, 2024 at 2:24 PM Mark Nelson wrote: Most likely you are seeing time spent waiting on fdatsync in bstore_kv_sync if the drives you

[ceph-users] Re: High IO utilization for bstore_kv_sync

2024-02-22 Thread Mark Nelson
nt of data being read and written to the OSD/disks? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552

[ceph-users] Re: PSA: Long Standing Debian/Ubuntu build performance issue (fixed, backports in progress)

2024-02-09 Thread Mark Nelson
mes to the projects which is critical to the performance. RocksDB is one of them."/ Do we have similar issues with other sub-projects ? boost ? spdk .. ? 2) The chart shown on "rocksdb submit latency", going from over 10 ms to below 5 ms..is this during write i/o under heavy loa

[ceph-users] PSA: Long Standing Debian/Ubuntu build performance issue (fixed, backports in progress)

2024-02-08 Thread Mark Nelson
github.com/ceph/ceph/pull/55501 https://github.com/ceph/ceph/pull/55502 Please feel free to reply if you have any questions! Thanks, Mark -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: https://clyso.com | e: mark.nel...@clyso.

[ceph-users] Re: OSD read latency grows over time

2024-02-02 Thread Mark Nelson
min usage trim” does SingleDelete() in the RocksDB layer when deleting objects that could be bulk deleted (RangeDelete?) due to them having the same prefix (name + date). Best regards On 26 Jan 2024, at 23:18, Mark Nelson wrote: On 1/26/24 11:26, Roman Pashin wrote: Unfortunately they cann

[ceph-users] Re: OSD read latency grows over time

2024-01-26 Thread Mark Nelson
On 1/26/24 11:26, Roman Pashin wrote: Unfortunately they cannot. You'll want to set them in centralized conf and then restart OSDs for them to take effect. Got it. Thank you Josh! WIll put it to config of affected OSDs and restart them. Just curious, can decreasing rocksdb_cf_compact_on_delet

[ceph-users] Re: 17.2.7: Backfilling deadlock / stall / stuck / standstill

2024-01-26 Thread Mark Nelson
For what it's worth, we saw this last week at Clyso on two separate customer clusters on 17.2.7 and also solved it by moving back to wpq.  We've been traveling this week so haven't created an upstream tracker for it yet, but we're back to recommending wpq to our customers for all production clu

[ceph-users] Re: OSD read latency grows over time

2024-01-19 Thread Mark Nelson
ter as failure domain. 7x HDD (data), 2x SSD (index), 1x NVME (wal + OS). ceph config - https://pastebin.com/pCqxXhT3 OSD read latency graph - https://postimg.cc/5YHk9bby -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: https://

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-19 Thread Mark Nelson
n, and have been rocking with single OSD/NVMe, but as always everything is workload dependant and there is sometimes a need for doubling up 😊 Regards, Bailey -Original Message- From: Maged Mokhtar Sent: January 17, 2024 4:59 PM To: Mark Nelson ; ceph-users@ceph.io Subject: [ceph-users]

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
7, 2024 4:59 PM To: Mark Nelson ; ceph-users@ceph.io Subject: [ceph-users] Re: Performance impact of Heterogeneous environment Very informative article you did Mark. IMHO if you find yourself with very high per-OSD core count, it may be logical to just pack/add more nvmes per host, you'd

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
mmunity/bluestore-default-vs-tuned-performance-comparison/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 2

[ceph-users] Re: How does mclock work?

2024-01-09 Thread Mark Nelson
With HDDs and a lot of metadata, it's tough to get away from it imho.  In an alternate universe it would have been really neat if Intel could have worked with the HDD vendors to put like 16GB of user accessible optane on every HDD.  Enough for the WAL and L0 (and maybe L1). Mark On 1/9/24 0

[ceph-users] Re: After upgrading from 17.2.6 to 18.2.0, OSDs are very frequently restarting due to livenessprobe failures

2023-09-28 Thread Mark Nelson
ive such a reversion hence better/safier do that gradually. Hope this helps and awaiting for the feedback. Thanks, Igor On 27/09/2023 22:04, sbeng...@gmail.com wrote: Hi Igor, I have copied three OSD logs to https://drive.google.com/file/d/1aQxibFJR6Dzvr3RbuqnpPhaSMhPSL--F/vie

[ceph-users] Re: Ceph MDS OOM in combination with 6.5.1 kernel client

2023-09-20 Thread Mark Nelson
ceph-users@ceph.io/thread/YR5UNKBOKDHPL2PV4J75ZIUNI4HNMC2W/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391

[ceph-users] Re: ceph_leadership_team_meeting_s18e06.mkv

2023-09-07 Thread Mark Nelson
y already tried: - Disabling modules (like the restful one, which was a suspect) - Enabling debug 20 - Turning the pg autoscaler off Debugging will continue to characterize this issue: - Enable profiling (Mark Nelson) - Try Bloomberg's Python mem profiler <https://github.c

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
s the difference between a cluster that gives me enough I/Os to work properly and a cluster that murders my performances. I hope this helps. Feel free to ask us if you need further details and I'll see what I can do. On 9/7/23 13:59, Mark Nelson wrote: Ok, good to know.  Please feel f

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
x27;re 95% sure that, contrary to our previous beliefs, it's an issue with changes to the bluestore_allocator and not the compaction process. That said, I will keep this email in mind as we will want to test optimizations to compaction on our test environment. On 9/7/23 12:32, Mark Nelson w

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
ry $timeperiod to fix any potential RocksDB degradation. That's what we do. What kind of workload do you run (i.e. RBD, CephFS, RGW)? Do you also see these timeouts occur during deep-scrubs? Gr. Stefan -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 215523

[ceph-users] Re: snaptrim number of objects

2023-08-22 Thread Mark Nelson
.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&

[ceph-users] Re: question about OSD onode hits ratio

2023-08-04 Thread Mark Nelson
iling list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-16 Thread Mark Nelson
On 7/10/23 11:19 AM, Matthew Booth wrote: On Thu, 6 Jul 2023 at 12:54, Mark Nelson wrote: On 7/6/23 06:02, Matthew Booth wrote: On Wed, 5 Jul 2023 at 15:18, Mark Nelson wrote: I'm sort of amazed that it gave you symbols without the debuginfo packages installed. I'll need to fi

[ceph-users] Re: OSD memory usage after cephadm adoption

2023-07-11 Thread Mark Nelson
aggregate memory size is decided on, it goes through a process of looking at how hot the different caches are and assigns memory based on where it thinks the memory would be most useful.  Again this is based on mapped memory though.  It can't force the kernel to reclaim memory that has alread

[ceph-users] Re: OSD memory usage after cephadm adoption

2023-07-11 Thread Mark Nelson
limit? Thanks Luis Domingues Proton AG ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 Mü

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-06 Thread Mark Nelson
On 7/6/23 06:02, Matthew Booth wrote: On Wed, 5 Jul 2023 at 15:18, Mark Nelson wrote: I'm sort of amazed that it gave you symbols without the debuginfo packages installed. I'll need to figure out a way to prevent that. Having said that, your new traces look more accurate to me.

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-05 Thread Mark Nelson
On 7/4/23 10:39, Matthew Booth wrote: On Tue, 4 Jul 2023 at 10:00, Matthew Booth wrote: On Mon, 3 Jul 2023 at 18:33, Ilya Dryomov wrote: On Mon, Jul 3, 2023 at 6:58 PM Mark Nelson wrote: On 7/3/23 04:53, Matthew Booth wrote: On Thu, 29 Jun 2023 at 14:11, Mark Nelson wrote: This

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-03 Thread Mark Nelson
On 7/3/23 04:53, Matthew Booth wrote: On Thu, 29 Jun 2023 at 14:11, Mark Nelson wrote: This container runs: fio --rw=write --ioengine=sync --fdatasync=1 --directory=/var/lib/etcd --size=100m --bs=8000 --name=etcd_perf --output-format=json --runtime=60 --time_based=1 And extracts

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-06-29 Thread Mark Nelson
1952 hit_bytes: 6 KiB / 0% miss_bytes: 349 MiB -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso.com/jobs/ ___

[ceph-users] Re: osd memory target not work

2023-06-20 Thread Mark Nelson
arget 204800 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://

[ceph-users] Re: radosgw hang under pressure

2023-06-12 Thread Mark Nelson
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.ne

[ceph-users] Re: reef v18.1.0 QE Validation status

2023-06-01 Thread Mark Nelson
iling list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso

[ceph-users] Re: BlueStore fragmentation woes

2023-05-31 Thread Mark Nelson
y+. Hence you're free to go with Pacific and enable 4K for BlueFS later in Quincy. Ah, that's good to know. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best

[ceph-users] Re: RGW versioned bucket index issues

2023-05-31 Thread Mark Nelson
e plugged all of the holes. Thanks, Cory Snyder 11:11 Systems ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a:

[ceph-users] Re: BlueStore fragmentation woes

2023-05-25 Thread Mark Nelson
On 5/24/23 09:18, Hector Martin wrote: On 24/05/2023 22.07, Mark Nelson wrote: Yep, bluestore fragmentation is an issue.  It's sort of a natural result of using copy-on-write and never implementing any kind of defragmentation scheme.  Adam and I have been talking about doing it now, pro

[ceph-users] Re: BlueStore fragmentation woes

2023-05-24 Thread Mark Nelson
ing a defrag tool of some sort for bluestore? - Hector ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R&D (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w:

[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-23 Thread Mark Nelson
rt after the second(?) shot, aren't they? yes, actually they start after issuing systemctl ceph-osd@xx restart, it just takes long time performing log recovery.. If I can provide more info, please let me know BR nik -- - Ing. Nikola CIPRICH LinuxB

[ceph-users] Re: Discussion thread for Known Pacific Performance Regressions

2023-05-17 Thread Mark Nelson
ively!) during the weekly ceph community performance call over the past year or two. There's been no intention to hide them, they were just never really summarized on the mailing list until now. On 11 May 2023, at 17:38, Mark Nelson wrote: Hi Everyone, This email was originally p

[ceph-users] Re: CEPH Version choice

2023-05-15 Thread Mark Nelson
d be nice if you could more efforts into upgrade-tests/QA as well as on releasing stuff that actually compiles on non-RHEL/current Linux distributions. Regards, Daniel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Discussion thread for Known Pacific Performance Regressions

2023-05-11 Thread Mark Nelson
e may be other performance issues that I'm not remembering, but these are the big ones I can think of off the top of my head at the moment.  Hopefully this helps clarify what's going on if people are seeing a regression, what to look for, and if they are hitting it, the why behind it.

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Mark Nelson
Is it possible that you are storing object (chunks if EC) that are smaller than the min_alloc size? This cheat sheet might help: https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit?usp=sharing Mark On 3/14/23 12:34, Gaël THEROND wrote: Hi everyone, I’ve g

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Mark Nelson
One thing to watch out for with bluefs_buffered_io is that disabling it can greatly impact certain rocksdb workloads. From what I remember it was a huge problem during certain iteration workloads for things like collection listing. I think the block cache was being invalidated or simply never

[ceph-users] Re: Ceph OSD imbalance and performance

2023-02-28 Thread Mark Nelson
On 2/28/23 13:11, Dave Ingram wrote: On Tue, Feb 28, 2023 at 12:56 PM Reed Dier wrote: I think a few other things that could help would be `ceph osd df tree` which will show the hierarchy across different crush domains. Good idea: https://pastebin.com/y07TKt52 Yeah, it looks like OSD.14

[ceph-users] Re: mons excessive writes to local disk and SSD wearout

2023-02-27 Thread Mark Nelson
On 2/27/23 03:22, Andrej Filipcic wrote: On 2/24/23 15:18, Dan van der Ster wrote: Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. most of the time it's few 10kB/s, but then it jumps a lot, few times a

[ceph-users] Re: RadosGW - Performance Expectations

2023-02-10 Thread Mark Nelson
For reference, with parallel writes using the S3 Go API (via hsbench: https://github.com/markhpc/hsbench), I was recently doing about 600ish MB/s to a single RGW instance from one client. RadosGW used around 3ish HW threads from a 2016 era Xeon to do that. Didn't try single-file tests in that

[ceph-users] Re: ceph cluster iops low

2023-01-23 Thread Mark Nelson
Hi Peter, I'm not quite sure if you're cluster is fully backed by NVMe drives based on your description, but you might be interested in the CPU scaling article we posted last fall. It's available here: https://ceph.io/en/news/blog/2022/ceph-osd-cpu-scaling/ That gives a good overview of wh

[ceph-users] Re: why does 3 copies take so much more time than 2?

2023-01-04 Thread Mark Nelson
Hi Charles, Going from 40s to 4.5m seems excessive to me at least.  Can you tell if the drives or OSDs are hitting their limits?  Tools like iostart, sar, or collectl might help. Longer answer: There are a couple of potential issues.  One is that you are bound by the latency of writing the

[Mingw-w64-public] undefined reference for __mingw_vfprintf

2022-12-22 Thread Mark Nelson via Mingw-w64-public
I'm trying to build a program in Cygwin using the x86_64-w64-mingw32 stuff. When I compile and link my program just with x86_64-w64-mingw32-gcc.exe, it compiles and links fine. But when I try to use a makefile to compile and link in separate steps, it gives the following errors: $ make -f spi_c

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-12-14 Thread Mark Nelson
On 12/14/22 10:09 AM, Stefan Kooman wrote: On 11/21/22 10:07, Stefan Kooman wrote: On 11/8/22 21:20, Mark Nelson wrote: 2.     https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/     <https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/> You tested network encryption impact on performan

[ceph-users] Re: Tuning CephFS on NVME for HPC / IO500

2022-12-01 Thread Mark Nelson
Hi Manuel, I did the IO500 runs back in 2020 and wrote the cephfs aiori backend for IOR/mdtest.  Not sure about the segfault, it's been a while since I've touched that code.  It was working the last time I used it. :D  Having said that, I don't think that's your issue.   The userland backend

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-21 Thread Mark Nelson
On 11/21/22 11:34, Sven Kieske wrote: On Fr, 2022-11-11 at 10:11 +0100, huxia...@horebdata.cn wrote: Thanks a lot for your insightful blogs on Ceph performance. It is really very informative and interesting. When i read Ceph OSD CPU Scaling, i am wondering by which way you  scale CPU cores p

[ceph-users] Re: Impact of DB+WAL undersizing in Pacific and later

2022-11-13 Thread Mark Nelson
Hi Gregor, DB space usage will be mostly governed by the number of onodes and blobs/extents/etc (potentially caused by fragmentation).  If you are primarily using RBD and/or large files in CephFS and you aren't doing a ton of small overwrites, your DB usage could remain below 1%.  It's possib

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-10 Thread Mark Nelson
no concurrency overhead. nevertheless, the concurrent write flag will allow those operation that do need to share the same cf to do it in a safe way. was this option ever considered? ---- *From:* Mark Nelson *Sent:* Wednesday, Nov

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
On 11/9/22 4:48 AM, Stefan Kooman wrote: On 11/8/22 21:20, Mark Nelson wrote: Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: For sure, thanks a lot, it's really informative! C

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
------ *From:* Mark Nelson *Sent:* Tuesday, November 8, 2022 10:20 PM *To:* ceph-users@ceph.io *Subject:* [ceph-users] Recent ceph.io Performance Blog Posts CAUTION: External Sender Hi Folks, I thought I would mention that I've released a couple of pe

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-08 Thread Mark Nelson
On 11/8/22 14:59, Marc wrote: 2. https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/ Very nice! Thanks Marc! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send a

[ceph-users] Recent ceph.io Performance Blog Posts

2022-11-08 Thread Mark Nelson
Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: 1. https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/ 2. https:

[ceph-users] Correction: 10/27/2022 perf meeting with guest speaker Peter Desnoyers today!

2022-10-27 Thread Mark Nelson
Hi Folks, The weekly performance meeting will be starting in approximately 55 minutes at 8AM PST.  Peter Desnoyers from Khoury College of Computer Sciences, Northeastern University will be speaking today about his work on local storage for RBD caching.  A short architectural overview is avail

[ceph-users] 10/20/2022 perf meeting with guest speaker Peter Desnoyers today!

2022-10-27 Thread Mark Nelson
Hi Folks, The weekly performance meeting will be starting in approximately 70 minutes at 8AM PST.  Peter Desnoyers from Khoury College of Computer Sciences, Northeastern University will be speaking today about his work on local storage for RBD caching.  A short architectural overview is avail

[ceph-users] Re: osd_memory_target for low-memory machines

2022-10-03 Thread Mark Nelson
Hi Nicola, I wrote the autotuning code in the OSD.  Janne's response is absolutely correct.  Right now we just control the size of the caches in the OSD and rocksdb to try to keep the OSD close to a certain memory limit.  By default this works down to around 2GB, but the smaller the limit, th

[ceph-users] Re: weird performance issue on ceph

2022-09-26 Thread Mark Nelson
in your deployments? Thanks, Zoltan Am 17.09.22 um 06:58 schrieb Mark Nelson: CAUTION: This email originated from outside the organization. Do not click links unless you can confirm the sender and know the content is safe. Hi Zoltan, So kind of

[ceph-users] Re: weird performance issue on ceph

2022-09-16 Thread Mark Nelson
a good state we have to recreate the whole thing, so we thought we start with the bad state, maybe something obvious is already visible for someone who knows the osd internals well. You find the file here: https://pastebin.com/0HdNapLQ Tanks a lot in advance, Zolta Am 12.08.22 um 18:25 schrieb

[ceph-users] Re: Wide variation in osd_mclock_max_capacity_iops_hdd

2022-09-08 Thread Mark Nelson
FWIW, I'd be for trying to take periodic samples of actual IO happening on the drive during operation.  You can get a much better idea of latency and throughput characteristics across different IO sizes over time (though you will need to account for varying levels of concurrency at the device l

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-18 Thread Mark Nelson
compress_success_count": 41071482, "compress_rejected_count": 1895058, "bluestore_allocated": 171709562880, "bluestore_stored": 304405529094, "bluestore_compressed": 30506295169, "bluestore_compressed_al

[ceph-users] Re: weird performance issue on ceph

2022-08-12 Thread Mark Nelson
test: https://pastebin.com/dEv05eGV Do you see anything obvious that could give us a clue what is going on? Many thanks! Zoltan Am 02.08.22 um 19:01 schrieb Mark Nelson: Ah, too bad!  I suppose that was too easy. :) Ok, so my two lines of thought: 1) Something related to the weird performance i

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-11 Thread Mark Nelson
re, I just meant the bluestore_compression_mode option you specify in the ceph.conf file. Mark Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Mark Nelson Sent: 10 August 2022 22:28 To: Frank Schilder; ceph-use

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-10 Thread Mark Nelson
ould require extremely heavy lifting though.  I suspect we'd shoot for easier performance wins first where we can get them. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Mark Nelson Sent: 09 August 2022 16

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-09 Thread Mark Nelson
tuning - make OSD create- and tune parameters as orthogonal as possible Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Mark Nelson Sent: 08 August 2022 20:30:49 To: ceph-users@ceph.io Subject: [ceph-users] Request fo

[ceph-users] Request for Info: bluestore_compression_mode?

2022-08-08 Thread Mark Nelson
Hi Folks, We are trying to get a sense for how many people are using bluestore_compression_mode or the per-pool compression_mode options (these were introduced early in bluestore's life, but afaik may not widely be used).  We might be able to reduce complexity in bluestore's blob code if we

[ceph-users] Re: weird performance issue on ceph

2022-08-02 Thread Mark Nelson
s for it, however I am still struggling with this. Any other recommendation or idea what to check? Thanks a lot, Zoltan Am 01.08.22 um 17:53 schrieb Mark Nelson: Hi Zoltan, It doesn't look like your pictures showed up for me at least. Very interesting results though!  Are (or were) th

[ceph-users] Re: weird performance issue on ceph

2022-08-01 Thread Mark Nelson
;t seen any mention of these options int he official docs just in pull requests. Is it safe to use these options in production at all? Many thanks, Zoltan Am 25.07.22 um 21:42 schrieb Mark Nelson: I don't think so if this is just plain old RBD.  RBD  shouldn't require a bunch of Roc

[ceph-users] Re: weird performance issue on ceph

2022-07-25 Thread Mark Nelson
=== Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Mark Nelson Sent: 25 July 2022 18:50 To: ceph-users@ceph.io Subject: [ceph-users] Re: weird performance issue on ceph Hi Zoltan, We have a very similar setup with one of our upstr

[ceph-users] Re: weird performance issue on ceph

2022-07-25 Thread Mark Nelson
Hi Zoltan, We have a very similar setup with one of our upstream community performance test clusters.  60 4TB PM983 drives spread across 10 nodes.  We get similar numbers to what you are initially seeing (scaled down to 60 drives) though with somewhat lower random read IOPS (we tend to max o

[ceph-users] Re: How much IOPS can be expected on NVME OSDs

2022-05-12 Thread Mark Nelson
Hi Felix, Those are pretty good drives and shouldn't have too much trouble with O_DSYNC writes which can often be a bottleneck for lower end NVMe drives.  Usually if the drives are fast enough, it comes down to clock speed and cores.  Clock speed helps the kv sync thread write metadata to the

[ceph-users] Re: Using CephFS in High Performance (and Throughput) Compute Use Cases

2022-04-13 Thread Mark Nelson
GB/sec and read bandwidth of 75GB/sec with 10 server nodes and 8 disks each, if I understand correctly, I assume they used 2-rep, though. They used 10 client nodes with 32 threads each... Best wishes, Manuel [1] https://croit.io/blog/ceph-performance-test-and-optimization [2] https://io500.org/submiss

[ceph-users] Re: Low performance on format volume

2022-04-12 Thread Mark Nelson
Hi Iban, Most of these options fall under the osd section.  You can get descriptions of what they do here: https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/ The journal settings are for the old filestore backend and aren't relevant unless you are using it.  Still, you can

[ceph-users] Re: osd with unlimited ram growth

2022-04-12 Thread Mark Nelson
Hi Joachim, Thank you much for the great writeup!  This definitely has been a major source of frustration. Thanks, Mark On 4/12/22 05:23, Joachim Kraftmayer (Clyso GmbH) wrote: Hi all, In the last few weeks we have discovered an error for which there have been several tickets and error

[ceph-users] Re: What's the relationship between osd_memory_target and bluestore_cache_size?

2022-03-29 Thread Mark Nelson
ytes": 0 } }, "total": { "items": 205850092, "bytes": 9282297783 } } } The total bytes at the end is much less than what the OS reports. Is this something I can control by adjusting the calc

[ceph-users] Re: What's the relationship between osd_memory_target and bluestore_cache_size?

2022-03-29 Thread Mark Nelson
On 3/29/22 11:44, Anthony D'Atri wrote: [osd] bluestore_cache_autotune = 0 Why are you turning autotuning off? FWIW I’ve encountered the below assertions. I neither support nor deny them, pasting here for discussion. One might interpret this to only apply to OSDs with DB on a seperate (fas

[ceph-users] Re: Ceph-CSI and OpenCAS

2022-03-14 Thread Mark Nelson
Hi Martin, I believe RH's reference architecture team has deployed ceph with CAS (and perhaps open CAS when it was open sourced), but I'm not sure if there's been any integration work done yet with ceph-csi. Theoretically it should be fairly easy though since the OSD will just treat it as ge

[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson
Hi Eugen, Thanks for the great feedback.  Is there anything specific about the cache tier itself that you like vs hypothetically having caching live below the OSDs?  There are some real advantages to the cache tier concept, but eviction over the network has definitely been one of the tougher

[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson
On a related note,  Intel will be presenting about their Open CAS software that provides caching at the block layer under the OSD at the weekly performance meeting on 2/24/2022 (similar to dm-cache, but with differences regarding the implementation).  This isn't a replacement for cache tiering,

[ceph-users] Re: Advice on enabling autoscaler

2022-02-07 Thread Mark Nelson
On 2/7/22 12:34 PM, Alexander E. Patrakov wrote: пн, 7 февр. 2022 г. в 17:30, Robert Sander : And keep in mind that when PGs are increased that you also may need to increase the number of OSDs as one OSD should carry a max of around 200 PGs. But I do not know if that is still the case with curr

[ceph-users] Re: NVME Namspaces vs SPDK

2022-02-05 Thread Mark Nelson
4 NVMe drives per OSD only really makes sense if you have extra CPU to spare and very fast drives.  It can have higher absolute performance when you give OSDs unlimited CPUs, but it tends to be slower (and less efficient) in CPU limited scenarios in our testing (ymmv).  We've got some semi-rece

[ceph-users] Re: Cephadm Deployment with io_uring OSD

2022-01-10 Thread Mark Nelson
Hi Gene, Unfortunately when the io_uring code was first implemented there were no stable centos kernels in our test lab that included io_uring support so it hasn't gotten a ton of testing.  I agree that your issue looks similar to what was reported in issue #47661, but it looks like you are

[ceph-users] Re: 50% IOPS performance drop after upgrade from Nautilus 14.2.22 to Octopus 15.2.15

2021-12-22 Thread Mark Nelson
On 12/22/21 4:23 AM, Marc wrote: I guess what caused the issue was high latencies on our “big” SSD’s (7TB drives), which got really high after the upgrade to Octopus. We split them into 4OSD’s some days ago and since then the high commit latencies on the OSD’s and on bluestore are gone Hmm, but

[ceph-users] Re: Large latency for single thread

2021-12-21 Thread Mark Nelson
client is sensitive to latency. P.S. crimson can be used in production now or not ? On 12/16/21 3:53 AM, Mark Nelson wrote: FWIW, we ran single OSD, iodepth=1 O_DSYNC write tests against classic and crimson bluestore OSDs in our Q3 crimson slide deck. You can see the results starting on sli

[ceph-users] Re: Large latency for single thread

2021-12-15 Thread Mark Nelson
FWIW, we ran single OSD, iodepth=1 O_DSYNC write tests against classic and crimson bluestore OSDs in our Q3 crimson slide deck. You can see the results starting on slide 32 here: https://docs.google.com/presentation/d/1eydyAFKRea8n-VniQzXKW8qkKM9GLVMJt2uDjipJjQA/edit#slide=id.gf880cf6296_1_73

[ceph-users] Re: OSD huge memory consumption

2021-12-06 Thread Mark Nelson
Hi Marius, Have you changed any of the default settings?  You've got a huge number of pglog entries.  Do you have any other pools as well? Even though pglog is only taking up 6-7GB of the 37GB used, that's a bit of a red flag for me.  Something we don't track via the mempools is taking up a t

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
read. Is there anything to set or tune on the osd to not have these pg movements this effect? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original M

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
ement from changing the PG count? Mark Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com ------- -Original Message- From: Mark Nelson Sent: Thursda

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
Hi Istvan, Is that 1-1.2 billion 40KB rgw objects?  If you are running EC 4+2 on a 42 OSD cluster with that many objects (and a heavily write oriented workload), that could be hitting rocksdb pretty hard.  FWIW, you might want to look at the compaction stats provided in the OSD log.  You can

  1   2   3   4   5   6   7   8   9   10   >