[ceph-users] Re: Ceph ARM providing storage for x86

2024-05-27 Thread Mark Nelson
Once upon a time there were some endian issues running clusters with mixed hardware, though I don't think it affected clients.  As far as I know those were all resolved many years ago. Mark On 5/25/24 08:46, Anthony D'Atri wrote: Why not? The hwarch doesn't matter. On May 25, 2024, at

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-02 Thread Mark Nelson
the more recent updates landed in 17.2.6+ though. Mark On 5/2/24 00:05, Sridhar Seshasayee wrote: Hi Mark, On Thu, May 2, 2024 at 3:18 AM Mark Nelson wrote: For our customers we are still disabling mclock and using wpq. Might be worth trying. Could you please elaborate a bit on the issue(s

[ceph-users] Re: Ceph reef and (slow) backfilling - how to speed it up

2024-05-01 Thread Mark Nelson
mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research

[ceph-users] Re: Migrating from S3 to Ceph RGW (Cloud Sync Module)

2024-04-15 Thread Mark Nelson
At Clyso we've been building a tool that can migrate S3 data around called Chorus. Normally I wouldn't promote it here, but it's open source and sounds like it might be useful in this case. I don't work on it myself, but thought I'd mention it: https://github.com/clyso/chorus One problem

[ceph-users] Re: Are we logging IRC channels?

2024-03-22 Thread Mark Nelson
of discussions have migrated into different channels, though #ceph still gets some community traffic (and a lot of hardware design discussion). Mark On 3/22/24 02:15, Alvaro Soto wrote: Should we bring to life this again? On Tue, Mar 19, 2024, 8:14 PM Mark Nelson <mailto:mark.a.nel...@gmail.com>&

[ceph-users] Re: Are we logging IRC channels?

2024-03-19 Thread Mark Nelson
A long time ago Wido used to have a bot logging IRC afaik, but I think that's been gone for some time. Mark On 3/19/24 19:36, Alvaro Soto wrote: Hi Community!!! Are we logging IRC channels? I ask this because a lot of people only use Slack, and the Slack we use doesn't have a subscription,

[ceph-users] Re: Number of pgs

2024-03-05 Thread Mark Nelson
There are both pros and cons to having more PGs. Here are a couple of considerations: Pros: 1) Better data distribution prior to balancing (and maybe after) 2) Fewer objects/data per PG 3) Lower per-PG lock contention Cons: 1) Higher PG log memory usage until you hit the osd target unless you

[ceph-users] Re: Performance improvement suggestion

2024-03-04 Thread Mark Nelson
WAL ingestion rate. Mark ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: ht

[ceph-users] Re: High IO utilization for bstore_kv_sync

2024-02-22 Thread Mark Nelson
disks. Is there any parameter that we can change/tune to better handle the call "fdatsync"? Maybe using NVMEs for the RocksDB? On Thu, Feb 22, 2024 at 2:24 PM Mark Nelson wrote: Most likely you are seeing time spent waiting on fdatsync in bstore_kv_sync if the drives you are u

[ceph-users] Re: High IO utilization for bstore_kv_sync

2024-02-22 Thread Mark Nelson
being read and written to the OSD/disks? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12

[ceph-users] Re: PSA: Long Standing Debian/Ubuntu build performance issue (fixed, backports in progress)

2024-02-09 Thread Mark Nelson
which is critical to the performance. RocksDB is one of them."/ Do we have similar issues with other sub-projects ? boost ? spdk .. ? 2) The chart shown on "rocksdb submit latency", going from over 10 ms to below 5 ms..is this during write i/o under heavy load ? /Maged On 08

[ceph-users] PSA: Long Standing Debian/Ubuntu build performance issue (fixed, backports in progress)

2024-02-08 Thread Mark Nelson
/ceph/pull/55501 https://github.com/ceph/ceph/pull/55502 Please feel free to reply if you have any questions! Thanks, Mark -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: https://clyso.com | e: mark.nel...@clyso.com We are hir

[ceph-users] Re: OSD read latency grows over time

2024-02-02 Thread Mark Nelson
e RocksDB layer when deleting objects that could be bulk deleted (RangeDelete?) due to them having the same prefix (name + date). Best regards On 26 Jan 2024, at 23:18, Mark Nelson wrote: On 1/26/24 11:26, Roman Pashin wrote: Unfortunately they cannot. You'll want to set them in centralized co

[ceph-users] Re: OSD read latency grows over time

2024-01-26 Thread Mark Nelson
On 1/26/24 11:26, Roman Pashin wrote: Unfortunately they cannot. You'll want to set them in centralized conf and then restart OSDs for them to take effect. Got it. Thank you Josh! WIll put it to config of affected OSDs and restart them. Just curious, can decreasing

[ceph-users] Re: 17.2.7: Backfilling deadlock / stall / stuck / standstill

2024-01-26 Thread Mark Nelson
For what it's worth, we saw this last week at Clyso on two separate customer clusters on 17.2.7 and also solved it by moving back to wpq.  We've been traveling this week so haven't created an upstream tracker for it yet, but we're back to recommending wpq to our customers for all production

[ceph-users] Re: OSD read latency grows over time

2024-01-19 Thread Mark Nelson
SSD (index), 1x NVME (wal + OS). ceph config - https://pastebin.com/pCqxXhT3 OSD read latency graph - https://postimg.cc/5YHk9bby -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: https://clyso.com | e: mark.nel...@clyso.com We

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-19 Thread Mark Nelson
/NVMe, but as always everything is workload dependant and there is sometimes a need for doubling up  Regards, Bailey -Original Message- From: Maged Mokhtar Sent: January 17, 2024 4:59 PM To: Mark Nelson ; ceph-users@ceph.io Subject: [ceph-users] Re: Performance impact of Heterogeneous

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
To: Mark Nelson ; ceph-users@ceph.io Subject: [ceph-users] Re: Performance impact of Heterogeneous environment Very informative article you did Mark. IMHO if you find yourself with very high per-OSD core count, it may be logical to just pack/add more nvmes per host, you'd be getting the best price

[ceph-users] Re: Performance impact of Heterogeneous environment

2024-01-17 Thread Mark Nelson
-vs-tuned-performance-comparison/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota

[ceph-users] Re: How does mclock work?

2024-01-09 Thread Mark Nelson
With HDDs and a lot of metadata, it's tough to get away from it imho.  In an alternate universe it would have been really neat if Intel could have worked with the HDD vendors to put like 16GB of user accessible optane on every HDD.  Enough for the WAL and L0 (and maybe L1). Mark On 1/9/24

[ceph-users] Re: After upgrading from 17.2.6 to 18.2.0, OSDs are very frequently restarting due to livenessprobe failures

2023-09-28 Thread Mark Nelson
gradually. Hope this helps and awaiting for the feedback. Thanks, Igor On 27/09/2023 22:04, sbeng...@gmail.com wrote: Hi Igor, I have copied three OSD logs to https://drive.google.com/file/d/1aQxibFJR6Dzvr3RbuqnpPhaSMhPSL--F/view?usp=sharing Hopefully they include some mean

[ceph-users] Re: Ceph MDS OOM in combination with 6.5.1 kernel client

2023-09-20 Thread Mark Nelson
.io/thread/YR5UNKBOKDHPL2PV4J75ZIUNI4HNMC2W/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnes

[ceph-users] Re: ceph_leadership_team_meeting_s18e06.mkv

2023-09-07 Thread Mark Nelson
s (like the restful one, which was a suspect) - Enabling debug 20 - Turning the pg autoscaler off Debugging will continue to characterize this issue: - Enable profiling (Mark Nelson) - Try Bloomberg's Python mem profiler <https://github.com/bloomberg/memray> (Matthew Leonard) *

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
that gives me enough I/Os to work properly and a cluster that murders my performances. I hope this helps. Feel free to ask us if you need further details and I'll see what I can do. On 9/7/23 13:59, Mark Nelson wrote: Ok, good to know.  Please feel free to update us here with what you

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
that, contrary to our previous beliefs, it's an issue with changes to the bluestore_allocator and not the compaction process. That said, I will keep this email in mind as we will want to test optimizations to compaction on our test environment. On 9/7/23 12:32, Mark Nelson wrote: Hello

[ceph-users] Re: Rocksdb compaction and OSD timeout

2023-09-07 Thread Mark Nelson
What kind of workload do you run (i.e. RBD, CephFS, RGW)? Do you also see these timeouts occur during deep-scrubs? Gr. Stefan -- Best Regards, Mark Nelson Head of Research and Development Clyso GmbH p: +49 89 21552391 12 | a: Minnesota, USA w: https://clyso.com | e: mark.nel...@clyso.com We are hir

[ceph-users] Re: snaptrim number of objects

2023-08-22 Thread Mark Nelson
_ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com

[ceph-users] Re: question about OSD onode hits ratio

2023-08-04 Thread Mark Nelson
-- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso.com/jobs

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-16 Thread Mark Nelson
On 7/10/23 11:19 AM, Matthew Booth wrote: On Thu, 6 Jul 2023 at 12:54, Mark Nelson wrote: On 7/6/23 06:02, Matthew Booth wrote: On Wed, 5 Jul 2023 at 15:18, Mark Nelson wrote: I'm sort of amazed that it gave you symbols without the debuginfo packages installed. I'll need to figure out

[ceph-users] Re: OSD memory usage after cephadm adoption

2023-07-11 Thread Mark Nelson
decided on, it goes through a process of looking at how hot the different caches are and assigns memory based on where it thinks the memory would be most useful.  Again this is based on mapped memory though.  It can't force the kernel to reclaim memory that has already been released. Thanks, Mark --

[ceph-users] Re: OSD memory usage after cephadm adoption

2023-07-11 Thread Mark Nelson
limit? Thanks Luis Domingues Proton AG ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 Münche

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-06 Thread Mark Nelson
On 7/6/23 06:02, Matthew Booth wrote: On Wed, 5 Jul 2023 at 15:18, Mark Nelson wrote: I'm sort of amazed that it gave you symbols without the debuginfo packages installed. I'll need to figure out a way to prevent that. Having said that, your new traces look more accurate to me. The thing

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-05 Thread Mark Nelson
On 7/4/23 10:39, Matthew Booth wrote: On Tue, 4 Jul 2023 at 10:00, Matthew Booth wrote: On Mon, 3 Jul 2023 at 18:33, Ilya Dryomov wrote: On Mon, Jul 3, 2023 at 6:58 PM Mark Nelson wrote: On 7/3/23 04:53, Matthew Booth wrote: On Thu, 29 Jun 2023 at 14:11, Mark Nelson wrote

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-07-03 Thread Mark Nelson
On 7/3/23 04:53, Matthew Booth wrote: On Thu, 29 Jun 2023 at 14:11, Mark Nelson wrote: This container runs: fio --rw=write --ioengine=sync --fdatasync=1 --directory=/var/lib/etcd --size=100m --bs=8000 --name=etcd_perf --output-format=json --runtime=60 --time_based=1 And extracts

[ceph-users] Re: RBD with PWL cache shows poor performance compared to cache device

2023-06-29 Thread Mark Nelson
6 KiB / 0% miss_bytes: 349 MiB -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso.com/jobs/ ___ ceph-use

[ceph-users] Re: osd memory target not work

2023-06-20 Thread Mark Nelson
204800 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e

[ceph-users] Re: radosgw hang under pressure

2023-06-12 Thread Mark Nelson
users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.c

[ceph-users] Re: reef v18.1.0 QE Validation status

2023-06-01 Thread Mark Nelson
list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hiring: https://www.clyso.com/jobs

[ceph-users] Re: BlueStore fragmentation woes

2023-05-31 Thread Mark Nelson
s good to know. Gr. Stefan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w

[ceph-users] Re: RGW versioned bucket index issues

2023-05-31 Thread Mark Nelson
Systems ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com

[ceph-users] Re: BlueStore fragmentation woes

2023-05-25 Thread Mark Nelson
On 5/24/23 09:18, Hector Martin wrote: On 24/05/2023 22.07, Mark Nelson wrote: Yep, bluestore fragmentation is an issue.  It's sort of a natural result of using copy-on-write and never implementing any kind of defragmentation scheme.  Adam and I have been talking about doing it now, probably

[ceph-users] Re: BlueStore fragmentation woes

2023-05-24 Thread Mark Nelson
- Hector ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 München | Germany w: https://clyso.com | e: mark.nel...@clyso.com We are hir

[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-23 Thread Mark Nelson
temctl ceph-osd@xx restart, it just takes long time performing log recovery.. If I can provide more info, please let me know BR nik -- - Ing. Nikola CIPRICH LinuxBox.cz, s.r.o. 28.rijna 168, 709 00 Ostrava tel.: +420 591 166 214 fax:+420 596 62

[ceph-users] Re: Discussion thread for Known Pacific Performance Regressions

2023-05-17 Thread Mark Nelson
!) during the weekly ceph community performance call over the past year or two. There's been no intention to hide them, they were just never really summarized on the mailing list until now. On 11 May 2023, at 17:38, Mark Nelson wrote: Hi Everyone, This email was originally posted to d

[ceph-users] Re: CEPH Version choice

2023-05-15 Thread Mark Nelson
y compiles on non-RHEL/current Linux distributions. Regards, Daniel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 2

[ceph-users] Discussion thread for Known Pacific Performance Regressions

2023-05-11 Thread Mark Nelson
f my head at the moment.  Hopefully this helps clarify what's going on if people are seeing a regression, what to look for, and if they are hitting it, the why behind it. Thanks, Mark -- Best Regards, Mark Nelson Head of R (USA) Clyso GmbH p: +49 89 21552391 12 a: Loristraße 8 | 80335 Münche

[ceph-users] Re: 10x more used space than expected

2023-03-14 Thread Mark Nelson
Is it possible that you are storing object (chunks if EC) that are smaller than the min_alloc size? This cheat sheet might help: https://docs.google.com/spreadsheets/d/1rpGfScgG-GLoIGMJWDixEkqs-On9w8nAUToPQjN8bDI/edit?usp=sharing Mark On 3/14/23 12:34, Gaël THEROND wrote: Hi everyone, I’ve

[ceph-users] Re: avg apply latency went up after update from octopus to pacific

2023-02-28 Thread Mark Nelson
One thing to watch out for with bluefs_buffered_io is that disabling it can greatly impact certain rocksdb workloads. From what I remember it was a huge problem during certain iteration workloads for things like collection listing. I think the block cache was being invalidated or simply

[ceph-users] Re: Ceph OSD imbalance and performance

2023-02-28 Thread Mark Nelson
On 2/28/23 13:11, Dave Ingram wrote: On Tue, Feb 28, 2023 at 12:56 PM Reed Dier wrote: I think a few other things that could help would be `ceph osd df tree` which will show the hierarchy across different crush domains. Good idea: https://pastebin.com/y07TKt52 Yeah, it looks like

[ceph-users] Re: mons excessive writes to local disk and SSD wearout

2023-02-27 Thread Mark Nelson
On 2/27/23 03:22, Andrej Filipcic wrote: On 2/24/23 15:18, Dan van der Ster wrote: Hi Andrej, That doesn't sound right -- I checked a couple of our clusters just now and the mon filesystem is writing at just a few 100kBps. most of the time it's few 10kB/s, but then it jumps a lot, few times a

[ceph-users] Re: RadosGW - Performance Expectations

2023-02-10 Thread Mark Nelson
For reference, with parallel writes using the S3 Go API (via hsbench: https://github.com/markhpc/hsbench), I was recently doing about 600ish MB/s to a single RGW instance from one client. RadosGW used around 3ish HW threads from a 2016 era Xeon to do that. Didn't try single-file tests in

[ceph-users] Re: ceph cluster iops low

2023-01-23 Thread Mark Nelson
Hi Peter, I'm not quite sure if you're cluster is fully backed by NVMe drives based on your description, but you might be interested in the CPU scaling article we posted last fall. It's available here: https://ceph.io/en/news/blog/2022/ceph-osd-cpu-scaling/ That gives a good overview of

[ceph-users] Re: why does 3 copies take so much more time than 2?

2023-01-04 Thread Mark Nelson
Hi Charles, Going from 40s to 4.5m seems excessive to me at least.  Can you tell if the drives or OSDs are hitting their limits?  Tools like iostart, sar, or collectl might help. Longer answer: There are a couple of potential issues.  One is that you are bound by the latency of writing

[ceph-users] Re: Tuning CephFS on NVME for HPC / IO500

2022-12-01 Thread Mark Nelson
Hi Manuel, I did the IO500 runs back in 2020 and wrote the cephfs aiori backend for IOR/mdtest.  Not sure about the segfault, it's been a while since I've touched that code.  It was working the last time I used it. :D  Having said that, I don't think that's your issue.   The userland backend

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-21 Thread Mark Nelson
On 11/21/22 11:34, Sven Kieske wrote: On Fr, 2022-11-11 at 10:11 +0100, huxia...@horebdata.cn wrote: Thanks a lot for your insightful blogs on Ceph performance. It is really very informative and interesting. When i read Ceph OSD CPU Scaling, i am wondering by which way you  scale CPU cores

[ceph-users] Re: Impact of DB+WAL undersizing in Pacific and later

2022-11-13 Thread Mark Nelson
Hi Gregor, DB space usage will be mostly governed by the number of onodes and blobs/extents/etc (potentially caused by fragmentation).  If you are primarily using RBD and/or large files in CephFS and you aren't doing a ton of small overwrites, your DB usage could remain below 1%.  It's

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-10 Thread Mark Nelson
. nevertheless, the concurrent write flag will allow those operation that do need to share the same cf to do it in a safe way. was this option ever considered? *From:* Mark Nelson *Sent:* Wednesday, November 9, 2022 3:09 PM

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
On 11/9/22 4:48 AM, Stefan Kooman wrote: On 11/8/22 21:20, Mark Nelson wrote: Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: For sure, thanks a lot, it's really informative! Can we also

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-09 Thread Mark Nelson
*From:* Mark Nelson *Sent:* Tuesday, November 8, 2022 10:20 PM *To:* ceph-users@ceph.io *Subject:* [ceph-users] Recent ceph.io Performance Blog Posts CAUTION: External Sender Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that

[ceph-users] Re: Recent ceph.io Performance Blog Posts

2022-11-08 Thread Mark Nelson
On 11/8/22 14:59, Marc wrote: 2. https://ceph.io/en/news/blog/2022/qemu-kvm-tuning/ Very nice! Thanks Marc! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send

[ceph-users] Recent ceph.io Performance Blog Posts

2022-11-08 Thread Mark Nelson
Hi Folks, I thought I would mention that I've released a couple of performance articles on the Ceph blog recently that might be of interest to people: 1. https://ceph.io/en/news/blog/2022/rocksdb-tuning-deep-dive/ 2.

[ceph-users] Correction: 10/27/2022 perf meeting with guest speaker Peter Desnoyers today!

2022-10-27 Thread Mark Nelson
Hi Folks, The weekly performance meeting will be starting in approximately 55 minutes at 8AM PST.  Peter Desnoyers from Khoury College of Computer Sciences, Northeastern University will be speaking today about his work on local storage for RBD caching.  A short architectural overview is

[ceph-users] 10/20/2022 perf meeting with guest speaker Peter Desnoyers today!

2022-10-27 Thread Mark Nelson
Hi Folks, The weekly performance meeting will be starting in approximately 70 minutes at 8AM PST.  Peter Desnoyers from Khoury College of Computer Sciences, Northeastern University will be speaking today about his work on local storage for RBD caching.  A short architectural overview is

[ceph-users] Re: osd_memory_target for low-memory machines

2022-10-03 Thread Mark Nelson
Hi Nicola, I wrote the autotuning code in the OSD.  Janne's response is absolutely correct.  Right now we just control the size of the caches in the OSD and rocksdb to try to keep the OSD close to a certain memory limit.  By default this works down to around 2GB, but the smaller the limit,

[ceph-users] Re: weird performance issue on ceph

2022-09-26 Thread Mark Nelson
? Thanks, Zoltan Am 17.09.22 um 06:58 schrieb Mark Nelson: CAUTION: This email originated from outside the organization. Do not click links unless you can confirm the sender and know the content is safe. Hi Zoltan, So kind of interesting results

[ceph-users] Re: weird performance issue on ceph

2022-09-16 Thread Mark Nelson
to recreate the whole thing, so we thought we start with the bad state, maybe something obvious is already visible for someone who knows the osd internals well. You find the file here: https://pastebin.com/0HdNapLQ Tanks a lot in advance, Zolta Am 12.08.22 um 18:25 schrieb Mark Nelson: CAUTION

[ceph-users] Re: Wide variation in osd_mclock_max_capacity_iops_hdd

2022-09-08 Thread Mark Nelson
FWIW, I'd be for trying to take periodic samples of actual IO happening on the drive during operation.  You can get a much better idea of latency and throughput characteristics across different IO sizes over time (though you will need to account for varying levels of concurrency at the device

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-18 Thread Mark Nelson
ed_count": 1895058, "bluestore_allocated": 171709562880, "bluestore_stored": 304405529094, "bluestore_compressed": 30506295169, "bluestore_compressed_allocated": 132702666752, "bluestore_compressed_origi

[ceph-users] Re: weird performance issue on ceph

2022-08-12 Thread Mark Nelson
: https://pastebin.com/dEv05eGV Do you see anything obvious that could give us a clue what is going on? Many thanks! Zoltan Am 02.08.22 um 19:01 schrieb Mark Nelson: Ah, too bad!  I suppose that was too easy. :) Ok, so my two lines of thought: 1) Something related to the weird performance issues

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-11 Thread Mark Nelson
meant the bluestore_compression_mode option you specify in the ceph.conf file. Mark Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Mark Nelson Sent: 10 August 2022 22:28 To: Frank Schilder; ceph-users@ceph.io

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-10 Thread Mark Nelson
ot for easier performance wins first where we can get them. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Mark Nelson Sent: 09 August 2022 16:56:19 To: Frank Schilder; ceph-users@ceph.io Subject: Re: [ceph-u

[ceph-users] Re: Request for Info: bluestore_compression_mode?

2022-08-09 Thread Mark Nelson
parameters as orthogonal as possible Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 ____ From: Mark Nelson Sent: 08 August 2022 20:30:49 To: ceph-users@ceph.io Subject: [ceph-users] Request for Info: bluestore_compression_mode?

[ceph-users] Request for Info: bluestore_compression_mode?

2022-08-08 Thread Mark Nelson
Hi Folks, We are trying to get a sense for how many people are using bluestore_compression_mode or the per-pool compression_mode options (these were introduced early in bluestore's life, but afaik may not widely be used).  We might be able to reduce complexity in bluestore's blob code if we

[ceph-users] Re: weird performance issue on ceph

2022-08-02 Thread Mark Nelson
gling with this. Any other recommendation or idea what to check? Thanks a lot, Zoltan Am 01.08.22 um 17:53 schrieb Mark Nelson: Hi Zoltan, It doesn't look like your pictures showed up for me at least. Very interesting results though!  Are (or were) the drives particularly full when you've

[ceph-users] Re: weird performance issue on ceph

2022-08-01 Thread Mark Nelson
l requests. Is it safe to use these options in production at all? Many thanks, Zoltan Am 25.07.22 um 21:42 schrieb Mark Nelson: I don't think so if this is just plain old RBD.  RBD  shouldn't require a bunch of RocksDB iterator seeks in the read/write hot path and writes should pretty quickly

[ceph-users] Re: weird performance issue on ceph

2022-07-25 Thread Mark Nelson
AIT Risø Campus Bygning 109, rum S14 From: Mark Nelson Sent: 25 July 2022 18:50 To: ceph-users@ceph.io Subject: [ceph-users] Re: weird performance issue on ceph Hi Zoltan, We have a very similar setup with one of our upstream community performance test

[ceph-users] Re: weird performance issue on ceph

2022-07-25 Thread Mark Nelson
Hi Zoltan, We have a very similar setup with one of our upstream community performance test clusters.  60 4TB PM983 drives spread across 10 nodes.  We get similar numbers to what you are initially seeing (scaled down to 60 drives) though with somewhat lower random read IOPS (we tend to max

[ceph-users] Re: How much IOPS can be expected on NVME OSDs

2022-05-12 Thread Mark Nelson
Hi Felix, Those are pretty good drives and shouldn't have too much trouble with O_DSYNC writes which can often be a bottleneck for lower end NVMe drives.  Usually if the drives are fast enough, it comes down to clock speed and cores.  Clock speed helps the kv sync thread write metadata to

[ceph-users] Re: Using CephFS in High Performance (and Throughput) Compute Use Cases

2022-04-13 Thread Mark Nelson
server nodes and 8 disks each, if I understand correctly, I assume they used 2-rep, though. They used 10 client nodes with 32 threads each... Best wishes, Manuel [1] https://croit.io/blog/ceph-performance-test-and-optimization [2] https://io500.org/submissions/view/82 On Tue, Jul 27, 2021 at 7:21 P

[ceph-users] Re: Low performance on format volume

2022-04-12 Thread Mark Nelson
Hi Iban, Most of these options fall under the osd section.  You can get descriptions of what they do here: https://docs.ceph.com/en/latest/rados/configuration/osd-config-ref/ The journal settings are for the old filestore backend and aren't relevant unless you are using it.  Still, you

[ceph-users] Re: osd with unlimited ram growth

2022-04-12 Thread Mark Nelson
Hi Joachim, Thank you much for the great writeup!  This definitely has been a major source of frustration. Thanks, Mark On 4/12/22 05:23, Joachim Kraftmayer (Clyso GmbH) wrote: Hi all, In the last few weeks we have discovered an error for which there have been several tickets and error

[ceph-users] Re: What's the relationship between osd_memory_target and bluestore_cache_size?

2022-03-29 Thread Mark Nelson
t;: { "items": 205850092, "bytes": 9282297783 } } } The total bytes at the end is much less than what the OS reports. Is this something I can control by adjusting the calculation frequency as Mark suggests? Looking at your numbers he

[ceph-users] Re: What's the relationship between osd_memory_target and bluestore_cache_size?

2022-03-29 Thread Mark Nelson
On 3/29/22 11:44, Anthony D'Atri wrote: [osd] bluestore_cache_autotune = 0 Why are you turning autotuning off? FWIW I’ve encountered the below assertions. I neither support nor deny them, pasting here for discussion. One might interpret this to only apply to OSDs with DB on a seperate

[ceph-users] Re: Ceph-CSI and OpenCAS

2022-03-14 Thread Mark Nelson
Hi Martin, I believe RH's reference architecture team has deployed ceph with CAS (and perhaps open CAS when it was open sourced), but I'm not sure if there's been any integration work done yet with ceph-csi. Theoretically it should be fairly easy though since the OSD will just treat it as

[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson
Hi Eugen, Thanks for the great feedback.  Is there anything specific about the cache tier itself that you like vs hypothetically having caching live below the OSDs?  There are some real advantages to the cache tier concept, but eviction over the network has definitely been one of the

[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson
On a related note,  Intel will be presenting about their Open CAS software that provides caching at the block layer under the OSD at the weekly performance meeting on 2/24/2022 (similar to dm-cache, but with differences regarding the implementation).  This isn't a replacement for cache

[ceph-users] Re: Advice on enabling autoscaler

2022-02-07 Thread Mark Nelson
On 2/7/22 12:34 PM, Alexander E. Patrakov wrote: пн, 7 февр. 2022 г. в 17:30, Robert Sander : And keep in mind that when PGs are increased that you also may need to increase the number of OSDs as one OSD should carry a max of around 200 PGs. But I do not know if that is still the case with

[ceph-users] Re: NVME Namspaces vs SPDK

2022-02-05 Thread Mark Nelson
4 NVMe drives per OSD only really makes sense if you have extra CPU to spare and very fast drives.  It can have higher absolute performance when you give OSDs unlimited CPUs, but it tends to be slower (and less efficient) in CPU limited scenarios in our testing (ymmv).  We've got some

[ceph-users] Re: Cephadm Deployment with io_uring OSD

2022-01-10 Thread Mark Nelson
Hi Gene, Unfortunately when the io_uring code was first implemented there were no stable centos kernels in our test lab that included io_uring support so it hasn't gotten a ton of testing.  I agree that your issue looks similar to what was reported in issue #47661, but it looks like you are

[ceph-users] Re: 50% IOPS performance drop after upgrade from Nautilus 14.2.22 to Octopus 15.2.15

2021-12-22 Thread Mark Nelson
On 12/22/21 4:23 AM, Marc wrote: I guess what caused the issue was high latencies on our “big” SSD’s (7TB drives), which got really high after the upgrade to Octopus. We split them into 4OSD’s some days ago and since then the high commit latencies on the OSD’s and on bluestore are gone Hmm,

[ceph-users] Re: Large latency for single thread

2021-12-21 Thread Mark Nelson
is sensitive to latency. P.S. crimson can be used in production now or not ? On 12/16/21 3:53 AM, Mark Nelson wrote: FWIW, we ran single OSD, iodepth=1 O_DSYNC write tests against classic and crimson bluestore OSDs in our Q3 crimson slide deck. You can see the results starting on slide 32 here

[ceph-users] Re: Large latency for single thread

2021-12-15 Thread Mark Nelson
FWIW, we ran single OSD, iodepth=1 O_DSYNC write tests against classic and crimson bluestore OSDs in our Q3 crimson slide deck. You can see the results starting on slide 32 here: https://docs.google.com/presentation/d/1eydyAFKRea8n-VniQzXKW8qkKM9GLVMJt2uDjipJjQA/edit#slide=id.gf880cf6296_1_73

[ceph-users] Re: OSD huge memory consumption

2021-12-06 Thread Mark Nelson
Hi Marius, Have you changed any of the default settings?  You've got a huge number of pglog entries.  Do you have any other pools as well? Even though pglog is only taking up 6-7GB of the 37GB used, that's a bit of a red flag for me.  Something we don't track via the mempools is taking up a

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
on the osd to not have these pg movements this effect? Istvan Szabo Senior Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Mark Nelson Sent: Thu

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
Infrastructure Engineer --- Agoda Services Co., Ltd. e: istvan.sz...@agoda.com --- -Original Message- From: Mark Nelson Sent: Thursday, December 2, 2021 7:33 PM To: ceph-users@ceph.io Subject: [ceph

[ceph-users] Re: Best settings bluestore_rocksdb_options for my workload

2021-12-02 Thread Mark Nelson
Hi Istvan, Is that 1-1.2 billion 40KB rgw objects?  If you are running EC 4+2 on a 42 OSD cluster with that many objects (and a heavily write oriented workload), that could be hitting rocksdb pretty hard.  FWIW, you might want to look at the compaction stats provided in the OSD log.  You can

[ceph-users] Re: OSDs get killed by OOM when other host goes down

2021-11-16 Thread Mark Nelson
Yeah, if it's not memory reported by the mempools, that means it's something we aren't tracking.  Perhaps temporary allocations in some dark corner of the code, or possibly rocksdb (though 38GB of ram is obviously excessive).  heap stats are a good idea.  it's possible if neither the heap

[ceph-users] Re: How to minimise the impact of compaction in ‘rocksdb options’?

2021-11-15 Thread Mark Nelson
Hi, Compaction can block reads, but on the write path you should be able to absorb a certain amount of writes via the WAL before rocksdb starts throttling writes.  The larger and more WAL buffers you have, the more writes you can absorb, but bigger buffers also take more CPU to keep in

[ceph-users] Re: Question if WAL/block.db partition will benefit us

2021-11-11 Thread Mark Nelson
On 11/11/21 1:09 PM, Anthony D'Atri wrote: it in the documentation. This sounds like a horrible SPoF. How can you recover from it? Purge the OSD, wipe the disk and readd it? All flash cluster is sadly not an option for our s3, as it is just too large and we just bought around 60x 8TB Disks (in

[ceph-users] Re: High ceph_osd_commit_latency_ms on Toshiba MG07ACA14TE HDDs

2021-11-04 Thread Mark Nelson
Hi Dan, I can't speak for those specific Toshiba drives, but we have absolutely seen very strange behavior (sometimes with cache enabled and sometimes not) with different drives and firmwares over the years from various manufacturers.  There was one especially bad case from back in the

  1   2   >