[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin
On 10/09/2021 15:54, Janne Johansson wrote: Den fre 10 sep. 2021 kl 14:39 skrev George Shuklin : On 10/09/2021 14:49, George Shuklin wrote: Hello. I wonder if there is a way to see how many replicas are available for each object (or, at least, PG-level statistics). Basically, if I have

[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin
On 10/09/2021 15:37, Janne Johansson wrote: Den fre 10 sep. 2021 kl 14:27 skrev George Shuklin : On 10/09/2021 15:19, Janne Johansson wrote: Are there a way? pg list is not very informative, as it does not show how badly 'unreplicated' data are. ceph pg dump should list all PGs and how many

[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin
On 10/09/2021 14:49, George Shuklin wrote: Hello. I wonder if there is a way to see how many replicas are available for each object (or, at least, PG-level statistics). Basically, if I have damaged cluster, I want to see the scale of damage, and I want to see the most degraded objects (which

[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread George Shuklin
On 10/09/2021 15:19, Janne Johansson wrote: Den fre 10 sep. 2021 kl 13:55 skrev George Shuklin : Hello. I wonder if there is a way to see how many replicas are available for each object (or, at least, PG-level statistics). Basically, if I have damaged cluster, I want to see the scale of damage

[ceph-users] List pg with heavily degraded objects

2021-09-10 Thread George Shuklin
Hello. I wonder if there is a way to see how many replicas are available for each object (or, at least, PG-level statistics). Basically, if I have damaged cluster, I want to see the scale of damage, and I want to see the most degraded objects (which has 1 copy, then objects with 2 copies,

[ceph-users] Permissions for OSD

2021-01-25 Thread George Shuklin
Docs for permissions are super vague. What each flag does? What is 'x' permitting? What's the difference between class-write and write? And the last question: can we limit user to reading/writing only to existing objects in the pool? Thanks! ___

[ceph-users] Re: How to make HEALTH_ERR quickly and pain-free

2021-01-21 Thread George Shuklin
On 21/01/2021 12:57, George Shuklin wrote: I have hell of the question: how to make HEALTH_ERR status for a cluster without consequences? I'm working on CI tests and I need to check if our reaction to HEALTH_ERR is good. For this I need to take an empty cluster with an empty pool and do

[ceph-users] Re: How to make HEALTH_ERR quickly and pain-free

2021-01-21 Thread George Shuklin
On 21/01/2021 13:02, Eugen Block wrote: But HEALTH_ERR is a bit more tricky. Any ideas? I think if you set a very low quota for a pool (e.g. 1000 bytes or so) and fill it up it should create a HEALTH_ERR status, IIRC. Cool idea. Unfortunately, even with 1 byte quota (and some data in the

[ceph-users] How to make HEALTH_ERR quickly and pain-free

2021-01-21 Thread George Shuklin
I have hell of the question: how to make HEALTH_ERR status for a cluster without consequences? I'm working on CI tests and I need to check if our reaction to HEALTH_ERR is good. For this I need to take an empty cluster with an empty pool and do something. Preferably quick and reversible.

[ceph-users] Decoding pgmap

2021-01-14 Thread George Shuklin
There is a command `ceph pg getmap`. It produces a binary file. Are there any utility to decode it? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Namespace usability for mutitenancy

2020-12-17 Thread George Shuklin
Hello. Had been someone starting using namespaces for real production for multi-tenancy? How good is it at isolating tenants from each other? Can they see each other presence, quotas, etc? Is is safe to give access via cephx to (possibly hostile to each other) users to the same pool with

[ceph-users] Re: NVMe's

2020-09-23 Thread George Shuklin
I've just finishing doing our own benchmarking, and I can say, you want to do something very unbalanced and CPU bounded. 1. Ceph consume a LOT of CPU. My peak value was around 500% CPU per ceph-osd at top-performance (see the recent thread on 'ceph on brd') with more realistic numbers

[ceph-users] Re: Low level bluestore usage

2020-09-23 Thread George Shuklin
On 23/09/2020 04:09, Alexander E. Patrakov wrote: Sometimes this doesn't help. For data recovery purposes, the most helpful step if you get the "bluefs enospc" error is to add a separate db device, like this: systemctl disable --now ceph-osd@${OSDID} truncate -s 32G

[ceph-users] Re: NVMe's

2020-09-23 Thread George Shuklin
On 23/09/2020 10:54, Marc Roos wrote: Depends on your expected load not? I already read here numerous of times that osd's can not keep up with nvme's, that is why people put 2 osd's on a single nvme. So on a busy node, you probably run out of cores? (But better verify this with someone that

[ceph-users] Re: NVMe's

2020-09-23 Thread George Shuklin
I've just finishing doing our own benchmarking, and I can say, you want to do something very unbalanced and CPU bounded. 1. Ceph consume a LOT of CPU. My peak value was around 500% CPU per ceph-osd at top-performance (see the recent thread on 'ceph on brd') with more realistic numbers

[ceph-users] Re: Low level bluestore usage

2020-09-22 Thread George Shuklin
As far as I know, bluestore doesn't like super small sizes. Normally odd should stop doing funny things as full mark, but if device is too small it may be too late and bluefs run out of space. Two things: 1. Don't use too small osd 2. Have a spare area on the drive. I usually reserve 1% for

[ceph-users] Re: Benchmark WAL/DB on SSD and HDD for RGW RBD CephFS

2020-09-18 Thread George Shuklin
On 17/09/2020 17:37, Mark Nelson wrote: Does fio handle S3 objects spread across many buckets well? I think bucket listing performance was maybe missing too, but It's been a while since I looked at fio's S3 support.  Maybe they have those use cases covered now.  I wrote a go based benchmark

[ceph-users] disk scheduler for SSD

2020-09-18 Thread George Shuklin
I start to wonder (again) which scheduler is better for ceph on SSD. My reasoning. None: 1. Reduces latency for requests. The lower latency is, the higher is perceived performance for unbounded workload with fixed queue depth (hello, benchmarks). 2. Causes possible spikes in latency for

[ceph-users] Re: Benchmark WAL/DB on SSD and HDD for RGW RBD CephFS

2020-09-17 Thread George Shuklin
On 16/09/2020 07:26, Danni Setiawan wrote: Hi all, I'm trying to find performance penalty with OSD HDD when using WAL/DB in faster device (SSD/NVMe) vs WAL/DB in same device (HDD) for different workload (RBD, RGW with index bucket in SSD pool, and CephFS with metadata in SSD pool). I want to

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-14 Thread George Shuklin
On 11/09/2020 17:44, Mark Nelson wrote: On 9/11/20 4:15 AM, George Shuklin wrote: On 10/09/2020 19:37, Mark Nelson wrote: On 9/10/20 11:03 AM, George Shuklin wrote: ... Are there any knobs to tweak to see higher performance for ceph-osd? I'm pretty sure it's not any kind of leveling, GC

[ceph-users] Re: Is it possible to assign osd id numbers?

2020-09-14 Thread George Shuklin
On 11/09/2020 22:43, Shain Miley wrote: Thank you for your answer below. I'm not looking to reuse them as much as I am trying to control what unused number is actually used. For example if I have 20 osds and 2 have failed...when I replace a disk in one server I don't want it to automatically

[ceph-users] Re: Is it possible to assign osd id numbers?

2020-09-11 Thread George Shuklin
On 11/09/2020 16:11, Shain Miley wrote: Hello, I have been wondering for quite some time whether or not it is possible to influence the osd.id numbers that are assigned during an install. I have made an attempt to keep our osds in order over the last few years, but it is a losing battle

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-11 Thread George Shuklin
On 10/09/2020 19:37, Mark Nelson wrote: On 9/10/20 11:03 AM, George Shuklin wrote: ... Are there any knobs to tweak to see higher performance for ceph-osd? I'm pretty sure it's not any kind of leveling, GC or other 'iops-related' issues (brd has performance of two order of magnitude higher

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-11 Thread George Shuklin
On 10/09/2020 22:35, vita...@yourcmc.ru wrote: Hi George Author of Ceph_performance here! :) I suspect you're running tests with 1 PG. Every PG's requests are always serialized, that's why OSD doesn't utilize all threads with 1 PG. You need something like 8 PGs per OSD. More than 8 usually

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-10 Thread George Shuklin
Latency from a client side is not an issue. It just combines with other latencies in the stack. The more client lags, the easier it's for the cluster. Here, the thing I talk, is slightly different. When you want to establish baseline performance for osd daemon (disregarding block device and

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-10 Thread George Shuklin
I know. I tested fio before testing ceph with fio. On null ioengine fio can handle up to 14M IOPS (on my dusty lab's R220). On blk_null to gets down to 2.4-2.8M IOPS. On brd it drops to sad 700k IOPS. BTW, never run synthetic high-performance benchmarks on kvm. My old server with

[ceph-users] Re: ceph-osd performance on ram disk

2020-09-10 Thread George Shuklin
Thank you! I know that article, but they promise 6 core use per OSD, and I got barely over three, and all this in totally synthetic environment with no SDD to blame (brd is more than fast and have a very consistent latency under any kind of load). On Thu, Sep 10, 2020, 19:39 Marc Roos wrote: >

[ceph-users] ceph-osd performance on ram disk

2020-09-10 Thread George Shuklin
I'm creating a benchmark suite for Сeph. During benchmarking of benchmark, I've checked how fast ceph-osd works. I decided to skip all 'SSD mess' and use brd (block ram disk, modprobe brd) as underlying storage. Brd itself can yield up to 2.7Mpps in fio. In single thread mode (iodepth=1) it