[ceph-users] Re: Module 'cephadm' has failed: 'NoneType' object has no attribute 'split'

2021-02-02 Thread Tony Liu
File \"/usr/share/ceph/mgr/cephadm/module.py\", line 442, in serve serve.serve() File \"/usr/share/ceph/mgr/cephadm/serve.py\", line 66, in serve self.mgr.rm_util.process_removal_queue() File \"/usr/share/ceph/mgr/cephadm/services/osd.py\", line 348, in process_removal_queue self.mgr._remove

[ceph-users] Module 'cephadm' has failed: 'NoneType' object has no attribute 'split'

2021-02-02 Thread Tony Liu
Hi, After upgrading from 15.2.5 to 15.2.8, I see this health error. Has anyone seen this? "ceph log last cephadm" doesn't show anything about it. How can I trace it? Thanks! Tony ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] replace OSD without PG remapping

2021-02-02 Thread Tony Liu
Hi, There are multiple different procedures to replace an OSD. What I want is to replace an OSD without PG remapping. #1 I tried "orch osd rm --replace", which sets OSD reweight 0 and status "destroyed". "orch osd rm status" shows "draining". All PGs on this OSD are remapped. Checked "pg dump", c

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-02 Thread Tony Liu
All mon, mgr, crash and osd are upgraded to 15.2.8. It actually fixed another issue (no device listed after adding host). But this issue remains. ``` # cat osd-spec.yaml service_type: osd service_id: osd-spec placement: host_pattern: ceph-osd-[1-3] data_devices: rotational: 1 db_devices: ro

[ceph-users] Re: no device listed after adding host

2021-02-02 Thread Tony Liu
This works after upgrading to 15.2.8 from 15.2.5. I see an improvement that "orch host add" does some checking and shows explicit messages if anything is missing. But I am still not sure how 15.2.5 worked initially when building the cluster. Anyways, I am good now. Thanks! Tony > -Original Mes

[ceph-users] Re: is unknown pg going to be active after osds are fixed?

2021-02-02 Thread Tony Liu
Thank you all for kind response! This problem didn't happen naturally. It was caused by operation mistake. Anyways, 3 OSDs were replaced by zapped disk. That caused two unknown PGs. Data on those 2 PGs are permanently lost unfortunately. "pg dump" shows unknown. "pg map " shows those 3 replaced OSD

[ceph-users] Re: radosgw bucket index issue

2021-02-02 Thread Fox, Kevin M
Ping From: Fox, Kevin M Sent: Tuesday, December 29, 2020 3:17 PM To: ceph-users@ceph.io Subject: [ceph-users] radosgw bucket index issue We have a fairly old cluster that has over time been upgraded to nautilus. We were digging through some things and fo

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-02 Thread Loïc Dachary
Hi Greg, On 02/02/2021 20:34, Gregory Farnum wrote: > Packing's obviously a good idea for storing these kinds of artifacts > in Ceph, and hacking through the existing librbd might indeed be > easier than building something up from raw RADOS, especially if you > want to use stuff like rbd-mirror. >

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-02 Thread Anthony D'Atri
I’d be nervous about a plan to utilize a single volume, growing indefinitely. I would think that from a blast radius perspective that you’d want to strike a balance between a single monolithic blockchain-style volume vs a zillion tiny files. Perhaps a strategy to shard into, say, 10 TB volumes

[ceph-users] Re: Using RBD to pack billions of small files

2021-02-02 Thread Gregory Farnum
Packing's obviously a good idea for storing these kinds of artifacts in Ceph, and hacking through the existing librbd might indeed be easier than building something up from raw RADOS, especially if you want to use stuff like rbd-mirror. My main concern would just be as Dan points out, that we don'

[ceph-users] Re: is unknown pg going to be active after osds are fixed?

2021-02-02 Thread Jeremy Austin
I'm in a similar but not identical situation. I was in the middle of a rebalance on a small test cluster, without about 1% of pgs degraded, and shut the cluster entirely down for maintenance. On startup, many pgs are entirely unknown, and most stale. In fact most pgs can't be queried! No mon failu

[ceph-users] Re: `cephadm` not deploying OSDs from a storage spec

2021-02-02 Thread Juan Miguel Olmo Martinez
Hi Davor, Use "ceph orch ls osd --format yaml" to have more info about the problems deploying the osd service, probably that will give you clues about what is happening. Share the input if you cannot solve the problem:-) The same command can be used for other services like the node-exporter, alth

[ceph-users] Re: no device listed after adding host

2021-02-02 Thread Juan Miguel Olmo Martinez
Hi Eugen Block useful tips to create OSDs: 1. Check devices availability in your cluster hosts: # ceph orch device ls 2. Devices not available: This usually means that you have created lvs in these devices, (I mean the de

[ceph-users] Re: osd recommended scheduler

2021-02-02 Thread Andrei Mikhailovsky
Thanks for your reply, Wido. Isn't CFQ being deprecated in the latest kernel versions? From what I've read in the Ubuntu support pages, the cfq, deadline and noop are no longer supported since 2019 / kernel version 5.3 and later. There are, however, the following schedulers: bfq, kyber, mq-dea

[ceph-users] pg repair or pg deep-scrub does not start

2021-02-02 Thread Marcel Kuiper
Hi I've got an old cluster running ceph 10.2.11 with filestore backend. Last week a PG was reported inconsistent with a scrub error # ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 38.20 is active+clean+inconsistent, acting [1778,1640,1379] 1 scrub errors I first tried 'ceph

[ceph-users] Re: osd recommended scheduler

2021-02-02 Thread Dan van der Ster
cfq and now bfq are the only IO schedulers that implement fair share across processes, and also they are the only schedulers that implement io priorities (e.g. ionice). We run this script via rc.local on all our ceph clusters: https://gist.github.com/dvanders/968d5862f227e0dd988eb5db8fbba203

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-02 Thread Eugen Block
Hi, I would recommend to update (again), here's my output from a 15.2.8 test cluster: host1:~ # ceph orch ls --service_name osd.default --export service_type: osd service_id: default service_name: osd.default placement: hosts: - host4 - host3 - host1 - host2 spec: block_db_size:

[ceph-users] XFS block size on RBD / EC vs space amplification

2021-02-02 Thread Gilles Mocellin
Hello, As we know, with 64k for bluestore_min_alloc_size_hdd (I'm only using HDDs), in certain conditions, especially with erasure coding, there's a leak of space while writing objects smaller than 64k x k (EC:k+m). Every object is divided in k elements, written on different OSD. My main us

[ceph-users] Re: osd recommended scheduler

2021-02-02 Thread mj
Hi! Interesting. Didn't know that. We are running cfq on the OSDs, and added to ceph.conf: [osd] osd_disk_thread_ioprio_class = idle osd_disk_thread_ioprio_priority = 7 Since we recently switched from HDD to SSD OSDs, I guess we should also change from CFQ to noop. Is the

[ceph-users] Increasing QD=1 performance (lowering latency)

2021-02-02 Thread Wido den Hollander
Hi, There are many talks and presentations out there about Ceph's performance. Ceph is great when it comes to parallel I/O, large queue depths and many applications sending I/O towards Ceph. One thing where Ceph isn't the fastest are 4k blocks written at Queue Depth 1. Some applications benefit

[ceph-users] Re: CephFS per client monitoring

2021-02-02 Thread Venky Shankar
On Tue, Feb 2, 2021 at 1:55 PM Erwin Bogaard wrote: > > Hi, > > we're using mainly CephFS to give access to storage. > At all times we can see that all clients combines use "X MiB/s" and "y > op/s" for read and write by using the cli or ceph dashboard. > With a tool like iftop, I can get a bit of

[ceph-users] Re: db_devices doesn't show up in exported osd service spec

2021-02-02 Thread Eugen Block
Have you tried with a newer version of ceph? There has been a major rewrite of ceph-volume in 15.2.8 [1], maybe this was already resolved? [1] https://docs.ceph.com/en/latest/releases/octopus/#notable-changes Zitat von Tony Liu : Hi, When build cluster Octopus 15.2.5 initially, here is the

[ceph-users] Re: no device listed after adding host

2021-02-02 Thread Eugen Block
Just a note: you don't need to install any additional package to run ceph-volume: host1:~ # cephadm ceph-volume lvm list Did you resolve the missing OSDs since you posted a follow-up question? If not did you check all the logs on the OSD host, e.g. 'journalctl -f' or ceph-volume.log in /va

[ceph-users] CephFS per client monitoring

2021-02-02 Thread Erwin Bogaard
Hi, we're using mainly CephFS to give access to storage. At all times we can see that all clients combines use "X MiB/s" and "y op/s" for read and write by using the cli or ceph dashboard. With a tool like iftop, I can get a bit of insight to which clients most data 'flows', but it isn't really pr