[ceph-users] Fw: Welcome to the "ceph-users" mailing list

2021-05-20 Thread 274456...@qq.com
274456...@qq.com From: ceph-users-request Date: 2021-05-21 13:55 To: 274456...@qq.com Subject: Welcome to the "ceph-users" mailing list Welcome to the "ceph-users" mailing list! To post to this list, send your email to: ceph-users@ceph.io You can unsubscribe or make adjustments to

[ceph-users] Re: Does dynamic resharding block I/Os by design?

2021-05-20 Thread Satoru Takeuchi
2021年5月18日(火) 14:09 Satoru Takeuchi : > 2021年5月18日(火) 9:23 Satoru Takeuchi : > > > > Hi, > > > > I have a Ceph cluster used for RGW and RBD. I found that all I/Os to > > RGW seemed to be > > blocked while dynamic resharding. Could you tell me whether this > > behavior is by design or not? > > > >

[ceph-users] Application for mirror.csclub.uwaterloo.ca as an official mirror

2021-05-20 Thread Zachary Seguin
Hello, I am contacting you on behalf of the Computer Science Club of the University of Waterloo (https://csclub.uwaterloo.ca) to add our mirror (https://mirror.csclub.uwaterloo.ca) as an official mirror of the Ceph project. Our mirror is located at the University of Waterloo in Waterloo,

[ceph-users] Stray hosts and daemons

2021-05-20 Thread Vladimir Brik
I am not sure how to interpret CEPHADM_STRAY_HOST and CEPHADM_STRAY_DAEMON warnings. They seem to be inconsistent. I converted my cluster to be managed by cephadm by adopting mon and all other daemons, and they show up in ceph orch ps, but ceph health says mons are stray: [WRN]

[ceph-users] MDS Stuck in Replay Loop (Segfault) after subvolume creation

2021-05-20 Thread Carsten Feuls
Hello, i want to test something with cephfs subvolume an how to mount it and set quota. after some "ceph fs" commands I got an E-Mail from Prometheus that the cluster is in "Health Warn". The Error was that every MDS crash with a Segfault. Following Some Information of my cluster. The cluster

[ceph-users] OSD's still UP after power loss

2021-05-20 Thread by morphin
Hello I have a weird problem on 3 node cluster. "Nautilus 14.2.9" When I try power failure OSD's are not marking as DOWN and MDS do not respond anymore. If I manually set osd down then MDS becomes active again. BTW: Only 2 node has OSD's. Third node is only for MON. I've set

[ceph-users] Re: [EXTERNAL] Re: fsck error: found stray omap data on omap_head

2021-05-20 Thread Pickett, Neale T
You are correct, even though the repair reports an error, I was able to join the disk back into the cluster, and it stopped reporting the legacy omap warning. I had assumed an "error" was something that needed to be rectified before anything could proceed, but apparently it's more like

[ceph-users] Re: ceph df: pool stored vs bytes_used -- raw or not?

2021-05-20 Thread Dan van der Ster
I can confirm that we still occasionally see stored==used even with 14.2.21, but I didn't have time yet to debug the pattern behind the observations. I'll let you know if we find anything useful. .. Dan On Thu, May 20, 2021, 6:56 PM Konstantin Shalygin wrote: > > > > On 20 May 2021, at

[ceph-users] Re: ceph df: pool stored vs bytes_used -- raw or not?

2021-05-20 Thread Igor Fedotov
This patch (https://github.com/ceph/ceph/pull/38354) should be present in Nautilus starting v14.2.21 Perhaps you're facing a different issue, could you please share "ceph osd tree" output? Thanks, Igor On 5/19/2021 6:18 PM, Konstantin Shalygin wrote: Dan, Igor Seems this wasn't

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Dan van der Ster
Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you choose 6 type host and then chooseleaf 2 type osd? .. Dan On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi wrote: > Hallo Dan, Bryan, > I have a rule similar to yours, for an 8+4 pool, with only > difference that I

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Nathan Fish
The obvious thing to do is to set 4+2 instead - is that not an option? On Wed, May 12, 2021 at 11:58 AM Bryan Stillwell wrote: > > I'm trying to figure out a CRUSH rule that will spread data out across my > cluster as much as possible, but not more than 2 chunks per host. > > If I use the

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Dan van der Ster
Hi Fulvio, That's strange... It doesn't seem right to me. Are there any upmaps for that PG? ceph osd dump | grep upmap | grep 116.453 Cheers, Dan On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi wrote: > Hallo Dan, Bryan, > I have a rule similar to yours, for an 8+4 pool, with only

[ceph-users] mgr+Prometheus/grafana (+consul)

2021-05-20 Thread Jeremy Austin
I recently configured Prometheus to scrape mgr /metrics and add Grafana dashboards. All daemons at 15.2.11 I use Hashicorp consul to advertise the active mgr in DNS, and Prometheus points at a single DNS target. (Is anyone else using this method, or just statically pointing Prometheus at all

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Sebastian Luna Valero
Hi Eugen, Here it is: # ceph mgr module ls | jq -r '.enabled_modules[]' cephadm dashboard diskprediction_local iostat prometheus restful Should "crash" and "orchestrator" be part on the list? Why would have they disappeared in the first place? Best regards, Sebastian On Thu, 20 May 2021 at

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Eugen Block
Which mgr modules are enabled? Can you share (if it responds): ceph mgr module ls | jq -r '.enabled_modules[]' We have checked the call made from the container by checking DEBUG logs and I see that it is correct, in some commands work but others hang: Do you see those shell sessions on

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
Reading through the bugtracker: https://tracker.ceph.com/issues/50293 Thanks for your patience. Am Do., 20. Mai 2021 um 15:10 Uhr schrieb Boris Behrens : > I try to bump it once more, because it makes finding orphan objects nearly > impossible. > > Am Di., 11. Mai 2021 um 13:03 Uhr schrieb

[ceph-users] Re: "radosgw-admin bucket radoslist" loops when a multipart upload is happening

2021-05-20 Thread Boris Behrens
I try to bump it once more, because it makes finding orphan objects nearly impossible. Am Di., 11. Mai 2021 um 13:03 Uhr schrieb Boris Behrens : > Hi together, > > I still search for orphan objects and came across a strange bug: > There is a huge multipart upload happening (around 4TB), and

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread ManuParra
Hi Eugen thank you very much for your reply. I'm Manuel, a colleague of Sebastián. I complete what you ask us. We have checked more ceph commands, not only ceph crash and ceph org and many other commands are equally hung: [spsrc-mon-1 ~]# cephadm shell -- ceph pg stat hangs forever

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Fulvio Galeazzi
Hallo Dan, Bryan, I have a rule similar to yours, for an 8+4 pool, with only difference that I replaced the second "choose" with "chooseleaf", which I understand should make no difference: rule default.rgw.buckets.data { id 6 type erasure min_size 3

[ceph-users] Re: fsck error: found stray omap data on omap_head

2021-05-20 Thread Igor Fedotov
I think there is no way to fix that at the moment other than manually identify and remove relevant record(s) in RocksDB with ceph-kvstore-tool. Which might be pretty tricky.. Looks like we should implement these stray records removal when repairing BlueStore... On 5/19/2021 11:12 PM,

[ceph-users] Bucket index OMAP keys unevenly distributed among shards

2021-05-20 Thread James, GleSYS
Hi, we're running 15.2.7 and our cluster is warning us about LARGE_OMAP_OBJECTS (1 large omap objects). Here is what the distribution looks like for the bucket in question, and as you can see all but 3 of the keys reside in shard 2. .dir.5a5c812a-3d31-4d79-87e6-1a17206228ac.18635192.221.0

[ceph-users] Re: ceph-ansible in Pacific and beyond?

2021-05-20 Thread Gregory Orange
Hi, On 19/3/21 1:11 pm, Stefan Kooman wrote: Is it going to continue to be supported? We use it (and uncontainerised packages) for all our clusters, so I'd be a bit alarmed if it was going to go away... Just a reminder to all of you. Please fill in the Ceph-user survey and > make your

[ceph-users] Re: ceph orch status hangs forever

2021-05-20 Thread Eugen Block
Hi, HEALTH_WARN 2 failed cephadm daemon(s) [WRN] CEPHADM_FAILED_DAEMON: 2 failed cephadm daemon(s) daemon mon.spsrc-mon-1-safe on spsrc-mon-1 is in error state daemon mon.spsrc-mon-2-safe on spsrc-mon-2 is in error state I don't think these containers are crucial, right? I did ask a