[ceph-users] Help with setting-up Influx MGR module: ERROR - queue is full

2024-02-13 Thread Fulvio Galeazzi
th these matters, namely storing configuration and metrics "somewhere"? Thanks a lot! (for your patience in reading this, at least) Fulvio -- Fulvio Galeazzi GARR-Net Department tel.: +39-334-6533-250 skype: fgaleazzi70 smime.p7s Description: S/MIME Cr

[ceph-users] Re: How to configure something like osd_deep_scrub_min_interval?

2024-01-08 Thread Fulvio Galeazzi
rs@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io -- Fulvio Galeazzi GARR-Net Department tel.: +39-334-6533-250 skype: fgaleazzi70 smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list -- ceph-use

[ceph-users] Re: Question about recovery priority

2022-09-23 Thread Fulvio Galeazzi
her it's worth to try different m+n.) Thanks again! Fulvio Josh On Thu, Sep 22, 2022 at 6:35 AM Fulvio Galeazzi wrote: Hallo all, taking advantage of the redundancy of my EC pool, I destroyed a couple of servers in order to reinstall them with a new op

[ceph-users] Question about recovery priority

2022-09-22 Thread Fulvio Galeazzi
ot;guide" the process? Thanks for your hints Fulvio -- Fulvio Galeazzi GARR-CSD Department skype: fgaleazzi70 tel.: +39-334-6533-250 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
u may also need to "ceph pg repeer $pgid" for each of the PGs stuck activating. Josh On Thu, Sep 15, 2022 at 8:42 AM Fulvio Galeazzi mailto:fulvio.galea...@garr.it>> wrote: > > > Hallo, >         I am on Nautilus and today, after upg

[ceph-users] Re: Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
id" for each of the PGs stuck activating. Josh On Thu, Sep 15, 2022 at 8:42 AM Fulvio Galeazzi wrote: Hallo, I am on Nautilus and today, after upgrading the operating system (from CentOS 7 to CentOS 8 Stream) on a couple OSD servers and adding them back to the cluster, I noticed some

[ceph-users] Nautilus: PGs stuck "activating" after adding OSDs. Please help!

2022-09-15 Thread Fulvio Galeazzi
https://pastebin.ubuntu.com/p/VWhT7FWf6m/ ceph osd lspools ; ceph pg dump_stuck inactive https://pastebin.ubuntu.com/p/9f6rXRYMh4/ Thanks a lot! Fulvio -- Fulvio Galeazzi GARR-CSD Department tel.: +39-334-6533-250 skype: fgaleazzi70 ___

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Fulvio Galeazzi
if you use that type, it can break things. But since you're not actually using "storage" at the moment, it probably isn't causing any issue. So -- could you go ahead with that chooseleaf fix then let us know how it goes? Cheers, Dan On Mon, Apr 4, 2022 at 10:01 AM Fulvio Gale

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-04 Thread Fulvio Galeazzi
you can safely ignore as they are no longer present in any crush_rule. I think they may be relevant, as mentioned earlier. Please also don't worry about the funny weights, as I am preparing for hardware replacemente and am freeing up space. As a general rule, never drain osds (never decreas

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Fulvio Galeazzi
degraded, remapped, or whatever! They must all be active+clean to consider big changes like injecting a new crush rule!! Ok, now I think I learned it. In my mind it was a sort of optimization: as I was moving stuff around due to the additional servers, why not at the same time update the crush rule

[ceph-users] Re: PG down, due to 3 OSD failing

2022-04-01 Thread Fulvio Galeazzi
Ciao Dan, thanks for your time! So you are suggesting that my problems with PG 85.25 may somehow resolve if I manage to bring up the three OSDs currently "down" (possibly due to PG 85.12, and other PGs)? Looking for the string 'start interval does not contain the required bound' I found

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-30 Thread Fulvio Galeazzi
hich one is the right copy? Thanks! Fulvio Il 3/29/2022 9:35 AM, Fulvio Galeazzi ha scritto: Thanks a lot, Dan! > The EC pgs have a naming convention like 85.25s1 etc.. for the various > k/m EC shards. That was the bit of information I was missing... I wa

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-29 Thread Fulvio Galeazzi
io Il 3/29/2022 9:35 AM, Fulvio Galeazzi ha scritto: Thanks a lot, Dan! > The EC pgs have a naming convention like 85.25s1 etc.. for the various > k/m EC shards. That was the bit of information I was missing... I was looking for the wrong object. I can now go on and export/import that

[ceph-users] Re: PG down, due to 3 OSD failing

2022-03-29 Thread Fulvio Galeazzi
again! Fulvio On 28/03/2022 16:27, Dan van der Ster wrote: Hi Fulvio, You can check (offline) which PGs are on an OSD with the list-pgs op, e.g. ceph-objectstore-tool --data-path /var/lib/ceph/osd/cephpa1-158/ --op list-pgs -- dan On Mon, Mar 28, 2022 at 2:29 PM Fulvio Galeazzi wrote:

[ceph-users] PG down, due to 3 OSD failing

2022-03-28 Thread Fulvio Galeazzi
{ "osd": 90, "shard": 0 }, { "osd": 91, "shard": 4 },

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-06-04 Thread Fulvio Galeazzi
crush rule? Or perhaps you're running an old version of ceph which had buggy balancer implementation? Cheers, Dan On Thu, May 27, 2021 at 5:16 PM Fulvio Galeazzi wrote: Hallo Dan, Nathan, thanks for your replies and apologies for my silence. Sorry I had made a typo... the rule i

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-27 Thread Fulvio Galeazzi
leaf 2 type osd? .. Dan On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi <mailto:fulvio.galea...@garr.it>> wrote: Hallo Dan, Bryan,      I have a rule similar to yours, for an 8+4 pool, with only difference that I replaced the second "choose" with "chooseleaf&q

[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-05-20 Thread Fulvio Galeazzi
e OSDs in each. Then a normal host-wise rule should work. Cheers, Dan -- Fulvio Galeazzi GARR-CSD Department skype: fgaleazzi70 tel.: +39-334-6533-250 smime.p7s Description: Firma crittografica S/MIME ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Kubernetes Luminous client acting on Nautilus pool: protocol feature mismatch: missing 200000 (CEPH_FEATURE_MON_GV ?)

2020-10-07 Thread Fulvio Galeazzi
ng Ceph packages on Kubernetes workers (now at Luminous) would help, may be? Thanks! Fulvio -- Fulvio Galeazzi GARR-CSD Department skype: fgaleazzi70 tel.: +39-334-6533-250 smime.p7s Description: Firma crittografica S/MIME _

[ceph-users] Re: Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

2020-05-27 Thread Fulvio Galeazzi
update to Nautilus in 1 month or so, so I decided to consider it as "history". Thanks again Dan for your help! Fulvio Il 5/22/2020 10:43 PM, Fulvio Galeazzi ha scritto: Hallo Dan, thanks for your patience! Il 5/22/2020 1:57 PM, Dan van der Ster ha scr

[ceph-users] Re: Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

2020-05-22 Thread Fulvio Galeazzi
ut/output error . still going on, but I am not confident it will end up in anything good, will see. Thanks! Fulvio -- dan On Fri, May 22, 2020 at 1:02 PM Fulvio Galeazzi wrote: Hallo Dan, thanks for your reply! Very good to know about compression... wi

[ceph-users] Re: Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

2020-05-22 Thread Fulvio Galeazzi
ENOENT: option 'compression_mode' is not set on pool 'testrgw.usage' Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.users' Error ENOENT: option 'compression_mode' is not set on pool 'testrgw.users.email' Error ENOENT: opti

[ceph-users] Luminous, OSDs down: "osd init failed" and "failed to load OSD map for epoch ... got 0 bytes"

2020-05-21 Thread Fulvio Galeazzi
Hallo all,     hope you can help me with very strange problems which arose suddenly today. Tried to search, also in this mailing list, but could not find anything relevant. At some point today, without any action from my side, I noticed some OSDs in my production cluster would go down and never