[ceph-users] Re: Is there a way to find out which client uses which version of ceph?

2023-12-21 Thread Simon Oosthoek
;release": "luminous", >> "num": 147 >> } >> ], >> "mgr": [ >> { >> "features": "0x3f01cfbf7ffd", >> "release": "luminous&

[ceph-users] Is there a way to find out which client uses which version of ceph?

2023-12-21 Thread Simon Oosthoek
Hi, Our cluster is currently running quincy, and I want to set the minimal client version to luminous, to enable upmap balancer, but when I tried to, I got this: # ceph osd set-require-min-compat-client luminous Error EPERM: cannot set require_min_compat_client to luminous: 2 connected client(s)

[ceph-users] planning upgrade from pacific to quincy

2023-11-16 Thread Simon Oosthoek
Hi All (apologies if you get again, I suspect mails from my @science.ru.nl account get dropped by most receiving mail servers, due to the strict DMARC policy (p=reject) in place) after a long while being in health_err state (due to an unfound object, which we eventually decided to "forget"), we

[ceph-users] planning upgrade from pacific to quincy

2023-11-15 Thread Simon Oosthoek
Hi All (apologies if you get this twice, I suspect mails from my @science.ru.nl account get dropped by most receiving mail servers, due to the strict DMARC policies in place) after a long while being in health_err state (due to an unfound object, which we eventually decided to "forget"), we

[ceph-users] planning upgrade from pacific to quincy

2023-11-15 Thread Simon Oosthoek
Hi All after a long while being in health_err state (due to an unfound object, which we eventually decided to "forget"), we are now planning to upgrade our cluster which is running Pacific (at least on the mons/mdss/osds, the gateways are by accident running quincy already). The installation

[ceph-users] compounded problems interfering with recovery

2023-10-08 Thread Simon Oosthoek
Hi we're still struggling with our getting our ceph to health_ok. We're having compounded issues interfering with recovery, as I understand it. To summarize, we have a cluster of 22 osd nodes running ceph 16.2.x. About a month back we had one of the OSDs break down (just the OS disk, but we

[ceph-users] Re: cannot repair a handful of damaged pg's

2023-10-06 Thread Simon Oosthoek
ingham.com <mailto:w...@wesdillingham.com> LinkedIn <http://www.linkedin.com/in/wesleydillingham> On Fri, Oct 6, 2023 at 11:02 AM Simon Oosthoek <mailto:s.oosth...@science.ru.nl>> wrote: On 06/10/2023 16:09, Simon Oosthoek wrote: > Hi > > we're still in

[ceph-users] Re: cannot repair a handful of damaged pg's

2023-10-06 Thread Simon Oosthoek
On 06/10/2023 16:09, Simon Oosthoek wrote: Hi we're still in HEALTH_ERR state with our cluster, this is the top of the output of `ceph health detail` HEALTH_ERR 1/846829349 objects unfound (0.000%); 248 scrub errors; Possible data damage: 1 pg recovery_unfound, 2 pgs inconsistent; Degraded

[ceph-users] cannot repair a handful of damaged pg's

2023-10-06 Thread Simon Oosthoek
Hi we're still in HEALTH_ERR state with our cluster, this is the top of the output of `ceph health detail` HEALTH_ERR 1/846829349 objects unfound (0.000%); 248 scrub errors; Possible data damage: 1 pg recovery_unfound, 2 pgs inconsistent; Degraded data redundancy: 6/7118781559 objects

[ceph-users] Re: ceph osd down doesn't seem to work

2023-10-03 Thread Simon Oosthoek
On Tue, Oct 3, 2023 at 10:14 AM Simon Oosthoek wrote: Hi I'm trying to mark one OSD as down, so we can clean it out and replace it. It keeps getting medium read errors, so it's bound to fail sooner rather than later. When I command ceph from the mon to mark the osd down, it doesn't actually do

[ceph-users] ceph osd down doesn't seem to work

2023-10-03 Thread Simon Oosthoek
Hi I'm trying to mark one OSD as down, so we can clean it out and replace it. It keeps getting medium read errors, so it's bound to fail sooner rather than later. When I command ceph from the mon to mark the osd down, it doesn't actually do it. When the service on the osd stops, it is also

[ceph-users] Re: v16.2.12 Pacific (hot-fix) released

2023-04-24 Thread Simon Oosthoek
Dear List we upgraded to 16.2.12 on April 17th, since then we've seen some unexplained downed osd services in our cluster (264 osds), is there any risk of data loss, if so, would it be possible to downgrade or is a fix expected soon? if so, when? ;-) FYI, we are running a cluster without

[ceph-users] dashboard version of ceph versions shows N/A

2022-12-01 Thread Simon Oosthoek
Dear list Yesterday we updated our ceph cluster from 15.2.17 to 16.2.10 using packages. Our cluster is a mix of ubuntu 18 and ubuntu 20 with ceph coming from packages in the ceph.com repo. All went well and we now have all nodes running Pacific. However, there's something odd in the

[ceph-users] Re: how to upgrade host os under ceph

2022-10-28 Thread Simon Oosthoek
issue. Also useful to have in our fingers when e.g. the OS disk fails for some reason. So that would be my reason to still want to upgrade, even though there may not be an urgent reason... Cheers /Simon On Oct 27, 2022, at 03:16, Simon Oosthoek wrote: Dear list thanks for the answers

[ceph-users] Re: how to upgrade host os under ceph

2022-10-27 Thread Simon Oosthoek
elease for focal. Reed > On Oct 26, 2022, at 9:14 AM, Simon Oosthoek mailto:s.oosth...@science.ru.nl>> wrote: > > Dear list, > > I'm looking for some guide or pointers to how people upgrade the underlying host OS in a ceph

[ceph-users] how to upgrade host os under ceph

2022-10-26 Thread Simon Oosthoek
Dear list, I'm looking for some guide or pointers to how people upgrade the underlying host OS in a ceph cluster (if this is the right way to proceed, I don't even know...) Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is nearing the end of support date. We have a

[ceph-users] Re: post-mortem of a ceph disruption

2022-10-26 Thread Simon Oosthoek
On 26/10/2022 10:57, Stefan Kooman wrote: On 10/25/22 17:08, Simon Oosthoek wrote: At this point, one of noticed that a strange ip adress was mentioned; 169.254.0.2, it turns out that a recently added package (openmanage) and some configuration had added this interface and address

[ceph-users] post-mortem of a ceph disruption

2022-10-25 Thread Simon Oosthoek
Dear list, recently we experienced a short outage of our ceph storage, it was a surprising cause, and probably indicates a subtle misconfiguration on our part, I'm hoping for a useful suggestion ;-) We are running a 3PB cluster with 21 osd nodes (spread across 3 datacenters), 3 mon/mgrs and

[ceph-users] crush rule for 4 copy over 3 failure domains?

2021-12-17 Thread Simon Oosthoek
Dear ceph users, Since recently we have 3 locations with ceph osd nodes, for 3 copy pools, it is trivial to create a crush rule that uses all 3 datacenters for each block, but 4 copy is harder. Our current "replicated" rule is this: rule replicated_rule { id 0 type replicated

[ceph-users] Re: crushtool -i; more info from output?

2021-12-09 Thread Simon Oosthoek
13:23, Simon Oosthoek wrote: On 02/12/2021 10:20, Simon Oosthoek wrote: Dear ceph-users, We want to optimise our crush rules further and to test adjustments without impact to the cluster, we use crushtool to show the mappings. eg: crushtool -i crushmap.16  --test --num-rep 4 --show-mappings

[ceph-users] Re: crushtool -i; more info from output?

2021-12-02 Thread Simon Oosthoek
On 02/12/2021 10:20, Simon Oosthoek wrote: Dear ceph-users, We want to optimise our crush rules further and to test adjustments without impact to the cluster, we use crushtool to show the mappings. eg: crushtool -i crushmap.16  --test --num-rep 4 --show-mappings --rule 0|tail -n 10 CRUSH

[ceph-users] crushtool -i; more info from output?

2021-12-02 Thread Simon Oosthoek
Dear ceph-users, We want to optimise our crush rules further and to test adjustments without impact to the cluster, we use crushtool to show the mappings. eg: crushtool -i crushmap.16 --test --num-rep 4 --show-mappings --rule 0|tail -n 10 CRUSH rule 0 x 1014 [121,125,195,197] CRUSH rule 0

[ceph-users] Re: ceph-ansible and crush location

2021-11-09 Thread Simon Oosthoek
On 03/11/2021 16:03, Simon Oosthoek wrote: > On 03/11/2021 15:48, Stefan Kooman wrote: >> On 11/3/21 15:35, Simon Oosthoek wrote: >>> Dear list, >>> >>> I've recently found it is possible to supply ceph-ansible with >>> information about a cru

[ceph-users] Re: ceph-ansible and crush location

2021-11-03 Thread Simon Oosthoek
On 03/11/2021 15:48, Stefan Kooman wrote: On 11/3/21 15:35, Simon Oosthoek wrote: Dear list, I've recently found it is possible to supply ceph-ansible with information about a crush location, however I fail to understand how this is actually used. It doesn't seem to have any effect when

[ceph-users] ceph-ansible and crush location

2021-11-03 Thread Simon Oosthoek
Dear list, I've recently found it is possible to supply ceph-ansible with information about a crush location, however I fail to understand how this is actually used. It doesn't seem to have any effect when create a cluster from scratch (I'm testing on a bunch of vm's generated by vagrant and

[ceph-users] ceph-ansible stable-5.0 repository must be quincy?

2021-10-20 Thread Simon Oosthoek
Hi we're trying to get ceph-ansible working again for our current version of ceph (octopus), in order to be able to add some osd nodes to our cluster. (Obviously there's a longer story here, but just a quick question for now...) When we add in all.yml ceph_origin: repository

[ceph-users] Re: upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

2021-04-09 Thread Simon Oosthoek
On 25/03/2021 21:08, Simon Oosthoek wrote: > > I'll wait a bit before upgrading the remaining nodes. I hope 14.2.19 > will be available quickly. > Hi Dan, Just FYI, I upgraded the cluster this week to 14.2.19 and all systems are good now. I've removed the workaround configuration

[ceph-users] Re: upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

2021-03-25 Thread Simon Oosthoek
ceph on ubuntu 18.04 LTS this seems to be happening on the mons/mgrs and osds Cheers /Simon -- dan On Thu, Mar 25, 2021 at 8:34 PM Simon Oosthoek wrote: Hi I'm in a bit of a panic :-( Recently we started attempting to configure a radosgw to our ceph cluster, which was until now only doing

[ceph-users] Re: upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

2021-03-25 Thread Simon Oosthoek
On 25/03/2021 20:42, Dan van der Ster wrote: netstat -anp | grep LISTEN | grep mgr # netstat -anp | grep LISTEN | grep mgr tcp0 0 127.0.0.1:6801 0.0.0.0:* LISTEN 1310/ceph-mgr tcp0 0 127.0.0.1:6800 0.0.0.0:* LISTEN 1310/ceph-mgr tcp6

[ceph-users] upgrade problem nautilus 14.2.15 -> 14.2.18? (Broken ceph!)

2021-03-25 Thread Simon Oosthoek
Hi I'm in a bit of a panic :-( Recently we started attempting to configure a radosgw to our ceph cluster, which was until now only doing cephfs (and rbd wss working as well). We were messing about with ceph-ansible, as this was how we originally installed the cluster. Anyway, it installed

[ceph-users] Re: ceph slow at 80% full, mds nodes lots of unused memory

2021-02-25 Thread Simon Oosthoek
On 25/02/2021 11:19, Dylan McCulloch wrote: > Simon Oosthoek wrote: >> On 24/02/2021 22:28, Patrick Donnelly wrote: >> >   Hello Simon, >> >   >> >  On Wed, Feb 24, 2021 at 7:43 AM Simon Oosthoek > s.oosthoek(a)science.ru.nl wrote: >> >   >> &

[ceph-users] Re: ceph slow at 80% full, mds nodes lots of unused memory

2021-02-25 Thread Simon Oosthoek
On 24/02/2021 22:28, Patrick Donnelly wrote: > Hello Simon, > > On Wed, Feb 24, 2021 at 7:43 AM Simon Oosthoek > wrote: >> >> On 24/02/2021 12:40, Simon Oosthoek wrote: >>> Hi >>> >>> we've been running our Ceph cluster for nearly 2 years

[ceph-users] Re: ceph slow at 80% full, mds nodes lots of unused memory

2021-02-24 Thread Simon Oosthoek
On 24/02/2021 12:40, Simon Oosthoek wrote: > Hi > > we've been running our Ceph cluster for nearly 2 years now (Nautilus) > and recently, due to a temporary situation the cluster is at 80% full. > > We are only using CephFS on the cluster. > > Normally, I realize we sh

[ceph-users] ceph slow at 80% full, mds nodes lots of unused memory

2021-02-24 Thread Simon Oosthoek
Hi we've been running our Ceph cluster for nearly 2 years now (Nautilus) and recently, due to a temporary situation the cluster is at 80% full. We are only using CephFS on the cluster. Normally, I realize we should be adding OSD nodes, but this is a temporary situation, and I expect the cluster

[ceph-users] Re: BlueFS spillover detected, why, what?

2020-08-20 Thread Simon Oosthoek
, reality is always different. We also struggle with small files which lead to further problems. Accordingly, the right initial setting is pretty important and depends on your individual usecase. Regards, Michael On 20.08.20, 10:40, "Simon Oosthoek" wrote: Hi Michael,

[ceph-users] Re: BlueFS spillover detected, why, what?

2020-08-20 Thread Simon Oosthoek
is depends on the actual amount and file sizes. I hope this helps. Regards, Michael On 20.08.20, 09:10, "Simon Oosthoek" wrote: Hi Recently our ceph cluster (nautilus) is experiencing bluefs spillovers, just 2 osd's and I disabled the warning for these osds. (cep

[ceph-users] BlueFS spillover detected, why, what?

2020-08-20 Thread Simon Oosthoek
Hi Recently our ceph cluster (nautilus) is experiencing bluefs spillovers, just 2 osd's and I disabled the warning for these osds. (ceph config set osd.125 bluestore_warn_on_bluefs_spillover false) I'm wondering what causes this and how this can be prevented. As I understand it the rocksdb

[ceph-users] Re: Combining erasure coding and replication?

2020-03-27 Thread Simon Oosthoek
On 27/03/2020 09:56, Eugen Block wrote: > Hi, > >> I guess what you are suggesting is something like k+m with m>=k+2, for >> example k=4, m=6. Then, one can distribute 5 shards per DC and sustain >> the loss of an entire DC while still having full access to redundant >> storage. > > that's

[ceph-users] Re: v15.2.0 Octopus released

2020-03-25 Thread Simon Oosthoek
On 25/03/2020 10:10, konstantin.ilya...@mediascope.net wrote: > That is why i am asking that question about upgrade instruction. > I really don`t understand, how to upgrade/reinstall CentOS 7 to 8 without > affecting the work of cluster. > As i know, this process is easier on Debian, but we