[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-26 Thread Frank Schilder
Hi David,

we recently had the same/similar problem, a failing SFP transceiver. We got 
"long ping time" warnings and it took a while to find the source. Strange that 
you didn't have ping time warnings. Are your thresholds too high?

I learned that our switches have flapping protection, it is called "link 
dampening". Our switches are Dell OS9, s4048 and z9100. Might be worth checking 
if your switches support something like that as well.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: David Orman 
Sent: 26 February 2021 19:57:45
To: ceph-users
Subject: [ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

We figured this out - it was a leg of an LACP-based interface that was
misbehaving. Once we dropped it, everything went back to normal. Does
anybody know a good way to get a sense of what might be slowing down a
cluster in this regard, with EC? We didn't see any indication of a single
host as a problem until digging into the socket statistics and seeing high
sendqs to that host.

On Thu, Feb 25, 2021 at 7:33 PM David Orman  wrote:

> Hi,
>
> We've got an interesting issue we're running into on Ceph 15.2.9. We're
> experiencing VERY slow performance from the cluster, and extremely slow
> misplaced object correction, with very little cpu/disk/network utilization
> (almost idle) across all nodes in the cluster.
>
> We have 7 servers in this cluster, 24 rotational OSDs, and two NVMEs with
> 12 OSD's worth of DB/WAL files on them. The OSDs are all equal weighted, so
> the tree is pretty straightforward:
>
> root@ceph01:~# ceph osd tree
>
> Inferring fsid 41bb9256-c3bf-11ea-85b9-9e07b0435492
>
> Inferring config
> /var/lib/ceph/41bb9256-c3bf-11ea-85b9-9e07b0435492/mon.ceph01/config
>
> Using recent ceph image
> docker.io/ceph/ceph@sha256:4e710662986cf366c282323bfb4c4ca507d7e117c5ccf691a8273732073297e5
>
> ID   CLASS  WEIGHT  TYPE NAMESTATUS  REWEIGHT  PRI-AFF
>
>  -1 2149.39062  root default
>
>  -2 2149.39062  rack rack1
>
>  -5  307.05579  host ceph01
>
>   0hdd12.79399  osd.0up   1.0  1.0
>
>   1hdd12.79399  osd.1up   1.0  1.0
>
>   2hdd12.79399  osd.2up   1.0  1.0
>
>   3hdd12.79399  osd.3up   1.0  1.0
>
>   4hdd12.79399  osd.4up   1.0  1.0
>
>   5hdd12.79399  osd.5up   1.0  1.0
>
>   6hdd12.79399  osd.6up   1.0  1.0
>
>   7hdd12.79399  osd.7up   1.0  1.0
>
>   8hdd12.79399  osd.8up   1.0  1.0
>
>   9hdd12.79399  osd.9up   1.0  1.0
>
>  10hdd12.79399  osd.10   up   1.0  1.0
>
>  11hdd12.79399  osd.11   up   1.0  1.0
>
>  12hdd12.79399  osd.12   up   1.0  1.0
>
>  13hdd12.79399  osd.13   up   1.0  1.0
>
>  14hdd12.79399  osd.14   up   1.0  1.0
>
>  15hdd12.79399  osd.15   up   1.0  1.0
>
>  16hdd12.79399  osd.16   up   1.0  1.0
>
>  17hdd12.79399  osd.17   up   1.0  1.0
>
>  18hdd12.79399  osd.18   up   1.0  1.0
>
>  19hdd12.79399  osd.19   up   1.0  1.0
>
>  20hdd12.79399  osd.20   up   1.0  1.0
>
>  21hdd12.79399  osd.21   up   1.0  1.0
>
>  22hdd12.79399  osd.22   up   1.0  1.0
>
>  23hdd12.79399  osd.23   up   1.0  1.0
>
>  -7  307.05579  host ceph02
>
>  24hdd12.79399  osd.24   up   1.0  1.0
>
>  25hdd12.79399  osd.25   up   1.0  1.0
>
>  26hdd12.79399  osd.26   up   1.0  1.0
>
>  27hdd12.79399  osd.27   up   1.0  1.0
>
>  28hdd12.79399  osd.28   up   1.0  1.0
>
>  29hdd12.79399  osd.29   up   1.0  1.0
>
>  30hdd12.79399  osd.30   up   1.0  1.0
>
>  31hdd12.79399  osd.31   up   1.0  1.0
>
>  32hdd12.79399  osd.32   up   1.0  1.0
>
>  33hdd12.79399  osd.33   up   1.0  1.0
>
>  34hdd12.79399  osd.34   up   1.0  1.0
>
>  35hdd12.79399  osd.35   up   1.0  1.0
>
>  36hdd12.79399  osd.36   up   1.0  1.0
>
>  37hdd12.79399  osd

[ceph-users] Re: MDSs report damaged metadata

2021-02-26 Thread ricardo.re.azevedo
Thanks for the advice and info regarding the error. 

I tried ` ceph tell mds.database-0 scrub start / recursive repair force` and it 
didn't help. Is there anything else I can try? Or manually fix the links?

Best,
Ricardo


-Original Message-
From: Patrick Donnelly  
Sent: Thursday, February 25, 2021 12:06 PM
To: ricardo.re.azev...@gmail.com
Cc: ceph-users 
Subject: Re: [ceph-users] MDSs report damaged metadata

Hello Ricardo,

On Thu, Feb 25, 2021 at 11:51 AM  wrote:
>
> Hi all,
>
>
>
> My cephfs MDS is reporting damaged metadata following the addition 
> (and
> remapping) of 12 new OSDs.
> `ceph tell mds.database-0 damage ls` reports ~85 files damaged. All of 
> type "backtrace" which is very concerning.

It is not concerning, actually. This just indicates that the reverse link of 
the file's object data to its path in the file system is incorrect.

> ` ceph tell mds.database-0 scrub start / recursive repair` seems to 
> have no effect on the damage. What does this sort of damage mean? Is 
> there anything I can do to recover these files?

Scrubbing should correct it. Try "recursive repair force" to see if that helps. 
"force" will cause the MDS to revisit metadata that has been scrubbed 
previously but unchanged since then.

--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-26 Thread David Orman
Hi Martin,

We've already got the collection in place, and we (in retrospect) see
some errors on the sub-interface in question. We'll be adding alerting
for this specific scenario as it was missed in more general alerting,
and the bonded interfaces themselves don't show the errors - only the
underlying interfaces. We only discovered the issue when looking at
the SendQs across the cluster and noting that many were to a specific
host, at which point we discovered errors with a sub-interface of a
bonded NIC.

Thank you for the suggestion!

I am curious, though, how one might have pin-pointed a troublesome
host/OSD prior to this. Looking back at some of the detail when
attempting to diagnose, we do see some ops taking longer in
sub_op_committed, but not really a lot else. We'd get an occasional
slow operation on OSD warning, but the OSDs were spread across various
ceph nodes, not just the one with issues, I'm assuming due to EC.

There was no real clarity on where the 'jam' was happening, at least
in anything we looked at. I'm wondering if there's a better way to see
what, specifically, is "slow" on a cluster. Looking at even the OSD
perf output wasn't helpful, because all of that was fine - it was
likely due to EC and write operations to OSDs on that specific node in
question. Is there some way to look at a cluster and see which hosts
are problematic/leading to slowness in an EC-based setup?

Thanks,
David


On Fri, Feb 26, 2021 at 1:16 PM Martin Verges  wrote:
>
> Hello,
>
> within croit, we have a network latency monitoring that would have
> shown you the packetlos.
> We therefore suggest to install something like a smokeping on your
> infrastructure to monitor the quality of your network.
>
> Why does it affect your cluster?
>
> The network is the central component of a Ceph cluster. If this does
> not function stably and reliably, Ceph cannot work properly either. It
> is practically the backbone of the scale-out cluster and cannot be
> replaced by anything. Single packet loss, for example, leads to
> retransmits of packets, increased latency and thus reduced data
> throughput. This in turn leads to a higher impact during replication
> work, which is particularly prevalent in EC. In EC, not only write
> accesses but also read accesses must be loaded from several OSDs.
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
> Am Fr., 26. Feb. 2021 um 20:00 Uhr schrieb David Orman :
> >
> > We figured this out - it was a leg of an LACP-based interface that was
> > misbehaving. Once we dropped it, everything went back to normal. Does
> > anybody know a good way to get a sense of what might be slowing down a
> > cluster in this regard, with EC? We didn't see any indication of a single
> > host as a problem until digging into the socket statistics and seeing high
> > sendqs to that host.
> >
> > On Thu, Feb 25, 2021 at 7:33 PM David Orman  wrote:
> >
> > > Hi,
> > >
> > > We've got an interesting issue we're running into on Ceph 15.2.9. We're
> > > experiencing VERY slow performance from the cluster, and extremely slow
> > > misplaced object correction, with very little cpu/disk/network utilization
> > > (almost idle) across all nodes in the cluster.
> > >
> > > We have 7 servers in this cluster, 24 rotational OSDs, and two NVMEs with
> > > 12 OSD's worth of DB/WAL files on them. The OSDs are all equal weighted, 
> > > so
> > > the tree is pretty straightforward:
> > >
> > > root@ceph01:~# ceph osd tree
> > >
> > > Inferring fsid 41bb9256-c3bf-11ea-85b9-9e07b0435492
> > >
> > > Inferring config
> > > /var/lib/ceph/41bb9256-c3bf-11ea-85b9-9e07b0435492/mon.ceph01/config
> > >
> > > Using recent ceph image
> > > docker.io/ceph/ceph@sha256:4e710662986cf366c282323bfb4c4ca507d7e117c5ccf691a8273732073297e5
> > >
> > > ID   CLASS  WEIGHT  TYPE NAMESTATUS  REWEIGHT  PRI-AFF
> > >
> > >  -1 2149.39062  root default
> > >
> > >  -2 2149.39062  rack rack1
> > >
> > >  -5  307.05579  host ceph01
> > >
> > >   0hdd12.79399  osd.0up   1.0  1.0
> > >
> > >   1hdd12.79399  osd.1up   1.0  1.0
> > >
> > >   2hdd12.79399  osd.2up   1.0  1.0
> > >
> > >   3hdd12.79399  osd.3up   1.0  1.0
> > >
> > >   4hdd12.79399  osd.4up   1.0  1.0
> > >
> > >   5hdd12.79399  osd.5up   1.0  1.0
> > >
> > >   6hdd12.79399  osd.6up   1.0  1.0
> > >
> > >   7hdd12.79399  osd.7up   1.0  1.0
> > >
> > >   8hdd12.79399  os

[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-26 Thread Martin Verges
Hello,

within croit, we have a network latency monitoring that would have
shown you the packetlos.
We therefore suggest to install something like a smokeping on your
infrastructure to monitor the quality of your network.

Why does it affect your cluster?

The network is the central component of a Ceph cluster. If this does
not function stably and reliably, Ceph cannot work properly either. It
is practically the backbone of the scale-out cluster and cannot be
replaced by anything. Single packet loss, for example, leads to
retransmits of packets, increased latency and thus reduced data
throughput. This in turn leads to a higher impact during replication
work, which is particularly prevalent in EC. In EC, not only write
accesses but also read accesses must be loaded from several OSDs.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx

Am Fr., 26. Feb. 2021 um 20:00 Uhr schrieb David Orman :
>
> We figured this out - it was a leg of an LACP-based interface that was
> misbehaving. Once we dropped it, everything went back to normal. Does
> anybody know a good way to get a sense of what might be slowing down a
> cluster in this regard, with EC? We didn't see any indication of a single
> host as a problem until digging into the socket statistics and seeing high
> sendqs to that host.
>
> On Thu, Feb 25, 2021 at 7:33 PM David Orman  wrote:
>
> > Hi,
> >
> > We've got an interesting issue we're running into on Ceph 15.2.9. We're
> > experiencing VERY slow performance from the cluster, and extremely slow
> > misplaced object correction, with very little cpu/disk/network utilization
> > (almost idle) across all nodes in the cluster.
> >
> > We have 7 servers in this cluster, 24 rotational OSDs, and two NVMEs with
> > 12 OSD's worth of DB/WAL files on them. The OSDs are all equal weighted, so
> > the tree is pretty straightforward:
> >
> > root@ceph01:~# ceph osd tree
> >
> > Inferring fsid 41bb9256-c3bf-11ea-85b9-9e07b0435492
> >
> > Inferring config
> > /var/lib/ceph/41bb9256-c3bf-11ea-85b9-9e07b0435492/mon.ceph01/config
> >
> > Using recent ceph image
> > docker.io/ceph/ceph@sha256:4e710662986cf366c282323bfb4c4ca507d7e117c5ccf691a8273732073297e5
> >
> > ID   CLASS  WEIGHT  TYPE NAMESTATUS  REWEIGHT  PRI-AFF
> >
> >  -1 2149.39062  root default
> >
> >  -2 2149.39062  rack rack1
> >
> >  -5  307.05579  host ceph01
> >
> >   0hdd12.79399  osd.0up   1.0  1.0
> >
> >   1hdd12.79399  osd.1up   1.0  1.0
> >
> >   2hdd12.79399  osd.2up   1.0  1.0
> >
> >   3hdd12.79399  osd.3up   1.0  1.0
> >
> >   4hdd12.79399  osd.4up   1.0  1.0
> >
> >   5hdd12.79399  osd.5up   1.0  1.0
> >
> >   6hdd12.79399  osd.6up   1.0  1.0
> >
> >   7hdd12.79399  osd.7up   1.0  1.0
> >
> >   8hdd12.79399  osd.8up   1.0  1.0
> >
> >   9hdd12.79399  osd.9up   1.0  1.0
> >
> >  10hdd12.79399  osd.10   up   1.0  1.0
> >
> >  11hdd12.79399  osd.11   up   1.0  1.0
> >
> >  12hdd12.79399  osd.12   up   1.0  1.0
> >
> >  13hdd12.79399  osd.13   up   1.0  1.0
> >
> >  14hdd12.79399  osd.14   up   1.0  1.0
> >
> >  15hdd12.79399  osd.15   up   1.0  1.0
> >
> >  16hdd12.79399  osd.16   up   1.0  1.0
> >
> >  17hdd12.79399  osd.17   up   1.0  1.0
> >
> >  18hdd12.79399  osd.18   up   1.0  1.0
> >
> >  19hdd12.79399  osd.19   up   1.0  1.0
> >
> >  20hdd12.79399  osd.20   up   1.0  1.0
> >
> >  21hdd12.79399  osd.21   up   1.0  1.0
> >
> >  22hdd12.79399  osd.22   up   1.0  1.0
> >
> >  23hdd12.79399  osd.23   up   1.0  1.0
> >
> >  -7  307.05579  host ceph02
> >
> >  24hdd12.79399  osd.24   up   1.0  1.0
> >
> >  25hdd12.79399  osd.25   up   1.0  1.0
> >
> >  26hdd12.79399  osd.26   up   1.0  1.0
> >
> >  27hdd12.79399  osd.27   up   1.0  1.0
> >
> >  28hdd12.79399  osd.28   up   1.0  1.0

[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-26 Thread David Orman
We figured this out - it was a leg of an LACP-based interface that was
misbehaving. Once we dropped it, everything went back to normal. Does
anybody know a good way to get a sense of what might be slowing down a
cluster in this regard, with EC? We didn't see any indication of a single
host as a problem until digging into the socket statistics and seeing high
sendqs to that host.

On Thu, Feb 25, 2021 at 7:33 PM David Orman  wrote:

> Hi,
>
> We've got an interesting issue we're running into on Ceph 15.2.9. We're
> experiencing VERY slow performance from the cluster, and extremely slow
> misplaced object correction, with very little cpu/disk/network utilization
> (almost idle) across all nodes in the cluster.
>
> We have 7 servers in this cluster, 24 rotational OSDs, and two NVMEs with
> 12 OSD's worth of DB/WAL files on them. The OSDs are all equal weighted, so
> the tree is pretty straightforward:
>
> root@ceph01:~# ceph osd tree
>
> Inferring fsid 41bb9256-c3bf-11ea-85b9-9e07b0435492
>
> Inferring config
> /var/lib/ceph/41bb9256-c3bf-11ea-85b9-9e07b0435492/mon.ceph01/config
>
> Using recent ceph image
> docker.io/ceph/ceph@sha256:4e710662986cf366c282323bfb4c4ca507d7e117c5ccf691a8273732073297e5
>
> ID   CLASS  WEIGHT  TYPE NAMESTATUS  REWEIGHT  PRI-AFF
>
>  -1 2149.39062  root default
>
>  -2 2149.39062  rack rack1
>
>  -5  307.05579  host ceph01
>
>   0hdd12.79399  osd.0up   1.0  1.0
>
>   1hdd12.79399  osd.1up   1.0  1.0
>
>   2hdd12.79399  osd.2up   1.0  1.0
>
>   3hdd12.79399  osd.3up   1.0  1.0
>
>   4hdd12.79399  osd.4up   1.0  1.0
>
>   5hdd12.79399  osd.5up   1.0  1.0
>
>   6hdd12.79399  osd.6up   1.0  1.0
>
>   7hdd12.79399  osd.7up   1.0  1.0
>
>   8hdd12.79399  osd.8up   1.0  1.0
>
>   9hdd12.79399  osd.9up   1.0  1.0
>
>  10hdd12.79399  osd.10   up   1.0  1.0
>
>  11hdd12.79399  osd.11   up   1.0  1.0
>
>  12hdd12.79399  osd.12   up   1.0  1.0
>
>  13hdd12.79399  osd.13   up   1.0  1.0
>
>  14hdd12.79399  osd.14   up   1.0  1.0
>
>  15hdd12.79399  osd.15   up   1.0  1.0
>
>  16hdd12.79399  osd.16   up   1.0  1.0
>
>  17hdd12.79399  osd.17   up   1.0  1.0
>
>  18hdd12.79399  osd.18   up   1.0  1.0
>
>  19hdd12.79399  osd.19   up   1.0  1.0
>
>  20hdd12.79399  osd.20   up   1.0  1.0
>
>  21hdd12.79399  osd.21   up   1.0  1.0
>
>  22hdd12.79399  osd.22   up   1.0  1.0
>
>  23hdd12.79399  osd.23   up   1.0  1.0
>
>  -7  307.05579  host ceph02
>
>  24hdd12.79399  osd.24   up   1.0  1.0
>
>  25hdd12.79399  osd.25   up   1.0  1.0
>
>  26hdd12.79399  osd.26   up   1.0  1.0
>
>  27hdd12.79399  osd.27   up   1.0  1.0
>
>  28hdd12.79399  osd.28   up   1.0  1.0
>
>  29hdd12.79399  osd.29   up   1.0  1.0
>
>  30hdd12.79399  osd.30   up   1.0  1.0
>
>  31hdd12.79399  osd.31   up   1.0  1.0
>
>  32hdd12.79399  osd.32   up   1.0  1.0
>
>  33hdd12.79399  osd.33   up   1.0  1.0
>
>  34hdd12.79399  osd.34   up   1.0  1.0
>
>  35hdd12.79399  osd.35   up   1.0  1.0
>
>  36hdd12.79399  osd.36   up   1.0  1.0
>
>  37hdd12.79399  osd.37   up   1.0  1.0
>
>  38hdd12.79399  osd.38   up   1.0  1.0
>
>  39hdd12.79399  osd.39   up   1.0  1.0
>
>  40hdd12.79399  osd.40   up   1.0  1.0
>
>  41hdd12.79399  osd.41   up   1.0  1.0
>
>  42hdd12.79399  osd.42   up   1.0  1.0
>
>  43hdd12.79399  osd.43   up   1.0  1.0
>
>  44hdd12.79399  osd.44   up   1.0  1.0
>
>  45hdd12.79399  osd.45   up   1.0  1.0
>
>  46hdd12.79399  osd.46   up   1.0  1.0
>
> 

[ceph-users] Re: Nautilus Cluster Struggling to Come Back Online

2021-02-26 Thread Wout van Heeswijk
The issue is found and fixed in 15.2.3.

Thanks for your response Igor!

Kind regards,

Wout
42on


From: Wout van Heeswijk 
Sent: Friday, 26 February 2021 16:10
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Nautilus Cluster Struggling to Come Back Online

For those interested in this issue. We've been seeing OSDs with corrupted wals 
after they had a suicide time out. I've updated the ticket created by William 
with some of our logs.
https://tracker.ceph.com/issues/48827#note-16

We're using ceph 15.2.2 in this cluster. Currently we are contemplating a way 
forward, but it looks like the wals are being corrupted under load.

Kind regards,

Wout
42on



From: William Law 
Sent: Tuesday, 19 January 2021 18:48
To: ceph-users@ceph.io
Subject: [ceph-users] Nautilus Cluster Struggling to Come Back Online

I guess as a sort of follow up from my previous post.  Our Nautilus (14.2.16 on 
ubuntu 18.04) cluster had some sort of event that caused many of the machines 
to have memory errors.  The aftermath is that initially some OSDs had (and 
continue to have) this error https://tracker.ceph.com/issues/48827  others 
won't start for various reasons.

The OSDs that *will* start are badly behind the current epoch for the most part.

It sounds very similar to this:
https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/

We are having trouble getting things back online.

I think the path forward is to:
-set noup/nodown/noout/nobackfill/and wait for the OSDs that run to come up; we 
were making good progress yesterday until some of the OSDs crashed with OOM 
errors.  We are again moving forward but understandably nervous.
-export the PGs from questionable OSDs and and then rebuild the OSDs; import 
the PGs if necessary (very likely).  Repeat until we are up.

Any suggestions for increasing speed?  We are using 
noup/nobackfill/norebalance/pause but the epoch catchup is taking a very long 
time.  Any tips for keeping the epoch from moving forward or speeding up the 
OSDs catching up? How can we estimate how long it should take?

Thank you for any ideas or assistance anyone can provide.

Will
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Nautilus Cluster Struggling to Come Back Online

2021-02-26 Thread Wout van Heeswijk
For those interested in this issue. We've been seeing OSDs with corrupted wals 
after they had a suicide time out. I've updated the ticket created by William 
with some of our logs.
https://tracker.ceph.com/issues/48827#note-16

We're using ceph 15.2.2 in this cluster. Currently we are contemplating a way 
forward, but it looks like the wals are being corrupted under load.

Kind regards,

Wout
42on



From: William Law 
Sent: Tuesday, 19 January 2021 18:48
To: ceph-users@ceph.io
Subject: [ceph-users] Nautilus Cluster Struggling to Come Back Online

I guess as a sort of follow up from my previous post.  Our Nautilus (14.2.16 on 
ubuntu 18.04) cluster had some sort of event that caused many of the machines 
to have memory errors.  The aftermath is that initially some OSDs had (and 
continue to have) this error https://tracker.ceph.com/issues/48827  others 
won't start for various reasons.

The OSDs that *will* start are badly behind the current epoch for the most part.

It sounds very similar to this:
https://blog.noc.grnet.gr/2016/10/18/surviving-a-ceph-cluster-outage-the-hard-way/

We are having trouble getting things back online.

I think the path forward is to:
-set noup/nodown/noout/nobackfill/and wait for the OSDs that run to come up; we 
were making good progress yesterday until some of the OSDs crashed with OOM 
errors.  We are again moving forward but understandably nervous.
-export the PGs from questionable OSDs and and then rebuild the OSDs; import 
the PGs if necessary (very likely).  Repeat until we are up.

Any suggestions for increasing speed?  We are using 
noup/nobackfill/norebalance/pause but the epoch catchup is taking a very long 
time.  Any tips for keeping the epoch from moving forward or speeding up the 
OSDs catching up? How can we estimate how long it should take?

Thank you for any ideas or assistance anyone can provide.

Will
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] "optimal" tunables on release upgrade

2021-02-26 Thread Matthew Vernon

Hi,

Having been slightly caught out by tunables on my Octopus upgrade[0], 
can I just check that if I do

ceph osd crush tunables optimal

That will update the tunables on the cluster to the current "optimal" 
values (and move a lot of data around), but that this doesn't mean 
they'll change next time I upgrade the cluster or anything like that?


It's not quite clear from the documentation whether the next time 
"optimal" tunables change that'll be applied to a cluster where I've set 
tunables thus, or if tunables are only ever changed by a fresh 
invocation of ceph osd crush tunables...


[I assume the same answer applies to "default"?]

Regards,

Matthew

[0] I foolishly thought a cluster initially installed as Jewel would 
have jewel tunables



--
The Wellcome Sanger Institute is operated by Genome Research 
Limited, a charity registered in England with number 1021457 and a 
company registered in England with number 2742969, whose registered 
office is 215 Euston Road, London, NW1 2BE. 
___

ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph version of new daemons deployed with orchestrator

2021-02-26 Thread Tobias Fischer
Hi Kenneth,

check the config db which image is set:

ceph config dump
WHO MASK  LEVEL OPTIONVALUE 
  RO
globalbasic container_image   
docker.io/ceph/ceph:v15.2.9 *

Probably you have v15 tag configured which means orchestrator will fetch latest 
v15 image - so as of today this would be v15.2.9.

So either you change the setting in the config DB or you can do it like this if 
you have v15 configured:

- log in to the host that is going to be added beforehand
- get you preferred image:
docker pull ceph/ceph:v15.2.6
- retag it
docker tag ceph/ceph:v15.2.6 ceph/ceph:v15
- remove the original image
docker rmi ceph/ceph:v15.2.6
- add host as usual

orchestrator will use the configured v15 image which on the new host 
corresponds to v15.2.6

hope it helps

best,
tobi

> Am 26.02.2021 um 11:16 schrieb Kenneth Waegeman :
> 
> Hi all,
> 
> I am running a cluster managed by orchestrator/cephadm. I installed new host 
> for OSDS yesterday, the osd daemons were automatically created using 
> drivegroups service specs 
> (https://docs.ceph.com/en/latest/cephadm/drivegroups/#drivegroups 
> ) and they 
> started with a 15.2.9 image, instead of 15.2.8 which all daemons of the 
> cluster are running.
> I did not yet run ceph orch upgrade to 15.2.9.
> 
> Is there a way to lock the version of OSDS/daemons created by 
> orchestrator/cephadm?
> 
> Thanks!
> Kenneth
> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MON slow ops and growing MON store

2021-02-26 Thread Janek Bevendorff
Since the full cluster restart and disabling logging to syslog, it's not 
a problem any more (for now).


Unfortunately, just disabling clog_to_monitors didn't have the wanted 
effect when I tried it yesterday. But I also believe that it is somehow 
related. I could not find any specific reason for the incident yesterday 
in the logs besides a few more RocksDB status and compact messages than 
usual, but that's more symptomatic.



On 26/02/2021 13:05, Mykola Golub wrote:

On Thu, Feb 25, 2021 at 08:58:01PM +0100, Janek Bevendorff wrote:


On the first MON, the command doesn’t even return, but I was able to
get a dump from the one I restarted most recently. The oldest ops
look like this:

 {
 "description": "log(1000 entries from seq 17876238 at 
2021-02-25T15:13:20.306487+0100)",
 "initiated_at": "2021-02-25T20:40:34.698932+0100",
 "age": 183.762551121,
 "duration": 183.762599201,

The mon stores cluster log messages in the mon db. You mentioned
problems with osds flooding with log messages. It looks like related.

If you still observe the db growth you may try temporarily disable
clog_to_monitors, i.e. set for all osds:

  clog_to_monitors = false

And see if it stops growing after this and if it helps with the slow
ops (it might make sense to restar mons if some look like get
stuck). You can apply the config option on the fly (without restarting
the osds, e.g with injectargs), but when re-enabling back you will
have to restart the osds to avoid crashes due to this bug [1].

[1] https://tracker.ceph.com/issues/48946


--

Bauhaus-Universität Weimar
Bauhausstr. 9a, R308
99423 Weimar, Germany

Phone: +49 3643 58 3577
www.webis.de
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MON slow ops and growing MON store

2021-02-26 Thread Mykola Golub
On Thu, Feb 25, 2021 at 08:58:01PM +0100, Janek Bevendorff wrote:

> On the first MON, the command doesn’t even return, but I was able to
> get a dump from the one I restarted most recently. The oldest ops
> look like this:
>
> {
> "description": "log(1000 entries from seq 17876238 at 
> 2021-02-25T15:13:20.306487+0100)",
> "initiated_at": "2021-02-25T20:40:34.698932+0100",
> "age": 183.762551121,
> "duration": 183.762599201,

The mon stores cluster log messages in the mon db. You mentioned
problems with osds flooding with log messages. It looks like related.

If you still observe the db growth you may try temporarily disable
clog_to_monitors, i.e. set for all osds:

 clog_to_monitors = false

And see if it stops growing after this and if it helps with the slow
ops (it might make sense to restar mons if some look like get
stuck). You can apply the config option on the fly (without restarting
the osds, e.g with injectargs), but when re-enabling back you will
have to restart the osds to avoid crashes due to this bug [1].

[1] https://tracker.ceph.com/issues/48946

-- 
Mykola Golub
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph version of new daemons deployed with orchestrator

2021-02-26 Thread Kenneth Waegeman

Hi Tobi,

I didn't know about that config option, but that did the trick!

Thank you!

Kenneth

On 26/02/2021 11:30, Tobias Fischer wrote:

Hi Kenneth,

check the config db which image is set:

ceph config dump
WHO MASK  LEVEL OPTIONVALUE 
  RO
globalbasic container_image   
docker.io/ceph/ceph:v15.2.9 *

Probably you have v15 tag configured which means orchestrator will fetch latest 
v15 image - so as of today this would be v15.2.9.

So either you change the setting in the config DB or you can do it like this if 
you have v15 configured:

- log in to the host that is going to be added beforehand
- get you preferred image:
docker pull ceph/ceph:v15.2.6
- retag it
docker tag ceph/ceph:v15.2.6 ceph/ceph:v15
- remove the original image
docker rmi ceph/ceph:v15.2.6
- add host as usual

orchestrator will use the configured v15 image which on the new host 
corresponds to v15.2.6

hope it helps

best,
tobi


Am 26.02.2021 um 11:16 schrieb Kenneth Waegeman :

Hi all,

I am running a cluster managed by orchestrator/cephadm. I installed new host for OSDS 
yesterday, the osd daemons were automatically created using drivegroups service specs 
(https://docs.ceph.com/en/latest/cephadm/drivegroups/#drivegroups 
) and they 
started with a 15.2.9 image, instead of 15.2.8 which all daemons of the cluster are 
running.
I did not yet run ceph orch upgrade to 15.2.9.

Is there a way to lock the version of OSDS/daemons created by 
orchestrator/cephadm?

Thanks!
Kenneth

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph version of new daemons deployed with orchestrator

2021-02-26 Thread Kenneth Waegeman

Hi all,

I am running a cluster managed by orchestrator/cephadm. I installed new 
host for OSDS yesterday, the osd daemons were automatically created 
using drivegroups service specs 
(https://docs.ceph.com/en/latest/cephadm/drivegroups/#drivegroups 
) and 
they started with a 15.2.9 image, instead of 15.2.8 which all daemons of 
the cluster are running.

I did not yet run ceph orch upgrade to 15.2.9.

Is there a way to lock the version of OSDS/daemons created by 
orchestrator/cephadm?


Thanks!
Kenneth

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io