On Sat, Apr 10, 2021 at 2:10 AM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 4:04 PM Dan van der Ster wrote:
> >
> > Here's what you should look for, with debug_mon=10. It shows clearly
> > that it takes the mon 23 seconds to run through
> > get_removed_snaps_range.
> > So if this is
On Tue, Apr 13, 2021 at 8:40 AM Robert LeBlanc wrote:
>
> Do you think it would be possible to build Nautilus FUSE or newer on
> 14.04, or do you think the toolchain has evolved too much since then?
>
An interesting question.
# cat /etc/os-release
NAME="Ubuntu"
VERSION="14.04.6 LTS, Trusty
On Mon, Apr 12, 2021 at 3:41 PM Brad Hubbard wrote:
>
> Sure Robert,
>
> I understand the realities of maintaining large installations which
> may have many reasons holding them back from upgrading any of the
> interdependent software they run. The other side of the page however
> is that we can
On Mon, Apr 12, 2021 at 11:35 AM Robert LeBlanc wrote:
>
> On Sun, Apr 11, 2021 at 4:19 PM Brad Hubbard wrote:
> >
> > PSA.
> >
> > https://docs.ceph.com/en/latest/releases/general/#lifetime-of-stable-releases
> >
> > https://docs.ceph.com/en/latest/releases/#ceph-releases-index
>
> I'm very
On Sun, Apr 11, 2021 at 4:19 PM Brad Hubbard wrote:
>
> PSA.
>
> https://docs.ceph.com/en/latest/releases/general/#lifetime-of-stable-releases
>
> https://docs.ceph.com/en/latest/releases/#ceph-releases-index
I'm very well aware that we are living on the dying edge (well, past
dead), but a good
PSA.
https://docs.ceph.com/en/latest/releases/general/#lifetime-of-stable-releases
https://docs.ceph.com/en/latest/releases/#ceph-releases-index
On Sat, Apr 10, 2021 at 10:11 AM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 4:04 PM Dan van der Ster wrote:
> >
> > Here's what you should
On Fri, Apr 9, 2021 at 4:04 PM Dan van der Ster wrote:
>
> Here's what you should look for, with debug_mon=10. It shows clearly
> that it takes the mon 23 seconds to run through
> get_removed_snaps_range.
> So if this is happening every 30s, it explains at least part of why
> this mon is busy.
>
On Fri, Apr 9, 2021 at 11:50 PM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 2:04 PM Dan van der Ster wrote:
> >
> > On Fri, Apr 9, 2021 at 9:37 PM Dan van der Ster wrote:
> > >
> > > On Fri, Apr 9, 2021 at 8:39 PM Robert LeBlanc
> > > wrote:
> > > >
> > > > On Fri, Apr 9, 2021 at 11:49
On Fri, Apr 9, 2021 at 2:04 PM Dan van der Ster wrote:
>
> On Fri, Apr 9, 2021 at 9:37 PM Dan van der Ster wrote:
> >
> > On Fri, Apr 9, 2021 at 8:39 PM Robert LeBlanc wrote:
> > >
> > > On Fri, Apr 9, 2021 at 11:49 AM Dan van der Ster
> > > wrote:
> > > >
> > > > Thanks. I didn't see
On Fri, Apr 9, 2021 at 9:37 PM Dan van der Ster wrote:
>
> On Fri, Apr 9, 2021 at 8:39 PM Robert LeBlanc wrote:
> >
> > On Fri, Apr 9, 2021 at 11:49 AM Dan van der Ster
> > wrote:
> > >
> > > Thanks. I didn't see anything ultra obvious to me.
> > >
> > > But I did notice the nearfull warnings
On Fri, Apr 9, 2021 at 8:39 PM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 11:49 AM Dan van der Ster wrote:
> >
> > Thanks. I didn't see anything ultra obvious to me.
> >
> > But I did notice the nearfull warnings so I wonder if this cluster is
> > churning through osdmaps? Did you see a
On 4/9/21 3:40 PM, Robert LeBlanc wrote:
I'm attempting to deep scrub all the PGs to see if that helps clear up
some accounting issues, but that's going to take a really long time on
2PB of data.
Are you running with 1 mon now? Have you tried adding mons from scratch?
So with a fresh
On Fri, Apr 9, 2021 at 11:49 AM Dan van der Ster wrote:
>
> Thanks. I didn't see anything ultra obvious to me.
>
> But I did notice the nearfull warnings so I wonder if this cluster is
> churning through osdmaps? Did you see a large increase in inbound or
> outbound network traffic on this mon
On Fri, Apr 9, 2021 at 7:24 PM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 11:05 AM Dan van der Ster wrote:
> >
> > Hi Robert,
> >
> > Have you checked a log with debug_mon=20 yet to try to see what it's doing?
> >
> I've posted the logs with debug_mon=20 for a period during high CPU
> here
On Fri, Apr 9, 2021 at 11:05 AM Dan van der Ster wrote:
>
> Hi Robert,
>
> Have you checked a log with debug_mon=20 yet to try to see what it's doing?
>
I've posted the logs with debug_mon=20 for a period during high CPU
here https://owncloud.leblancnet.us/owncloud/index.php/s/OtHsBAYN9r5eSbU
Hi Robert,
Have you checked a log with debug_mon=20 yet to try to see what it's doing?
.. Dan
On Fri, Apr 9, 2021, 7:02 PM Robert LeBlanc wrote:
> The only step not yet taken was to move to straw2. That was the last
> step we were going to do next.
>
> Robert LeBlanc
> PGP
The only step not yet taken was to move to straw2. That was the last
step we were going to do next.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Fri, Apr 9, 2021 at 10:41 AM Robert LeBlanc wrote:
>
> On Fri, Apr 9, 2021 at 9:25 AM Stefan
On Fri, Apr 9, 2021 at 9:25 AM Stefan Kooman wrote:
> Are you running with 1 mon now? Have you tried adding mons from scratch?
> So with a fresh database? And then maybe after they have joined, kill
> the donor mon and start from scratch.
>
> You have for sure not missed a step during the upgrade
I'm attempting to deep scrub all the PGs to see if that helps clear up
some accounting issues, but that's going to take a really long time on
2PB of data.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Apr 8, 2021 at 9:48 PM Robert
Good thought. The storage for the monitor data is a RAID-0 over three
NVMe devices. Watching iostat, they are completely idle, maybe 0.8% to
1.4% for a second every minute or so.
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
On Thu, Apr 8, 2021
On 4/8/21 6:22 PM, Robert LeBlanc wrote:
I upgraded our Luminous cluster to Nautilus a couple of weeks ago and
converted the last batch of FileStore OSDs to BlueStore about 36 hours
ago. Yesterday our monitor cluster went nuts and started constantly
calling elections because monitor nodes were
I found this thread that matches a lot of what I'm seeing. I see the
ms_dispatch thread going to 100%, but I'm at a single MON, the
recovery is done and the rocksdb MON database is ~300MB. I've tried
all the settings mentioned in that thread with no noticeable
improvement. I was hoping that once
On Thu, Apr 8, 2021 at 11:24 AM Robert LeBlanc wrote:
>
> On Thu, Apr 8, 2021 at 10:22 AM Robert LeBlanc wrote:
> >
> > I upgraded our Luminous cluster to Nautilus a couple of weeks ago and
> > converted the last batch of FileStore OSDs to BlueStore about 36 hours ago.
> > Yesterday our
On Thu, Apr 8, 2021 at 10:22 AM Robert LeBlanc wrote:
>
> I upgraded our Luminous cluster to Nautilus a couple of weeks ago and
> converted the last batch of FileStore OSDs to BlueStore about 36 hours ago.
> Yesterday our monitor cluster went nuts and started constantly calling
> elections
24 matches
Mail list logo