Thanks, Eugen. It is similar in the sense that the mgr is getting
OOM-killed.
It started happening in our cluster after the upgrade to 16.2.14. We
haven't had this issue with earlier Pacific releases.
/Z
On Tue, 21 Nov 2023, 21:53 Eugen Block, wrote:
> Just checking it on the phone, but isn’t
I encountered mgr ballooning multiple times with Luminous, but have not since.
At the time, I could often achieve relief by sending the admin socket a heap
release - it would show large amounts of memory unused but not yet released.
That experience is one reason I got Rook recently to allow
Just checking it on the phone, but isn’t this quite similar?
https://tracker.ceph.com/issues/45136
Zitat von Zakhar Kirpichenko :
Hi,
I'm facing a rather new issue with our Ceph cluster: from time to time
ceph-mgr on one of the two mgr nodes gets oom-killed after consuming over
100 GB RAM:
Hi,
I'm facing a rather new issue with our Ceph cluster: from time to time
ceph-mgr on one of the two mgr nodes gets oom-killed after consuming over
100 GB RAM:
[Nov21 15:02] tp_osd_tp invoked oom-killer:
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
[ +0.10]
Hi Yuri,
On Fri, Nov 10, 2023 at 1:22 PM Venky Shankar wrote:
>
> Hi Yuri,
>
> On Fri, Nov 10, 2023 at 4:55 AM Yuri Weinstein wrote:
> >
> > I've updated all approvals and merged PRs in the tracker and it looks
> > like we are ready for gibba, LRC upgrades pending approval/update from
> >
On 15-11-2023 07:09, Brent Kennedy wrote:
Greetings group!
We recently reloaded a cluster from scratch using cephadm and reef. The
cluster came up, no issues. We then decided to upgrade two existing cephadm
clusters that were on quincy. Those two clusters came up just fine but
there is
Hi,
I have setup with one default tenant and next user/bucket structure:
user1
bucket1
bucket11
user2
bucket2
user3
bucket3
IAM and STS APIs are enabled, user1 has roles=* capabilities.
When user1 permit user2 to assume role with
Hi,
were you able to resolve that situation in the meantime? If not, I
would probably try to 'umount -l' and see if that helps. If not, you
can check if the client is still blacklisted:
ceph osd blocklist ls (or blacklist)
If it's still blocklisted, you could try to remove it:
ceph osd
Hi,
I guess you could just redeploy the third MON which fails to start
(after the orchestrator is responding again) unless you figured it out
already. What is it logging?
1 osds exist in the crush map but not in the osdmap
This could be due to the input/output error, but it's just a