This problem of inaccessible file systems post upgrade by other than
client.admin date back from v14 carries on through v17. It also applies
to any case of specifying other than the default pool names for new file
systems. Solved because Curt remembered link on this list. (Thanks
Curt!) Here
Hey Angelo,
Ya, we are using the RBD driver for quite a few customers in production, and it
is working quite good!
Hahahaha, I am familiar with the bug you are talking about I think, I believe
that may be resolved by now.
I believe the driver is either out of beta now/soon to be out of beta?
Hi folks,
With a multi-site environment, when I create a bucket-level sync policy with a
symmetric flow between the master zone and another zone, "bucket sync status"
immediately shows that the sync is now enabled in the master zone. But it takes
a while for it to show that in the other zone. I
I saw two untracked failures in the upgrade/octopus-x suite. Both failures
seem to indicate a problem with containers, unrelated to the Ceph code.
However, if anyone else can please take a look and confirm, I would
appreciate it.
upgrade/octopus-x (pacific)
https://pulpito.ceph.com/yuriw-2023-04-25
In 17.2.6 is there a security requirement that pool names supporting a
ceph fs filesystem match the filesystem name.data for the data and
name.meta for the associated metadata pool? (multiple file systems are
enabled)
I have filesystems from older versions with the data pool name matching
the
Hello Igor,
On Tue, May 02, 2023 at 05:41:04PM +0300, Igor Fedotov wrote:
> Hi Nikola,
>
> I'd suggest to start monitoring perf counters for your osds.
> op_w_lat/subop_w_lat ones specifically. I presume they raise eventually,
> don't they?
OK, starting collecting those for all OSDs..
currently
Hi all,
On one server with a cache tier on Samsung PM983 SSDs for an EC base
tier on HDDs, I find the cache tier stops flushing or evicting when the
cache tier is near full. With quite some gdb-debugging, I find the
problem may be with the throttling mechanism. When the write traffic is
high,
The number of mgr daemons thing is expected. The way it works is it first
upgrades all the standby mgrs (which will be all but one) and then fails
over so the previously active mgr can be upgraded as well. After that
failover is when it's first actually running the newer cephadm code, which
is when
dashboard approved!
Regards,
Nizam
On Tue, May 2, 2023, 20:48 Yuri Weinstein wrote:
> Please review the Release Notes - https://github.com/ceph/ceph/pull/51301
>
> Still seeking approvals for:
>
> rados - Neha, Radek, Laura
> rook - Sébastien Han
> dashboard - Ernesto
>
> fs - Venky, Patric
Hi Patrick,
Please be careful resetting the journal. It was not necessary. You can
try to recover the missing inode using cephfs-data-scan [2].
Yes. I did that very reluctantly after trying everything else as a last
resort. But since it only gave me another error, I restored the previous
sta
Hi.
I am currently using Ceph for replicated storage to store many objects across 5
nodes with 3x replication.
When I generate ~1000 read requests to a single object, they all get serviced
by the same primary OSD. I would like to balance the reads across the replicas.
So I use the following:
On Tue, May 2, 2023 at 10:31 AM Janek Bevendorff
wrote:
>
> Hi,
>
> After a patch version upgrade from 16.2.10 to 16.2.12, our rank 0 MDS
> fails start start. After replaying the journal, it just crashes with
>
> [ERR] : MDS abort because newly corrupt dentry to be committed: [dentry
> #0x1/storag
Thank you for the explanation Frank.
I also agree with you, Ceph is not designed for this kind of use case
but I tried to continue what I know.
My idea was exactly what you described, I was trying to automate
cleaning or recreating on any failure.
As you can see below, rep1 pool is very fast:
- C
Please review the Release Notes - https://github.com/ceph/ceph/pull/51301
Still seeking approvals for:
rados - Neha, Radek, Laura
rook - Sébastien Han
dashboard - Ernesto
fs - Venky, Patrick
(upgrade/octopus-x (pacific) - Laura (look the same as in 16.2.8))
ceph-volume - Guillaume
On Tue,
Thanks!
I tried downgrading to 16.2.10 and was able to get it running again, but
after a reboot, got a warning that two of the OSDs on that host had
broken Bluestore compression. Restarting the two OSDs again got rid of
it, but that's still a bit concerning.
On 02/05/2023 16:48, Dan van der
On Tue, May 2, 2023 at 7:54 AM Igor Fedotov wrote:
>
>
> On 5/2/2023 11:32 AM, Nikola Ciprich wrote:
> > I've updated cluster to 17.2.6 some time ago, but the problem persists.
> > This is
> > especially annoying in connection with https://tracker.ceph.com/issues/56896
> > as restarting OSDs is q
On Thu, Apr 27, 2023 at 5:21 PM Yuri Weinstein wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/59542#note-1
> Release Notes - TBD
>
> Seeking approvals for:
>
> smoke - Radek, Laura
> rados - Radek, Laura
> rook - Sébastien Han
> cephadm - Adam K
>
Hi Adam,
I'm still struggling with this issue. I also checked it one more time with
newer versions, upgrading the cluster from 16.2.11 to 16.2.12 was
successful but from 16.2.12 to 17.2.6 failed again with the same ssh errors
(I checked
https://docs.ceph.com/en/quincy/cephadm/troubleshooting/#ssh-
Venky, I did plan to cherry-pick this PR if you approve this (this PR
was used for a rerun)
On Tue, May 2, 2023 at 7:51 AM Venky Shankar wrote:
>
> Hi Yuri,
>
> On Fri, Apr 28, 2023 at 2:53 AM Yuri Weinstein wrote:
> >
> > Details of this release are summarized here:
> >
> > https://tracker.ceph
Hi Yuri,
On Fri, Apr 28, 2023 at 2:53 AM Yuri Weinstein wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/59542#note-1
> Release Notes - TBD
>
> Seeking approvals for:
>
> smoke - Radek, Laura
> rados - Radek, Laura
> rook - Sébastien Han
> cephadm -
On 5/2/2023 11:32 AM, Nikola Ciprich wrote:
I've updated cluster to 17.2.6 some time ago, but the problem persists. This is
especially annoying in connection with https://tracker.ceph.com/issues/56896
as restarting OSDs is quite painfull when half of them crash..
with best regards
Feel free to
Hi Janek,
That assert is part of a new corruption check added in 16.2.12 -- see
https://github.com/ceph/ceph/commit/1771aae8e79b577acde749a292d9965264f20202
The abort is controlled by a new option:
+Option("mds_abort_on_newly_corrupt_dentry", Option::TYPE_BOOL,
Option::LEVEL_ADVANCED)
+.
Hi Nikola,
I'd suggest to start monitoring perf counters for your osds.
op_w_lat/subop_w_lat ones specifically. I presume they raise eventually,
don't they?
Does subop_w_lat grow for every OSD or just a subset of them? How large
is the delta between the best and the worst OSDs after a one we
Hi,
After a patch version upgrade from 16.2.10 to 16.2.12, our rank 0 MDS
fails start start. After replaying the journal, it just crashes with
[ERR] : MDS abort because newly corrupt dentry to be committed: [dentry
#0x1/storage [2,head] auth (dversion lock)
Immediately after the upgrade, I
Hi,
disclaimer: I haven't used LRC in a real setup yet, so there might be
some misunderstandings on my side. But I tried to play around with one
of my test clusters (Nautilus). Because I'm limited in the number of
hosts (6 across 3 virtual DCs) I tried two different profiles with
lower nu
To follow up on this issue, I saw the additional comments on
https://tracker.ceph.com/issues/59580 regarding mgr caps.
By setting the mgr user caps back to the default, I was able to reduce
the memory leak from several 100MB/h to just a few MB/hr.
As the other commenter had posted, in order fo
Hi,
while your assumptions are correct (you can use the rest of the pool
for other non-mirrored images), at least I'm not aware of any
limitations, can I ask for the motivation behind this question? Mixing
different use-cases doesn't seem like a good idea to me. There's
always a chance th
Hi Arnaud,
thanks, that's a good one. The inode in question should be in cache at this
time. It actually accepts the hex-code given in the log message and is really
fast.
I hope I remember that for next time.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
Hi,
Or you can query the MDS(s) with:
ceph tell mds.* dump inode 2>/dev/null | grep path
for example:
user@server:~$ ceph tell mds.* dump inode 1099836155033 2>/dev/null | grep path
"path": "/ec42/default/joliot/gipsi/gpu_burn.sif",
"stray_prior_path": "",
Arnaud
Le 01/05/2023 15:07
Hello dear CEPH users and developers,
we're dealing with strange problems.. we're having 12 node alma linux 9 cluster,
initially installed CEPH 15.2.16, then upgraded to 17.2.5. It's running bunch
of KVM virtual machines accessing volumes using RBD.
everything is working well, but there is strang
On May 1, 2023 9:30 pm, Peter wrote:
> Hi Fabian,
>
> Thank you for your prompt response. It's crucial to understand how things
> work, and I appreciate your assistance.
>
> After replacing the switch for our Ceph environment, we experienced three
> days of normalcy before the issue recurred th
31 matches
Mail list logo