[ceph-users] Error adding OSD

2023-09-20 Thread Budai Laszlo

Hi all,

I am trying to add an OSD using cephadm but it fails with the message found 
below. Do you have any ide what may be wrong? The given device used to be in 
the cluster but it has been removed, and now the device appears as available in 
the `ceph orch device ls`.

Thank you,
Laszlo

root@monitor1:~# ceph orch device ls| grep storage3
storage3  /dev/sdb  hdd   ATA_QEMU_HARDDISK_QM2  10.7G No 15m ago    
Insufficient space (<10 extents) on vgs, LVM detected, locked
storage3  /dev/sdc  hdd   ATA_QEMU_HARDDISK_QM3  10.7G Yes    15m ago
storage3  /dev/sdd  hdd   ATA_QEMU_HARDDISK_QM4  8589M No 15m ago 
locked
root@monitor1:~#

Here it is the error:


Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1756, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in 
handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in 
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # 
noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 843, in 
_daemon_add_osd
    raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 228, in 
raise_if_exception
    raise e
RuntimeError: cephadm exited with an error code: 1, stderr:Inferring config 
/var/lib/ceph/314d068c-56ee-11ee-87e2-cd6d389cbfb8/config/ceph.conf
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232 -e NODE_NAME=storage3 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/314d068c-56ee-11ee-87e2-cd6d389cbfb8:/var/run/ceph:z -v /var/log/ceph/314d068c-56ee-11ee-87e2-cd6d389cbfb8:/var/log/ceph:z -v /var/lib/ceph/314d068c-56ee-11ee-87e2-cd6d389cbfb8/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp4r8kteec:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmppw___l6k:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:6b0a24e3146d4723700ce6579d40e6016b2c63d9bf90422653f2d4caa49be232 lvm batch --no-auto /dev/sdc 
--yes --no-systemd

/usr/bin/docker: stderr --> passed data devices: 1 physical, 0 LVM
/usr/bin/docker: stderr --> relative data size: 1.0
/usr/bin/docker: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/usr/bin/docker: stderr Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - 
osd new d90dcdee-035c-4f3c-80f6-5d3eed25d598
/usr/bin/docker: stderr Running command: nsenter --mount=/rootfs/proc/1/ns/mnt 
--ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net 
--uts=/rootfs/proc/1/ns/uts /sbin/vgcreate --force --yes 
ceph-cf156193-5f39-4bfd-91c0-4e1d50fe0e4e /dev/sdc
/usr/bin/docker: stderr  stdout: Physical volume "/dev/sdc" successfully 
created.
/usr/bin/docker: stderr  stdout: Volume group 
"ceph-cf156193-5f39-4bfd-91c0-4e1d50fe0e4e" successfully created
/usr/bin/docker: stderr Running command: nsenter --mount=/rootfs/proc/1/ns/mnt 
--ipc=/rootfs/proc/1/ns/ipc --net=/rootfs/proc/1/ns/net 
--uts=/rootfs/proc/1/ns/uts /sbin/lvcreate --yes -l 2559 -n 
osd-block-d90dcdee-035c-4f3c-80f6-5d3eed25d598 
ceph-cf156193-5f39-4bfd-91c0-4e1d50fe0e4e
/usr/bin/docker: stderr  stdout: Logical volume 
"osd-block-d90dcdee-035c-4f3c-80f6-5d3eed25d598" created.
/usr/bin/docker: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/usr/bin/docker: stderr Running command: /usr/bin/mount -t tmpfs tmpfs 
/var/lib/ceph/osd/ceph-6
/usr/bin/docker: stderr Running command: /usr/bin/chown -h ceph:ceph 
/dev/ceph-cf156193-5f39-4bfd-91c0-4e1d50fe0e4e/osd-block-d90dcdee-035c-4f3c-80f6-5d3eed25d598
/usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph /dev/dm-1
/usr/bin/docker: stderr Running command: /usr/bin/ln -s 
/dev/ceph-cf156193-5f39-4bfd-91c0-4e1d50fe0e4e/osd-block-d90dcdee-035c-4f3c-80f6-5d3eed25d598
 /var/lib/ceph/osd/ceph-6/block
/usr/bin/docker: stderr Running command: /usr/bin/ceph --cluster ceph --name 
client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon 
getmap -o /var/lib/ceph/osd/ceph-6/activate.monmap
/usr/bin/docker: stderr  stderr: got monmap epoch 3
/usr/bin/docker: stderr --> Creating keyring file for osd.6
/usr/bin/docker: stderr Running command: /usr/bin/chown -R ceph:ceph 

[ceph-users] Re: millions of hex 80 0_0000 omap keys in single index shard for single bucket

2023-09-20 Thread Casey Bodley
these keys starting with "<80>0_" appear to be replication log entries
for multisite. can you confirm that this is a multisite setup? is the
'bucket sync status' mostly caught up on each zone? in a healthy
multisite configuration, these log entries would eventually get
trimmed automatically

On Wed, Sep 20, 2023 at 7:08 PM Christopher Durham  wrote:
>
> I am using ceph 17.2.6 on Rocky 8.
> I have a system that started giving me large omap object warnings.
>
> I tracked this down to a specific index shard for a single s3 bucket.
>
> rados -p  listomapkeys .dir..bucketid.nn.shardid
> shows over 3 million keys for that shard. There are only about 2
> million objects in the entire bucket according to a listing of the bucket
> and radosgw-admin bucket stats --bucket bucketname. No other shard
> has anywhere near this many index objects. Perhaps it should be noted that 
> this
> shard is the highest numbered shard for this bucket. For a bucket with
> 16 shards, this is shard 15.
>
> If I look at the list of omapkeys generated, there are *many*
> beginning with "<80>0_", almost the entire set of the three + million
> keys in the shard. These are index objects in the so-called 'ugly' namespace. 
> The rest ofthey omapkeys appear to be normal.
>
> The 0_ after the <80> indicates some sort of 'bucket log index' according 
> to src/cls/rgw/cls_rgw.cc.
> However, using some sed magic previously discussed here, I ran:
>
> rados -p  getomapval .dir..bucketid.nn.shardid 
> --omap-key-file /tmp/key.txt
>
> Where /tmp/key.txt contains only the funny <80>0_ key name without a 
> newline
>
> The output of this shows, in a hex dump, the object name to which the index
> refers, which was at one time a valid object.
>
> However, that object no longer exists in the bucket, and based on expiration 
> policy, was
> previously deleted. Let's say, in the hex dump, that the object was:
>
> foo/bar/baz/object1.bin
>
> The prefix foo/bar/baz/ used to have 32 objects, say 
> foo/bar/baz/{object1.bin, object2.bin, ... }
> An s3api listing shows that those objects no longer exist (and that is OK, as 
> they  were previously deleted).
> BUT, now, there is a weirdo object left in the bucket:
>
> foo/bar/baz/ <- with the slash at the end, and it is an object not a PRE 
> (fix).
>
> All objects under foo/ have a 3 day lifecycle expiration. If I wait(at most) 
> 3 days, the weirdo object with '/'
> at the end will be deleted, or I can delete it manually using aws s3api. But 
> either way, the log index
> objects, <80>0_ remain.
>
> The bucket in question is heavily used. But with over 3 million of these 
> <80>0_ objects (and growing)
> in a single shard, I am currently at a loss as to what to do or how to stop 
> this from occuring.
> I've poked around at a few other buckets, and I found a few others that have 
> this problem, but not enoughto cause a large omap warning. (A few hundred 
> <80>0_000 index objects in a shard), no where near enoughto cause the 
> large omap warning that led me to this post.
>
> Any ideas?
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] After upgrading from 17.2.6 to 18.2.0, OSDs are very frequently restarting due to livenessprobe failures

2023-09-20 Thread sbengeri
Since upgrading to 18.2.0 , OSDs are very frequently restarting due to 
livenessprobe failures making the cluster unusable. Has anyone else seen this 
behavior?

Upgrade path: ceph 17.2.6 to 18.2.0 (and rook from 1.11.9 to 1.12.1) 
on ubuntu 20.04 kernel 5.15.0-79-generic

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3website range requests - possible issue

2023-09-20 Thread Ondřej Kukla
I was checking the tracker again and I found already fixed issue that seems to 
be connected with this issue.

https://tracker.ceph.com/issues/44508

Here is the PR that fixes it https://github.com/ceph/ceph/pull/33807

What I’m still not understanding is why this is only happening when using 
s3website api.

Is there someone who could shed some light on this?

Regards,

Ondrej
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] millions of hex 80 0_0000 omap keys in single index shard for single bucket

2023-09-20 Thread Christopher Durham
I am using ceph 17.2.6 on Rocky 8.
I have a system that started giving me large omap object warnings.

I tracked this down to a specific index shard for a single s3 bucket.

rados -p  listomapkeys .dir..bucketid.nn.shardid
shows over 3 million keys for that shard. There are only about 2
million objects in the entire bucket according to a listing of the bucket
and radosgw-admin bucket stats --bucket bucketname. No other shard
has anywhere near this many index objects. Perhaps it should be noted that this
shard is the highest numbered shard for this bucket. For a bucket with
16 shards, this is shard 15.

If I look at the list of omapkeys generated, there are *many*
beginning with "<80>0_", almost the entire set of the three + million
keys in the shard. These are index objects in the so-called 'ugly' namespace. 
The rest ofthey omapkeys appear to be normal.

The 0_ after the <80> indicates some sort of 'bucket log index' according 
to src/cls/rgw/cls_rgw.cc. 
However, using some sed magic previously discussed here, I ran:

rados -p  getomapval .dir..bucketid.nn.shardid 
--omap-key-file /tmp/key.txt

Where /tmp/key.txt contains only the funny <80>0_ key name without a newline

The output of this shows, in a hex dump, the object name to which the index
refers, which was at one time a valid object.

However, that object no longer exists in the bucket, and based on expiration 
policy, was
previously deleted. Let's say, in the hex dump, that the object was:

foo/bar/baz/object1.bin

The prefix foo/bar/baz/ used to have 32 objects, say foo/bar/baz/{object1.bin, 
object2.bin, ... }
An s3api listing shows that those objects no longer exist (and that is OK, as 
they  were previously deleted).
BUT, now, there is a weirdo object left in the bucket:

foo/bar/baz/ <- with the slash at the end, and it is an object not a PRE (fix).

All objects under foo/ have a 3 day lifecycle expiration. If I wait(at most) 3 
days, the weirdo object with '/'
at the end will be deleted, or I can delete it manually using aws s3api. But 
either way, the log index
objects, <80>0_ remain.

The bucket in question is heavily used. But with over 3 million of these 
<80>0_ objects (and growing)
in a single shard, I am currently at a loss as to what to do or how to stop 
this from occuring.
I've poked around at a few other buckets, and I found a few others that have 
this problem, but not enoughto cause a large omap warning. (A few hundred 
<80>0_000 index objects in a shard), no where near enoughto cause the large 
omap warning that led me to this post.

Any ideas?


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3website range requests - possible issue

2023-09-20 Thread Ondřej Kukla
When checking the RGW logs I can confirm that it is in fact the same issue as 
the one in the issue.

2023-09-20T12:52:06.670+ 7f216d702700 1 -- xxx.xxx.58.15:0/758879303 --> 
[v2:xxx.xxx.58.2:6816/8556,v1:xxx.xxx.58.2:6817/8556] -- osd_op(unknown.0.0:238 
18.651 
18:8a75a7b2:::39078a70-7768-48c8-96a5-1e13ced83b5b.58017020.1_videos%2f7.mp4:head
 [getxattrs,stat,read 0~4194304] snapc 0=[] 
ondisk+read+known_if_redirected+supports_pool_eio e60419) v8 -- 0x7f21dc00a420 
con 0x7f21dc007820

You can find the OSD part of the log here - https://pastebin.com/nGQw4ugd

For the record the version of the cluster when I’m able to replicate this is 

ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)

Regards,

Ondrej


> On 20. 9. 2023, at 11:25, Ondřej Kukla  wrote:
> 
> I was checking the tracker again and I found already fixed issue that seems 
> to be connected with this issue.
> 
> https://tracker.ceph.com/issues/44508
> 
> Here is the PR that fixes it https://github.com/ceph/ceph/pull/33807
> 
> What I’m still not understanding is why this is only happening when using 
> s3website api.
> 
> Is there someone who could shed some light on this?
> 
> Regards,
> 
> Ondrej
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


> On 19. 9. 2023, at 10:49, Ondřej Kukla  wrote:
> 
> Hello,
> 
> In our deployment we are using the mix of s3 and s3website RGW. I’ve noticed 
> strange behaviour when sending range requests to the s3website RGWs that I’m 
> not able to replicate on the s3 ones.
> 
> I’ve created a simple wrk LUA script to test sending range requests on tiny 
> ranges so the issue is easily seen.
> 
> When sending these requests against s3 RGW I can see that the amount of data 
> read from Ceph is ± equivalent to what the RGW sends to the client. This 
> change very dramatically when I’m doing the same test against s3website RGW. 
> The read from Ceph is huge (3Gb/s compared to ~22Mb/s on s3 RGW) I seems to 
> me like the RGW is reading the whole files and then sending just the range 
> which is different then what s3 does.
> 
> I do not understand why would s3website need to read that much from Ceph and 
> I believe this is a bug - I was looking through the tracker and wasn’t able 
> to find anything related to s3website and range requests.
> 
> Did anyone else noticed this issue?
> 
> You can replicate it by running this wrk command wrk -t56 -c500 -d5m 
> http://${rgwipaddress}:8080/${bucket}/videos/ -s wrk-range-small.lua
> 
> wrk script
> 
> -- Initialize the pseudo random number generator
> math.randomseed( os.time())
> math.random(); math.random(); math.random()
> 
> i = 1
> 
> function request()
>if i == 8
>then
>i = 1
>end
> 
>local nrangefrom = math.random()
>local nrangeto = math.random(100)
>local path = wrk.path
>url = path..i..".mp4"
>wrk.headers["Range"] = nrangefrom.."-"..nrangeto
>i = i+1
>return wrk.format(nil, url)
> end
> 
> Kind regards,
> 
> Ondrej
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Clients failing to respond to capability release

2023-09-20 Thread Tim Bishop
Hi Stefan,

On Wed, Sep 20, 2023 at 11:00:12AM +0200, Stefan Kooman wrote:
> On 19-09-2023 13:35, Tim Bishop wrote:
> > The Ceph cluster is running Pacific 16.2.13 on Ubuntu 20.04. Almost all
> > clients are working fine, with the exception of our backup server. This
> > is using the kernel CephFS client on Ubuntu 22.04 with kernel 6.2.0 [1]
> > (so I suspect a newer Ceph version?).
> > 
> > The backup server has multiple (12) CephFS mount points. One of them,
> > the busiest, regularly causes this error on the cluster:
> > 
> > HEALTH_WARN 1 clients failing to respond to capability release
> > [WRN] MDS_CLIENT_LATE_RELEASE: 1 clients failing to respond to capability 
> > release
> >  mds.mds-server(mds.0): Client backupserver:cephfs-backupserver failing 
> > to respond to capability release client_id: 521306112
> > 
> > And occasionally, which may be unrelated, but occurs at the same time:
> > 
> > [WRN] MDS_SLOW_REQUEST: 1 MDSs report slow requests
> >  mds.mds-server(mds.0): 1 slow requests are blocked > 30 secs
> > 
> > The second one clears itself, but the first sticks until I can unmount
> > the filesystem on the client after the backup completes.
> 
> You are not alone. We also have a backup server running 22.04 and 6.2 and
> occasionally hit this issue. We hit this with mainly 5.12.19 clients and a
> 6.2 backup server. We're on 16.2.11.
> 
> 
> Sidenote:
> 
> For those of you who are wondering: why would you want to use latest
> (greatest?) linux kernel for CephFS ... this is why. To try to get rid of 1)
> slow requests because of some deadlock / locking issue, clients failing to
> capability release, and 3) bug fixes / improvements (thx devs!).
> 
> 
> Questions:
> 
> Do you have the filesystem read only mounted and given the backup server
> CephFS client read only caps on the MDS?

Yes, mounted read-only and the caps for the client are read-only for the
MDS.

I do have multiple mounts from the same CephFS filesystem though, and
I've been wondering if that could be causing more parallel requests from
the backup server. I'd been thinking about doing it through a single
mount, but then all the paths change which doesn't make the backups
overly happy.

> Are you running a multiple active MDS setup?

No. We tried it for a while but after seeing some issues like this we
backtracked to a single active MDS to rule out multiple active being the
issue.

> > It appears that whilst it's in this stuck state there may be one or more
> > directory trees that are inaccessible to all clients. The backup server
> > is walking the whole tree but never gets stuck itself, so either the
> > inaccessible directory entry is caused after it has gone past, or it's
> > not affected. Maybe the backup server is holding a directory when it
> > shouldn't?
> 
> We have seen both cases, yet most of the time the backup server would not be
> able to make progress and be stuck on a file.

Interesting. Backups have never got stuck for us. Whilst we regularly,
pretty much daily, see the above mentioned error.

But because nothing we're directly running gets stuck I only find out if
a directory somewhere is inaccessible if a user reports it to us from
one of our other client machines, usually a HPC node.

> > It may be that an upgrade to Quincy resolves this, since it's more
> > likely to be inline with the kernel client version wise, but I don't
> > want to knee-jerk upgrade just to try and fix this problem.
> 
> We are testing with 6.5 kernel clients (see other recent threads about
> this). We have not seen this issue there (but time will tell, it does not
> happen *that* often, but hit other issues).
> 
> The MDS server itself is indeed older than the newer kernel clients. It
> might certainly be a factor. And that raises the question what kind of
> interoperability / compatibility tests (if any) are done between CephFS
> (kernel) clients and MDS server versions. This might be a good "focus topic"
> for a ceph User + Dev meeting ...
> 
> > Thanks for any advice.
> 
> You might want to try 6.5.x kernel on the clients. But might run into other
> issues. Not sure about that, these might be only relevant for one of our
> workloads, only one way to find out ...

I've been sticking with what's available in Ubuntu - the 6.2 kernel is
part of their HWE enablement stack, which is handy. It won't be long
until 23.10 is out with the 6.5 kernel though. I'll definitely give it a
try then.

> Enable debug logging on the MDS to gather logs that might shine some light
> on what is happening with that request.
> 
> ceph daemon mds.name dump_ops_in_flight might help here to get client id and
> request.

I've done both of these in the past, but I should look again (of course,
it's not broken right now!). From what I recall there was nothing
unusual looking about the request, and certainly nothing that Googling
and searching list archives and bug reports led me to anything 

[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Dhairya Parmar
Hi Janek,

The PR venky mentioned makes use of OSD's laggy parameters (laggy_interval
and
laggy_probability) to find if any OSD is laggy or not. These laggy
parameters
can reset to 0 if the interval between the last modification done to OSDMap
and
the time stamp when OSD was marked down exceeds the grace interval threshold
which is the value we get by `mon_osd_laggy_halflife * 48` where
mon_osd_laggy_halflife is a configurable value which is by default 3600 so
only
if the interval I talked about exceeds 172800; the laggy parameters would
reset
to 0. I'd recommend taking a look at what your configured value is(using
cmd:
ceph config get osd mon_osd_laggy_halflife).

There is also a "hack" to reset the parameters manually(
*Not recommended, justfor info*): set mon_osd_laggy_weight to 1 using `ceph
config set osd
mon_osd_laggy_weight 1` and reboot the OSD(s) which is/are being said laggy
and
you will see the lagginess go away.


*Dhairya Parmar*

Associate Software Engineer, CephFS

Red Hat Inc. 

dpar...@redhat.com



On Wed, Sep 20, 2023 at 3:25 PM Venky Shankar  wrote:

> Hey Janek,
>
> I took a closer look at various places where the MDS would consider a
> client as laggy and it seems like a wide variety of reasons are taken
> into consideration and not all of them might be a reason to defer client
> eviction, so the warning is a bit misleading. I'll post a PR for this. In
> the meantime, could you share the debug logs stated in my previous email?
>
> On Wed, Sep 20, 2023 at 3:07 PM Venky Shankar  wrote:
>
> > Hi Janek,
> >
> > On Tue, Sep 19, 2023 at 4:44 PM Janek Bevendorff <
> > janek.bevendo...@uni-weimar.de> wrote:
> >
> >> Hi Venky,
> >>
> >> As I said: There are no laggy OSDs. The maximum ping I have for any OSD
> >> in ceph osd perf is around 60ms (just a handful, probably aging disks).
> The
> >> vast majority of OSDs have ping times of less than 1ms. Same for the
> host
> >> machines, yet I'm still seeing this message. It seems that the affected
> >> hosts are usually the same, but I have absolutely no clue why.
> >>
> >
> > It's possible that you are running into a bug which does not clear the
> > laggy clients list which the MDS sends to monitors via beacons. Could you
> > help us out with debug mds logs (by setting debug_mds=20) for the active
> > mds for around 15-20 seconds and share the logs please? Also reset the
> log
> > level once done since it can hurt performance.
> >
> > # ceph config set mds.<> debug_mds 20
> >
> > and reset via
> >
> > # ceph config rm mds.<> debug_mds
> >
> >
> >> Janek
> >>
> >>
> >> On 19/09/2023 12:36, Venky Shankar wrote:
> >>
> >> Hi Janek,
> >>
> >> On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff <
> >> janek.bevendo...@uni-weimar.de> wrote:
> >>
> >>> Thanks! However, I still don't really understand why I am seeing this.
> >>>
> >>
> >> This is due to a changes that was merged recently in pacific
> >>
> >> https://github.com/ceph/ceph/pull/52270
> >>
> >> The MDS would not evict laggy clients if the OSDs report as laggy. Laggy
> >> OSDs can cause cephfs clients to not flush dirty data (during cap
> revokes
> >> by the MDS) and thereby showing up as laggy and getting evicted by the
> MDS.
> >> This behaviour was changed and therefore you get warnings that some
> client
> >> are laggy but they are not evicted since the OSDs are laggy.
> >>
> >>
> >>> The first time I had this, one of the clients was a remote user
> dialling
> >>> in via VPN, which could indeed be laggy. But I am also seeing it from
> >>> neighbouring hosts that are on the same physical network with reliable
> ping
> >>> times way below 1ms. How is that considered laggy?
> >>>
> >>  Are some of your OSDs reporting laggy? This can be check via `perf
> dump`
> >>
> >> > ceph tell mds.<> perf dump
> >> (search for op_laggy/osd_laggy)
> >>
> >>
> >>> On 18/09/2023 18:07, Laura Flores wrote:
> >>>
> >>> Hi Janek,
> >>>
> >>> There was some documentation added about it here:
> >>> https://docs.ceph.com/en/pacific/cephfs/health-messages/
> >>>
> >>> There is a description of what it means, and it's tied to an mds
> >>> configurable.
> >>>
> >>> On Mon, Sep 18, 2023 at 10:51 AM Janek Bevendorff <
> >>> janek.bevendo...@uni-weimar.de> wrote:
> >>>
>  Hey all,
> 
>  Since the upgrade to Ceph 16.2.14, I keep seeing the following
> warning:
> 
>  10 client(s) laggy due to laggy OSDs
> 
>  ceph health detail shows it as:
> 
>  [WRN] MDS_CLIENTS_LAGGY: 10 client(s) laggy due to laggy OSDs
>   mds.***(mds.3): Client *** is laggy; not evicted because some
>  OSD(s) is/are laggy
>   more of this...
> 
>  When I restart the client(s) or the affected MDS daemons, the message
>  goes away and then comes back after a while. ceph osd perf does not
>  list
>  any laggy OSDs (a few with 10-60ms ping, but overwhelmingly < 1ms), so
>  I'm on a total loss what this even means.
> 

[ceph-users] Re: cephfs mount 'stalls'

2023-09-20 Thread Marc
> 
> William, this is fuse client, not the kernel
> 
> Mark, you can use kernel client. Stock c7 or install, for example, kernel-
> ml from ELrepo [1], and use the latest krbd version
> 
> 

I think I had to move to the fuse because with one of the latest releases of 
luminous, I was getting issues with kernel cephfs mounts on osd nodes / 
hyper-converged vm's.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
Hey Janek,

I took a closer look at various places where the MDS would consider a
client as laggy and it seems like a wide variety of reasons are taken
into consideration and not all of them might be a reason to defer client
eviction, so the warning is a bit misleading. I'll post a PR for this. In
the meantime, could you share the debug logs stated in my previous email?

On Wed, Sep 20, 2023 at 3:07 PM Venky Shankar  wrote:

> Hi Janek,
>
> On Tue, Sep 19, 2023 at 4:44 PM Janek Bevendorff <
> janek.bevendo...@uni-weimar.de> wrote:
>
>> Hi Venky,
>>
>> As I said: There are no laggy OSDs. The maximum ping I have for any OSD
>> in ceph osd perf is around 60ms (just a handful, probably aging disks). The
>> vast majority of OSDs have ping times of less than 1ms. Same for the host
>> machines, yet I'm still seeing this message. It seems that the affected
>> hosts are usually the same, but I have absolutely no clue why.
>>
>
> It's possible that you are running into a bug which does not clear the
> laggy clients list which the MDS sends to monitors via beacons. Could you
> help us out with debug mds logs (by setting debug_mds=20) for the active
> mds for around 15-20 seconds and share the logs please? Also reset the log
> level once done since it can hurt performance.
>
> # ceph config set mds.<> debug_mds 20
>
> and reset via
>
> # ceph config rm mds.<> debug_mds
>
>
>> Janek
>>
>>
>> On 19/09/2023 12:36, Venky Shankar wrote:
>>
>> Hi Janek,
>>
>> On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff <
>> janek.bevendo...@uni-weimar.de> wrote:
>>
>>> Thanks! However, I still don't really understand why I am seeing this.
>>>
>>
>> This is due to a changes that was merged recently in pacific
>>
>> https://github.com/ceph/ceph/pull/52270
>>
>> The MDS would not evict laggy clients if the OSDs report as laggy. Laggy
>> OSDs can cause cephfs clients to not flush dirty data (during cap revokes
>> by the MDS) and thereby showing up as laggy and getting evicted by the MDS.
>> This behaviour was changed and therefore you get warnings that some client
>> are laggy but they are not evicted since the OSDs are laggy.
>>
>>
>>> The first time I had this, one of the clients was a remote user dialling
>>> in via VPN, which could indeed be laggy. But I am also seeing it from
>>> neighbouring hosts that are on the same physical network with reliable ping
>>> times way below 1ms. How is that considered laggy?
>>>
>>  Are some of your OSDs reporting laggy? This can be check via `perf dump`
>>
>> > ceph tell mds.<> perf dump
>> (search for op_laggy/osd_laggy)
>>
>>
>>> On 18/09/2023 18:07, Laura Flores wrote:
>>>
>>> Hi Janek,
>>>
>>> There was some documentation added about it here:
>>> https://docs.ceph.com/en/pacific/cephfs/health-messages/
>>>
>>> There is a description of what it means, and it's tied to an mds
>>> configurable.
>>>
>>> On Mon, Sep 18, 2023 at 10:51 AM Janek Bevendorff <
>>> janek.bevendo...@uni-weimar.de> wrote:
>>>
 Hey all,

 Since the upgrade to Ceph 16.2.14, I keep seeing the following warning:

 10 client(s) laggy due to laggy OSDs

 ceph health detail shows it as:

 [WRN] MDS_CLIENTS_LAGGY: 10 client(s) laggy due to laggy OSDs
  mds.***(mds.3): Client *** is laggy; not evicted because some
 OSD(s) is/are laggy
  more of this...

 When I restart the client(s) or the affected MDS daemons, the message
 goes away and then comes back after a while. ceph osd perf does not
 list
 any laggy OSDs (a few with 10-60ms ping, but overwhelmingly < 1ms), so
 I'm on a total loss what this even means.

 I have never seen this message before nor was I able to find anything
 about it. Do you have any idea what this message actually means and how
 I can get rid of it?

 Thanks
 Janek

 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io

>>>
>>>
>>> --
>>>
>>> Laura Flores
>>>
>>> She/Her/Hers
>>>
>>> Software Engineer, Ceph Storage 
>>>
>>> Chicago, IL
>>>
>>> lflo...@ibm.com | lflo...@redhat.com 
>>> M: +17087388804
>>>
>>>
>>> --
>>> Bauhaus-Universität Weimar
>>> Bauhausstr. 9a, R308
>>> 99423 Weimar, Germany
>>>
>>> Phone: +49 3643 58 3577www.webis.de
>>>
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
>>
>> --
>> Cheers,
>> Venky
>>
>> --
>> Bauhaus-Universität Weimar
>> Bauhausstr. 9a, R308
>> 99423 Weimar, Germany
>>
>> Phone: +49 3643 58 3577www.webis.de
>>
>>
>
> --
> Cheers,
> Venky
>


-- 
Cheers,
Venky
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS warning: clients laggy due to laggy OSDs

2023-09-20 Thread Venky Shankar
Hi Janek,

On Tue, Sep 19, 2023 at 4:44 PM Janek Bevendorff <
janek.bevendo...@uni-weimar.de> wrote:

> Hi Venky,
>
> As I said: There are no laggy OSDs. The maximum ping I have for any OSD in
> ceph osd perf is around 60ms (just a handful, probably aging disks). The
> vast majority of OSDs have ping times of less than 1ms. Same for the host
> machines, yet I'm still seeing this message. It seems that the affected
> hosts are usually the same, but I have absolutely no clue why.
>

It's possible that you are running into a bug which does not clear the
laggy clients list which the MDS sends to monitors via beacons. Could you
help us out with debug mds logs (by setting debug_mds=20) for the active
mds for around 15-20 seconds and share the logs please? Also reset the log
level once done since it can hurt performance.

# ceph config set mds.<> debug_mds 20

and reset via

# ceph config rm mds.<> debug_mds


> Janek
>
>
> On 19/09/2023 12:36, Venky Shankar wrote:
>
> Hi Janek,
>
> On Mon, Sep 18, 2023 at 9:52 PM Janek Bevendorff <
> janek.bevendo...@uni-weimar.de> wrote:
>
>> Thanks! However, I still don't really understand why I am seeing this.
>>
>
> This is due to a changes that was merged recently in pacific
>
> https://github.com/ceph/ceph/pull/52270
>
> The MDS would not evict laggy clients if the OSDs report as laggy. Laggy
> OSDs can cause cephfs clients to not flush dirty data (during cap revokes
> by the MDS) and thereby showing up as laggy and getting evicted by the MDS.
> This behaviour was changed and therefore you get warnings that some client
> are laggy but they are not evicted since the OSDs are laggy.
>
>
>> The first time I had this, one of the clients was a remote user dialling
>> in via VPN, which could indeed be laggy. But I am also seeing it from
>> neighbouring hosts that are on the same physical network with reliable ping
>> times way below 1ms. How is that considered laggy?
>>
>  Are some of your OSDs reporting laggy? This can be check via `perf dump`
>
> > ceph tell mds.<> perf dump
> (search for op_laggy/osd_laggy)
>
>
>> On 18/09/2023 18:07, Laura Flores wrote:
>>
>> Hi Janek,
>>
>> There was some documentation added about it here:
>> https://docs.ceph.com/en/pacific/cephfs/health-messages/
>>
>> There is a description of what it means, and it's tied to an mds
>> configurable.
>>
>> On Mon, Sep 18, 2023 at 10:51 AM Janek Bevendorff <
>> janek.bevendo...@uni-weimar.de> wrote:
>>
>>> Hey all,
>>>
>>> Since the upgrade to Ceph 16.2.14, I keep seeing the following warning:
>>>
>>> 10 client(s) laggy due to laggy OSDs
>>>
>>> ceph health detail shows it as:
>>>
>>> [WRN] MDS_CLIENTS_LAGGY: 10 client(s) laggy due to laggy OSDs
>>>  mds.***(mds.3): Client *** is laggy; not evicted because some
>>> OSD(s) is/are laggy
>>>  more of this...
>>>
>>> When I restart the client(s) or the affected MDS daemons, the message
>>> goes away and then comes back after a while. ceph osd perf does not list
>>> any laggy OSDs (a few with 10-60ms ping, but overwhelmingly < 1ms), so
>>> I'm on a total loss what this even means.
>>>
>>> I have never seen this message before nor was I able to find anything
>>> about it. Do you have any idea what this message actually means and how
>>> I can get rid of it?
>>>
>>> Thanks
>>> Janek
>>>
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>
>>
>> --
>>
>> Laura Flores
>>
>> She/Her/Hers
>>
>> Software Engineer, Ceph Storage 
>>
>> Chicago, IL
>>
>> lflo...@ibm.com | lflo...@redhat.com 
>> M: +17087388804
>>
>>
>> --
>> Bauhaus-Universität Weimar
>> Bauhausstr. 9a, R308
>> 99423 Weimar, Germany
>>
>> Phone: +49 3643 58 3577www.webis.de
>>
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
>
> --
> Cheers,
> Venky
>
> --
> Bauhaus-Universität Weimar
> Bauhausstr. 9a, R308
> 99423 Weimar, Germany
>
> Phone: +49 3643 58 3577www.webis.de
>
>

-- 
Cheers,
Venky
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph MDS OOM in combination with 6.5.1 kernel client

2023-09-20 Thread Mark Nelson

Hi Stefan,

Can you tell if the memory being used is due to the cache not being 
trimmed fast enough or something else?  You might want to try to see if 
you can track down if the 6.5.1 client isn't releasing CAPS properly or 
something.  Dan Van der Ster might have some insight here as well.


Mark

On 9/19/23 03:57, Stefan Kooman wrote:

Hi List,

For those of you that are brave enough to run 6.5 CephFS kernel client, 
we are seeing some interesting things happening. Some of this might be 
related to this thread [1]. On a couple of shared webhosting platforms 
we are running CephFS with 6.5.1 kernel. We have disabled 
"workqueue.cpu_intensive_thresh_us=0" (to prevent CephFS events from 
seen as cpu intensive). We have seen two MDS OOM situations after that. 
The MDS allocates ~ 60 GiB of RAM above baseline in ~ 50 seconds. In 
both OOM situations, a little before the OOM happens, there is a spike 
of network traffic going out of the MDS to a kernel client (6.5.1). That 
node gets ~ 700 MiB/s of MDS traffic for also ~ 50 seconds before the 
MDS process gets killed. Nothing is logged about this. Ceph is 
HEALTH_OK, no logging by kernel client or MDS whatsoever. The MDS 
rejoins and is up and active after a couple of minutes. There is no 
increased load on the MDS or the client that explain this (for as far as 
we can see).


At this point I don't expect anyone to tell me based on these symptoms 
what the issue is. But if you encounter similar issues, please update 
this thread. I'm pretty certain we are hitting a bug (or bugs), as the 
MDS should not blow itself up like that in any case (but evict the 
client (that misbehaves?).


Ceph MDS 16.2.11, MDS_MEMORY_TARGET=160GiB.

Gr. Stefan

[1]: 
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/YR5UNKBOKDHPL2PV4J75ZIUNI4HNMC2W/

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: S3website range requests - possible issue

2023-09-20 Thread Ondřej Kukla
I was checking the tracker again and I found already fixed issue that seems to 
be connected with this issue.

https://tracker.ceph.com/issues/44508

Here is the PR that fixes it https://github.com/ceph/ceph/pull/33807

What I’m still not understanding is why this is only happening when using 
s3website api.

Is there someone who could shed some light on this?

Regards,

Ondrej
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: libceph: mds1 IP+PORT wrong peer at address

2023-09-20 Thread Frank Schilder
Hi, in our case the problem was on the client side. When you write "logs from a 
host", do you mean an OSD host or a host where client connections come from? 
Its not clear from your problem description *who* is requesting the wrong peer.

The reason for this message is that something tries to talk to an older 
instance of a daemon. The number after the / is a nonce that is assigned after 
each restart to be able to distinguish different instances of the same service 
(say, OSD or MDS). Usually, these get updated on peering.

If its clients that are stuck, you probably need to reboot the client hosts.

Please specify what is trying to reach the outdated OSD instances. Then a 
relevant developer is more likely to look at it. Since its not MDS-kclient 
interaction it might be useful to open a new case.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: U S 
Sent: Tuesday, September 19, 2023 5:35 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: libceph: mds1 IP+PORT wrong peer at address

Hi,

We are unable to resolve these issues and OSD restarts have made the ceph 
cluster unusable. We are wondering if downgrading ceph version from 18.2.0 to 
17.2.6. Please let us know if this is supported and if so, please point me to 
the procedure to do the same.

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: No snap_schedule module in Octopus

2023-09-20 Thread Patrick Begou

Hi Patrick,

I agree that learning Ceph today with Octopus is not a good idea, but, 
as a newbie with this tool, I was not able to solve the HDD detection 
problem and my post about it on this forum do not provide any help 
(https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/OPMWHJ4ZFCOOPUY6ST4WAJ4G4ASJFALM/). 
I've also looked for a list of new unsupported hardware  between Octopus 
and Pacific without success. I've also received a private mail from a 
Sweden user reading the forum last week and having the same HDD 
detection problem with 17.2.6. He was asking if I have solved it. He 
tells me he will try to debug.


In my mind, an old version of Ceph on an old material had more chances 
to be stable and bug free too.


Yes I have created file systems (datacfs) and I can create a snapshot by 
hand using cephadm. I've just tested:

# ceph fs set datacfs allow_new_snaps true
# ceph-fuse /mnt
# mkdir /mnt/.snap/$(TZ=CET date +%Y-%m-%d:%H-%M-%S)
and I have a snapshot. I can remove it too.

May be today my goal should be:
1- try to undo   "ceph mgr module enable snap_schedule --force" (always 
a bad idea in my mind to use options like "--force")
2- launch update to nautilus now that all HDDs are configured. In my 
Ceph learning process there is also the step to test update procedures.

3- try again to use snap_schedule

Thanks for the time spent on my problem

Patrick

Le 19/09/2023 à 19:46, Patrick Donnelly a écrit :

I'm not sure off-hand. The module did have several changes as recently
as pacific so it's possible something is broken. Perhaps you don't
have a file system created yet? I would still expect to see the
commands however...

I suggest you figure out why Ceph Pacific+ can't detect your hard disk
drives (???). That seems more productive than debugging a long EOLifed
release.

On Tue, Sep 19, 2023 at 8:49 AM Patrick Begou
 wrote:

Hi Patrick,

sorry for the bad copy/paste.  As it was not working I have also tried
with the module name 

[ceph: root@mostha1 /]# ceph fs snap-schedule
no valid command found; 10 closest matches:
fs status []
fs volume ls
fs volume create  []
fs volume rm  []
fs subvolumegroup ls 
fs subvolumegroup create   []
[] [] []
fs subvolumegroup rm   [--force]
fs subvolume ls  []
fs subvolume create   [] []
[] [] [] [] [--namespace-isolated]
fs subvolume rm   [] [--force]
[--retain-snapshots]
Error EINVAL: invalid command

I'm reading the same documentation, but for Octopus:
https://docs.ceph.com/en/octopus/cephfs/snap-schedule/#

I think that if  "ceph mgr module enable snap_schedule" was not working
without the "--force" option, it was because something was wrong in my
Ceph install.

Patrick

Le 19/09/2023 à 14:29, Patrick Donnelly a écrit :

https://docs.ceph.com/en/quincy/cephfs/snap-schedule/#usage

ceph fs snap-schedule

(note the hyphen!)

On Tue, Sep 19, 2023 at 8:23 AM Patrick Begou
 wrote:

Hi,

still some problems with snap_schedule as as the ceph fs snap-schedule
namespace is not available on my nodes.

[ceph: root@mostha1 /]# ceph mgr module ls | jq -r '.enabled_modules []'
cephadm
dashboard
iostat
prometheus
restful
snap_schedule

[ceph: root@mostha1 /]# ceph fs snap_schedule
no valid command found; 10 closest matches:
fs status []
fs volume ls
fs volume create  []
fs volume rm  []
fs subvolumegroup ls 
fs subvolumegroup create   []
[] [] []
fs subvolumegroup rm   [--force]
fs subvolume ls  []
fs subvolume create   [] []
[] [] [] [] [--namespace-isolated]
fs subvolume rm   [] [--force]
[--retain-snapshots]
Error EINVAL: invalid command

I think I need your help to go further 

Patrick
Le 19/09/2023 à 10:23, Patrick Begou a écrit :

Hi,

bad question, sorry.
I've just run

ceph mgr module enable snap_schedule --force

to solve this problem. I was just afraid to use "--force"   but as I
can break this test configuration

Patrick

Le 19/09/2023 à 09:47, Patrick Begou a écrit :

Hi,

I'm working on a small POC for a ceph setup on 4 old C6100
power-edge. I had to install Octopus since latest versions were
unable to detect the HDD (too old hardware ??).  No matter, this is
only for training and understanding Ceph environment.

My installation is based on
https://download.ceph.com/rpm-15.2.12/el8/noarch/cephadm-15.2.12-0.el8.noarch.rpm
bootstrapped.

I'm reaching the point to automate the snapshots (I can create
snapshot by hand without any problem). The documentation
https://download.ceph.com/rpm-15.2.12/el8/noarch/cephadm-15.2.12-0.el8.noarch.rpm
says to use the snap_schedule module but this module does not exist.

# ceph mgr module ls | jq -r '.enabled_modules []'
cephadm
dashboard
iostat
prometheus
restful

Have I missed something ? Is there some additional install steps to
do for this module ?

Thanks for your help.

Patrick
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- 

[ceph-users] Re: Status of IPv4 / IPv6 dual stack?

2023-09-20 Thread Robert Sander

On 9/18/23 11:19, Stefan Kooman wrote:


IIIRC, the "enable dual" stack PR's were more or less "accidentally"
merged


So this looks like a big NO on the dual stack support for Ceph.

I just need an answer, I do not need dual stack support.

It would be nice if the documentation was a little be clearer on this topic.

Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io