[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-09 Thread Nikola Ciprich
Hello Igor,
> You didn't reset the counters every hour, do you? So having average
> subop_w_latency growing that way means the current values were much higher
> than before.

bummer, I didn't.. I've updated gather script to reset stats, wait 10m and then
gather perf data, each hour. It's running since yesterday, so now we'll have to 
wait
about one week for the problem to appear again..


> 
> Curious if subop latencies were growing for every OSD or just a subset (may
> be even just a single one) of them?
since I only have long time averaga, it's not easy to say, but based on what we 
have:

only two OSDs avg got sub_w_lat > 0.0006. no clear relation between them
19 OSDs got avg sub_w_lat > 0.0005 - this is more interesting - 15 out of them
are on those later installed nodes (note that those nodes have almost no VMs 
running
so they are much less used!) 4 are on other nodes. but also note, that not all
of OSDs on suspicious nodes are over the threshold, it's 6, 6 and 3 out of 7 
OSDs
on the node. but still it's strange..

> 
> 
> Next time you reach the bad state please do the following if possible:
> 
> - reset perf counters for every OSD
> 
> -  leave the cluster running for 10 mins and collect perf counters again.
> 
> - Then start restarting OSD one-by-one starting with the worst OSD (in terms
> of subop_w_lat from the prev step). Wouldn't be sufficient to reset just a
> few OSDs before the cluster is back to normal?

will do once it slows down again.


> > 
> > I see very similar crash reported here:https://tracker.ceph.com/issues/56346
> > so I'm not reporting..
> > 
> > Do you think this might somehow be the cause of the problem? Anything else 
> > I should
> > check in perf dumps or elsewhere?
> 
> Hmm... don't know yet. Could you please last 20K lines prior the crash from
> e.g two sample OSDs?

https://storage.linuxbox.cz/index.php/s/o5bMaGMiZQxWadi

> 
> And the crash isn't permanent, OSDs are able to start after the second(?)
> shot, aren't they?
yes, actually they start after issuing systemctl ceph-osd@xx restart, it just 
takes
long time performing log recovery..

If I can provide more info, please let me know

BR

nik

-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Ben
Curiously what is the umask and directory permission in your case?  add a
host for the cluster for further try?

Eugen Block  于2023年5月9日周二 14:59写道:

> Hi,
>
> I just retried without the single-host option and it worked. Also
> everything under /tmp/var belongs to root in my case. Unfortunately, I
> can't use the curl-based cephadm but the contents are identical, I
> compared. Not sure what it could be at the moment.
>
> Zitat von Ben :
>
> > Hi, It is uos v20(with kernel 4.19), one linux distribution among others.
> > no matter since cephadm deploys things in containers by default. cephadm
> is
> > pulled by curl from Quincy branch of github.
> >
> > I think you could see some sort of errors if you remove parameter
> > --single-host-defaults.
> >
> > More investigation shows it looks like a bug with cephadm.
> > during the deploying procedure
> >
> ,/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new
> > is created through sudo ssh session remotely(with owner of root) and
> > /tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/ is changed to
> owner
> > of ssh user deployer. The correct thing to do instead is,  /tmp/var/ be
> > changed to the owner deployer recursively so that following scp can have
> > access permission.
> > I will see if having time to wire up a PR to fix it.
> >
> > Thanks for help on this.
> > Ben
> >
> >
> > Eugen Block  于2023年5月8日周一 21:01写道:
> >
> >> Hi,
> >>
> >> could you provide some more details about your host OS? Which cephadm
> >> version is it? I was able to bootstrap a one-node cluster with both
> >> 17.2.5 and 17.2.6 with a non-root user with no such error on openSUSE
> >> Leap 15.4:
> >>
> >> quincy:~ # rpm -qa | grep cephadm
> >> cephadm-17.2.6.248+gad656d572cb-lp154.2.1.noarch
> >>
> >> deployer@quincy:~> sudo cephadm --image quay.io/ceph/ceph:v17.2.5
> >> bootstrap --mon-ip 172.17.2.3 --skip-monitoring-stack --ssh-user
> >> deployer --single-host-defaults
> >> Verifying ssh connectivity ...
> >> Adding key to deployer@localhost authorized_keys...
> >> Verifying podman|docker is present...
> >> Verifying lvm2 is present...
> >> Verifying time synchronization is in place...
> >> Unit chronyd.service is enabled and running
> >> Repeating the final host check...
> >> podman (/usr/bin/podman) version 4.4.4 is present
> >> [...]
> >> Ceph version: ceph version 17.2.5
> >> (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
> >> Extracting ceph user uid/gid from container image...
> >> Creating initial keys...
> >> Creating initial monmap...
> >> Creating mon...
> >> Waiting for mon to start...
> >> Waiting for mon...
> >> mon is available
> >> [...]
> >> Adding key to deployer@localhost authorized_keys...
> >> Adding host quincy...
> >> Deploying mon service with default placement...
> >> Deploying mgr service with default placement...
> >> [...]
> >> Bootstrap complete.
> >>
> >> Zitat von Ben :
> >>
> >> > Hi,
> >> >
> >> > with following command:
> >> >
> >> > sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33
> >> --skip-monitoring-stack
> >> >   --ssh-user deployer
> >> > the user deployer has passwordless sudo configuration.
> >> > I can see the error below:
> >> >
> >> > debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
> >> > cephadm.ssh] Unable to write
> >> >
> >>
> szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:
> >> > scp:
> >> >
> >>
> /tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:
> >> > Permission denied
> >> >
> >> > Traceback (most recent call last):
> >> >
> >> >   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in
> >> _write_remote_file
> >> >
> >> > await asyncssh.scp(f.name, (conn, tmp_path))
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in
> scp
> >> >
> >> > await source.run(srcpath)
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in
> run
> >> >
> >> > self.handle_error(exc)
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> >> > handle_error
> >> >
> >> > raise exc from None
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in
> run
> >> >
> >> > await self._send_files(path, b'')
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
> >> > _send_files
> >> >
> >> > self.handle_error(exc)
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> >> > handle_error
> >> >
> >> > raise exc from None
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
> >> > _send_files
> >> >
> >> > await self._send_file(srcpath, dstpath, attrs)
> >> >
> >> >   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
> >> > 

[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Ben
umask here is 027. The pr should fix the problem above. no more fix needed
and wait for point release.

Adam King  于2023年5月10日周三 05:52写道:

> What's the umask for the "deployer" user? We saw an instance of someone
> hitting something like this, but for them it seemed to only happen when
> they had changed the umask to 027. We had patched in
> https://github.com/ceph/ceph/pull/50736 to address it, which I don't
> think was merged too late for the 17.2.6 release.
>
> On Mon, May 8, 2023 at 5:24 AM Ben  wrote:
>
>> Hi,
>>
>> with following command:
>>
>> sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33
>> --skip-monitoring-stack
>>   --ssh-user deployer
>> the user deployer has passwordless sudo configuration.
>> I can see the error below:
>>
>> debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
>> cephadm.ssh] Unable to write
>>
>> szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:
>> scp:
>>
>> /tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:
>> Permission denied
>>
>> Traceback (most recent call last):
>>
>>   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in
>> _write_remote_file
>>
>> await asyncssh.scp(f.name, (conn, tmp_path))
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp
>>
>> await source.run(srcpath)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run
>>
>> self.handle_error(exc)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
>> handle_error
>>
>> raise exc from None
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run
>>
>> await self._send_files(path, b'')
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
>> _send_files
>>
>> self.handle_error(exc)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
>> handle_error
>>
>> raise exc from None
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
>> _send_files
>>
>> await self._send_file(srcpath, dstpath, attrs)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
>> _send_file
>>
>> await self._make_cd_request(b'C', attrs, size, srcpath)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
>> _make_cd_request
>>
>> self._fs.basename(path))
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
>> make_request
>>
>> raise exc
>>
>> Any ideas on this?
>>
>> Thanks,
>> Ben
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: v16.2.13 Pacific released

2023-05-09 Thread Zakhar Kirpichenko
Thanks!

An upgrade from 16.2.12 on Ubuntu 20.04 LTS went smoothly.

/Z

On Wed, 10 May 2023 at 00:45, Yuri Weinstein  wrote:

> We're happy to announce the 13th backport release in the Pacific series.
>
> https://ceph.io/en/news/blog/2023/v16-2-13-pacific-released/
>
> Notable Changes
> ---
>
> * CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
>   `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
>   mds-client.yaml.in because this option was only used by MDS client from
> its
>   birth.
>
> * `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
>   `active_clients` fields at the top level.  Previously, these fields were
>   output under `always_on_modules` field.
>
> Getting Ceph
> 
> * Git at git://github.com/ceph/ceph.git
> * Tarball at https://download.ceph.com/tarballs/ceph-16.2.13.tar.gz
> * Containers at https://quay.io/repository/ceph/ceph
> * For packages, see https://docs.ceph.com/en/latest/install/get-packages/
> * Release git sha1: 5378749ba6be3a0868b51803968ee9cde4833a3e
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-09 Thread Zakhar Kirpichenko
Thank you, Igor. I will try to see how to collect the perf values. Not sure
about restarting all OSDs as it's a production cluster, is there a less
invasive way?

/Z

On Tue, 9 May 2023 at 23:58, Igor Fedotov  wrote:

> Hi Zakhar,
>
> Let's leave questions regarding cache usage/tuning to a different topic
> for now. And concentrate on performance drop.
>
> Could you please do the same experiment I asked from Nikola once your
> cluster reaches "bad performance" state (Nikola, could you please use this
> improved scenario as well?):
>
> - collect perf counters for every OSD
>
> - reset perf counters for every OSD
>
> -  leave the cluster running for 10 mins and collect perf counters again.
>
> - Then restart OSDs one-by-one starting with the worst OSD (in terms of
> subop_w_lat from the prev step). Wouldn't be sufficient to reset just a few
> OSDs before the cluster is back to normal?
>
> - if partial OSD restart is sufficient - please leave the remaining OSDs
> run as-is without reboot.
>
> - after the restart (no matter partial or complete one - the key thing
> it's should successful) reset all the perf counters and leave the cluster
> run for 30 mins and collect perf counters again.
>
> - wait 24 hours and collect the counters one more time
>
> - share all four counters snapshots.
>
>
> Thanks,
>
> Igor
>
> On 5/8/2023 11:31 PM, Zakhar Kirpichenko wrote:
>
> Don't mean to hijack the thread, but I may be observing something similar
> with 16.2.12: OSD performance noticeably peaks after OSD restart and then
> gradually reduces over 10-14 days, while commit and apply latencies
> increase across the board.
>
> Non-default settings are:
>
> "bluestore_cache_size_hdd": {
> "default": "1073741824",
> "mon": "4294967296",
> "final": "4294967296"
> },
> "bluestore_cache_size_ssd": {
> "default": "3221225472",
> "mon": "4294967296",
> "final": "4294967296"
> },
> ...
> "osd_memory_cache_min": {
> "default": "134217728",
> "mon": "2147483648",
> "final": "2147483648"
> },
> "osd_memory_target": {
> "default": "4294967296",
> "mon": "17179869184",
> "final": "17179869184"
> },
> "osd_scrub_sleep": {
> "default": 0,
> "mon": 0.10001,
> "final": 0.10001
> },
> "rbd_balance_parent_reads": {
> "default": false,
> "mon": true,
> "final": true
> },
>
> All other settings are default, the usage is rather simple Openstack /
> RBD.
>
> I also noticed that OSD cache usage doesn't increase over time (see my
> message "Ceph 16.2.12, bluestore cache doesn't seem to be used much" dated
> 26 April 2023, which received no comments), despite OSDs are being used
> rather heavily and there's plenty of host and OSD cache / target memory
> available. It may be worth checking if available memory is being used in a
> good way.
>
> /Z
>
> On Mon, 8 May 2023 at 22:35, Igor Fedotov  wrote:
>
>> Hey Nikola,
>>
>> On 5/8/2023 10:13 PM, Nikola Ciprich wrote:
>> > OK, starting collecting those for all OSDs..
>> > I have hour samples of all OSDs perf dumps loaded in DB, so I can
>> easily examine,
>> > sort, whatever..
>> >
>> You didn't reset the counters every hour, do you? So having average
>> subop_w_latency growing that way means the current values were much
>> higher than before.
>>
>> Curious if subop latencies were growing for every OSD or just a subset
>> (may be even just a single one) of them?
>>
>>
>> Next time you reach the bad state please do the following if possible:
>>
>> - reset perf counters for every OSD
>>
>> -  leave the cluster running for 10 mins and collect perf counters again.
>>
>> - Then start restarting OSD one-by-one starting with the worst OSD (in
>> terms of subop_w_lat from the prev step). Wouldn't be sufficient to
>> reset just a few OSDs before the cluster is back to normal?
>>
>> >> currently values for avgtime are around 0.0003 for subop_w_lat and
>> 0.001-0.002
>> >> for op_w_lat
>> > OK, so there is no visible trend on op_w_lat, still between 0.001 and
>> 0.002
>> >
>> > subop_w_lat seems to have increased since yesterday though! I see
>> values from
>> > 0.0004 to as high as 0.001
>> >
>> > If some other perf data might be interesting, please let me know..
>> >
>> > During OSD restarts, I noticed strange thing - restarts on first 6
>> machines
>> > went smooth, but then on another 3, I saw rocksdb logs recovery on all
>> SSD
>> > OSDs. but first didn't see any mention of daemon crash in ceph -s
>> >
>> > later, crash info appeared, but only about 3 daemons (in total, at
>> least 20
>> > of them crashed though)
>> >
>> > crash report was similar for all three OSDs:
>> >
>> > [root@nrbphav4a ~]# ceph crash info
>> 2023-05-08T17:45:47.056675Z_a5759fe9-60c6-423a-88fc-57663f692bd3

[ceph-users] Re: client isn't responding to mclientcaps(revoke), pending pAsLsXsFsc issued pAsLsXsFsc

2023-05-09 Thread Xiubo Li


On 5/9/23 16:23, Frank Schilder wrote:

Dear Xiubo,

both issues will cause problems, the one reported in the subject 
(https://tracker.ceph.com/issues/57244) and the potential follow-up on MDS 
restart 
(https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LYY7TBK63XPR6X6TD7372I2YEPJO2L6F).
 Either one will cause compute jobs on our HPC cluster to hang and users will 
need to run the jobs again. Our queues are full, so not very popular to loose 
your spot.

The process in D-state is a user process. Interestingly it is often possible to 
kill it despite the D-state (if one can find the process) and the stuck recall 
gets resolved. If I restart the MDS, the stuck process might continue working, 
but we run a significant risk of other processed getting stuck due to the 
libceph/MDS wrong peer issue. We actually have these kind of messages

[Mon Mar  6 12:56:46 2023] libceph: mds1 192.168.32.87:6801 wrong peer at 
address
[Mon Mar  6 13:05:18 2023] libceph: wrong peer, want 
192.168.32.87:6801/-223958753, got
192.168.32.87:6801/-1572619386

all over the HPC cluster and each of them means that some files/dirs are 
inaccessible on the compute node and jobs either died or are/got stuck there. 
Every MDS restart bears the risk of such events happening and with many nodes 
this probability approaches 1 - every time we restart an MDS jobs get stuck.

I have a reproducer for an instance of https://tracker.ceph.com/issues/57244. 
Unfortunately, this is a big one that I would need to pack into a container. I 
was not able to reduce it to something small, it seems to depend on a very 
specific combination of codes with certain internal latencies between threads 
that trigger a race.

It sounds like you have a patch for https://tracker.ceph.com/issues/57244 
although its not linked from the tracker item.


IMO evicting the corresponding client could also resolve this issue 
instead of restarting the MDS.


Have you tried this ?

Thanks

- Xiubo



Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Xiubo Li 
Sent: Friday, May 5, 2023 2:40 AM
To: Frank Schilder; ceph-users@ceph.io
Subject: Re: [ceph-users] client isn't responding to mclientcaps(revoke), 
pending pAsLsXsFsc issued pAsLsXsFsc


On 5/1/23 17:35, Frank Schilder wrote:

Hi all,

I think we might be hitting a known problem 
(https://tracker.ceph.com/issues/57244). I don't want to fail the mds yet, 
because we have troubles with older kclients that miss the mds restart and hold 
on to cache entries referring to the killed instance, leading to hanging jobs 
on our HPC cluster.

Will this cause any issue in your case ?


I have seen this issue before and there was a process in D-state that 
dead-locked itself. Usually, killing this process succeeded and resolved the 
issue. However, this time I can't find such a process.

BTW, what's the D-state process ? A ceph one ?

Thanks


The tracker mentions that one can delete the file/folder. I have the inode 
number, but really don't want to start a find on a 1.5PB file system. Is there 
a better way to find what path is causing the issue (ask the MDS directly, look 
at a cache dump, or similar)? Is there an alternative to deletion or MDS fail?

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Adam King
which I think was merged too late* (as in the patch wouldn't be in 17.2.6)

On Tue, May 9, 2023 at 5:52 PM Adam King  wrote:

> What's the umask for the "deployer" user? We saw an instance of someone
> hitting something like this, but for them it seemed to only happen when
> they had changed the umask to 027. We had patched in
> https://github.com/ceph/ceph/pull/50736 to address it, which I don't
> think was merged too late for the 17.2.6 release.
>
> On Mon, May 8, 2023 at 5:24 AM Ben  wrote:
>
>> Hi,
>>
>> with following command:
>>
>> sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33
>> --skip-monitoring-stack
>>   --ssh-user deployer
>> the user deployer has passwordless sudo configuration.
>> I can see the error below:
>>
>> debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
>> cephadm.ssh] Unable to write
>>
>> szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:
>> scp:
>>
>> /tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:
>> Permission denied
>>
>> Traceback (most recent call last):
>>
>>   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in
>> _write_remote_file
>>
>> await asyncssh.scp(f.name, (conn, tmp_path))
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp
>>
>> await source.run(srcpath)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run
>>
>> self.handle_error(exc)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
>> handle_error
>>
>> raise exc from None
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run
>>
>> await self._send_files(path, b'')
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
>> _send_files
>>
>> self.handle_error(exc)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
>> handle_error
>>
>> raise exc from None
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
>> _send_files
>>
>> await self._send_file(srcpath, dstpath, attrs)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
>> _send_file
>>
>> await self._make_cd_request(b'C', attrs, size, srcpath)
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
>> _make_cd_request
>>
>> self._fs.basename(path))
>>
>>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
>> make_request
>>
>> raise exc
>>
>> Any ideas on this?
>>
>> Thanks,
>> Ben
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Adam King
What's the umask for the "deployer" user? We saw an instance of someone
hitting something like this, but for them it seemed to only happen when
they had changed the umask to 027. We had patched in
https://github.com/ceph/ceph/pull/50736 to address it, which I don't think
was merged too late for the 17.2.6 release.

On Mon, May 8, 2023 at 5:24 AM Ben  wrote:

> Hi,
>
> with following command:
>
> sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33
> --skip-monitoring-stack
>   --ssh-user deployer
> the user deployer has passwordless sudo configuration.
> I can see the error below:
>
> debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
> cephadm.ssh] Unable to write
>
> szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:
> scp:
>
> /tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:
> Permission denied
>
> Traceback (most recent call last):
>
>   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in
> _write_remote_file
>
> await asyncssh.scp(f.name, (conn, tmp_path))
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp
>
> await source.run(srcpath)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run
>
> self.handle_error(exc)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> handle_error
>
> raise exc from None
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run
>
> await self._send_files(path, b'')
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
> _send_files
>
> self.handle_error(exc)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> handle_error
>
> raise exc from None
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
> _send_files
>
> await self._send_file(srcpath, dstpath, attrs)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
> _send_file
>
> await self._make_cd_request(b'C', attrs, size, srcpath)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
> _make_cd_request
>
> self._fs.basename(path))
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
> make_request
>
> raise exc
>
> Any ideas on this?
>
> Thanks,
> Ben
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v16.2.13 Pacific released

2023-05-09 Thread Yuri Weinstein
We're happy to announce the 13th backport release in the Pacific series.

https://ceph.io/en/news/blog/2023/v16-2-13-pacific-released/

Notable Changes
---

* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
  `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
  mds-client.yaml.in because this option was only used by MDS client from its
  birth.

* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
  `active_clients` fields at the top level.  Previously, these fields were
  output under `always_on_modules` field.

Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at https://download.ceph.com/tarballs/ceph-16.2.13.tar.gz
* Containers at https://quay.io/repository/ceph/ceph
* For packages, see https://docs.ceph.com/en/latest/install/get-packages/
* Release git sha1: 5378749ba6be3a0868b51803968ee9cde4833a3e
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-05-09 Thread Igor Fedotov

Hi Zakhar,

Let's leave questions regarding cache usage/tuning to a different topic 
for now. And concentrate on performance drop.


Could you please do the same experiment I asked from Nikola once your 
cluster reaches "bad performance" state (Nikola, could you please use 
this improved scenario as well?):


- collect perf counters for every OSD

- reset perf counters for every OSD

-  leave the cluster running for 10 mins and collect perf counters again.

- Then restart OSDs one-by-one starting with the worst OSD (in terms of 
subop_w_lat from the prev step). Wouldn't be sufficient to reset just a 
few OSDs before the cluster is back to normal?


- if partial OSD restart is sufficient - please leave the remaining OSDs 
run as-is without reboot.


- after the restart (no matter partial or complete one - the key thing 
it's should successful) reset all the perf counters and leave the 
cluster run for 30 mins and collect perf counters again.


- wait 24 hours and collect the counters one more time

- share all four counters snapshots.


Thanks,

Igor

On 5/8/2023 11:31 PM, Zakhar Kirpichenko wrote:
Don't mean to hijack the thread, but I may be observing something 
similar with 16.2.12: OSD performance noticeably peaks after OSD 
restart and then gradually reduces over 10-14 days, while commit and 
apply latencies increase across the board.


Non-default settings are:

        "bluestore_cache_size_hdd": {
            "default": "1073741824",
            "mon": "4294967296",
            "final": "4294967296"
        },
        "bluestore_cache_size_ssd": {
            "default": "3221225472",
            "mon": "4294967296",
            "final": "4294967296"
        },
...
        "osd_memory_cache_min": {
            "default": "134217728",
            "mon": "2147483648",
            "final": "2147483648"
        },
        "osd_memory_target": {
            "default": "4294967296",
            "mon": "17179869184",
            "final": "17179869184"
        },
        "osd_scrub_sleep": {
            "default": 0,
            "mon": 0.10001,
            "final": 0.10001
        },
        "rbd_balance_parent_reads": {
            "default": false,
            "mon": true,
            "final": true
        },

All other settings are default, the usage is rather simple Openstack / 
RBD.


I also noticed that OSD cache usage doesn't increase over time (see my 
message "Ceph 16.2.12, bluestore cache doesn't seem to be used much" 
dated 26 April 2023, which received no comments), despite OSDs are 
being used rather heavily and there's plenty of host and OSD cache / 
target memory available. It may be worth checking if available memory 
is being used in a good way.


/Z

On Mon, 8 May 2023 at 22:35, Igor Fedotov  wrote:

Hey Nikola,

On 5/8/2023 10:13 PM, Nikola Ciprich wrote:
> OK, starting collecting those for all OSDs..
> I have hour samples of all OSDs perf dumps loaded in DB, so I
can easily examine,
> sort, whatever..
>
You didn't reset the counters every hour, do you? So having average
subop_w_latency growing that way means the current values were much
higher than before.

Curious if subop latencies were growing for every OSD or just a
subset
(may be even just a single one) of them?


Next time you reach the bad state please do the following if possible:

- reset perf counters for every OSD

-  leave the cluster running for 10 mins and collect perf counters
again.

- Then start restarting OSD one-by-one starting with the worst OSD
(in
terms of subop_w_lat from the prev step). Wouldn't be sufficient to
reset just a few OSDs before the cluster is back to normal?

>> currently values for avgtime are around 0.0003 for subop_w_lat
and 0.001-0.002
>> for op_w_lat
> OK, so there is no visible trend on op_w_lat, still between
0.001 and 0.002
>
> subop_w_lat seems to have increased since yesterday though! I
see values from
> 0.0004 to as high as 0.001
>
> If some other perf data might be interesting, please let me know..
>
> During OSD restarts, I noticed strange thing - restarts on first
6 machines
> went smooth, but then on another 3, I saw rocksdb logs recovery
on all SSD
> OSDs. but first didn't see any mention of daemon crash in ceph -s
>
> later, crash info appeared, but only about 3 daemons (in total,
at least 20
> of them crashed though)
>
> crash report was similar for all three OSDs:
>
> [root@nrbphav4a ~]# ceph crash info
2023-05-08T17:45:47.056675Z_a5759fe9-60c6-423a-88fc-57663f692bd3
> {
>      "backtrace": [
>          "/lib64/libc.so.6(+0x54d90) [0x7f64a6323d90]",
>          "(BlueStore::_txc_create(BlueStore::Collection*,
BlueStore::OpSequencer*, std::__cxx11::list >*,
boost::intrusive_ptr)+0x413) [0x55a1c9d07c43]",
>


[ceph-users] Re: Lua scripting in the rados gateway

2023-05-09 Thread Thomas Bennett
Hi Yuval,

Just a follow up on this.

An issue I’ve just resolved is getting scripts into the cephadm shell. As
it turns out - I didn’t know this be it seems the host file system is
mounted into the cephadm shell at /rootfs/.

So I've been editing a /tmp/preRequest.lua on my host and then running:

cephadm shell radosgw-admin script put --infile=/rootfs/tmp/preRequest.lua
--context=preRequest

This injects the lua script into the pre request context.

Cheers,
Tom

On Fri, 28 Apr 2023 at 15:19, Thomas Bennett  wrote:

> Hey Yuval,
>
> No problem. It was interesting to me to figure out how it all fits
> together and works.  Thanks for opening an issue on the tracker.
>
> Cheers,
> Tom
>
> On Thu, 27 Apr 2023 at 15:03, Yuval Lifshitz  wrote:
>
>> Hi Thomas,
>> Thanks for the detailed info!
>> RGW lua scripting was never tested in a cephadm deployment :-(
>> Opened a tracker: https://tracker.ceph.com/issues/59574 to make sure
>> this would work out of the box.
>>
>> Yuval
>>
>>
>> On Tue, Apr 25, 2023 at 10:25 PM Thomas Bennett  wrote:
>>
>>> Hi ceph users,
>>>
>>> I've been trying out the lua scripting for the rados gateway (thanks
>>> Yuval).
>>>
>>> As in my previous email I mentioned that there is an error when trying to
>>> load the luasocket module. However, I thought it was a good time to
>>> report
>>> on my progress.
>>>
>>> My 'hello world' example below is called *test.lua* below includes the
>>> following checks:
>>>
>>>1. Can I write to the debug log?
>>>2. Can I use the lua socket package to do something stupid but
>>>intersting, like connect to a webservice?
>>>
>>> Before you continue reading this, you might need to know that I run all
>>> ceph processes in a *CentOS Stream release 8 *container deployed using
>>> ceph
>>> orchestrator running *Ceph v17.2.5*, so please view the information below
>>> in that context.
>>>
>>> For anyone looking for a reference, I suggest going to the ceph lua rados
>>> gateway documentation at radosgw/lua-scripting
>>> .
>>>
>>> There are two new switches you need to know about in the radosgw-admin:
>>>
>>>- *script* -> loading your lua script
>>>- *script-package* -> loading supporting packages for your script -
>>> e.i.
>>>luasocket in this case.
>>>
>>> For a basic setup, you'll need to have a few dependencies in your
>>> containers:
>>>
>>>- cephadm container: requires luarocks (I've checked the code - it
>>> runs
>>>a luarocks search command)
>>>- radosgw container: requires luarocks, gcc, make,  m4, wget (wget
>>> just
>>>in case).
>>>
>>> To achieve the above, I updated the container image for our running
>>> system.
>>> I needed to do this because I needed to redeploy the rados gateway
>>> container to inject the lua script packages into the radosgw runtime
>>> process. This will start with a fresh container based on the global
>>> config
>>> *container_image* setting on your running system.
>>>
>>> For us this is currently captured in *quay.io/tsolo/ceph:v17.2.5-3
>>> * and included the following exta
>>> steps (including installing the lua dev from an rpm because there is no
>>> centos package in yum):
>>> yum install luarocks gcc make wget m4
>>> rpm -i
>>>
>>> https://rpmfind.net/linux/centos/8-stream/PowerTools/x86_64/os/Packages/lua-devel-5.3.4-12.el8.x86_64.rpm
>>>
>>> You will notice that I've included a compiler and compiler support into
>>> the
>>> image. This is because luarocks on the radosgw to compile luasocket (the
>>> package I want to install). This will happen at start time when the
>>> radosgw
>>> is restarted from ceph orch.
>>>
>>> In the cephadm container I still need to update our cephadm shell so I
>>> need
>>> to install luarocks by hand:
>>> yum install luarocks
>>>
>>> Then set thew updated image to use:
>>> ceph config set global container_image quay.io/tsolo/ceph:v17.2.5-3
>>>
>>> I now create a file called: *test.lua* in the cephadm container. This
>>> contains the following lines to write to the log and then do a get
>>> request
>>> to google. This is not practical in production, but it serves the purpose
>>> of testing the infrastructure:
>>>
>>> RGWDebugLog("Tsolo start lua script")
>>> local LuaSocket = require("socket")
>>> client = LuaSocket.connect("google.com", 80)
>>> client:send("GET / HTTP/1.0\r\nHost: google.com\r\n\r\n")
>>> while true do
>>>   s, status, partial = client:receive('*a')
>>>   RGWDebugLog(s or partial)
>>>   if status == "closed" then
>>> break
>>>   end
>>> end
>>> client:close()
>>> RGWDebugLog("Tsolo stop lua")
>>>
>>> Next I run:
>>> radosgw-admin script-package add --package=luasocket --allow-compilation
>>>
>>> And then list the added package to make sure it is there:
>>> radosgw-admin script-package list
>>>
>>> Note - at this point the radosgw has not been modified, it must first be
>>> restarted.
>>>
>>> Then I put the *test.lua *script into the pre 

[ceph-users] Re: Radosgw multisite replication issues

2023-05-09 Thread Tarrago, Eli (RIS-BCT)
East and West Clusters have been upgraded to quincy, 17.2.6.

We are still seeing replication failures. Deep diving the logs, I found the 
following interesting items.

What is the best way to continue to troubleshoot this?
What is the curl attempting to fetch, but failing to obtain?

-
root@east01:~# radosgw-admin bucket sync --bucket=ceph-bucket 
--source-zone=rgw-west run
2023-05-09T15:22:43.582+ 7f197d7fa700  0 WARNING: curl operation 
timed out, network average transfer speed less than 1024 Bytes per second 
during 300 seconds.
2023-05-09T15:22:43.582+ 7f1a48dd9e40  0 data sync: ERROR: failed 
to fetch bucket index status
2023-05-09T15:22:43.582+ 7f1a48dd9e40  0 
RGW-SYNC:bucket[ceph-bucket:ddd66ab8-0417---.93706683.1:119<-ceph-bucket:ddd66ab8-0417---.93706683.93706683.1:119]:
 ERROR: init sync on bucket failed, retcode=-5
2023-05-09T15:24:54.652+ 7f197d7fa700  0 WARNING: curl operation 
timed out, network average transfer speed less than 1024 Bytes per second 
during 300 seconds.
2023-05-09T15:27:05.725+ 7f197d7fa700  0 WARNING: curl operation 
timed out, network average transfer speed less than 1024 Bytes per second 
during 300 seconds.
-

radosgw-admin bucket sync --bucket=ceph-bucket-prd info
  realm 98e0e391- (rgw-blobs)
  zonegroup 0e0faf4e- (WestEastCeph)
   zone ddd66ab8- (rgw-east)
 bucket :ceph-bucket[ddd66ab8-.93706683.1])

source zone b2a4a31c-
 bucket :ceph-bucket[ddd66ab8-.93706683.1])
root@bctlpmultceph01:~# radosgw-admin bucket sync --bucket=ceph-bucket 
status
  realm 98e0e391- (rgw-blobs)
  zonegroup 0e0faf4e- (WestEastCeph)
   zone ddd66ab8- (rgw-east)
 bucket :ceph-bucket[ddd66ab8.93706683.1])

source zone b2a4a31c- (rgw-west)
  source bucket :ceph-bucket[ddd66ab8-.93706683.1])
full sync: 0/120 shards
incremental sync: 120/120 shards
bucket is behind on 112 shards
behind shards: 
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,77,78,80,81,82,83,84,85,86,89,90,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119]


-


2023-05-09T15:46:21.069+ 7f1fc7fff700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20857f2700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f2092ffd700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f2080fe9700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20817ea700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f208b7fe700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f20867f4700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f2086ff5700  0 rgw async rados processor: 
store->fetch_remote_obj() returned r=-5
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per second during 300 
seconds.
2023-05-09T15:46:21.069+ 7f20b12b8700  0 WARNING: curl operation timed out, 
network average transfer speed less than 1024 Bytes per 

[ceph-users] Re: Upgrade Ceph cluster + radosgw from 14.2.18 to latest 15

2023-05-09 Thread Marc
Because pacific has performance issues

> 
> Curious, why not go to Pacific? You can upgrade up to 2 major releases
> in a go.
> 
> 
> The upgrade process to pacific is here:
> https://docs.ceph.com/en/latest/releases/pacific/#upgrading-non-cephadm-
> clusters
> The upgrade to Octopus is here:
> https://docs.ceph.com/en/latest/releases/octopus/#upgrading-from-mimic-
> or-nautilus
> 
> 
>   >
>   > Hi, I want to upgrade my old Ceph cluster + Radosgw from v14 to
> v15. But
>   > I'm not using cephadm and I'm not sure how to limit errors as
> much as
>   > possible during the upgrade process?
> 
>   Maybe check the changelog, check upgrading notes, and continuosly
> monitor the mailing list?
>   I have to do the same upgrade and eg. I need to recreate one
> monitor so it has the rocksdb before upgrading.
> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade Ceph cluster + radosgw from 14.2.18 to latest 15

2023-05-09 Thread Wesley Dillingham
Curious, why not go to Pacific? You can upgrade up to 2 major releases in a
go.

The upgrade process to pacific is here:
https://docs.ceph.com/en/latest/releases/pacific/#upgrading-non-cephadm-clusters
The upgrade to Octopus is here:
https://docs.ceph.com/en/latest/releases/octopus/#upgrading-from-mimic-or-nautilus

Respectfully,

*Wes Dillingham*
w...@wesdillingham.com
LinkedIn 


On Tue, May 9, 2023 at 3:25 AM Marc  wrote:

> >
> > Hi, I want to upgrade my old Ceph cluster + Radosgw from v14 to v15. But
> > I'm not using cephadm and I'm not sure how to limit errors as much as
> > possible during the upgrade process?
>
> Maybe check the changelog, check upgrading notes, and continuosly monitor
> the mailing list?
> I have to do the same upgrade and eg. I need to recreate one monitor so it
> has the rocksdb before upgrading.
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephadm docker registry

2023-05-09 Thread Satish Patel
Folks,

I am trying to install ceph on 10 node clusters and planning to use
cephadm. My question is if next year i will add new nodes to this cluster
then what docker image version cephadm will use to add new nodes?

Are there any local registry can i create one to copy images locally? How
does cephadm control images?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: client isn't responding to mclientcaps(revoke), pending pAsLsXsFsc issued pAsLsXsFsc

2023-05-09 Thread Frank Schilder
Dear Xiubo,

both issues will cause problems, the one reported in the subject 
(https://tracker.ceph.com/issues/57244) and the potential follow-up on MDS 
restart 
(https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/LYY7TBK63XPR6X6TD7372I2YEPJO2L6F).
 Either one will cause compute jobs on our HPC cluster to hang and users will 
need to run the jobs again. Our queues are full, so not very popular to loose 
your spot.

The process in D-state is a user process. Interestingly it is often possible to 
kill it despite the D-state (if one can find the process) and the stuck recall 
gets resolved. If I restart the MDS, the stuck process might continue working, 
but we run a significant risk of other processed getting stuck due to the 
libceph/MDS wrong peer issue. We actually have these kind of messages

[Mon Mar  6 12:56:46 2023] libceph: mds1 192.168.32.87:6801 wrong peer at 
address
[Mon Mar  6 13:05:18 2023] libceph: wrong peer, want 
192.168.32.87:6801/-223958753, got
192.168.32.87:6801/-1572619386

all over the HPC cluster and each of them means that some files/dirs are 
inaccessible on the compute node and jobs either died or are/got stuck there. 
Every MDS restart bears the risk of such events happening and with many nodes 
this probability approaches 1 - every time we restart an MDS jobs get stuck.

I have a reproducer for an instance of https://tracker.ceph.com/issues/57244. 
Unfortunately, this is a big one that I would need to pack into a container. I 
was not able to reduce it to something small, it seems to depend on a very 
specific combination of codes with certain internal latencies between threads 
that trigger a race.

It sounds like you have a patch for https://tracker.ceph.com/issues/57244 
although its not linked from the tracker item.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Xiubo Li 
Sent: Friday, May 5, 2023 2:40 AM
To: Frank Schilder; ceph-users@ceph.io
Subject: Re: [ceph-users] client isn't responding to mclientcaps(revoke), 
pending pAsLsXsFsc issued pAsLsXsFsc


On 5/1/23 17:35, Frank Schilder wrote:
> Hi all,
>
> I think we might be hitting a known problem 
> (https://tracker.ceph.com/issues/57244). I don't want to fail the mds yet, 
> because we have troubles with older kclients that miss the mds restart and 
> hold on to cache entries referring to the killed instance, leading to hanging 
> jobs on our HPC cluster.

Will this cause any issue in your case ?

> I have seen this issue before and there was a process in D-state that 
> dead-locked itself. Usually, killing this process succeeded and resolved the 
> issue. However, this time I can't find such a process.

BTW, what's the D-state process ? A ceph one ?

Thanks

> The tracker mentions that one can delete the file/folder. I have the inode 
> number, but really don't want to start a find on a 1.5PB file system. Is 
> there a better way to find what path is causing the issue (ask the MDS 
> directly, look at a cache dump, or similar)? Is there an alternative to 
> deletion or MDS fail?
>
> Thanks and best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How can I use not-replicated pool (replication 1 or raid-0)

2023-05-09 Thread Frank Schilder
When you say cache device, do you mean a ceph cache pool as a tier to a rep-2 
pool? If so, you might want to reconsider, cache pools are deprecated and will 
be removed from ceph at some point.

If you have funds to buy new drives, you can just as well deploy a beegfs (or 
something else) on these. It is no problem to run ceph and beegfs on the same 
hosts. The disks should not be shared, but that's all. This might still be a 
simpler config than introducing a cache tier just to cover up for rep-2 
overhead.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: mhnx 
Sent: Friday, May 5, 2023 9:26 PM
To: Frank Schilder
Cc: Janne Johansson; Ceph Users
Subject: Re: [ceph-users] Re: How can I use not-replicated pool (replication 1 
or raid-0)

Hello Frank.

>If your only tool is a hammer ...
>Sometimes its worth looking around.

You are absolutely right! But I have limitations because my customer
is a startup and they want to create a hybrid system with current
hardware for all their needs. That's why I'm spending time to find a
work around. They are using cephfs on their Software and I moved them
on this path from NFS. At the beginning they were only looking for a
rep2 pool for their important data and Ceph was an absolutely great
idea. Now the system is running smoothly but they also want to move
the [garbage data] on the same system but as I told you, the data flow
is different and the current hardware (non plp sata ssd's without
bluestore cache) can not supply the required speed with replication 2.
They are happy with replication 1 speed but I'm not because when any
network, disk, or node goes down, the cluster will be suspended due to
rep1.

Now I advised at least adding low latency PCI-Nvme's as a cache device
to force rep2 pool. I will solve the Write latency with PLP low
latency nvme's but still I need to solve deletion speed too. Actually
with the random write-delete I was trying to tell the difference on
delete speed. You are right, /dev/random requires cpu power and it
will create latency and it should not used for write speed tests.

Currently I'm working on development of an automation script to fix
any problem for replication 1 pool.
It is what it is.

Best regards.




Frank Schilder , 3 May 2023 Çar, 11:50 tarihinde şunu yazdı:


>
> Hi mhnx.
>
> > I also agree with you, Ceph is not designed for this kind of use case
> > but I tried to continue what I know.
> If your only tool is a hammer ...
> Sometimes its worth looking around.
>
> While your tests show that a rep-1 pool is faster than a rep-2 pool, the 
> values are not exactly impressive. There are 2 things that are relevant here: 
> ceph is a high latency system, its software stack is quite heavy-weight. Even 
> for a rep-1 pool its doing a lot to ensure data integrity. BeeGFS is a 
> lightweight low-latency system skipping a lot of magic, which makes it very 
> suited for performance critical tasks but less for long-term archival 
> applications.
>
> The second is that the device /dev/urandom is actually very slow (and even 
> unpredictable on some systems, it might wait for more entropy to be created). 
> Your times are almost certainly affected by that. If you want to have 
> comparable and close to native storage performance, create the files you want 
> to write to storage first in RAM and then copy from RAM to storage. Using 
> random data is a good idea to bypass potential built-in accelerations for 
> special data, like all-zeros. However, exclude the random number generator 
> from the benchmark and generate the data first before timing its use.
>
> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
>
> 
> From: mhnx 
> Sent: Tuesday, May 2, 2023 5:25 PM
> To: Frank Schilder
> Cc: Janne Johansson; Ceph Users
> Subject: Re: [ceph-users] Re: How can I use not-replicated pool (replication 
> 1 or raid-0)
>
> Thank you for the explanation Frank.
>
> I also agree with you, Ceph is not designed for this kind of use case
> but I tried to continue what I know.
> My idea was exactly what you described, I was trying to automate
> cleaning or recreating on any failure.
>
> As you can see below, rep1 pool is very fast:
> - Create: time for i in {1..9}; do head -c 1K  >randfile$i; done
> replication 2 : 31m59.917s
> replication 1 : 7m6.046s
> 
> - Delete: time rm -rf testdir/
> replication 2 : 11m56.994s
> replication 1 : 0m40.756s
> -
>
> I started learning DRBD, I will also check BeeGFS thanks for the advice.
>
> Regards.
>
> Frank Schilder , 1 May 2023 Pzt, 10:27 tarihinde şunu yazdı:
> >
> > I think you misunderstood Janne's reply. The main statement is at the end, 
> > ceph is not designed for an "I don't care about data" use case. If you need 
> > speed for temporary data where you can sustain data 

[ceph-users] Re: Change in DMARC handling for the list

2023-05-09 Thread Frank Schilder
Dear Dan,

I'm one of the users for whom this is an on-off experience. I had a period 
where everything worked fine only to get bad again; see my reply from October 
25 2022 to the dev-thread "Ceph Leadership Team meeting 2022-09-14". Over the 
last few days I made a similar experience. For 1 day, I think Friday/Saturday 
all ceph-user messages made it to mu inbox. Since Sunday I have to pull them 
out of MS quarantine again.

They aye usually reported as violating some sender authentication scheme. 
Unfortunately, since our e-mail service was moved to a cloud service I can't 
extract the real reason for quarantine any more, it just says "phising policy", 
which usually means something along the lines of sender could not be verified.

It would be great if you could get this working for everyone, also for the 
unfortunate souls who have to live with artificially intelligent microsoft 
policies.

Thanks and best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Dan Mick 
Sent: Friday, May 5, 2023 1:46 AM
To: ceph-users
Subject: [ceph-users] Change in DMARC handling for the list

Several users have complained for some time that our DMARC/DKIM handling
is not correct.  I've recently had time to go study DMARC, DKIM, SPF,
SRS, and other tasty morsels of initialisms, and have thus made a change
to how Mailman handles DKIM signatures for the list:

If a domain advertises that it will reject or quarantine messages that
fail DKIM (through its DMARC policy in the DNS text record
_dmarc.), the message will be rewritten to be "From" ceph.io,
and SPF should be correct.  I do not know if it will regenerate a DKIM
signature in that case for what is now it's own message.  The From:
address will say something like "From Original Sender via ceph-users
 so it's somewhat clear who first sent the message,
and Reply-To will be set to Original Sender.

Again, this will only happen for senders from domains that advertise a
strict DMARC policy.  This does not include gmail.com, surprisingly.

Let me know if you notice anything that seems to have gotten worse.

Next on the list is to investigate if DKIM-signing outbound messages, or
at least ones that don't already have an ARC-Seal, is appropriate and/or
workable.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade Ceph cluster + radosgw from 14.2.18 to latest 15

2023-05-09 Thread Marc
> 
> Hi, I want to upgrade my old Ceph cluster + Radosgw from v14 to v15. But
> I'm not using cephadm and I'm not sure how to limit errors as much as
> possible during the upgrade process?

Maybe check the changelog, check upgrading notes, and continuosly monitor the 
mailing list?
I have to do the same upgrade and eg. I need to recreate one monitor so it has 
the rocksdb before upgrading.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Eugen Block

And I just tried with docker as well, works too.

Zitat von Eugen Block :


Hi,

I just retried without the single-host option and it worked. Also  
everything under /tmp/var belongs to root in my case. Unfortunately,  
I can't use the curl-based cephadm but the contents are identical, I  
compared. Not sure what it could be at the moment.


Zitat von Ben :


Hi, It is uos v20(with kernel 4.19), one linux distribution among others.
no matter since cephadm deploys things in containers by default. cephadm is
pulled by curl from Quincy branch of github.

I think you could see some sort of errors if you remove parameter
--single-host-defaults.

More investigation shows it looks like a bug with cephadm.
during the deploying procedure
,/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new
is created through sudo ssh session remotely(with owner of root) and
/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/ is changed to owner
of ssh user deployer. The correct thing to do instead is,  /tmp/var/ be
changed to the owner deployer recursively so that following scp can have
access permission.
I will see if having time to wire up a PR to fix it.

Thanks for help on this.
Ben


Eugen Block  于2023年5月8日周一 21:01写道:


Hi,

could you provide some more details about your host OS? Which cephadm
version is it? I was able to bootstrap a one-node cluster with both
17.2.5 and 17.2.6 with a non-root user with no such error on openSUSE
Leap 15.4:

quincy:~ # rpm -qa | grep cephadm
cephadm-17.2.6.248+gad656d572cb-lp154.2.1.noarch

deployer@quincy:~> sudo cephadm --image quay.io/ceph/ceph:v17.2.5
bootstrap --mon-ip 172.17.2.3 --skip-monitoring-stack --ssh-user
deployer --single-host-defaults
Verifying ssh connectivity ...
Adding key to deployer@localhost authorized_keys...
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/usr/bin/podman) version 4.4.4 is present
[...]
Ceph version: ceph version 17.2.5
(98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
[...]
Adding key to deployer@localhost authorized_keys...
Adding host quincy...
Deploying mon service with default placement...
Deploying mgr service with default placement...
[...]
Bootstrap complete.

Zitat von Ben :


Hi,

with following command:

sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33

--skip-monitoring-stack

  --ssh-user deployer
the user deployer has passwordless sudo configuration.
I can see the error below:

debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
cephadm.ssh] Unable to write


szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:

scp:


/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:

Permission denied

Traceback (most recent call last):

  File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in

_write_remote_file


await asyncssh.scp(f.name, (conn, tmp_path))

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp

await source.run(srcpath)

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run

self.handle_error(exc)

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
handle_error

raise exc from None

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run

await self._send_files(path, b'')

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
_send_files

self.handle_error(exc)

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
handle_error

raise exc from None

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
_send_files

await self._send_file(srcpath, dstpath, attrs)

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
_send_file

await self._make_cd_request(b'C', attrs, size, srcpath)

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
_make_cd_request

self._fs.basename(path))

  File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
make_request

raise exc

Any ideas on this?

Thanks,
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 

[ceph-users] Re: non root deploy ceph 17.2.5 failed

2023-05-09 Thread Eugen Block

Hi,

I just retried without the single-host option and it worked. Also  
everything under /tmp/var belongs to root in my case. Unfortunately, I  
can't use the curl-based cephadm but the contents are identical, I  
compared. Not sure what it could be at the moment.


Zitat von Ben :


Hi, It is uos v20(with kernel 4.19), one linux distribution among others.
no matter since cephadm deploys things in containers by default. cephadm is
pulled by curl from Quincy branch of github.

I think you could see some sort of errors if you remove parameter
--single-host-defaults.

More investigation shows it looks like a bug with cephadm.
during the deploying procedure
,/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new
is created through sudo ssh session remotely(with owner of root) and
/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/ is changed to owner
of ssh user deployer. The correct thing to do instead is,  /tmp/var/ be
changed to the owner deployer recursively so that following scp can have
access permission.
I will see if having time to wire up a PR to fix it.

Thanks for help on this.
Ben


Eugen Block  于2023年5月8日周一 21:01写道:


Hi,

could you provide some more details about your host OS? Which cephadm
version is it? I was able to bootstrap a one-node cluster with both
17.2.5 and 17.2.6 with a non-root user with no such error on openSUSE
Leap 15.4:

quincy:~ # rpm -qa | grep cephadm
cephadm-17.2.6.248+gad656d572cb-lp154.2.1.noarch

deployer@quincy:~> sudo cephadm --image quay.io/ceph/ceph:v17.2.5
bootstrap --mon-ip 172.17.2.3 --skip-monitoring-stack --ssh-user
deployer --single-host-defaults
Verifying ssh connectivity ...
Adding key to deployer@localhost authorized_keys...
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman (/usr/bin/podman) version 4.4.4 is present
[...]
Ceph version: ceph version 17.2.5
(98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Waiting for mon to start...
Waiting for mon...
mon is available
[...]
Adding key to deployer@localhost authorized_keys...
Adding host quincy...
Deploying mon service with default placement...
Deploying mgr service with default placement...
[...]
Bootstrap complete.

Zitat von Ben :

> Hi,
>
> with following command:
>
> sudo cephadm  --docker bootstrap --mon-ip 10.1.32.33
--skip-monitoring-stack
>   --ssh-user deployer
> the user deployer has passwordless sudo configuration.
> I can see the error below:
>
> debug 2023-05-04T12:46:43.268+ 7fc5ddc2e700  0 [cephadm ERROR
> cephadm.ssh] Unable to write
>
szhyf-xx1d002-hx15w:/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e:
> scp:
>
/tmp/var/lib/ceph/ad3a132e-e9ee-11ed-8a19-043f72fb8bf9/cephadm.059bfc99f5cf36ed881f2494b104711faf4cbf5fc86a9594423cc105cafd9b4e.new:
> Permission denied
>
> Traceback (most recent call last):
>
>   File "/usr/share/ceph/mgr/cephadm/ssh.py", line 222, in
_write_remote_file
>
> await asyncssh.scp(f.name, (conn, tmp_path))
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 922, in scp
>
> await source.run(srcpath)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 458, in run
>
> self.handle_error(exc)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> handle_error
>
> raise exc from None
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 456, in run
>
> await self._send_files(path, b'')
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 438, in
> _send_files
>
> self.handle_error(exc)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 307, in
> handle_error
>
> raise exc from None
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 434, in
> _send_files
>
> await self._send_file(srcpath, dstpath, attrs)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 365, in
> _send_file
>
> await self._make_cd_request(b'C', attrs, size, srcpath)
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 343, in
> _make_cd_request
>
> self._fs.basename(path))
>
>   File "/lib/python3.6/site-packages/asyncssh/scp.py", line 224, in
> make_request
>
> raise exc
>
> Any ideas on this?
>
> Thanks,
> Ben
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To