[ceph-users] Re: Issue with CephFS (mds stuck in clientreplay status) since upgrade to 18.2.0.

2023-11-27 Thread Lo Re Giuseppe
me host "monitor-02" ? Cordialement, *David CASIER* Le lun. 27 nov. 2023 à 10:09, Lo Re Giuseppe mailto:giuseppe.l...@cscs.ch>> a écrit : > Hi, > We have upgraded one ceph cluster from 17.2.7 to 18.2.0. Since then we

[ceph-users] Issue with CephFS (mds stuck in clientreplay status) since upgrade to 18.2.0.

2023-11-27 Thread Lo Re Giuseppe
Hi, We have upgraded one ceph cluster from 17.2.7 to 18.2.0. Since then we are having CephFS issues. For example this morning: “”” [root@naret-monitor01 ~]# ceph -s cluster: id: 63334166-d991-11eb-99de-40a6b72108d0 health: HEALTH_WARN 1 filesystem is degraded

[ceph-users] Re: librbd 4k read/write?

2023-08-11 Thread Lo Re Giuseppe
Hi, In case of pool a cluster where most pools are with erasure code 4+2, what would you consider as value for cluster_size? Giuseppe On 10.08.23, 21:06, "Zakhar Kirpichenko" mailto:zak...@gmail.com>> wrote: Hi, You can use the following formula to roughly calculate the IOPS you can get

[ceph-users] Re: Upgrade from 16.2.7. to 16.2.11 failing on OSDs

2023-03-30 Thread Lo Re Giuseppe
isk each, so the issue seems really to be about the number of devices and nodes. Regards, Giuseppe On 30.03.23, 16:56, "Lo Re Giuseppe" mailto:giuseppe.l...@cscs.ch>> wrote: Dear all, On one of our clusters I started the upgrade process from 16.2.7 to 16.2.11. Mon and mgr

[ceph-users] Upgrade from 16.2.7. to 16.2.11 failing on OSDs

2023-03-30 Thread Lo Re Giuseppe
Dear all, On one of our clusters I started the upgrade process from 16.2.7 to 16.2.11. Mon and mgr and crash processes were done easily/quickly, then at the first attempt of upgrading a OSD container the upgrade process stopped because of the OSD process is not able to start after the upgrade.

[ceph-users] Mds crash at cscs

2023-01-19 Thread Lo Re Giuseppe
Dear all, We have started to use more intensively cephfs for some wlcg related workload. We have 3 active mds instances spread on 3 servers, mds_cache_memory_limit=12G, most of the other configs are default ones. One of them has crashed this night leaving the log below. Do you have any hint on

[ceph-users] Re: MGR failures and pg autoscaler

2022-10-25 Thread Lo Re Giuseppe
-4b53e28d-2f59-11ed-8aa5-9aa9e2c5aae2","vol_name":"cephfs"}]: dispatch debug 2022-10-25T05:06:08.884+ 7f4106e90700 -1 Traceback (most recent call last): File "/usr/share/ceph/mgr/progress/module.py", line 716, in serve self._process_pg_summary() File "/

[ceph-users] MGR failures and pg autoscaler

2022-10-25 Thread Lo Re Giuseppe
Hi, Since some weeks we started to us pg autoscale on our pools. We run with version 16.2.7. Maybe a coincidence, maybe not, from some weeks we started to experience mgr progress module failures: “”” [root@naret-monitor01 ~]# ceph -s cluster: id: 63334166-d991-11eb-99de-40a6b72108d0

[ceph-users] Re: Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-19 Thread Lo Re Giuseppe
again the upgrade procedure with cephadm and test if this time it starts... Giuseppe On 18.05.22, 14:19, "Eugen Block" wrote: Do you see anything suspicious in /var/log/ceph/cephadm.log? Also check the mgr logs for any hints. Zitat von Lo Re Giuseppe : > Hi,

[ceph-users] Re: S3 and RBD backup

2022-05-19 Thread Lo Re Giuseppe
: Thursday, 19 May 2022 at 09:41 To: Lo Re Giuseppe , stéphane chalansonnet Cc: "ceph-users@ceph.io" Subject: Re: [ceph-users] Re: S3 and RBD backup Hi Giuseppe, Thanks for your suggesion. Could you please elaborate more the term "exporting bucket as NFS share"? How you are

[ceph-users] Re: Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-19 Thread Lo Re Giuseppe
onder if this behaviour of mgr/cephadm is itself wrong and might cause the stall of the upgrade. Thanks, Giuseppe On 18.05.22, 14:19, "Eugen Block" wrote: Do you see anything suspicious in /var/log/ceph/cephadm.log? Also check the mgr logs for any hints. Zitat von Lo

[ceph-users] Re: S3 and RBD backup

2022-05-19 Thread Lo Re Giuseppe
Hi, We are doing exactly the same, exporting bucket as NFS share and run on it our backup software to get data to tape. Given the data volumes replication to another S3 disk based endpoint is not viable for us. Regards, Giuseppe On 18.05.22, 23:14, "stéphane chalansonnet" wrote: Hello,

[ceph-users] Upgrade from v15.2.16 to v16.2.7 not starting

2022-05-18 Thread Lo Re Giuseppe
Hi, We have happily tested the upgrade from v15.2.16 to v16.2.7 with cephadm on a test cluster made of 3 nodes and everything went smoothly. Today we started the very same operation on the production one (20 OSD servers, 720 HDDs) and the upgrade process doesn’t do anything at all… To be more

[ceph-users] Re: RBD map issue

2022-02-14 Thread Lo Re Giuseppe
Unfortunately nothing Is there any way to make it more verbose? On 14.02.22, 11:48, "Eugen Block" wrote: What does 'dmesg' reveal? Zitat von Lo Re Giuseppe : > root@fulen-w006:~# ll client.fulen.keyring > -rw-r--r-- 1 root root 69 Feb 11 15:30 clie

[ceph-users] Re: RBD map issue

2022-02-11 Thread Lo Re Giuseppe
f the client keyring on both systems? Zitat von Lo Re Giuseppe : > Hi, > > It's a single ceph cluster, I'm testing from 2 different client nodes. > The caps are below. > I think is unlikely that caps are the cause as they work from one > client node,

[ceph-users] Re: RBD map issue

2022-02-11 Thread Lo Re Giuseppe
both clusters and redact sensitive information. Zitat von Lo Re Giuseppe : > Hi all, > > This is my first post to this user group, I’m not a ceph expert, > sorry if I say/ask anything trivial. > > On a Kubernetes cluster I have an issue in creatin

[ceph-users] RBD map issue

2022-02-11 Thread Lo Re Giuseppe
Hi all, This is my first post to this user group, I’m not a ceph expert, sorry if I say/ask anything trivial. On a Kubernetes cluster I have an issue in creating volumes from a (csi) ceph EC pool. I can reproduce the problem from rbd cli like this from one of the k8s worker nodes: “””