[ceph-users] How to recover from an MDs rank in state 'failed'

2024-05-29 Thread Noe P.
Hi,

after our desaster yesterday, it seems that we got our MONs back.
One of the filesystems, however, seems in a strange state:

  % ceph fs status

  
  fs_cluster - 782 clients
  ==
  RANK  STATE MDSACTIVITY DNSINOS   DIRS   CAPS
   0active  cephmd6a  Reqs:5 /s  13.2M  13.2M  1425k  51.4k
   1failed
POOL TYPE USED  AVAIL
  fs_cluster_meta  metadata  3594G  53.5T
  fs_cluster_datadata 421T  53.5T
  
  STANDBY MDS
cephmd6b
cephmd4b
  MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2) 
quincy (stable)


  % ceph fs dump
  
  Filesystem 'fs_cluster' (3)
  fs_name fs_cluster
  epoch   3068261
  flags   12 joinable allow_snaps allow_multimds_snaps
  created 2022-08-26T15:55:07.186477+0200
  modified2024-05-29T12:43:30.606431+0200
  tableserver 0
  root0
  session_timeout 60
  session_autoclose   300
  max_file_size   4398046511104
  required_client_features{}
  last_failure0
  last_failure_osd_epoch  1777109
  compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable 
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses 
versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no 
anchor table,9=file layout v2,10=snaprealm v2}
  max_mds 2
  in  0,1
  up  {0=911794623}
  failed
  damaged
  stopped 2,3
  data_pools  [32]
  metadata_pool   33
  inline_data disabled
  balancer
  standby_count_wanted1
  [mds.cephmd6a{0:911794623} state up:active seq 44701 addr 
[v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat 
{c=[1],r=[1],i=[7ff]}]


We would like to get rid of the failed rank 1 (without crashing the MONs)
and have a 2nd MD from the standbys step in .

Anyone have an idea how to do this ?
I'm a bit reluctant to try 'ceph mds rmfailed', as this seems to have
triggered the MONs to crash.

Regards,
  Noe
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to recover from an MDs rank in state 'failed'

2024-05-29 Thread Eugen Block

I'm not really sure either, what about this?

ceph mds repaired 

The docs state:

Mark the file system rank as repaired. Unlike the name suggests,  
this command does not change a MDS; it manipulates the file system  
rank which has been marked damaged.


Maybe that could bring it back up? Did you set max_mds to 1 at some  
point? If you do it now (and you currently have only one active MDS),  
maybe that would clean up the failed rank as well?



Zitat von "Noe P." :


Hi,

after our desaster yesterday, it seems that we got our MONs back.
One of the filesystems, however, seems in a strange state:

  % ceph fs status

  
  fs_cluster - 782 clients
  ==
  RANK  STATE MDSACTIVITY DNSINOS   DIRS   CAPS
   0active  cephmd6a  Reqs:5 /s  13.2M  13.2M  1425k  51.4k
   1failed
POOL TYPE USED  AVAIL
  fs_cluster_meta  metadata  3594G  53.5T
  fs_cluster_datadata 421T  53.5T
  
  STANDBY MDS
cephmd6b
cephmd4b
  MDS version: ceph version 17.2.7  
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)



  % ceph fs dump
  
  Filesystem 'fs_cluster' (3)
  fs_name fs_cluster
  epoch   3068261
  flags   12 joinable allow_snaps allow_multimds_snaps
  created 2022-08-26T15:55:07.186477+0200
  modified2024-05-29T12:43:30.606431+0200
  tableserver 0
  root0
  session_timeout 60
  session_autoclose   300
  max_file_size   4398046511104
  required_client_features{}
  last_failure0
  last_failure_osd_epoch  1777109
  compat  compat={},rocompat={},incompat={1=base v0.20,2=client  
writeable ranges,3=default file layouts on dirs,4=dir inode in  
separate object,5=mds uses versioned encoding,6=dirfrag is stored in  
omap,7=mds uses inline data,8=no anchor table,9=file layout  
v2,10=snaprealm v2}

  max_mds 2
  in  0,1
  up  {0=911794623}
  failed
  damaged
  stopped 2,3
  data_pools  [32]
  metadata_pool   33
  inline_data disabled
  balancer
  standby_count_wanted1
  [mds.cephmd6a{0:911794623} state up:active seq 44701 addr  
[v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat  
{c=[1],r=[1],i=[7ff]}]



We would like to get rid of the failed rank 1 (without crashing the MONs)
and have a 2nd MD from the standbys step in .

Anyone have an idea how to do this ?
I'm a bit reluctant to try 'ceph mds rmfailed', as this seems to have
triggered the MONs to crash.

Regards,
  Noe
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Error EINVAL: check-host failed - Failed to add host

2024-05-29 Thread Suryanarayana Raju
Hello,

Please help:  CEPH cluster using docker.

using below command for the bootstrap with provided key and pub

cephadm -v bootstrap --mon-ip  --allow-overwrite --ssh-private-key 
id_ed25519 --ssh-public-key id_ed25519.pub

able to ssh directly with id_ed25519 key.


RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host 
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint 
/usr/bin/ceph --init -e CONTAINER_IMAGE=quay.io/ceph/ceph:v18 -e 
NODE_NAME= -e CEPH_USE_RANDOM_NONCE=1 -v 
/var/log/ceph/85c044a8-1d82-11ef-9ea1-a73759ab75e5:/var/log/ceph:z -v 
/tmp/ceph-tmp1sw6a5s0:/etc/ceph/ceph.client.admin.keyring:z -v 
/tmp/ceph-tmpethxlwxr:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v18 orch host add 
 : Error EINVAL: check-host failed:
Unable to write 
:/var/lib/ceph/85c044a8-1d82-11ef-9ea1-a73759ab75e5/cephadm.2b9d7d139a9cb40289f2358faf49a109fc297c0a258bde893227c262c30bca8d:
 Session request failed

also validated below and it was successful:

> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@

---
root@:~/.ssh# ceph orch host add  
Error EINVAL: check-host failed:
Unable to write 
:/var/lib/ceph/fe8ecd30-1da2-11ef-9ea1-a73759ab75e5/cephadm.2b9d7d139a9cb40289f2358faf49a109fc297c0a258bde893227c262c30bca8d:
 Session request failed
root@:~/.ssh# 


Thanks,
Surya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to recover from an MDs rank in state 'failed'

2024-05-29 Thread Noe P.



On Wed, 29 May 2024, Eugen Block wrote:

> I'm not really sure either, what about this?
>
> ceph mds repaired 

I think it works only for 'damaged' MDs.

N.

> The docs state:
>
> >Mark the file system rank as repaired. Unlike the name suggests, this command
> >does not change a MDS; it manipulates the file system rank which has been
> >marked damaged.
>
> Maybe that could bring it back up? Did you set max_mds to 1 at some point? If
> you do it now (and you currently have only one active MDS), maybe that would
> clean up the failed rank as well?
>
>
> Zitat von "Noe P." :
>
> >Hi,
> >
> >after our desaster yesterday, it seems that we got our MONs back.
> >One of the filesystems, however, seems in a strange state:
> >
> >  % ceph fs status
> >
> >  
> >  fs_cluster - 782 clients
> >  ==
> >  RANK  STATE MDSACTIVITY DNSINOS   DIRS   CAPS
> >   0active  cephmd6a  Reqs:5 /s  13.2M  13.2M  1425k  51.4k
> >   1failed
> >POOL TYPE USED  AVAIL
> >  fs_cluster_meta  metadata  3594G  53.5T
> >  fs_cluster_datadata 421T  53.5T
> >  
> >  STANDBY MDS
> >cephmd6b
> >cephmd4b
> >  MDS version: ceph version 17.2.7 (b12291d110049b2f35e32e0de30d70e9a4c060d2)
> >  quincy (stable)
> >
> >
> >  % ceph fs dump
> >  
> >  Filesystem 'fs_cluster' (3)
> >  fs_name fs_cluster
> >  epoch   3068261
> >  flags   12 joinable allow_snaps allow_multimds_snaps
> >  created 2022-08-26T15:55:07.186477+0200
> >  modified2024-05-29T12:43:30.606431+0200
> >  tableserver 0
> >  root0
> >  session_timeout 60
> >  session_autoclose   300
> >  max_file_size   4398046511104
> >  required_client_features{}
> >  last_failure0
> >  last_failure_osd_epoch  1777109
> >  compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> >  ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> >  uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline
> >  data,8=no anchor table,9=file layout v2,10=snaprealm v2}
> >  max_mds 2
> >  in  0,1
> >  up  {0=911794623}
> >  failed
> >  damaged
> >  stopped 2,3
> >  data_pools  [32]
> >  metadata_pool   33
> >  inline_data disabled
> >  balancer
> >  standby_count_wanted1
> >  [mds.cephmd6a{0:911794623} state up:active seq 44701 addr
> >  [v2:10.13.5.6:6800/189084355,v1:10.13.5.6:6801/189084355] compat
> >  {c=[1],r=[1],i=[7ff]}]
> >
> >
> >We would like to get rid of the failed rank 1 (without crashing the MONs)
> >and have a 2nd MD from the standbys step in .
> >
> >Anyone have an idea how to do this ?
> >I'm a bit reluctant to try 'ceph mds rmfailed', as this seems to have
> >triggered the MONs to crash.
> >
> >Regards,
> >  Noe
> >___
> >ceph-users mailing list -- ceph-users@ceph.io
> >To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Bug Found in Reef Releases - Action Required for pg-upmap-primary Interface Users

2024-05-29 Thread Laura Flores
Dear Ceph Users,

We have discovered a bug with the pg-upmap-primary interface (related to
the offline read balancer [1]) that affects all Reef releases.

In all Reef versions, users are required to set
`require-min-compat-client=reef` in order to use the pg-upmap-primary
interface to prevent pre-reef clients from connecting and not understanding
the new interface. We found this setting is simply not enforced [2], which
leads to miscommunication between older and newer peers or, depending on
version, to an assert in the mons and/or osds [3]. However, the fundamental
precondition is making use of the new `pg-upmap-primary` feature.

If you have not yet upgraded to v18.2.2, we recommend that you refrain from
upgrading to v18.2.2 until a later version is out with a fix. We also
recommend removing any existing pg-upmap-primary mappings to prevent
hitting the assert [3], as well as to prevent any miscommunication between
older and newer peers about pg primaries [2].
Remove mappings by:
$ `ceph osd dump`
For each pg_upmap_primary entry in the above output:
$ `ceph osd rm-pg-upmap-primary `

If you have already upgraded to v18.2.2, your cluster is more likely to hit
the osd/mon assert [3] when you set a `pg-upmap-primary` mapping (this
would involve explicitly setting a mapping via the osdmaptool or the CLI
command). As long as you refrain from setting any pg-upmap-primary
mappings, your cluster will NOT be affected by [3].

Follow the trackers below for further updates.

1. pg-upmap-primary documentation:
https://docs.ceph.com/en/reef/rados/operations/read-balancer/
2. mon, osd, *: require-min-compat-client is not really honored -
https://tracker.ceph.com/issues/66260
3. Failed assert "pg_upmap_primaries.empty()" in the read balancer
 - https://tracker.ceph.com/issues/61948

Thanks,
Laura Flores

-- 

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bug Found in Reef Releases - Action Required for pg-upmap-primary Interface Users

2024-05-29 Thread Fox, Kevin M
How do you know if its safe to set `require-min-compat-client=reef` if you have 
kernel clients?

Thanks,
Kevin


From: Laura Flores 
Sent: Wednesday, May 29, 2024 8:12 AM
To: ceph-users; dev; clt
Cc: Radoslaw Zarzynski; Yuri Weinstein
Subject: [ceph-users] Bug Found in Reef Releases - Action Required for 
pg-upmap-primary Interface Users

Check twice before you click! This email originated from outside PNNL.


Dear Ceph Users,

We have discovered a bug with the pg-upmap-primary interface (related to
the offline read balancer [1]) that affects all Reef releases.

In all Reef versions, users are required to set
`require-min-compat-client=reef` in order to use the pg-upmap-primary
interface to prevent pre-reef clients from connecting and not understanding
the new interface. We found this setting is simply not enforced [2], which
leads to miscommunication between older and newer peers or, depending on
version, to an assert in the mons and/or osds [3]. However, the fundamental
precondition is making use of the new `pg-upmap-primary` feature.

If you have not yet upgraded to v18.2.2, we recommend that you refrain from
upgrading to v18.2.2 until a later version is out with a fix. We also
recommend removing any existing pg-upmap-primary mappings to prevent
hitting the assert [3], as well as to prevent any miscommunication between
older and newer peers about pg primaries [2].
Remove mappings by:
$ `ceph osd dump`
For each pg_upmap_primary entry in the above output:
$ `ceph osd rm-pg-upmap-primary `

If you have already upgraded to v18.2.2, your cluster is more likely to hit
the osd/mon assert [3] when you set a `pg-upmap-primary` mapping (this
would involve explicitly setting a mapping via the osdmaptool or the CLI
command). As long as you refrain from setting any pg-upmap-primary
mappings, your cluster will NOT be affected by [3].

Follow the trackers below for further updates.

1. pg-upmap-primary documentation:
https://docs.ceph.com/en/reef/rados/operations/read-balancer/
2. mon, osd, *: require-min-compat-client is not really honored -
https://tracker.ceph.com/issues/66260
3. Failed assert "pg_upmap_primaries.empty()" in the read balancer
 - https://tracker.ceph.com/issues/61948

Thanks,
Laura Flores

--

Laura Flores

She/Her/Hers

Software Engineer, Ceph Storage 

Chicago, IL

lflo...@ibm.com | lflo...@redhat.com 
M: +17087388804
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-29 Thread Mazzystr
Each simultaneously.  These are SATA3 with 128mb cache drives.  The bus is
6 gb/s.  I expect usage to be in the 90+% range not the 50% range.

On Mon, May 27, 2024 at 5:37 PM Anthony D'Atri  wrote:

>
>
>
>   hdd iops on the three discs hover around 80 +/- 5.
>
>
> Each or total?  I wouldn’t expect much more than 80 per drive.
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Lousy recovery for mclock and reef

2024-05-29 Thread Anthony D'Atri


> 
> Each simultaneously.  These are SATA3 with 128mb cache drives.

Turn off the cache.

>  The bus is 6 gb/s.  I expect usage to be in the 90+% range not the 50% range.

"usage" as measured how?

> 
> On Mon, May 27, 2024 at 5:37 PM Anthony D'Atri  wrote:
> 
>> 
>> 
>> 
>>  hdd iops on the three discs hover around 80 +/- 5.
>> 
>> 
>> Each or total?  I wouldn’t expect much more than 80 per drive.
>> 
>> 
>> 
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Ceph Reef v18.2.3 - release date?

2024-05-29 Thread Peter Razumovsky
Hello! We're waiting brand new minor 18.2.3 due to 
https://github.com/ceph/ceph/pull/56004. Why? Timing in our work is a tough 
thing. Could you kindly share an estimation of 18.2.3 release timeframe? It is 
16 days passed from original tag creation so I want to understand when it will 
be released for upgrade time planning.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RadosGW Multisite Zonegroup redirect

2024-05-29 Thread rabe
Hi folks,

I am currently in the process of building a Multisite Cluster (v18.2.2) with 
two Ceph clusters (cluster1 and cluster2). Each Cluster is its own Zone and has 
its own zonegroup. The idea is to sync metadata but not data. The user should 
select the zonegroup (region) via the location constraint. I figured out how to 
set up the two zonegroups but now the following problem appears:
When i create a bucket via the cluster2 rgw with the cluster1 zonegroup as a 
target my minio client shows the following error:
mc mb test/test1 --region cluster1
mc:  Unable to make bucket `test/test1`. The specified 
location-constraint is not valid

the cluster1 rgw shows the following error:
s3:create_bucket location constraint (cluster1) doesn't match zonegroup 
(cluster2)

the cluster2 rgw shows no logs

When i do the same with the aws cli i get no error but the rgw seems to ignore 
the location constraint and just creates the bucket in cluster2:
aws s3api --endpoint http://cluster2 --profile test create-bucket --bucket 
test2 --region cluster1

When I copy a file to a bucket in cluster2 via the cluster1 rgw i get the 
following error on the cluster1 rgw:
op->ERRORHANDLER: err_no=-2024 new_err_no=-2024

The Period is synced on both clusters and the rgws are restarted:

{
"id": "7189ba3c-c7de-4ea2-8d14-307e4355a9e3",
"epoch": 6,
"predecessor_uuid": "0253dec1-d804-470f-92b6-90a64ffb5548",
"sync_status": [],
"period_map": {
"id": "7189ba3c-c7de-4ea2-8d14-307e4355a9e3",
"zonegroups": [
{
"id": "4daa58d4-3eb5-4e59-bffe-e929f64c70b5",
"name": "cluster1",
"api_name": "cluster1",
"is_master": true,
"endpoints": [
"http://172.31.4.12:80";
],
"hostnames": [
"172.31.4.12",
"172.31.4.16"
],
"hostnames_s3website": [],
"master_zone": "ae3aa0e9-c580-410a-8ed6-2f01657a06f8",
"zones": [
{
"id": "ae3aa0e9-c580-410a-8ed6-2f01657a06f8",
"name": "cluster1-a",
"endpoints": [
"http://172.31.4.12:80";
],
"log_meta": false,
"log_data": false,
"bucket_index_max_shards": 11,
"read_only": false,
"tier_type": "",
"sync_from_all": true,
"sync_from": [],
"redirect_zone": "",
"supported_features": [
"compress-encrypted",
"resharding"
]
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "5e8f4256-cf89-40db-b2a9-bb717f980b4f",
"sync_policy": {
"groups": []
},
"enabled_features": [
"resharding"
]
},
{
"id": "8d1d8050-0dea-4c5a-9bc3-d6631ef132f8",
"name": "cluster2",
"api_name": "cluster2",
"is_master": false,
"endpoints": [
"http://172.31.4.16:80";
],
"hostnames": [
"172.31.4.12",
"172.31.4.16"
],
"hostnames_s3website": [],
"master_zone": "2a882ccc-b540-4de2-abe3-f026fccb9aef",
"zones": [
{
"id": "2a882ccc-b540-4de2-abe3-f026fccb9aef",
"name": "cluster2-a",
"endpoints": [
"http://172.31.4.16:80";
],
"log_meta": false,
"log_data": false,
"bucket_index_max_shards": 11,
"read_only": false,
"tier_type": "",
"sync_from_all": true,
"sync_from": [],
"redirect_zone": "",
"supported_features": [
"compress-encrypted",
"resharding"
]
}
],
"placement_targets": [
{
"name": "default-placement",

[ceph-users] MDS crashing

2024-05-29 Thread Johan

Hi,

I have a small cluster with 11 osds and 4 filesystems. Each server 
(Debian 11, ceph 17.2.7) usually run several services.


After troubles with a host with OSD:s I removed the OSD:s and let the 
cluster repair it self (x3 replica). After a while it returned to a 
healthy state and everything was well. This might not be important for 
what followed, but I mention it just in case.


A couple of days later a mds-services gave a health-warning. First one 
was (2024-05-28 10:02)

  mds.cloudfs.stugan6.ywuomz(mds.0): 1 slow requests are blocked > 30 secs

followed by filesystem being degraded (2024-05-28 10:22)
  fs cloudfs is degraded
the other filesystems have been marked degraded from time to time but 
later cleared.


at 2024-05-28 10:28
  daemon mds.mediafs.stugan7.zzxavs on stugan7 is in error state
  daemon mds.cloudfs.stugan7.cmjbun on stugan7 is in error state
at 2024-05-28 10:33
  daemon mds.cloudfs.stugan4.qxwzox on stugan4 is in error state
  daemon mds.cloudfs.stugan5.hbkkad on stugan5 is in error state
  daemon mds.oxylfs.stugan7.iazekf on stugan7 is in error state

MDS-services went on crashing ...

I put the osd:s on pause and 
nodown,noout,nobackfill,norebalance,norecover flags, but at present only 
the flags as I have tried to get the system up and running again.




While the osd:s were paused, I could 'clear up' the mess and remove all 
services in error state. The monitors and managers seems to function 
well. I could also start getting the mds-services running again. BUT, 
when I removed the pause from the osd:s the mds-services once again 
started to go inte error state.


Now I have removed the mds-label from all ceph servers and therefore it 
has calmed down. But if I let the services be recreated the crashes will 
start over again.


If I check the filesystem (I have marked them down for now) status the 
cloudfs is strange...


oxylfs - 0 clients
==
  POOL TYPE USED  AVAIL
oxylfs_metadata  metadata   154M  20.8T
  oxylfs_data  data1827G  20.8T
cloudfs - 0 clients
===
RANK  STATEMDSACTIVITY   DNSINOS 
DIRS   CAPS
 0replay(laggy)  backupfs.stugan6.bgcltx   0  0 
 0  0

  POOL  TYPE USED  AVAIL
cloudfs_metadata  metadata   337M  20.8T
  cloudfs_data  data 356G  20.8T
mediafs - 0 clients
===
  POOL  TYPE USED  AVAIL
mediafs_metadata  metadata  66.0M  20.8T
  mediafs_data  data2465G  20.8T
backupfs - 0 clients

POOLTYPE USED  AVAIL
backupfsnew_metadata  metadata   221M  20.8T
  backupfsnew_data  data8740G  20.8T
MDS version: ceph version 17.2.7 
(b12291d110049b2f35e32e0de30d70e9a4c060d2) quincy (stable)



Why is the mds (in error state) for the backupfs-filesystem shown with 
the cloudfs-filesystem?



Now... Is there a way back to normal?

/Johan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: stretched cluster new pool and second pool with nvme

2024-05-29 Thread ronny.lippold

hi stefan ... i did the next step and need your help.

my idea was to stretch the cluster without stretch mode. so we decided 
to reserve a size of 4 on each side.


the setup is the same as stretched mode, also crush rule, location, 
election_strategy and tie breaker.
only "ceph mon enable_stretch_mode e stretch_rule datacenter" wasn't 
made.


now in my test, i made a split brain and expect, that on the remaining 
side, the cluster will rebuild the 4 replica.

but that did not happen.
actually, the cluster, is doing the same stuff, as stretch mode enabled. 
writeable with 2 replica.


can you explain me why? i'm spinning around.

this is the status during split brain:

##
pve-test02-01:~# ceph -s
  cluster:
id: 376fcdef-bba0-4e58-b63e-c9754dc948fa
health: HEALTH_WARN
6/13 mons down, quorum 
pve-test01-01,pve-test01-03,pve-test01-05,pve-test02-01,pve-test02-03,pve-test02-05,tie-breaker

1 datacenter (8 osds) down
8 osds down
6 hosts (8 osds) down
Degraded data redundancy: 2116/4232 objects degraded 
(50.000%), 95 pgs degraded, 113 pgs undersized


  services:
mon: 13 daemons, quorum 
pve-test01-01,pve-test01-03,pve-test01-05,pve-test02-01,pve-test02-03,pve-test02-05,tie-breaker 
(age 54m), out of quorum: pve-test01-02, pve-test01-04, pve-test01-06, 
pve-test02-02, pve-test02-04, pve-test02-06
mgr: pve-test02-05(active, since 53m), standbys: pve-test01-05, 
pve-test01-01, pve-test01-03, pve-test02-01, pve-test02-03

mds: 1/1 daemons up, 1 standby
osd: 16 osds: 8 up (since 54m), 16 in (since 77m)

  data:
volumes: 1/1 healthy
pools:   5 pools, 113 pgs
objects: 1.06k objects, 3.9 GiB
usage:   9.7 GiB used, 580 GiB / 590 GiB avail
pgs: 2116/4232 objects degraded (50.000%)
 95 active+undersized+degraded
 18 active+undersized

  io:
client:   17 KiB/s wr, 0 op/s rd, 10 op/s wr
##

thanks a lot,
ronny

Am 2024-04-30 11:42, schrieb Stefan Kooman:

On 30-04-2024 11:22, ronny.lippold wrote:

hi stefan ... you are the hero of the month ;)


:p.



i don't know, why i did not found your bug report.

i have the exact same problem and resolved the HEALTH only with "ceph 
osd force_healthy_stretch_mode --yes-i-really-mean-it"

will comment the report soon.

actually, we think about 4/2 size without stretch mode enable.

what was your solution?


This specific setup (on which I did the testing) is going to be full 
flash (SSD). So the HDDs are going to be phased out. And only the 
default non-device-class crush rule will be used. While that will work 
for this (small) cluster, it is not a solution. This issue should be 
fixed, as I figure there are quite a few cluster that want to use 
device-classes and use stretch mode at the same time.


Gr. Stefan

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problems adding a new host via orchestration. (solved)

2024-05-29 Thread mcpherson
Hi everyone.

I had the very same problem, and I believe I've figured out what is happening.  
Many admins advise using "ufw limit ssh" to help protect your system against 
brute-force password guessing.  Well, "ceph orch host add" makes multiple ssh 
connections very quickly and triggers the ufw limit.  I switched to "ufw allow 
ssh" and everything works perfectly.

Mike
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] rgw can't find zone

2024-05-29 Thread stephan . budach
I am running two Ceph clusters on 18.2.2. On both clusters, I initially 
configured rgw using the following config from a yaml file:

service_type: rgw
service_id: s3.jvm.gh79
placement:
label: rgw
count_per_host: 1
spec:
rgw_realm: jvm
rgw_zone: gh79
ssl: true
rgw_frontend_port: 9443
rgw_frontend_type: beast
rgw_frontend_ssl_certificate: |

Note, that I didn't configure anything else regarding realms, zonegroups or 
zones. I then modified the tags on 3 of my cephnodes to include rgw and the 
services got installed, configured and started. I am pretty sure, that I did 
this after I updated Ceph to 18.2.2. A couple of days ago, one of my cephnodes, 
hosting a rgw rebootet and the rgw service didn't come up again. When I looked 
at the logs I saw that the rgw service had an error when looking up the zone - 
which seems non-existent at this point:

Mai 23 12:43:00 bceph06 radosgw[3608419]: deferred set uid:gid to 167:167 
(ceph:ceph)
Mai 23 12:43:00 bceph06 radosgw[3608419]: ceph version 18.2.2 
(531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable), process radosgw, pid 2
Mai 23 12:43:00 bceph06 radosgw[3608419]: framework: beast
Mai 23 12:43:00 bceph06 radosgw[3608419]: framework conf key: ssl_port, val: 
9443
Mai 23 12:43:00 bceph06 radosgw[3608419]: framework conf key: ssl_certificate, 
val: config://rgw/cert/rgw.s3.jvm.gh79
Mai 23 12:43:00 bceph06 radosgw[3608419]: init_numa not setting numa affinity
Mai 23 12:43:01 bceph06 radosgw[3608419]: rgw main: ERROR: could not find zone 
(gh79)
Mai 23 12:43:01 bceph06 radosgw[3608419]: rgw main: ERROR: failed to start 
notify service ((2) No such file or directory
Mai 23 12:43:01 bceph06 radosgw[3608419]: rgw main: ERROR: failed to init 
services (ret=(2) No such file or directory)
Mai 23 12:43:01 bceph06 
ceph-5ccaf98c-ec2a-11ee-8293-bc2411f9733d-rgw-s3-jvm-gh79-bceph06-rwkczj[3608415]:
 2024-05-23T10:43:01.014+ 7fe0b87b3a80 -1 Couldn't init storage provider 
(RADOS)
Mai 23 12:43:01 bceph06 radosgw[3608419]: Couldn't init storage provider (RADOS)

The rados zone configs look like this at the moment:

root@cephnode01:/# radosgw-admin zone list
{
"default_info": "e67d13fc-8343-4c88-9b1f-a8286f46f44a",
"zones": [
"default"
]
}

So, why did it work in the first place and what would be the easiest way of 
resolving this?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CephFS HA: mgr finish mon failed to return metadata for mds

2024-05-29 Thread kvesligaj
Hi,

we have a stretched cluster (Reef 18.2.1) with 5 nodes (2 nodes on each side + 
witness). You can se our daemon placement below.

[admin]
ceph-admin01 labels="['_admin', 'mon', 'mgr']"

[nodes]
[DC1]
ceph-node01 labels="['mon', 'mgr', 'mds', 'osd']"
ceph-node02 labels="['mon', 'rgw', 'mds', 'osd']"
[DC2]
ceph-node03 labels="['mon', 'mgr', 'mds', 'osd']"
ceph-node04 labels="['mon', 'rgw', 'mds', 'osd']"

We have been testing CephFS HA and noticed when we have active MDS (we have two 
active MDS daemons at all times) and active MGR (MGR is either on admin node or 
in one of the DC's) in one DC and when we shutdown that site (DC) we have a 
problem when one of the MDS metadata can't be retrieved thus showing in logs as:

"mgr finish mon failed to return metadata for mds"

After we turn that site back on the problem persists and metadata of MDS in 
question can't be retrieved with "ceph mds metadata"

After I manually fail MDS daemon in question with "ceph mds fail" the problem 
is solved and I can retrieve MDS metadata.

My question is, would this be related to the following bug 
(https://tracker.ceph.com/issues/63166) - I can see that it is showed as 
backported to 18.2.1 but I can't find it in release notes for Reef.

Second question is should this work in current configuration at all as MDS and 
MGR are both at the same moment disconnected from the rest of the cluster?

And final question would be what would be the solution here and is there any 
loss of data when this happens?

Any help is appreciated.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cephadm quincy 17.2.5 always shows slowops in all OSDs and Ceph orch stuck

2024-05-29 Thread pahrialtkj
Hi all,

I'm running a quincy 17.2.5 ceph cluster, with 3 nodes that each have 3 disks 
with replica size 3 min_size 2, my cluster was running fine before, and 
suddenly it can't read and write because ceph slowops on all osd. I tried 
restarting osd, then degraded and misplaced data appeared, and slow ops 
reappeared. Any suggestions guys to solve this issues ?

# ceph -s
  cluster:
id: bb6590a9-155c-4830-aea7-ef4b5a098466
health: HEALTH_ERR
2 failed cephadm daemon(s)
1/282019 objects unfound (0.000%)
noout flag(s) set
Reduced data availability: 79 pgs inactive, 47 pgs peering
Possible data damage: 1 pg recovery_unfound
Degraded data redundancy: 26340/838103 objects degraded (3.143%), 
16 pgs degraded, 11 pgs undersized
302 slow ops, oldest one blocked for 226773 sec, daemons 
[osd.2,osd.3,osd.4,osd.6,osd.7] have slow ops.

  services:
mon: 3 daemons, quorum r2c2,r2c3,r2c1 (age 2h)
mgr: r2c3(active, since 15m), standbys: r2c2, r2c1.ivwibd
osd: 9 osds: 9 up (since 9m), 9 in (since 3d); 30 remapped pgs
 flags noout

  data:
pools:   12 pools, 329 pgs
objects: 282.02k objects, 1.3 TiB
usage:   4.9 TiB used, 94 TiB / 99 TiB avail
pgs: 24.924% pgs not active
 26340/838103 objects degraded (3.143%)
 5411/838103 objects misplaced (0.646%)
 1/282019 objects unfound (0.000%)
 240 active+clean
 31  peering
 23  activating
 16  remapped+peering
 7   activating+undersized+degraded+remapped
 2   activating+remapped
 2   active+undersized+degraded+remapped+backfilling
 2   active+recovering+degraded
 2   activating+degraded
 1   active+recovery_unfound+undersized+degraded+remapped
 1   activating+degraded+remapped
 1   active+clean+scrubbing+deep
 1   active+recovery_wait+undersized+degraded+remapped

  io:
client:   12 KiB/s rd, 1.3 KiB/s wr, 7 op/s rd, 1 op/s wr

Thanks,
Pahrial
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: tuning for backup target cluster

2024-05-29 Thread Darren Soothill
So a few questions I have around this.

What is the network you have for this cluster?

Changing the bluestone_min_alloc_size would be the last thing I would even 
consider. In fact I wouldn’t be changing it as you are in untested territory.

The challenge with making these sort of things perform is to generate lots of 
parallel streams so what ever is doing the uploading needs to be doing parallel 
multipart uploads. There is no mention of the uploading code that is being used.

So with 7 Nodes each with 12 Disks and doing large files like this I would be 
expecting to see 50-70MB/s per useable HDD. By useable I mean if you are doing 
Replicas then you would divide the number of disks by the replica number or in 
your case with EC I would be diving the number of disks by the EC size and 
multiplying by the data part. So divide by 6 and multiply by 4.

So allowing for EC overhead you in theory could get beyond 2.8GBytes/s That is 
the theoretical disk limit I would be looking to exceed.

So now you have the question of do you have enough streams running in parallel? 
Have you tried a benchmarking tool such as minio warp to see what it can 
achieve. 

You haven’t mentioned the number of PG’s you have for each of the pools in 
question. You need to ensure that every pool that is being used has more PG’s 
that the number of disks. If that’s not the case then individual disks could be 
slowing things down.

You also have the metadata pools used by RGW that ideally need to be on NVME.

Because you are using EC then there is the buckets.non-ec pool which is used to 
manage the OMAPS for the multipart uploads this is usually down at 8 PG’s and 
that will be limiting things as well.



Darren Soothill

Want a meeting with me: https://calendar.app.google/MUdgrLEa7jSba3du9

Looking for help with your Ceph cluster? Contact us at https://croit.io/
 
croit GmbH, Freseniusstr. 31h, 81247 Munich 
CEO: Martin Verges - VAT-ID: DE310638492 
Com. register: Amtsgericht Munich HRB 231263 
Web: https://croit.io/ | YouTube: https://goo.gl/PGE1Bx




> On 25 May 2024, at 14:56, Anthony D'Atri  wrote:
> 
> 
> 
>> Hi Everyone,
>> 
>> I'm putting together a HDD cluster with an ECC pool dedicated to the backup
>> environment. Traffic via s3. Version 18.2,  7 OSD nodes, 12 * 12TB HDD +
>> 1NVME each,
> 
> QLC, man.  QLC.  That said, I hope you're going to use that single NVMe SSD 
> for at least the index pool.  Is this a chassis with universal slots, or is 
> that NVMe device maybe M.2 or rear-cage?
> 
>> Wondering if there is some general guidance for startup setup/tuning in
>> regards to s3 object size.
> 
> Small objects are the devil of any object storage system.
> 
> 
>> Files are read from fast storage (SSD/NVME) and
>> written to s3. Files sizes are 10MB-1TB, so it's not standard s3. traffic.
> 
> Nothing nonstandard about that, though your 1TB objects presumably are going 
> to be MPU.  Having the .buckets.non-ec pool on HDD with objects that large 
> might be really slow to assemble them, you might need to increase timeouts 
> but I'm speculating.
> 
> 
>> Backup for big files took hours to complete.
> 
> Spinners gotta spin.  They're a false economy.
> 
>> My first shot would be to increase default bluestore_min_alloc_size_hdd, to
>> reduce the number of stored objects, but I'm not sure if it's a
>> good direccion?  
> 
> With that workload you *could* increase that to like 64KB, but I don't think 
> it'd gain you much.  
> 
> 
>> Any other parameters worth checking to support such a
>> traffic pattern?
> 
> `ceph df`
> `ceph osd dump | grep pool`
> 
> So we can see what's going on HDD and what's on NVMe.
> 
>> 
>> Thanks!
>> 
>> -- 
>> Łukasz
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] We are using ceph octopus environment. For client can we use ceph quincy?

2024-05-29 Thread s . dhivagar . cse
We are using ceph octopus environment. For client can we use ceph quincy?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unable to Install librados2 18.2.0 on RHEL 7 from Ceph Repository

2024-05-29 Thread abdel . douichi
The Ceph repository at https://download.ceph.com/ does not seem to have the 
librados2 package version 18.2.0 for RHEL 7. The directory  
https://download.ceph.com/rpm-18.2.0/el7/ is empty, and the specific 
package URL 
https://download.ceph.com/rpm-18.2.0/el7/x86_64/librados2-18.2.0-0.el7.x86_64.rpm
 returns a 404 error.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RBD Mirror - implicit snapshot cleanup

2024-05-29 Thread scott . cairns
With RBD Mirroring configured from Site A to Site B, we typically see 2 
snapshots on the source side (current and 1 previous) and 1 snapshot on the 
destination side (most recent snapshot from source) - this has been the case 
for over a year now.

When testing a DR scenario (i.e. demoting the primary, promoting the secondary, 
testing, resyncing, demoting the secondary, promoting the primary) we then 
result in a lot more snapshots on each side - currently around 6 on the source 
side (demoted image, then current and 4 previous) and 4 on the destination side 
(original snapshot, then snapshot from the promote, demote, then the current 
snapshot from source).

Is it possible to clear these existing snapshots, or is there some sort of 
schedule that'll automatically purge them, so that we return to 2 snapshots on 
source and 1 snapshot on destination?

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RBD Mirror - Failed to unlink peer

2024-05-29 Thread scott . cairns
I've had rbd mirror deployed for over a year copying images from Site A to Site 
B for DR.

After introducing an additional node into the cluster a few weeks back and 
creating OSD's on this node (no mon/mgr) we've started to see snapshots 
throwing an error 'librbd::mirror::snapshot::CreatePrimaryRequest: 
0x7f24fc001bf0 handle_unlink_peer: failed to unlink peer: (2) No such file or 
directory'

However, the snapshot still completes (provides a Snapshot ID) and RBD mirror 
still pulls the snapshot on the remote end, so it appears to be working as 
expected, however every snapshot is generating this error.

I can't see anything else logged when this error is generated, it's 
progressively got worse since the start (the first day we saw around 2 errors 
on 150 snapshots, it's now about 145 errors on 150 snapshots)

The only other reference I can see to this is from 2 years ago and was related 
to a bug which was fixed a while ago.

I'm running Ceph 17.2.7.

Any idea what file/directory it's having issues with?

Thanks.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Bug Found in Reef Releases - Action Required for pg-upmap-primary Interface Users

2024-05-29 Thread Radoslaw Zarzynski
> How do you know if its safe to set `require-min-compat-client=ree
f` if you have kernel clients?

At the moment it is not enforced when it comes to accepting connections
from clients. There is undergoing discussion what was intended and
what the contract in librados finally became.

Regardless of this, the upmap-primary genuinely requires clients
to understand the concept. It shouldn't be turned on if an operator is
not sure that older clients can be neglected. What is connected
at given moment can be dumped the mon's `sessions` command.
I would expect the feature will become "universally useful" in long time
from now, together with proliferation of modern-enough kernels.

For right now upmap-primary has several defects, also about version
checks, and the best is to simply to not enable it (by adding mappings).
If mappings are already present, it's recommended to purge them,
Another reason for that is incompatible clients may get exposed (by
failing connection attempts) when the version / feature bit enforcement
gets fixed.

Regards,
Radek



On Wed, May 29, 2024 at 5:36 PM Fox, Kevin M  wrote:
>
> How do you know if its safe to set `require-min-compat-client=reef` if you 
> have kernel clients?
>
> Thanks,
> Kevin
>
> 
> From: Laura Flores 
> Sent: Wednesday, May 29, 2024 8:12 AM
> To: ceph-users; dev; clt
> Cc: Radoslaw Zarzynski; Yuri Weinstein
> Subject: [ceph-users] Bug Found in Reef Releases - Action Required for 
> pg-upmap-primary Interface Users
>
> Check twice before you click! This email originated from outside PNNL.
>
>
> Dear Ceph Users,
>
> We have discovered a bug with the pg-upmap-primary interface (related to
> the offline read balancer [1]) that affects all Reef releases.
>
> In all Reef versions, users are required to set
> `require-min-compat-client=reef` in order to use the pg-upmap-primary
> interface to prevent pre-reef clients from connecting and not understanding
> the new interface. We found this setting is simply not enforced [2], which
> leads to miscommunication between older and newer peers or, depending on
> version, to an assert in the mons and/or osds [3]. However, the fundamental
> precondition is making use of the new `pg-upmap-primary` feature.
>
> If you have not yet upgraded to v18.2.2, we recommend that you refrain from
> upgrading to v18.2.2 until a later version is out with a fix. We also
> recommend removing any existing pg-upmap-primary mappings to prevent
> hitting the assert [3], as well as to prevent any miscommunication between
> older and newer peers about pg primaries [2].
> Remove mappings by:
> $ `ceph osd dump`
> For each pg_upmap_primary entry in the above output:
> $ `ceph osd rm-pg-upmap-primary `
>
> If you have already upgraded to v18.2.2, your cluster is more likely to hit
> the osd/mon assert [3] when you set a `pg-upmap-primary` mapping (this
> would involve explicitly setting a mapping via the osdmaptool or the CLI
> command). As long as you refrain from setting any pg-upmap-primary
> mappings, your cluster will NOT be affected by [3].
>
> Follow the trackers below for further updates.
>
> 1. pg-upmap-primary documentation:
> https://docs.ceph.com/en/reef/rados/operations/read-balancer/
> 2. mon, osd, *: require-min-compat-client is not really honored -
> https://tracker.ceph.com/issues/66260
> 3. Failed assert "pg_upmap_primaries.empty()" in the read balancer
>  - https://tracker.ceph.com/issues/61948
>
> Thanks,
> Laura Flores
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage 
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: tuning for backup target cluster

2024-05-29 Thread Anthony D'Atri



> You also have the metadata pools used by RGW that ideally need to be on NVME.

The OP seems to intend shared NVMe for WAL+DB, so that the omaps are on NVMe 
that way.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Error EINVAL: check-host failed - Failed to add host

2024-05-29 Thread isnraju26
Hello,

Please help:  CEPH cluster using docker.

using below command for the bootstrap with provided key and pub

cephadm -v bootstrap --mon-ip  --allow-overwrite --ssh-private-key 
id_ed25519 --ssh-public-key id_ed25519.pub

able to ssh directly with id_ed25519 key.


RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host 
--stop-signal=SIGTERM --ulimit nofile=1048576 --net=host --entrypoint 
/usr/bin/ceph --init -e CONTAINER_IMAGE=quay.io/ceph/ceph:v18 -e 
NODE_NAME= -e CEPH_USE_RANDOM_NONCE=1 -v 
/var/log/ceph/85c044a8-1d82-11ef-9ea1-a73759ab75e5:/var/log/ceph:z -v 
/tmp/ceph-tmp1sw6a5s0:/etc/ceph/ceph.client.admin.keyring:z -v 
/tmp/ceph-tmpethxlwxr:/etc/ceph/ceph.conf:z quay.io/ceph/ceph:v18 orch host add 
 : Error EINVAL: check-host failed:
Unable to write 
:/var/lib/ceph/85c044a8-1d82-11ef-9ea1-a73759ab75e5/cephadm.2b9d7d139a9cb40289f2358faf49a109fc297c0a258bde893227c262c30bca8d:
 Session request failed

also validated below and it was successful:

> ceph cephadm get-ssh-config > ssh_config
> ceph config-key get mgr/cephadm/ssh_identity_key > key
> ssh -F ssh_config -i key root@

---
root@:~/.ssh# ceph orch host add  
Error EINVAL: check-host failed:
Unable to write 
:/var/lib/ceph/fe8ecd30-1da2-11ef-9ea1-a73759ab75e5/cephadm.2b9d7d139a9cb40289f2358faf49a109fc297c0a258bde893227c262c30bca8d:
 Session request failed
root@:~/.ssh# 

Thanks,
Surya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] About placement group scrubbing state

2024-05-29 Thread Phong Tran Thanh
Hi everyone,

I want to know about the scrubbing state placement group, my cluster has
too many pg in state scrubbing and it's increasing over time, maybe
scrubbing takes too long.
[image: image.png]
Cluster is not problem
I'm using reef version
root@n1s1:~# ceph health detail
HEALTH_OK

I want to ask, if the scrubbing state of pg is too long, what is the matter
with my cluster? scrubbing error or another problem?

Thank to the community



*Tran Thanh Phong*

Email: tranphong...@gmail.com
Skype: tranphong079
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to Install librados2 18.2.0 on RHEL 7 from Ceph Repository

2024-05-29 Thread Konstantin Shalygin
Hi,

The last release for EL7 is Octopus (version 15), you try to catch version 18


k
Sent from my iPhone

> On 29 May 2024, at 22:34, abdel.doui...@gmail.com wrote:
> 
> The Ceph repository at https://download.ceph.com/ does not seem to have the 
> librados2 package version 18.2.0 for RHEL 7. The directory  
> https://download.ceph.com/rpm-18.2.0/el7/ is empty, and the specific 
> package URL 
> https://download.ceph.com/rpm-18.2.0/el7/x86_64/librados2-18.2.0-0.el7.x86_64.rpm
>  returns a 404 error.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Error EINVAL: check-host failed - Failed to add host

2024-05-29 Thread isnraju26
Please give some suggestion on this issue.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Missing ceph data

2024-05-29 Thread Prabu GJ
Hi Team,


We are using Ceph Octopus version with a total disk size of 136 TB, configured 
with two replicas. Currently, our usage is 57 TB, and the available size is 5.3 
TB. An incident occurred yesterday where around 3 TB of data was deleted 
automatically. Upon analysis, we couldn't find the reason for the deletion. All 
OSDs are functioning properly and actively running. 

We have 3 MDS , we try to restarted all MDS services. Is there any solution to 
recover those data. Can anyone please help us find the issue?





cluster:

    id: 0d605d58-5caf-4f76-b6bd-e12402a22296

    health: HEALTH_WARN

    insufficient standby MDS daemons available

    5 nearfull osd(s)

    3 pool(s) nearfull

    1 pool(s) have non-power-of-two pg_num



  services:

    mon: 4 daemons, quorum 
download-mon3,download-mon4,download-mon1,download-mon2 (age 14h)

    mgr: download-mon2(active, since 14h), standbys: download-mon1, 
download-mon3

    mds: integdownload:2 {0=download-mds3=up:active,1=download-mds1=up:active}

    osd: 39 osds: 39 up (since 16h), 39 in (since 4d)



  data:

    pools:   3 pools, 1087 pgs

    objects: 71.76M objects, 51 TiB

    usage:   105 TiB used, 31 TiB / 136 TiB avail

    pgs: 1087 active+clean



  io:

    client:   414 MiB/s rd, 219 MiB/s wr, 513 op/s rd, 1.22k op/s wr



ID  HOST USED  AVAIL  WR OPS  WR DATA  RD OPS  RD DATA  STATE   
   

0  download-osd1   2995G   581G 14 4785k  6 6626k  exists,up
  

1  download-osd2   2578G   998G 84 3644k 18 10.1M  exists,up
  

2  download-osd3   3093G   483G 17 5114k  5 4152k  
exists,nearfull,up 

3  download-osd4   2757G   819G 12  996k  2 4107k  exists,up
  

4  download-osd5   2889G   687G 28 3355k 20 8660k  exists,up
  

5  download-osd6   2448G  1128G    183 3312k 10 9435k  exists,up
  

6  download-osd7   2814G   762G  7 1667k  4 6354k  exists,up
  

7  download-osd8   2872G   703G 14 1672k 15 10.5M  exists,up
  

8  download-osd9   2577G   999G 10 6615k  3 6960k  exists,up
  

9  download-osd10  2651G   924G 16 4736k  3 7378k  exists,up
  

10  download-osd11  2889G   687G 15 4810k  6 8980k  exists,up   
   

11  download-osd12  2912G   664G 11 2516k  2 4106k  exists,up   
   

12  download-osd13  2785G   791G 74 4643k 11 3717k  exists,up   
   

13  download-osd14  3150G   426G    214 6133k  4 7389k  
exists,nearfull,up 

14  download-osd15  2728G   848G 11 4959k  4 6603k  exists,up   
   

15  download-osd16  2682G   894G 13 3170k  3 2503k  exists,up   
   

16  download-osd17  2555G  1021G 53 2183k  7 5058k  exists,up   
   

17  download-osd18  3013G   563G 18 3497k  3 4427k  exists,up   
   

18  download-osd19  2924G   651G 24 3534k 12 10.4M  exists,up   
   

19  download-osd20  3003G   573G 19 5149k  3 2531k  exists,up   
   

20  download-osd21  2757G   819G 16 3707k  9 9816k  exists,up   
   

21  download-osd22  2576G   999G 15 2526k  8 7739k  exists,up   
   

22  download-osd23  2758G   818G 13 4412k 16 7125k  exists,up   
   

23  download-osd24  2862G   714G 18 4424k  6 5787k  exists,up   
   

24  download-osd25  2792G   783G 16 1972k  9 9749k  exists,up   
   

25  download-osd26  2397G  1179G 14 4296k  9 12.0M  exists,up   
   

26  download-osd27  2308G  1267G  8 3149k 22 6280k  exists,up   
   

27  download-osd29  2732G   844G 12 3357k  3 7372k  exists,up   
   

28  download-osd28  2814G   761G 11  476k  5 3316k  exists,up   
   

29  download-osd30  3069G   507G 15 9043k 17 5628k  
exists,nearfull,up 

30  download-osd31  2660G   916G 15  841k 14 7798k  exists,up   
   

31  download-osd32  2037G  1539G 10 1153k 15 3719k  exists,up   
   

32  download-osd33  3116G   460G 20 7704k 12 9041k  
exists,nearfull,up 

33  download-osd34  2847G   728G 19 5788k  4 9014k  exists,up   
   

34  download-osd35  3088G   488G 17 7178k  7 5730k  
exists,nearfull,up 

35  download-osd36  2414G  1161G 27 2017k 14 7612k  exists,up   
   

36  download-osd37  2760G   815G 17 4292k  5 10.6M  exists,up   
   

37  download-osd38  2679G   897G 12 2610k  5 10.0M  exists,up   
   

38  download-osd39  3013G   563G 18 1804k  7 9235k  exists,up 

[ceph-users] MDs stuck in rejoin with '[ERR] : loaded dup inode'

2024-05-29 Thread Noe P.


Hi,

I'm still unable to get our filesystem back.
I now have this:

fs_cluster - 0 clients
==
RANK  STATE MDS ACTIVITY   DNSINOS   DIRS   CAPS
 0rejoin  cephmd4b90.0k  89.4k  14.7k 0
 1rejoin  cephmd6b 105k   105k  21.3k 0
 2failed
  POOL TYPE USED  AVAIL
fs_cluster_meta  metadata   288G  55.2T
fs_cluster_datadata 421T  55.2T


Still cannot get rid of the 3rd failed rank. But the other two currently
stay in state rejoin forever. After all clients were stopped, the log
complains about a 'dup inode':

  2024-05-30T07:59:46.252+0200 7f2fe9146700 -1 log_channel(cluster) log [ERR] :
  loaded dup inode 0x1001710ea1d [12bc6a,head] v1432525092 at
  /homes/YYY/ZZZ/.bash_history-21032.tmp, but inode 0x1001710ea1d.head
  v1432525109 already exists at /homes/YYY/ZZZ/.bash_history

Questions:
 - Is there a way to scan/repair the metadata without any MD in 'active' state ?

 - Is there a way to remove (or otherwise fix) the inode in question given the
   above inode number ?

 - Is the state 'rejoin' due to the inode error or because of that 3rd rank ?


Regard,
  N.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: We are using ceph octopus environment. For client can we use ceph quincy?

2024-05-29 Thread Robert Sander

On 5/27/24 09:28, s.dhivagar@gmail.com wrote:

We are using ceph octopus environment. For client can we use ceph quincy?


Yes.

--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Rebalance OSDs after adding disks?

2024-05-29 Thread tpDev Tester

Hi,


I have been curiuos about ceph for long time and now I started to 
experiment to find out, how it works. The idea I like most is, that ceph 
can provide growing storage without the need to move from storage x to 
storage y on consumer side.


I started with a 3 node cluster where each node got one OSD (2x 500GB, 
1x 1TB). I created the pool and the filesystem, mounted the filesystem 
and filled it with data. All three OSDs filled up evenly and when the 
pool reached 70%, I added two 1TB OSDs (each on the node with 500GB). 
Now I expected some activity to rebalance the fill level of the OSDs, 
but nothing special happened. The new OSDs got some data, but the 500GB 
OSDs ran into 95%, the pool reached 100%, everything got stuck with 2 
OSDs filled up just 20% and left 80% free space?


I searched the documentation how to initiate the 
rebalance/redistribution by hand but was unable to find anything.


Can someone please point me to the docs how I can expand the capacity of 
the pool without such problems.



Thanks in advance

Thomas
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Rebalance OSDs after adding disks?

2024-05-29 Thread Robert Sander

On 5/30/24 08:53, tpDev Tester wrote:

Can someone please point me to the docs how I can expand the capacity of 
the pool without such problems.


Please show the output of

ceph status

ceph df

ceph osd df tree

ceph osd crush rule dump

ceph osd pool ls detail


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io