[ceph-users] label or pseudo name for cephfs volume path

2024-05-10 Thread Adiga, Anantha
Hi,

Under the circumstance that a ceph fs subvolume has   to be recreated , the  
uuid will change  and   we have to change  all  sources  that reference  the 
volume path.

Is there a way to provide  a  label /tag  to the volume path  that can be used 
for  pv_root_path so that we do not have to change  any references OR any 
suggestion  to address this need ?


Spec containing  subvolume path:
k8s_rook_pv:
  - name: inventory-pv
subvolumegroup: cephfs_data_pool_ec21_subvolumegroup
subvolume: cluster_inventory_subvolume
data_pool: cephfs_data_pool_ec21
size: 10Gi
pvc_name: prometheus-inventory-pvc
pv_root_path: 
/volumes/cephfs_data_pool_ec21_subvolumegroup/cluster_inventory_subvolume/ecfdf977-1fdb-4474-b8dd-c3bedb42620e
mode: ReadWriteMany


Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-03 Thread Adiga, Anantha

removed the config setting for mon. a001s016. 
 
Here it is 
# ceph config get mon container_image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
# ceph config get osd container_image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
# ceph config get mgr mgr/cephadm/container_image_base
docker.io/ceph/daemon

Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Wednesday, April 3, 2024 12:27 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

I have no idea what you did there ;-) I would remove that config though and 
rather configure the ceph image globally, there have been several issues when 
cephadm tries to launch daemons with different ceph versions. Although in your 
case it looks like they are actually the same images according to the digest 
(and also in the ceph orch ps output). But it might cause some trouble anyway, 
so I'd recommend to remove the individual config for mon.a001s016 and only use 
the global config. Can you add these outputs (mask sensitive data)?

ceph config get mon container_image
ceph config get osd container_image
ceph config get mgr mgr/cephadm/container_image_base

Zitat von "Adiga, Anantha" :

> Hi Eugen,
>
>
> Noticed this in the config dump:  Why  only   "mon.a001s016 "  
> listed?And this is the one that is not listed in "ceph -s"
>
>
>   mon  advanced   
> auth_allow_insecure_global_id_reclaim  false
>   mon  advanced   
> auth_expose_insecure_global_id_reclaim false
>   mon  advanced   
> mon_compact_on_start   true
> mon.a001s016   basic container_image  
> 
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
>
> *
>   mgr  advanced   
> mgr/cephadm/container_image_base   docker.io/ceph/daemon
>   mgr  advanced   
> mgr/cephadm/container_image_node_exporter   
> docker.io/prom/node-exporter:v0.17.0
>
>
>   cluster:
> id: 604d56db-2fab-45db-a9ea-c418f9a8cca8
> health: HEALTH_OK
>
>   services:
> mon: 2 daemons, quorum a001s018,a001s017 (age 45h)
> mgr: a001s016.ctmoay(active, since 28h), standbys: a001s017.bpygfm
> mds: 1/1 daemons up, 2 standby
> osd: 36 osds: 36 up (since 29h), 36 in (since 2y)
> rgw: 3 daemons active (3 hosts, 1 zones)
>
> var lib mon unit.image
> 
>
> a001s016:
> # cat
> /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s016/unit.i
> mage
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a
> 2cb7ff7668f776b61b9d586
>
> a001s017:
> # cat
> /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s017/unit.i
> mage
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a
> 2cb7ff7668f776b61b9d586
> a001s018:
> # cat
> /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s018/unit.i
> mage
> docker.io/ceph/daemon:latest-pacific
>
> ceph image tag, digest from docker inspect of:  ceph/daemon   
> latest-pacific   6e73176320aa   2 years ago 1.27GB
> ==
> 
> a001s016:
> "Id":  
> "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
> "RepoTags": [
> "ceph/daemon:latest-pacific"
> "RepoDigests": [
>  
> "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"
>
> a001s017:
> "Id":  
> "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
> "RepoTags": [
> "ceph/daemon:latest-pacific"
> "RepoDigests": [
>  
> "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"
>
> a001s018:
> "Id":  
> "sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
> "RepoTags": [
> "ceph/daemon:latest-pacific"
> "RepoDigests": [
>  
> "ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"
>
> -Original Message-
> From: Adiga, Anantha
> Sent: Tuesday, April 2, 2024 10:42 AM

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-02 Thread Adiga, Anantha
Hi Eugen,


Noticed this in the config dump:  Why  only   "mon.a001s016 " listed?And 
this is the one that is not listed in "ceph -s" 
 

  mon  advanced  
auth_allow_insecure_global_id_reclaim  false

  mon  advanced  
auth_expose_insecure_global_id_reclaim false

  mon  advanced  mon_compact_on_start   
true 
mon.a001s016   basic container_image

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
  *
  mgr  advanced  
mgr/cephadm/container_image_base   docker.io/ceph/daemon

  mgr  advanced  
mgr/cephadm/container_image_node_exporter  docker.io/prom/node-exporter:v0.17.0 



  cluster:
id: 604d56db-2fab-45db-a9ea-c418f9a8cca8
health: HEALTH_OK

  services:
mon: 2 daemons, quorum a001s018,a001s017 (age 45h)
mgr: a001s016.ctmoay(active, since 28h), standbys: a001s017.bpygfm
mds: 1/1 daemons up, 2 standby
osd: 36 osds: 36 up (since 29h), 36 in (since 2y)
rgw: 3 daemons active (3 hosts, 1 zones)

var lib mon unit.image


a001s016: 
# cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s016/unit.image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586

a001s017:
# cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s017/unit.image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
a001s018:
# cat /var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s018/unit.image
docker.io/ceph/daemon:latest-pacific

ceph image tag, digest from docker inspect of:  ceph/daemon  
latest-pacific   6e73176320aa   2 years ago 1.27GB
==
a001s016:
"Id": 
"sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"RepoTags": [ 
"ceph/daemon:latest-pacific"
"RepoDigests": [

"ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"

a001s017:
"Id": 
"sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"RepoTags": [
"ceph/daemon:latest-pacific"
"RepoDigests": [

"ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"

a001s018:
"Id": 
"sha256:6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"RepoTags": [
"ceph/daemon:latest-pacific"
"RepoDigests": [

"ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586"

-Original Message-
From: Adiga, Anantha 
Sent: Tuesday, April 2, 2024 10:42 AM
To: Eugen Block 
Cc: ceph-users@ceph.io
Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services

Hi Eugen,

Currently there are only three nodes, but I can add  a node to the cluster and 
check it out. I will take a look at the mon logs 


Thank you,
Anantha

-Original Message-
From: Eugen Block 
Sent: Tuesday, April 2, 2024 12:19 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

You can add a mon manually to the monmap, but that requires a downtime of the 
mons. Here's an example [1] how to modify the monmap (including network change 
which you don't need, of course). But that would be my last resort, first I 
would try to find out why the MON fails to join the quorum. What is that 
mon.a001s016 logging, and what are the other two logging?
Do you have another host where you could place a mon daemon to see if that 
works?


[1]
https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#example-procedure

Zitat von "Adiga, Anantha" :

> # ceph mon stat
> e6: 2 mons at
> {a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2
> :10.45.128.28:3300/0,v1:10.45.128.28:6789/0]}, election epoch 162, 
> leader 0 a001s018, quorum 0,1
> a001s018,a001s017
>
> # ceph orch ps | grep mon
> mon.a001s016a001s016   running  
> (3h)  6m ago   3h 527M2048M  16.2.5   
> 6e73176320aa  39db8cfba7e1
> mon.a001s017a001s017   running  
>

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-02 Thread Adiga, Anantha
Hi Eugen,

Currently there are only three nodes, but I can add  a node to the cluster and 
check it out. I will take a look at the mon logs 


Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Tuesday, April 2, 2024 12:19 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

You can add a mon manually to the monmap, but that requires a downtime of the 
mons. Here's an example [1] how to modify the monmap (including network change 
which you don't need, of course). But that would be my last resort, first I 
would try to find out why the MON fails to join the quorum. What is that 
mon.a001s016 logging, and what are the other two logging?
Do you have another host where you could place a mon daemon to see if that 
works?


[1]
https://docs.ceph.com/en/latest/rados/operations/add-or-rm-mons/#example-procedure

Zitat von "Adiga, Anantha" :

> # ceph mon stat
> e6: 2 mons at
> {a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2
> :10.45.128.28:3300/0,v1:10.45.128.28:6789/0]}, election epoch 162, 
> leader 0 a001s018, quorum 0,1
> a001s018,a001s017
>
> # ceph orch ps | grep mon
> mon.a001s016a001s016   running  
> (3h)  6m ago   3h 527M2048M  16.2.5   
> 6e73176320aa  39db8cfba7e1
> mon.a001s017a001s017   running  
> (22h)47s ago   1h 993M2048M  16.2.5   
> 6e73176320aa  e5e5cb6c256c
> mon.a001s018a001s018   running  
> (5w) 48s ago   2y1167M2048M  16.2.5   
> 6e73176320aa  7d2bb6d41f54
>
> # ceph mgr stat
> {
> "epoch": 1130365,
> "available": true,
> "active_name": "a001s016.ctmoay",
> "num_standby": 1
> }
>
> # ceph orch ps | grep mgr
> mgr.a001s016.ctmoay a001s016  *:8443   running  
> (18M)   109s ago  23M 518M-  16.2.5   
> 6e73176320aa  169cafcbbb99
> mgr.a001s017.bpygfm a001s017  *:8443   running  
> (19M) 5m ago  23M 501M-  16.2.5   
> 6e73176320aa  97257195158c
> mgr.a001s018.hcxnef a001s018  *:8443   running  
> (20M) 5m ago  23M 113M-  16.2.5   
> 6e73176320aa  21ba5896cee2
>
> # ceph orch ls --service_name=mgr --export
> service_type: mgr
> service_name: mgr
> placement:
>   count: 3
>   hosts:
>   - a001s016
>   - a001s017
>   - a001s018
>
> # ceph orch ls --service_name=mon --export
> service_type: mon
> service_name: mon
> placement:
>   count: 3
>   hosts:
>   - a001s016
>   - a001s017
>   - a001s018
>
> -Original Message-
> From: Adiga, Anantha
> Sent: Monday, April 1, 2024 6:06 PM
> To: Eugen Block 
> Cc: ceph-users@ceph.io
> Subject: RE: [ceph-users] Re: ceph status not showing correct monitor 
> services
>
> # ceph tell mon.a001s016 mon_status Error ENOENT: problem getting 
> command descriptions from mon.a001s016
>
> a001s016 is outside quorum see below
>
> # ceph tell mon.a001s017 mon_status {
> "name": "a001s017",
> "rank": 1,
> "state": "peon",
> "election_epoch": 162,
> "quorum": [
> 0,
> 1
> ],
> "quorum_age": 79938,
> "features": {
> "required_con": "2449958747317026820",
> "required_mon": [
> "kraken",
> "luminous",
> "mimic",
> "osdmap-prune",
> "nautilus",
> "octopus",
> "pacific",
> "elector-pinging"
> ],
> "quorum_con": "4540138297136906239",
> "quorum_mon": [
> "kraken",
> "luminous",
> "mimic",
> "osdmap-prune",
> "nautilus",
> "octopus",
> "pacific",
> "elector-pinging"
> ]
> },
> "outside_quorum": [],
> "extra_probe_peers": [
> {
> "addrvec": [
> {
> "type": "v2",
> "addr": "10.45.128.26:3300",
> "nonce": 0
> },
> {
> &q

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
# ceph mon stat
e6: 2 mons at 
{a001s017=[v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0],a001s018=[v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0]},
 election epoch 162, leader 0 a001s018, quorum 0,1 a001s018,a001s017

# ceph orch ps | grep mon
mon.a001s016a001s016   running (3h)  6m 
ago   3h 527M2048M  16.2.5  6e73176320aa  39db8cfba7e1
mon.a001s017a001s017   running (22h)47s 
ago   1h 993M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w) 48s 
ago   2y1167M2048M  16.2.5  6e73176320aa  7d2bb6d41f54

# ceph mgr stat
{
"epoch": 1130365,
"available": true,
"active_name": "a001s016.ctmoay",
"num_standby": 1
}

# ceph orch ps | grep mgr
mgr.a001s016.ctmoay a001s016  *:8443   running (18M)   109s 
ago  23M 518M-  16.2.5  6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm a001s017  *:8443   running (19M) 5m 
ago  23M 501M-  16.2.5  6e73176320aa  97257195158c
mgr.a001s018.hcxnef a001s018  *:8443   running (20M) 5m 
ago  23M 113M-  16.2.5  6e73176320aa  21ba5896cee2

# ceph orch ls --service_name=mgr --export
service_type: mgr
service_name: mgr
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

# ceph orch ls --service_name=mon --export
service_type: mon
service_name: mon
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

-Original Message-
From: Adiga, Anantha 
Sent: Monday, April 1, 2024 6:06 PM
To: Eugen Block 
Cc: ceph-users@ceph.io
Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services

# ceph tell mon.a001s016 mon_status Error ENOENT: problem getting command 
descriptions from mon.a001s016

a001s016 is outside quorum see below 

# ceph tell mon.a001s017 mon_status {
"name": "a001s017",
"rank": 1,
"state": "peon",
"election_epoch": 162,
"quorum": [
0,
1
],
"quorum_age": 79938,
"features": {
"required_con": "2449958747317026820",
"required_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"quorum_con": "4540138297136906239",
"quorum_mon": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
]
},
"outside_quorum": [],
"extra_probe_peers": [
{
"addrvec": [
{
"type": "v2",
"addr": "10.45.128.26:3300",
"nonce": 0
},
{
"type": "v1",
"addr": "10.45.128.26:6789",
"nonce": 0
}
]
}
],
"sync_provider": [],
"monmap": {
"epoch": 6,
"fsid": "604d56db-2fab-45db-a9ea-c418f9a8cca8",
"modified": "2024-03-31T23:54:18.692983Z",
"created": "2021-09-30T16:15:12.884602Z",
"min_mon_release": 16,
"min_mon_release_name": "pacific",
"election_strategy": 1,
"disallowed_leaders: ": "",
"stretch_mode": false,
"features": {
"persistent": [
"kraken",
"luminous",
"mimic",
"osdmap-prune",
"nautilus",
"octopus",
"pacific",
"elector-pinging"
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": "a001s018",
"public_addrs": {
"addrvec": [
{
"type": "v2",
"ad

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
   "mon": [
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 1
}
],
"mds": [
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 3
}
],
"osd": [
    {
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 15
}
],
"client": [
{
"features": "0x2f018fb86aa42ada",
"release": "luminous",
"num": 50
},
{
"features": "0x2f018fb87aa4aafe",
"release": "luminous",
"num": 40
},
{
"features": "0x3f01cfb8ffed",
"release": "luminous",
"num": 1
},
{
"features": "0x3f01cfb9fffd",
"release": "luminous",
"num": 72
}
]
},
"stretch_mode": false
}
root@a001s016:/var/run/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8# ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017
root@a001s016:/var/run/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8#

-Original Message-
From: Adiga, Anantha 
Sent: Monday, April 1, 2024 3:20 PM
To: Adiga, Anantha ; Eugen Block 
Cc: ceph-users@ceph.io
Subject: RE: [ceph-users] Re: ceph status not showing correct monitor services

Both methods are not update the mon map, is there a way to inject mon.a001s016  
into the current mon map?  

# ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+ created 
2021-09-30T16:15:12.884602+ min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017

# ceph tell mon.a001s016 mon_status
Error ENOENT: problem getting command descriptions from mon.a001s016

# ceph tell mon.a001s016 mon_status
Error ENOENT: problem getting command descriptions from mon.a001s016

# ceph tell mon.a001s017 mon_status
{
"name": "a001s017",
"rank": 1,
"state": "peon",
"election_epoch": 162,
"quorum": [
0,
1
],
"quorum_age": 69551,
"features": {
..
..


# ceph orch ls --service_name=mon --export > mon3.yml
service_type: mon
service_name: mon
placement:
  count: 3
  hosts:
  - a001s016
  - a001s017
  - a001s018

# cp mon3.yml mon2.yml
# vi mon2.yml
#cat mon2.yml
service_type: mon
service_name: mon
placement:
  count: 2
  hosts:
  - a001s017
  - a001s018

# ceph orch apply -i mon2.yml --dry-run
WARNING! Dry-Runs are snapshots of a certain point in time and are bound to the 
current inventory setup. If any on these conditions changes, the preview will 
be invalid. Please make sure to have a minimal timeframe between planning and 
applying the specs.

SERVICESPEC PREVIEWS

+-+--++--+
|SERVICE  |NAME  |ADD_TO  |REMOVE_FROM   |
+-+--++--+
|mon  |mon   ||mon.a001s016  |
+-+--++--+

OSDSPEC PREVIEWS

+-+--+--+--++-+
|SERVICE  |NAME  |HOST  |DATA  |DB  |WAL  |
+-+--+--+--++-+
+-+--+--+--++-+

# ceph orch ls --service_name=mon --refresh NAME  PORTS  RUNNING  REFRESHED  
AGE  PLACEMENT
mon  3/3  5m ago 18h  a001s016;a001s017;a001s018;count:3

# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   running (21h) 2s 
ago  21h 734M2048M  16.2.5  6e73176320aa  8484a912f96a
mon.a001s017a001s017   running (18h) 2s 
ago  21h 976M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  2s 
ago   2y1164M2048M  16.2.5

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
)30s 
ago  22h 976M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  0s 
ago   2y1166M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
#ceph orch ps --refresh | grep mon
mon.a001s017a001s017   running (19h) 2s 
ago  22h 982M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  3s 
ago   2y1166M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   starting 
  ---2048M  
mon.a001s017a001s017   running (19h) 6s 
ago  22h 982M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  7s 
ago   2y1166M2048M  16.2.5  6e73176320aa  7d2bb6d41f54
# ceph orch ps --refresh | grep mon
mon.a001s016a001s016   running (8s)  2s 
ago   8s14.4M2048M  16.2.5  6e73176320aa  39db8cfba7e1
mon.a001s017a001s017   running (19h) 1s 
ago  22h 987M2048M  16.2.5  6e73176320aa  e5e5cb6c256c
mon.a001s018a001s018   running (5w)  2s 
ago   2y1171M2048M  16.2.5  6e73176320aa  7d2bb6d41f54

# ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017

#

-Original Message-
From: Adiga, Anantha  
Sent: Monday, April 1, 2024 2:01 PM
To: Eugen Block 
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph status not showing correct monitor services

Thank you. I will try the  export and import method first.

Thank you,
Anantha

-Original Message-
From: Eugen Block 
Sent: Monday, April 1, 2024 1:57 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

I have two approaches in mind, first one (and preferred) would be to edit the 
mon spec to first remove mon.a001s016 and have a clean state.  
Get the current spec with:

ceph orch ls mon --export > mon-edit.yaml

Edit the spec file so that mon.a001s016 is not part of it, then apply:

ceph orch apply -i mon-edit.yaml

This should remove the mon.a001s016 daemon. Then wait a few minutes or so 
(until the daemon is actually gone, check locally on the node with 'cephadm ls' 
and in /var/lib/ceph//removed) and add it back to the spec file, then 
apply again. I would expect a third MON to be deployed. If that doesn't work 
for some reason you'll need to inspect logs to find the root cause.

The second approach would be to remove and add the daemon manually:

ceph orch daemon rm mon.a001s016

Wait until it's really gone, then add it:

ceph orch daemon add mon a001s016

Not entirely sure about the daemon add mon command, you might need to provide 
something else, I'm typing this by heart.

Zitat von "Adiga, Anantha" :

> Hi Eugen,
>
> Yes that is it. OSDs were restarted since mon a001s017 was reporting  
> is low on available space.  How  to update the mon map to add   
> mon.a001s016  as it is already online?
> And how to update mgr map to  include standby mgr.a001s018 as it is 
> also running.
>
>
> ceph mon dump
> dumped monmap epoch 6
> epoch 6
> fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
> last_changed 2024-03-31T23:54:18.692983+ created
> 2021-09-30T16:15:12.884602+ min_mon_release 16 (pacific)
> election_strategy: 1
> 0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
> 1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017
>
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Eugen Block 
> Sent: Monday, April 1, 2024 1:10 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: ceph status not showing correct monitor 
> services
>
> Maybe it’s just not in the monmap? Can you show the output of:
>
> ceph mon dump
>
> Did you do any maintenance (apparently OSDs restarted recently) and 
> maybe accidentally removed a MON from the monmap?
>
>
> Zitat von "Adiga, Anantha" :
>
>> Hi Anthony,
>>
>> Seeing it since last after noon.  It is same with mgr services as , 
>> "ceph -s" is reporting only TWO instead of THREE
>>
>> Also  mon and mgr shows " is_active: false" see below.
>>
>> # ceph orch ps --daemon_type=mgr
>> NAME HOST 

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Thank you. I will try the  export and import method first.

Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Monday, April 1, 2024 1:57 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ceph status not showing correct monitor services

I have two approaches in mind, first one (and preferred) would be to edit the 
mon spec to first remove mon.a001s016 and have a clean state.  
Get the current spec with:

ceph orch ls mon --export > mon-edit.yaml

Edit the spec file so that mon.a001s016 is not part of it, then apply:

ceph orch apply -i mon-edit.yaml

This should remove the mon.a001s016 daemon. Then wait a few minutes or so 
(until the daemon is actually gone, check locally on the node with 'cephadm ls' 
and in /var/lib/ceph//removed) and add it back to the spec file, then 
apply again. I would expect a third MON to be deployed. If that doesn't work 
for some reason you'll need to inspect logs to find the root cause.

The second approach would be to remove and add the daemon manually:

ceph orch daemon rm mon.a001s016

Wait until it's really gone, then add it:

ceph orch daemon add mon a001s016

Not entirely sure about the daemon add mon command, you might need to provide 
something else, I'm typing this by heart.

Zitat von "Adiga, Anantha" :

> Hi Eugen,
>
> Yes that is it. OSDs were restarted since mon a001s017 was reporting  
> is low on available space.  How  to update the mon map to add   
> mon.a001s016  as it is already online?
> And how to update mgr map to  include standby mgr.a001s018 as it is 
> also running.
>
>
> ceph mon dump
> dumped monmap epoch 6
> epoch 6
> fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
> last_changed 2024-03-31T23:54:18.692983+ created 
> 2021-09-30T16:15:12.884602+ min_mon_release 16 (pacific)
> election_strategy: 1
> 0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
> 1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017
>
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Eugen Block 
> Sent: Monday, April 1, 2024 1:10 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Re: ceph status not showing correct monitor 
> services
>
> Maybe it’s just not in the monmap? Can you show the output of:
>
> ceph mon dump
>
> Did you do any maintenance (apparently OSDs restarted recently) and 
> maybe accidentally removed a MON from the monmap?
>
>
> Zitat von "Adiga, Anantha" :
>
>> Hi Anthony,
>>
>> Seeing it since last after noon.  It is same with mgr services as , 
>> "ceph -s" is reporting only TWO instead of THREE
>>
>> Also  mon and mgr shows " is_active: false" see below.
>>
>> # ceph orch ps --daemon_type=mgr
>> NAME HOST  PORTS   STATUS REFRESHED  AGE
>>  MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
>> mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M
>> 206M-  16.2.5   6e73176320aa  169cafcbbb99
>> mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M
>> 332M-  16.2.5   6e73176320aa  97257195158c
>> mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M
>> 113M-  16.2.5   6e73176320aa  21ba5896cee2
>>
>> # ceph orch ls --service_name=mgr
>> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
>> mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3
>>
>>
>> # ceph orch ps --daemon_type=mon --format=json-pretty
>>
>> [
>>   {
>> "container_id": "8484a912f96a",
>> "container_image_digests": [
>>
>> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
>> ],
>> "container_image_id":
>> "6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
>> "container_image_name":
>> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>,
>> "created": "2024-03-31T23:55:16.164155Z",
>> "daemon_id": "a001s016",
>> "daemon_type": "mon",
>> "hostname": "a001s016",
>> "is_active": false,
>><== why is it false
>> "last_refresh": "2024-04-01T19:38:30.929014Z",
>> "memory_request": 2147483648,
>> "memory_usage&q

[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Hi Eugen,

Yes that is it. OSDs were restarted since mon a001s017 was reporting is low on 
available space.  How  to update the mon map to add  mon.a001s016  as it is 
already online?   
And how to update mgr map to  include standby mgr.a001s018 as it is also 
running. 


ceph mon dump
dumped monmap epoch 6
epoch 6
fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8
last_changed 2024-03-31T23:54:18.692983+
created 2021-09-30T16:15:12.884602+
min_mon_release 16 (pacific)
election_strategy: 1
0: [v2:10.45.128.28:3300/0,v1:10.45.128.28:6789/0] mon.a001s018
1: [v2:10.45.128.27:3300/0,v1:10.45.128.27:6789/0] mon.a001s017


Thank you,
Anantha

-Original Message-
From: Eugen Block  
Sent: Monday, April 1, 2024 1:10 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: ceph status not showing correct monitor services

Maybe it’s just not in the monmap? Can you show the output of:

ceph mon dump

Did you do any maintenance (apparently OSDs restarted recently) and maybe 
accidentally removed a MON from the monmap?


Zitat von "Adiga, Anantha" :

> Hi Anthony,
>
> Seeing it since last after noon.  It is same with mgr services as , 
> "ceph -s" is reporting only TWO instead of THREE
>
> Also  mon and mgr shows " is_active: false" see below.
>
> # ceph orch ps --daemon_type=mgr
> NAME HOST  PORTS   STATUS REFRESHED  AGE  
>  MEM USE  MEM LIM  VERSION  IMAGE ID  CONTAINER ID
> mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M  
> 206M-  16.2.5   6e73176320aa  169cafcbbb99
> mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M  
> 332M-  16.2.5   6e73176320aa  97257195158c
> mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M  
> 113M-  16.2.5   6e73176320aa  21ba5896cee2
>
> # ceph orch ls --service_name=mgr
> NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
> mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3
>
>
> # ceph orch ps --daemon_type=mon --format=json-pretty
>
> [
>   {
> "container_id": "8484a912f96a",
> "container_image_digests": [
>
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
> ],
> "container_image_id":  
> "6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
> "container_image_name":  
> docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>,
> "created": "2024-03-31T23:55:16.164155Z",
> "daemon_id": "a001s016",
> "daemon_type": "mon",
> "hostname": "a001s016",
> "is_active": false,   
><== why is it false
> "last_refresh": "2024-04-01T19:38:30.929014Z",
>     "memory_request": 2147483648,
> "memory_usage": 761685606,
> "ports": [],
> "service_name": "mon",
> "started": "2024-03-31T23:55:16.268266Z",
> "status": 1,
> "status_desc": "running",
> "version": "16.2.5"
>   },
>
>
> Thank you,
> Anantha
>
> From: Anthony D'Atri 
> Sent: Monday, April 1, 2024 12:25 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] ceph status not showing correct monitor 
> services
>
>
>
>
>  a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
>
> Looks like you just had an mgr failover?  Could be that the secondary 
> mgr hasn't caught up with current events.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha


Hi Anthony,

Seeing it since last after noon.  It is same with mgr services as , "ceph -s" 
is reporting only TWO instead of THREE

Also  mon and mgr shows " is_active: false" see below.

# ceph orch ps --daemon_type=mgr
NAME HOST  PORTS   STATUS REFRESHED  AGE  MEM USE  
MEM LIM  VERSION  IMAGE ID  CONTAINER ID
mgr.a001s016.ctmoay  a001s016  *:8443  running (18M) 3m ago  23M 206M   
 -  16.2.5   6e73176320aa  169cafcbbb99
mgr.a001s017.bpygfm  a001s017  *:8443  running (19M) 3m ago  23M 332M   
 -  16.2.5   6e73176320aa  97257195158c
mgr.a001s018.hcxnef  a001s018  *:8443  running (20M) 3m ago  23M 113M   
 -  16.2.5   6e73176320aa  21ba5896cee2

# ceph orch ls --service_name=mgr
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mgr  3/3  3m ago 23M  a001s016;a001s017;a001s018;count:3


# ceph orch ps --daemon_type=mon --format=json-pretty

[
  {
"container_id": "8484a912f96a",
"container_image_digests": [
  
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
],
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_name": 
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>,
"created": "2024-03-31T23:55:16.164155Z",
"daemon_id": "a001s016",
"daemon_type": "mon",
"hostname": "a001s016",
"is_active": false, <== why 
is it false
"last_refresh": "2024-04-01T19:38:30.929014Z",
"memory_request": 2147483648,
"memory_usage": 761685606,
"ports": [],
"service_name": "mon",
"started": "2024-03-31T23:55:16.268266Z",
"status": 1,
"status_desc": "running",
"version": "16.2.5"
  },


Thank you,
Anantha

From: Anthony D'Atri 
Sent: Monday, April 1, 2024 12:25 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph status not showing correct monitor services




 a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay

Looks like you just had an mgr failover?  Could be that the secondary mgr 
hasn't caught up with current events.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph status not showing correct monitor services

2024-04-01 Thread Adiga, Anantha
Hi

Why is "ceph -s" showing only two monitors while three monitor services are 
running ?

# ceph versions
{   "mon": {"ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 2 },
"mgr": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 2 },
"osd": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 36 },
"mds": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 1 },
"rgw": { "ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 3 },
"overall": {"ceph version 16.2.5 
(0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)": 44 } }

# ceph orch ls
NAME PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
crash   3/3  8m ago 2y   label:ceph
ingress.nfs.nfs  10.45.128.8:2049,9049  4/4  8m ago 2y   count:2
mds.cephfs  3/3  8m ago 2y   
count:3;label:mdss
mgr 3/3  8m ago 23M  
a001s016;a001s017;a001s018;count:3
mon 3/3  8m ago 16h  
a001s016;a001s017;a001s018;count:3   <== [ 3 monitor services running]
nfs.nfs  ?:120493/3  8m ago 2y   
a001s016;a001s017;a001s018;count:3
node-exporter?:9100 3/3  8m ago 2y   *
osd.unmanaged 36/36  8m ago -
prometheus   ?:9095 1/1  10s ago23M  count:1
rgw.ceph ?:8080 3/3  8m ago 19h  
count-per-host:1;label:rgws
root@a001s017:~# ceph -s
  cluster:
id: 604d56db-2fab-45db-a9ea-c418f9a8cca8
health: HEALTH_OK

  services:
mon: 2 daemons, quorum a001s018,a001s017 (age 16h)  <== [ shows ONLY 2 
monitors running]
mgr: a001s017.bpygfm(active, since 13M), standbys: a001s016.ctmoay
mds: 1/1 daemons up, 2 standby
osd: 36 osds: 36 up (since 54s), 36 in (since 2y)
rgw: 3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   43 pools, 1633 pgs
objects: 51.81M objects, 77 TiB
usage:   120 TiB used, 131 TiB / 252 TiB avail
pgs: 1631 active+clean
 2active+clean+scrubbing+deep

  io:
client:   220 MiB/s rd, 448 MiB/s wr, 251 op/s rd, 497 op/s wr

# ceph orch ls --service_name=mon
NAME  PORTS  RUNNING  REFRESHED  AGE  PLACEMENT
mon  3/3  8m ago 16h  a001s016;a001s017;a001s018;count:3  <== [ 
3 monitors running ]

# ceph orch ps --daemon_type=mon
NAME  HOST  PORTS  STATUS REFRESHED  AGE  MEM USE  MEM LIM  
VERSION  IMAGE ID  CONTAINER ID
mon.a001s016  a001s016 running (19h) 9m ago  19h 706M2048M  
16.2.5   6e73176320aa  8484a912f96a
mon.a001s017  a001s017 running (16h)66s ago  19h 949M2048M  
16.2.5   6e73176320aa  e5e5cb6c256c   <== [ 3  mon daemons running ]
mon.a001s018  a001s018 running (5w)  2m ago   2y1155M2048M  
16.2.5   6e73176320aa  7d2bb6d41f54

a001s016# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s016.service
loaded active running Ceph mon.a001s016 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8
a001s017# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s017.service
loaded active running Ceph mon.a001s017 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8
a001s018# systemctl --type=service | grep @mon
  
ceph-604d56db-2fab-45db-a9ea-c418f9a8cca8@mon.a001s018.service
loaded active running Ceph mon.a001s018 for 
604d56db-2fab-45db-a9ea-c418f9a8cca8


Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] recreating a cephfs subvolume with the same absolute path

2024-03-29 Thread Adiga, Anantha
Hi,

ceph fs subvolume getpath cephfs cluster_A_subvolume 
cephfs_data_pool_ec21_subvolumegroup
/volumes/cephfs_data_pool_ec21_subvolumegroup/cluster_A_subvolume/0f90806d-0d70-4fe1-9e2b-f958056ef0c9

If the subvolume got deleted,  is it possible to recreate the subvolume with 
the same absolute path? so that yml specs that  use the volume paths need not 
change

Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephadm Adding OSD wal device on a new

2023-12-16 Thread Adiga, Anantha
Hi,

After adding a node to the cluster (3 nodes) with cephadm, how do I add OSDs 
with the same configuration on the other nodes ?
The other nodes have
12 drives for data osd-block AND 2 drives for wal osd-wal. There are 6 LVs in 
each wal disk for the 12 data drives.
I have added the ODS with
   ceph orch daemon add osd hostname:/dev/nvme0n1

How do I attach the wal devices to the OSDs?
I have the WAL volumes created
nvme3n1 
  259:50 349.3G  0 disk
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--dad0df4e--149a--4e80--b451--79f9b81838b8
   253:12   0  58.2G  0 lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--a5f2e93a--7bf0--4904--a233--3946b855c764
   253:13   0  58.2G  0 lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--cc949e1b--2560--4d38--bc27--558550881726
   253:14   0  58.2G  0 lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--8846f50e--7e92--4f66--a738--ce3a89650019
   253:15   0  58.2G  0 lvm
|-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--6d646762--483a--40ca--8c51--ea54e0684a94
   253:16   0  58.2G  0 lvm
`-ceph--75d65cd1--91e4--4a8f--869b--e2a550f83104-osd--wal--74e58163--de1d--4062--a658--5b0356d43a87
   253:17   0  58.2G  0 lvm

How do I attach the wal volumes to the OSDs


osd.36 nvme0n1  259:10   5.8T  0 disk   
`-ceph--3df7c5c3--c2c0--4498--9e17--2af79e448abc-osd--block--804b50ea--d44c--4cad--9177--8d722f737df9
 253:00   5.8T  0 lvm
Osd.37 nvme1n1  259:30   5.8T  0 disk   
`-ceph--30858acc--c48b--4a08--bb98--4c9b59112c59-osd--block--0a3a198b--66ec--4ed9--94da--fb171e190e38
 253:10   5.8T  0 lvm
nvme3n1



Thank you,

Anantha



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs export over RGW issue in Pacific

2023-12-07 Thread Adiga, Anantha
Thank you Adam!!

Anantha

From: Adam King 
Sent: Thursday, December 7, 2023 10:46 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] nfs export over RGW issue in Pacific

The first handling of nfs exports over rgw in the nfs module, including the 
`ceph nfs export create rgw` command, wasn't added to the nfs module in pacific 
until 16.2.7.

On Thu, Dec 7, 2023 at 1:35 PM Adiga, Anantha 
mailto:anantha.ad...@intel.com>> wrote:
Hi,


oot@a001s016:~# cephadm version

Using recent ceph image 
ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586

ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)

root@a001s016:~#



root@a001s016:~# cephadm shell

Inferring fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8

Inferring config 
/var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s016/config

Using recent ceph image 
ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586



root@a001s016:~# ceph version

ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)

-
But, Cephadm does not show "nfs export create rgw"


nfs export create cephfs[--readonly] []

nfs export rm  

nfs export delete  

nfs export ls  [--detailed]

nfs export get  

nfs export update

-

However, Ceph Dashboard allows to create the export see below:

Access Type RW
Cluster nfs-1
Daemons nfs-1.0.0.zp3110b001a0101.uckows, nfs-1.1.0.zp3110b001a0102.hhpebb, 
nfs-1.2.0.zp3110b001a0103.bbkpcb, nfs-1.3.0.zp3110b001a0104.zujkso
NFS Protocol NFSv4
Object Gateway User admin
Path buc-cluster-inventory
Pseudo /rgwnfs_cluster_inventory
Squash no_root_squash
Storage Backend Object Gateway
Transport TCP
-

While nfs export is created, pseudo "/rgwnfs_cluster_inventory",   cephadm is 
not listing it

# ceph nfs export ls nfs-1
[
  "/cluster_inventory",
  "/oob_crashdump"
]
#

Anantha
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] nfs export over RGW issue in Pacific

2023-12-07 Thread Adiga, Anantha
Hi,


oot@a001s016:~# cephadm version

Using recent ceph image 
ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586

ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)

root@a001s016:~#



root@a001s016:~# cephadm shell

Inferring fsid 604d56db-2fab-45db-a9ea-c418f9a8cca8

Inferring config 
/var/lib/ceph/604d56db-2fab-45db-a9ea-c418f9a8cca8/mon.a001s016/config

Using recent ceph image 
ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586



root@a001s016:~# ceph version

ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)

-
But, Cephadm does not show "nfs export create rgw"


nfs export create cephfs[--readonly] []

nfs export rm  

nfs export delete  

nfs export ls  [--detailed]

nfs export get  

nfs export update

-

However, Ceph Dashboard allows to create the export see below:

Access Type RW
Cluster nfs-1
Daemons nfs-1.0.0.zp3110b001a0101.uckows, nfs-1.1.0.zp3110b001a0102.hhpebb, 
nfs-1.2.0.zp3110b001a0103.bbkpcb, nfs-1.3.0.zp3110b001a0104.zujkso
NFS Protocol NFSv4
Object Gateway User admin
Path buc-cluster-inventory
Pseudo /rgwnfs_cluster_inventory
Squash no_root_squash
Storage Backend Object Gateway
Transport TCP
-

While nfs export is created, pseudo "/rgwnfs_cluster_inventory",   cephadm is 
not listing it

# ceph nfs export ls nfs-1
[
  "/cluster_inventory",
  "/oob_crashdump"
]
#

Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-30 Thread Adiga, Anantha
Hi Venky,

“peer-bootstrap import” is working fine now. It was port 3300 blocked by 
firewall.
Thank you for your help.

Regards,
Anantha

From: Adiga, Anantha
Sent: Monday, August 7, 2023 1:29 PM
To: Venky Shankar ; ceph-users@ceph.io
Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Venky,

Could this be the reason that the peer-bootstrap import is hanging?  how do I 
upgrade cephfs-mirror to Quincy?
root@fl31ca104ja0201:/# cephfs-mirror --version
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
root@fl31ca104ja0201:/# ceph version
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
root@fl31ca104ja0201:/#


Thank you,
Anantha
From: Adiga, Anantha
Sent: Monday, August 7, 2023 11:21 AM
To: 'Venky Shankar' mailto:vshan...@redhat.com>>; 
'ceph-users@ceph.io' mailto:ceph-users@ceph.io>>
Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Venky,

I tried on another secondary Quincy cluster and it is the same problem. The 
peer_bootstrap mport  command hangs.



root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==

……

…….

..command does not complete..waits here
^C  to exit.
Thereafter some commands do not complete…
root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
fl31ca104ja0202, fl31ca104ja0203
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.9 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

  io:
client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr

root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/# ceph fs status cephfs
This command also waits. ……

I have attached the mgr log
root@fl31ca104ja0201:/# ceph service status
{
"cephfs-mirror": {
"5306346": {
"status_stamp": "2023-08-07T17:35:56.884907+",
"last_beacon": "2023-08-07T17:45:01.903540+",
"status": {
"status_json": 
"{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
}
}

Quincy secondary cluster


root@a001s008-zz14l47008:/# ceph mgr module enable mirroring

root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / rwps

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

caps mds = "allow rwps fsname=cephfs"

caps mon = "allow r fsname=cephfs"

caps osd = "allow rw tag cephfs data=cephfs"

root@a001s008-zz14l47008:/#

root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
cephfs client.mirror_remote shgR-site

{"token": 
"eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}

root@a001s008-zz14l47008:/#

Thank you,
Anantha

From: Adiga, Anantha
Sent: Friday, August 4, 2023 11:55 AM
To: Venky Shankar mailto:vshan...@redhat.com>>; 
ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung


Hi Venky,



Thank you so much for the guidance. Attached is the mgr log.



Note: the 4th node in the primary cluster has smaller capacity  drives, the 
other 3 nodes have the larger capacity drives.

32ssd6.98630   1.0  7.0 TiB   44 GiB   44 GiB   183 KiB  148 MiB  
6.9 TiB  0.62  0.64   40 

[ceph-users] Re: radosgw mulsite multi zone configuration: current period realm name not same as in zonegroup

2023-08-30 Thread Adiga, Anantha
Update:  There was a networking issue between the sites, after fixing it , the 
issue reported below did not occur.

Thank you,
Anantha

From: Adiga, Anantha
Sent: Thursday, August 24, 2023 2:40 PM
To: ceph-users@ceph.io
Subject: radosgw mulsite multi zone configuration: current period realm name 
not same as in zonegroup

Hi,

I have a multi zone configuration with 4 zones.

While adding a secondary zone, getting this error:

root@cs17ca101ja0702:/# radosgw-admin realm pull --rgw-realm=global 
--url=http://10.45.128.139:8080 --default --access-key=sync_user 
--secret=sync_secret
request failed: (13) Permission denied
If the realm has been changed on the master zone, the master zone's gateway may 
need to be restarted to recognize this user.
root@cs17ca101ja0702:/#

The realm name is "global". Is the cause of the error due to the primary 
cluster having a current period listing the realm name as "default" instead of 
"global" ?  However, the realm id is of realm "global" AND the zonegroup does 
not list realm name but has the correct realm id. See below.

How to fix this issue.

root@fl31ca104ja0201:/# radosgw-admin realm get
{
"id": "3da7b5ea-c44b-4d44-aced-fae2aabce97b",
"name": "global",
"current_period": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"epoch": 2
}
root@fl31ca104ja0201:/# radosgw-admin realm get --rgw-realm=global
{
"id": "3da7b5ea-c44b-4d44-aced-fae2aabce97b",
"name": "global",
"current_period": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"epoch": 2
}

root@fl31ca104ja0201:/# radosgw-admin zonegroup list
{
"default_info": "ec8b68db-1900-464f-a21a-2f6e8c107e94",
"zonegroups": [
"alldczg"
]
}

root@fl31ca104ja0201:/# radosgw-admin zonegroup get --rgw-zonegroup=alldczg
{
"id": "ec8b68db-1900-464f-a21a-2f6e8c107e94",
"name": "alldczg",
"api_name": "alldczg",
"is_master": "true",
"endpoints": [
http://10.45.128.139:8080,
http://172.18.55.71:8080,
http://10.239.155.23:8080
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "ae267592-7cd8-4d67-8792-adc57d104cd6",
"zones": [
{
"id": "0962f0b4-beb6-4d07-a64d-07046b81529e",
"name": "CRsite",
"endpoints": [
http://172.18.55.71:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "9129d118-55ac-4859-b339-b8afe0793a80",
"name": "BArga",
"endpoints": [
http://10.208.11.26:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "ae267592-7cd8-4d67-8792-adc57d104cd6",
"name": "ORflex2",
"endpoints": [
http://10.45.128.139:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "f5edeb4b-2a37-413b-8587-0ff40d7647ea",
"name": "SHGrasp",
"endpoints": [
http://10.239.155.23:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_a

[ceph-users] radosgw mulsite multi zone configuration: current period realm name not same as in zonegroup

2023-08-24 Thread Adiga, Anantha
Hi,

I have a multi zone configuration with 4 zones.

While adding a secondary zone, getting this error:

root@cs17ca101ja0702:/# radosgw-admin realm pull --rgw-realm=global 
--url=http://10.45.128.139:8080 --default --access-key=sync_user 
--secret=sync_secret
request failed: (13) Permission denied
If the realm has been changed on the master zone, the master zone's gateway may 
need to be restarted to recognize this user.
root@cs17ca101ja0702:/#

The realm name is "global". Is the cause of the error due to the primary 
cluster having a current period listing the realm name as "default" instead of 
"global" ?  However, the realm id is of realm "global" AND the zonegroup does 
not list realm name but has the correct realm id. See below.

How to fix this issue.

root@fl31ca104ja0201:/# radosgw-admin realm get
{
"id": "3da7b5ea-c44b-4d44-aced-fae2aabce97b",
"name": "global",
"current_period": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"epoch": 2
}
root@fl31ca104ja0201:/# radosgw-admin realm get --rgw-realm=global
{
"id": "3da7b5ea-c44b-4d44-aced-fae2aabce97b",
"name": "global",
"current_period": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"epoch": 2
}

root@fl31ca104ja0201:/# radosgw-admin zonegroup list
{
"default_info": "ec8b68db-1900-464f-a21a-2f6e8c107e94",
"zonegroups": [
"alldczg"
]
}

root@fl31ca104ja0201:/# radosgw-admin zonegroup get --rgw-zonegroup=alldczg
{
"id": "ec8b68db-1900-464f-a21a-2f6e8c107e94",
"name": "alldczg",
"api_name": "alldczg",
"is_master": "true",
"endpoints": [
http://10.45.128.139:8080,
http://172.18.55.71:8080,
http://10.239.155.23:8080
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "ae267592-7cd8-4d67-8792-adc57d104cd6",
"zones": [
{
"id": "0962f0b4-beb6-4d07-a64d-07046b81529e",
"name": "CRsite",
"endpoints": [
http://172.18.55.71:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "9129d118-55ac-4859-b339-b8afe0793a80",
"name": "BArga",
"endpoints": [
http://10.208.11.26:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "ae267592-7cd8-4d67-8792-adc57d104cd6",
"name": "ORflex2",
"endpoints": [
http://10.45.128.139:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "f5edeb4b-2a37-413b-8587-0ff40d7647ea",
"name": "SHGrasp",
"endpoints": [
http://10.239.155.23:8080
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "3da7b5ea-c44b-4d44-aced-fae2aabce97b",
"sync_policy": {
"groups": []
}
}

root@fl31ca104ja0201:/# radosgw-admin period get-current
{
"current_period": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6"
}
root@fl31ca104ja0201:/# radosgw-admin period get
{
"id": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"epoch": 42,
"predecessor_uuid": "2df86f9a-d267-4b52-a13b-def8e5e612a2",
"sync_status": [],
"period_map": {
"id": "b8bc1187-2a2d-4d9e-b7be-c4f4667e3fa6",
"zonegroups": [
{
"id": "ec8b68db-1900-464f-a21a-2f6e8c107e94",
"name": "alldczg",
"api_name": "alldczg",
"is_master": "true",
"endpoints": [
http://10.45.128.139:8080,
http://172.18.55.71:8080,
http://10.239.155.23:8080
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "ae267592-7cd8-4d67-8792-adc57d104cd6",
"zones": [

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Here, should I send the mrg log ?

root@fl31ca104ja0201:/etc/ceph# ceph -c remote_ceph.conf --id=mirror_remote  
status --verbose
parsed_args: Namespace(admin_socket=None, block=False, 
cephconf='remote_ceph.conf', client_id='mirror_remote', client_name=None, 
cluster=None, cluster_timeout=None, completion=False, help=False, 
input_file=None, output_file=None, output_format=None, period=1, setgroup=None, 
setuser=None, status=False, verbose=True, version=False, watch=False, 
watch_channel=None, watch_debug=False, watch_error=False, watch_info=False, 
watch_sec=False, watch_warn=False), childargs: ['status']
^CCluster connection aborted

root@fl31ca104ja0201:/etc/ceph#  cat remote_ceph.client.mirror_remote.keyring
[client.mirror_remote]
key = AQCfwMlkM90pLBAAwXtvpp8j04IvC8tqpAG9bA==
caps mds = "allow rwps fsname=cephfs"
caps mon = "allow r fsname=cephfs"
caps osd = "allow rw tag cephfs data=cephfs"

root@fl31ca104ja0201:/etc/ceph# cat remote_ceph.conf
[client.libvirt]
admin socket = /var/run/ceph/$cluster-$type.$id.$pid.$cctid.asok # must be 
writable by QEMU and allowed by SELinux or AppArmor
log file = /var/log/ceph/qemu-guest-$pid.log # must be writable by QEMU and 
allowed by SELinux or AppArmor

[client.rgw.cr21meg16ba0101.rgw0]
host = cr21meg16ba0101
keyring = /var/lib/ceph/radosgw/ceph-rgw.cr21meg16ba0101.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-cr21meg16ba0101.rgw0.log
rgw frontends = beast endpoint=172.18.55.71:8080
rgw thread pool size = 512

# Please do not change this file directly since it is managed by Ansible and 
will be overwritten
[global]
cluster network = 172.18.55.71/24
fsid = a6f52598-e5cd-4a08-8422-7b6fdb1d5dbe
mon host = 
[v2:172.18.55.71:3300,v1:172.18.55.71:6789],[v2:172.18.55.72:3300,v1:172.18.55.72:6789],[v2:172.18.55.73:3300,v1:172.18.55.73:6789]
mon initial members = cr21meg16ba0101,cr21meg16ba0102,cr21meg16ba0103
osd pool default crush rule = -1
public network = 172.18.55.0/24

[mon]
auth_allow_insecure_global_id_reclaim = False
auth_expose_insecure_global_id_reclaim = False

[osd]
osd memory target = 23630132019

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 9:26 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

On Tue, Aug 8, 2023 at 9:16 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> Is this correct?
> (copied ceph.conf from secondary cluster to /etc/ce/ph/crsite directory in 
> primary cluster, copied ceph.mon.keyring from secondary as  
> ceph.client.crsite.mon.keyringin /etc/ceph on primary)
> root@fl31ca104ja0201:/etc/ceph# ls
> ceph.client.admin.keyring  ceph.client.crsite.admin.keyring  
> ceph.client.mirror_remote.keying  crsitefio-fs.test   fs-mnt   rbdmap
> ceph.client.crash.keyring  ceph.client.crsite.mon.keyringceph.conf
>  fio-bsd.test  fio-nfs.test  nfs-mnt  remote_ceph.conf
> root@fl31ca104ja0201:/etc/ceph# ls crsite ceph.conf  ceph.mon.keyring
>
> root@fl31ca104ja0201:/etc/ceph/crsite# ceph -c ceph.conf 
> --id=crsite.mon --cluster=ceph --verbose
> parsed_args: Namespace(admin_socket=None, block=False, 
> cephconf='ceph.conf', client_id='crsite.mon', client_name=None, 
> cluster='ceph', cluster_timeout=None, completion=False, help=False, 
> input_file=None, output_file=None, output_format=None, period=1, 
> setgroup=None, setuser=None, status=False, verbose=True, 
> version=False, watch=False, watch_channel=None, watch_debug=False, 
> watch_error=False, watch_info=False, watch_sec=False, 
> watch_warn=False), childargs: [] ^CCluster connection aborted
>
> Not sure if the --id (CLIENT_ID) is correct.. not able to connect

use `remote_ceph.conf` and id as `mirror_remote` (since I guess these are the 
secondary clusters' conf given the names).

>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 7:05 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap 
> import hung
>
> Hi Anantha,
>
> On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
> >
> > Hi Venky,
> >
> > The primary and secondary clusters both have the same cluster name "ceph" 
> > and both have a single filesystem by name "cephfs".
>
> That's not an issue.
>
> > How do I check the connection from primary to secondary using mon addr and 
> > key?   What is command line
>
> A quick way to check this would be to place the secondary cluster ceph 
> config file and the user key on one of the primary node (preferably, 
> the ceph-mgr host, just for tests - so purge these when done) and then 
> running
>
> ceph -c /path/to/secondary/ceph.conf -

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Is this correct? 
(copied ceph.conf from secondary cluster to /etc/ce/ph/crsite directory in 
primary cluster, copied ceph.mon.keyring from secondary as  
ceph.client.crsite.mon.keyringin /etc/ceph on primary)
root@fl31ca104ja0201:/etc/ceph# ls
ceph.client.admin.keyring  ceph.client.crsite.admin.keyring  
ceph.client.mirror_remote.keying  crsitefio-fs.test   fs-mnt   rbdmap
ceph.client.crash.keyring  ceph.client.crsite.mon.keyringceph.conf  
   fio-bsd.test  fio-nfs.test  nfs-mnt  remote_ceph.conf
root@fl31ca104ja0201:/etc/ceph# ls crsite
ceph.conf  ceph.mon.keyring

root@fl31ca104ja0201:/etc/ceph/crsite# ceph -c ceph.conf --id=crsite.mon 
--cluster=ceph --verbose
parsed_args: Namespace(admin_socket=None, block=False, cephconf='ceph.conf', 
client_id='crsite.mon', client_name=None, cluster='ceph', cluster_timeout=None, 
completion=False, help=False, input_file=None, output_file=None, 
output_format=None, period=1, setgroup=None, setuser=None, status=False, 
verbose=True, version=False, watch=False, watch_channel=None, 
watch_debug=False, watch_error=False, watch_info=False, watch_sec=False, 
watch_warn=False), childargs: []
^CCluster connection aborted

Not sure if the --id (CLIENT_ID) is correct.. not able to connect

Thank you,
Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 7:05 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Tue, Aug 8, 2023 at 6:29 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".

That's not an issue.

> How do I check the connection from primary to secondary using mon addr and 
> key?   What is command line

A quick way to check this would be to place the secondary cluster ceph config 
file and the user key on one of the primary node (preferably, the ceph-mgr 
host, just for tests - so purge these when done) and then running

ceph -c /path/to/secondary/ceph.conf --id <> status

If that runs all fine, then the mirror daemon is probably hitting some bug.

> These two clusters are configured for rgw multisite and is functional.
>
> Thank you,
> Anantha
>
> -Original Message-
> From: Venky Shankar 
> Sent: Monday, August 7, 2023 5:46 PM
> To: Adiga, Anantha 
> Cc: ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap 
> import hung
>
> Hi Anantha,
>
> On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  
> wrote:
> >
> > Hi Venky,
> >
> >
> >
> > I tried on another secondary Quincy cluster and it is the same problem. The 
> > peer_bootstrap mport  command hangs.
>
> A pacific cluster generated peer token should be importable in a quincy 
> source cluster. Looking at the logs, I suspect that the perceived hang is the 
> mirroring module blocked on connecting to the secondary cluster (to set 
> mirror info xattr). Are you able to connect to the secondary cluster from the 
> host running ceph-mgr on the primary cluster using its monitor address (and a 
> key)?
>
> The primary and secondary clusters both have the same cluster name "ceph" and 
> both have a single filesystem by name "cephfs".  How do I check that 
> connection from primary to secondary using mon addr and key?
> These two clusters are configured for rgw multisite and is functional.
>
> >
> >
> >
> >
> >
> > root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap 
> > import cephfs 
> > eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJm
> > aW 
> > xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIi
> > wg 
> > InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFX
> > bW 
> > V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNT
> > Uu 
> > MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4x
> > OT 
> > ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOj
> > Mz MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
> >
> > ……
> >
> > …….
> >
> > ..command does not complete..waits here
> >
> > ^C  to exit.
> >
> > Thereafter some commands do not complete…
> >
> > root@fl31ca104ja0201:/# ceph -s
> >
> >   cluster:
> >
> > id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
> >
> > health: HEALTH_OK
> >
> >
> >
> >   services:
> >
> > mon:   3 daemons, quorum 
> > fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja020

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky, 

The primary and secondary clusters both have the same cluster name "ceph" and 
both have a single filesystem by name "cephfs".  How do I check the connection 
from primary to secondary using mon addr and key?   What is command line
These two clusters are configured for rgw multisite and is functional.  

Thank you,
Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 5:46 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Mon, Aug 7, 2023 at 11:52 PM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.

A pacific cluster generated peer token should be importable in a quincy source 
cluster. Looking at the logs, I suspect that the perceived hang is the 
mirroring module blocked on connecting to the secondary cluster (to set mirror 
info xattr). Are you able to connect to the secondary cluster from the host 
running ceph-mgr on the primary cluster using its monitor address (and a key)?

The primary and secondary clusters both have the same cluster name "ceph" and 
both have a single filesystem by name "cephfs".  How do I check that connection 
from primary to secondary using mon addr and key?   
These two clusters are configured for rgw multisite and is functional.  

>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import 
> cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaW
> xlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwg
> InNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbW
> V6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUu
> MTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOT
> ozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMz
> MDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs 
> client.mirror_remote / rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap 
> create cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEw

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Thank you very much.

Anantha

-Original Message-
From: Venky Shankar  
Sent: Monday, August 7, 2023 5:23 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Anantha,

On Tue, Aug 8, 2023 at 1:59 AM Adiga, Anantha  wrote:
>
> Hi Venky,
>
>
>
> Could this be the reason that the peer-bootstrap import is hanging?  how do I 
> upgrade cephfs-mirror to Quincy?

I was on leave yesterday -- will have a look at the log and update.

>
> root@fl31ca104ja0201:/# cephfs-mirror --version
>
> ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific 
> (stable)
>
> root@fl31ca104ja0201:/# ceph version
>
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
>
> root@fl31ca104ja0201:/#
>
>
>
>
>
> Thank you,
>
> Anantha
>
> From: Adiga, Anantha
> Sent: Monday, August 7, 2023 11:21 AM
> To: 'Venky Shankar' ; 'ceph-users@ceph.io' 
> 
> Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import 
> hung
>
>
>
> Hi Venky,
>
>
>
> I tried on another secondary Quincy cluster and it is the same problem. The 
> peer_bootstrap mport  command hangs.
>
>
>
>
>
> root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
> eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==
>
> ……
>
> …….
>
> ..command does not complete..waits here
>
> ^C  to exit.
>
> Thereafter some commands do not complete…
>
> root@fl31ca104ja0201:/# ceph -s
>
>   cluster:
>
> id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
>
> health: HEALTH_OK
>
>
>
>   services:
>
> mon:   3 daemons, quorum 
> fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
>
> mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
> fl31ca104ja0202, fl31ca104ja0203
>
> mds:   1/1 daemons up, 2 standby
>
> osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
>
> cephfs-mirror: 1 daemon active (1 hosts)
>
> rgw:   3 daemons active (3 hosts, 1 zones)
>
>
>
>   data:
>
> volumes: 1/1 healthy
>
> pools:   25 pools, 769 pgs
>
> objects: 614.40k objects, 1.9 TiB
>
> usage:   2.9 TiB used, 292 TiB / 295 TiB avail
>
> pgs: 769 active+clean
>
>
>
>   io:
>
> client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr
>
>
>
> root@fl31ca104ja0201:/#
>
> root@fl31ca104ja0201:/# ceph fs status cephfs
>
> This command also waits. ……
>
>
>
> I have attached the mgr log
>
> root@fl31ca104ja0201:/# ceph service status
>
> {
>
> "cephfs-mirror": {
>
> "5306346": {
>
> "status_stamp": "2023-08-07T17:35:56.884907+",
>
> "last_beacon": "2023-08-07T17:45:01.903540+",
>
> "status": {
>
> "status_json": 
> "{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
>
> }
>
> }
>
>
>
> Quincy secondary cluster
>
>
>
> root@a001s008-zz14l47008:/# ceph mgr module enable mirroring
>
> root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / 
> rwps
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote
>
> [client.mirror_remote]
>
> key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==
>
> caps mds = "allow rwps fsname=cephfs"
>
> caps mon = "allow r fsname=cephfs"
>
> caps osd = "allow rw tag cephfs data=cephfs"
>
> root@a001s008-zz14l47008:/#
>
> root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
> cephfs client.mirror_remote shgR-site
>
> {"token": 
> "eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0i

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-07 Thread Adiga, Anantha
Hi Venky,

Could this be the reason that the peer-bootstrap import is hanging?  how do I 
upgrade cephfs-mirror to Quincy?
root@fl31ca104ja0201:/# cephfs-mirror --version
ceph version 16.2.13 (5378749ba6be3a0868b51803968ee9cde4833a3e) pacific (stable)
root@fl31ca104ja0201:/# ceph version
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
root@fl31ca104ja0201:/#


Thank you,
Anantha
From: Adiga, Anantha
Sent: Monday, August 7, 2023 11:21 AM
To: 'Venky Shankar' ; 'ceph-users@ceph.io' 

Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi Venky,

I tried on another secondary Quincy cluster and it is the same problem. The 
peer_bootstrap mport  command hangs.



root@fl31ca104ja0201:/# ceph fs  snapshot mirror peer_bootstrap import cephfs 
eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ==

……

…….

..command does not complete..waits here
^C  to exit.
Thereafter some commands do not complete…
root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 2d)
mgr:   fl31ca104ja0201.kkoono(active, since 3d), standbys: 
fl31ca104ja0202, fl31ca104ja0203
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 2d), 44 in (since 5w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.9 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

  io:
client:   32 KiB/s rd, 0 B/s wr, 33 op/s rd, 1 op/s wr

root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/# ceph fs status cephfs
This command also waits. ……

I have attached the mgr log
root@fl31ca104ja0201:/# ceph service status
{
"cephfs-mirror": {
"5306346": {
"status_stamp": "2023-08-07T17:35:56.884907+",
"last_beacon": "2023-08-07T17:45:01.903540+",
"status": {
"status_json": 
"{\"1\":{\"name\":\"cephfs\",\"directory_count\":0,\"peers\":{}}}"
}
}

Quincy secondary cluster


root@a001s008-zz14l47008:/# ceph mgr module enable mirroring

root@a001s008-zz14l47008:/# ceph fs authorize cephfs client.mirror_remote / rwps

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

root@a001s008-zz14l47008:/# ceph auth get client.mirror_remote

[client.mirror_remote]

key = AQCIFtFkI+SLNhAAWmez2DJpH9eGrbxA9efdoA==

caps mds = "allow rwps fsname=cephfs"

caps mon = "allow r fsname=cephfs"

caps osd = "allow rw tag cephfs data=cephfs"

root@a001s008-zz14l47008:/#

root@a001s008-zz14l47008:/# ceph fs snapshot mirror peer_bootstrap create 
cephfs client.mirror_remote shgR-site

{"token": 
"eyJmc2lkIjogIjJlYWMwZWEwLTYwNDgtNDQ0Zi04NGIyLThjZWVmZWQyN2E1YiIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJzaGdSLXNpdGUiLCAia2V5IjogIkFRQ0lGdEZrSStTTE5oQUFXbWV6MkRKcEg5ZUdyYnhBOWVmZG9BPT0iLCAibW9uX2hvc3QiOiAiW3YyOjEwLjIzOS4xNTUuMTg6MzMwMC8wLHYxOjEwLjIzOS4xNTUuMTg6Njc4OS8wXSBbdjI6MTAuMjM5LjE1NS4xOTozMzAwLzAsdjE6MTAuMjM5LjE1NS4xOTo2Nzg5LzBdIFt2MjoxMC4yMzkuMTU1LjIwOjMzMDAvMCx2MToxMC4yMzkuMTU1LjIwOjY3ODkvMF0ifQ=="}

root@a001s008-zz14l47008:/#

Thank you,
Anantha

From: Adiga, Anantha
Sent: Friday, August 4, 2023 11:55 AM
To: Venky Shankar mailto:vshan...@redhat.com>>; 
ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: RE: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung


Hi Venky,



Thank you so much for the guidance. Attached is the mgr log.



Note: the 4th node in the primary cluster has smaller capacity  drives, the 
other 3 nodes have the larger capacity drives.

32ssd6.98630   1.0  7.0 TiB   44 GiB   44 GiB   183 KiB  148 MiB  
6.9 TiB  0.62  0.64   40  up  osd.32

-7  76.84927 -   77 TiB  652 GiB  648 GiB20 MiB  3.0 GiB   
76 TiB  0.83  0.86-  host fl31ca104ja0203

  1ssd6.98630   1.0  7.0 TiB   73 GiB   73 GiB   8.0 MiB  333 MiB  
6.9 TiB  1.02  1.06   54  up  osd.1

  4ssd6.98630   1.0  7.0 TiB   77 GiB   77 GiB   1.1 MiB  174 MiB  
6.9 TiB  1.07  

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-03 Thread Adiga, Anantha
Attached log file

-Original Message-
From: Adiga, Anantha  
Sent: Thursday, August 3, 2023 5:50 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Adding additional info:

The cluster A and B both have the same name: ceph and  each has a single 
filesystem with the same name cephfs. Is that the issue ? 


Tried using peer_add command and it is hanging as well:

root@fl31ca104ja0201:/#ls /etc/ceph/
cr_ceph.conf  client.mirror_remote.keying ceph.client.admin.keyring  ceph.conf

(remote cluster)
root@cr21meg16ba0101:/etc/ceph# ls /etc/ceph
ceph.client.admin.keyring  ceph.conf   ceph.mon.keyring
  

root@fl31ca104ja0201:/# ceph fs snapshot mirror peer_add cephfs 
client.mirror_remote@cr_ceph  cephfs 
v2:172.18.55.71:3300,v1:172.18.55.71:6789],[v2:172.18.55.72:3300,v1:172.18.55.72:6789],[v2:172.18.55.73:3300,v1:172.18.55.73:6789
 AQCfwMlkM90pLBAAwXtvpp8j04IvC8tqpAG9bA==



Hi

Could you please  provide guidance on how to diagnose this issue:

In this case, there are two  Ceph clusters: cluster A, 4 nodes and cluster B, 3 
node, in different locations.  Both are already running RGW multi-site,  A is 
master.

Cephfs snapshot mirroring is being configured on the clusters.  Cluster A  is 
the primary, cluster B is the peer. Cephfs snapshot mirroring is being 
configured. The bootstrap import  step on the primary node hangs.

On the target cluster :
---
"version": "16.2.5",
"release": "pacific",
"release_type": "stable"

root@cr21meg16ba0101:/# ceph fs snapshot mirror peer_bootstrap create cephfs 
client.mirror_remote flex2-site
{"token": 
"eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0="}
root@cr21meg16ba0101:/var/run/ceph#

On the source cluster:

"version": "17.2.6",
"release": "quincy",
"release_type": "stable"

root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 111m)
mgr:   fl31ca104ja0201.nwpqlh(active, since 11h), standbys: 
fl31ca104ja0203, fl31ca104ja0202
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 111m), 44 in (since 4w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.8 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

root@fl31ca104ja0302:/# ceph mgr module enable mirroring module 'mirroring' is 
already enabled root@fl31ca104ja0302:/# ceph fs snapshot mirror peer_bootstrap 
import cephfs 
eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0=

root@fl31ca104ja0201:/# ceph fs snapshot mirror daemon status
[{"daemon_id": 5300887, "filesystems": [{"filesystem_id": 1, "name": "cephfs", 
"directory_count": 0, "peers": []}]}]

root@fl31ca104ja0302:/var/run/ceph# ceph --admin-daemon 
/var/run/ceph/ceph-client.cephfs-mirror.fl31ca104ja0302.sypagt.7.94083135960976.asok
 status {
"metadata": {
"ceph_sha1": "d7ff0d10654d2280e08f1ab989c7cdf3064446a5",
"ceph_version": "ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)",
"entity_id": "cephfs-mirror.fl31ca104ja0302.sypagt",
"hostname": "fl31ca104ja0302",
"pid": "7",
"root": "/"
},
"dentry_count": 0,
"dentry_pinned_count": 0,
"id": 5194553,
"inst": {
"name": {
"type": "client",
"num": 5194553
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
}
},
"addr": {
"

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-03 Thread Adiga, Anantha
Adding additional info:

The cluster A and B both have the same name: ceph and  each has a single 
filesystem with the same name cephfs. Is that the issue ? 


Tried using peer_add command and it is hanging as well:

root@fl31ca104ja0201:/#ls /etc/ceph/
cr_ceph.conf  client.mirror_remote.keying ceph.client.admin.keyring  ceph.conf

(remote cluster)
root@cr21meg16ba0101:/etc/ceph# ls /etc/ceph
ceph.client.admin.keyring  ceph.conf   ceph.mon.keyring
  

root@fl31ca104ja0201:/# ceph fs snapshot mirror peer_add cephfs 
client.mirror_remote@cr_ceph  cephfs 
v2:172.18.55.71:3300,v1:172.18.55.71:6789],[v2:172.18.55.72:3300,v1:172.18.55.72:6789],[v2:172.18.55.73:3300,v1:172.18.55.73:6789
 AQCfwMlkM90pLBAAwXtvpp8j04IvC8tqpAG9bA==



Hi

Could you please  provide guidance on how to diagnose this issue:

In this case, there are two  Ceph clusters: cluster A, 4 nodes and cluster B, 3 
node, in different locations.  Both are already running RGW multi-site,  A is 
master.

Cephfs snapshot mirroring is being configured on the clusters.  Cluster A  is 
the primary, cluster B is the peer. Cephfs snapshot mirroring is being 
configured. The bootstrap import  step on the primary node hangs.

On the target cluster :
---
"version": "16.2.5",
"release": "pacific",
"release_type": "stable"

root@cr21meg16ba0101:/# ceph fs snapshot mirror peer_bootstrap create cephfs 
client.mirror_remote flex2-site
{"token": 
"eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0="}
root@cr21meg16ba0101:/var/run/ceph#

On the source cluster:

"version": "17.2.6",
"release": "quincy",
"release_type": "stable"

root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 111m)
mgr:   fl31ca104ja0201.nwpqlh(active, since 11h), standbys: 
fl31ca104ja0203, fl31ca104ja0202
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 111m), 44 in (since 4w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.8 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

root@fl31ca104ja0302:/# ceph mgr module enable mirroring module 'mirroring' is 
already enabled root@fl31ca104ja0302:/# ceph fs snapshot mirror peer_bootstrap 
import cephfs 
eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0=

root@fl31ca104ja0201:/# ceph fs snapshot mirror daemon status
[{"daemon_id": 5300887, "filesystems": [{"filesystem_id": 1, "name": "cephfs", 
"directory_count": 0, "peers": []}]}]

root@fl31ca104ja0302:/var/run/ceph# ceph --admin-daemon 
/var/run/ceph/ceph-client.cephfs-mirror.fl31ca104ja0302.sypagt.7.94083135960976.asok
 status {
"metadata": {
"ceph_sha1": "d7ff0d10654d2280e08f1ab989c7cdf3064446a5",
"ceph_version": "ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)",
"entity_id": "cephfs-mirror.fl31ca104ja0302.sypagt",
"hostname": "fl31ca104ja0302",
"pid": "7",
"root": "/"
},
"dentry_count": 0,
"dentry_pinned_count": 0,
"id": 5194553,
"inst": {
"name": {
"type": "client",
"num": 5194553
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
}
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
},
"inst_str": "client.5194553 10.45.129.5:0/2497002034",
"addr_str": "10.45.129.5:0/2497002034",
"inode_count": 1,
"mds_epoch": 118,
"osd_epoch": 6266,
"osd_epoch_barrier": 0,
"blocklisted": false,
"fs_name": "cephfs"
}

root@fl31ca104ja0302:/home/general# docker logs 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-cephfs-mirror-fl31ca104ja0302-sypagt 
--tail  10 debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 ceph version 
17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable), process 
cephfs-mirror, pid 7 debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 
pidfile_write: 

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-03 Thread Adiga, Anantha


Tried using peer_add command and it is hanging as well:
root@fl31ca104ja0201:/# ceph fs snapshot mirror peer_add cephfs 
client.mirror_remote@cr_ceph  cephfs 
v2:172.18.55.71:3300,v1:172.18.55.71:6789],[v2:172.18.55.72:3300,v1:172.18.55.72:6789],[v2:172.18.55.73:3300,v1:172.18.55.73:6789
 AQCfwMlkM90pLBAAwXtvpp8j04IvC8tqpAG9bA==



-Original Message-
From: Adiga, Anantha  
Sent: Thursday, August 3, 2023 2:31 PM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

Hi

Could you please  provide guidance on how to diagnose this issue:

In this case, there are two  Ceph clusters: cluster A, 4 nodes and cluster B, 3 
node, in different locations.  Both are already running RGW multi-site,  A is 
master.

Cephfs snapshot mirroring is being configured on the clusters.  Cluster A  is 
the primary, cluster B is the peer. Cephfs snapshot mirroring is being 
configured. The bootstrap import  step on the primary node hangs.

On the target cluster :
---
"version": "16.2.5",
"release": "pacific",
"release_type": "stable"

root@cr21meg16ba0101:/# ceph fs snapshot mirror peer_bootstrap create cephfs 
client.mirror_remote flex2-site
{"token": 
"eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0="}
root@cr21meg16ba0101:/var/run/ceph#

On the source cluster:

"version": "17.2.6",
"release": "quincy",
"release_type": "stable"

root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 111m)
mgr:   fl31ca104ja0201.nwpqlh(active, since 11h), standbys: 
fl31ca104ja0203, fl31ca104ja0202
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 111m), 44 in (since 4w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.8 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

root@fl31ca104ja0302:/# ceph mgr module enable mirroring module 'mirroring' is 
already enabled root@fl31ca104ja0302:/# ceph fs snapshot mirror peer_bootstrap 
import cephfs 
eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0=

root@fl31ca104ja0201:/# ceph fs snapshot mirror daemon status
[{"daemon_id": 5300887, "filesystems": [{"filesystem_id": 1, "name": "cephfs", 
"directory_count": 0, "peers": []}]}]

root@fl31ca104ja0302:/var/run/ceph# ceph --admin-daemon 
/var/run/ceph/ceph-client.cephfs-mirror.fl31ca104ja0302.sypagt.7.94083135960976.asok
 status {
"metadata": {
"ceph_sha1": "d7ff0d10654d2280e08f1ab989c7cdf3064446a5",
"ceph_version": "ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)",
"entity_id": "cephfs-mirror.fl31ca104ja0302.sypagt",
"hostname": "fl31ca104ja0302",
"pid": "7",
"root": "/"
},
"dentry_count": 0,
"dentry_pinned_count": 0,
"id": 5194553,
"inst": {
"name": {
"type": "client",
"num": 5194553
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
}
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
},
"inst_str": "client.5194553 10.45.129.5:0/2497002034",
"addr_str": "10.45.129.5:0/2497002034",
"inode_count": 1,
"mds_epoch": 118,
"osd_epoch": 6266,
"osd_epoch_barrier": 0,

[ceph-users] Re: cephfs snapshot mirror peer_bootstrap import hung

2023-08-03 Thread Adiga, Anantha
Hi

Could you please  provide guidance on how to diagnose this issue:

In this case, there are two  Ceph clusters: cluster A, 4 nodes and cluster B, 3 
node, in different locations.  Both are already running RGW multi-site,  A is 
master.

Cephfs snapshot mirroring is being configured on the clusters.  Cluster A  is 
the primary, cluster B is the peer. Cephfs snapshot mirroring is being 
configured. The bootstrap import  step on the primary node hangs.

On the target cluster :
---
"version": "16.2.5",
"release": "pacific",
"release_type": "stable"

root@cr21meg16ba0101:/# ceph fs snapshot mirror peer_bootstrap create cephfs 
client.mirror_remote flex2-site
{"token": 
"eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0="}
root@cr21meg16ba0101:/var/run/ceph#

On the source cluster:

"version": "17.2.6",
"release": "quincy",
"release_type": "stable"

root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 111m)
mgr:   fl31ca104ja0201.nwpqlh(active, since 11h), standbys: 
fl31ca104ja0203, fl31ca104ja0202
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 111m), 44 in (since 4w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.8 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

root@fl31ca104ja0302:/# ceph mgr module enable mirroring module 'mirroring' is 
already enabled root@fl31ca104ja0302:/# ceph fs snapshot mirror peer_bootstrap 
import cephfs 
eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0=

root@fl31ca104ja0201:/# ceph fs snapshot mirror daemon status
[{"daemon_id": 5300887, "filesystems": [{"filesystem_id": 1, "name": "cephfs", 
"directory_count": 0, "peers": []}]}]

root@fl31ca104ja0302:/var/run/ceph# ceph --admin-daemon 
/var/run/ceph/ceph-client.cephfs-mirror.fl31ca104ja0302.sypagt.7.94083135960976.asok
 status {
"metadata": {
"ceph_sha1": "d7ff0d10654d2280e08f1ab989c7cdf3064446a5",
"ceph_version": "ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)",
"entity_id": "cephfs-mirror.fl31ca104ja0302.sypagt",
"hostname": "fl31ca104ja0302",
"pid": "7",
"root": "/"
},
"dentry_count": 0,
"dentry_pinned_count": 0,
"id": 5194553,
"inst": {
"name": {
"type": "client",
"num": 5194553
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
}
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
},
"inst_str": "client.5194553 10.45.129.5:0/2497002034",
"addr_str": "10.45.129.5:0/2497002034",
"inode_count": 1,
"mds_epoch": 118,
"osd_epoch": 6266,
"osd_epoch_barrier": 0,
"blocklisted": false,
"fs_name": "cephfs"
}

root@fl31ca104ja0302:/home/general# docker logs 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-cephfs-mirror-fl31ca104ja0302-sypagt 
--tail  10 debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 ceph version 
17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable), process 
cephfs-mirror, pid 7 debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 
pidfile_write: ignore empty --pid-file debug 2023-08-03T05:24:27.445+ 
7f8eb6fc0280  1 mgrc service_daemon_register cephfs-mirror.5184622 metadata 
{arch=x86_64,ceph_release=quincy,ceph_version=ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy 
(stable),ceph_version_short=17.2.6,container_hostname=fl31ca104ja0302,container_image=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e,cpu=Intel(R)
 Xeon(R) Gold 6252 CPU @ 2.10GHz,distro=centos,distro_description=CentOS Stream 
8,distro_version=8,hostname=fl31ca104ja0302,id=fl31ca104ja0302.sypagt,instance_id=5184622,kernel_description=#82-Ub
 untu SMP Tue Jun 6 23:10:23 UTC 

[ceph-users] cephfs snapshot mirror peer_bootstrap import hung

2023-08-03 Thread Adiga, Anantha
Hi

Could you please  provide guidance on how to diagnose this issue:

In this case, there are two  Ceph clusters: cluster A, 4 nodes and cluster B, 3 
node, in different locations.  Both are already running RGW multi-site,  A is 
master.

Cephfs snapshot mirroring is being configured on the clusters.  Cluster A  is 
the primary, cluster B is the peer. Cephfs snapshot mirroring is being 
configured. The bootstrap import  step on the primary node hangs.

On the target cluster :
---
"version": "16.2.5",
"release": "pacific",
"release_type": "stable"

root@cr21meg16ba0101:/# ceph fs snapshot mirror peer_bootstrap create cephfs 
client.mirror_remote flex2-site
{"token": 
"eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0="}
root@cr21meg16ba0101:/var/run/ceph#

On the source cluster:

"version": "17.2.6",
"release": "quincy",
"release_type": "stable"

root@fl31ca104ja0201:/# ceph -s
  cluster:
id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
health: HEALTH_OK

  services:
mon:   3 daemons, quorum 
fl31ca104ja0202,fl31ca104ja0203,fl31ca104ja0201 (age 111m)
mgr:   fl31ca104ja0201.nwpqlh(active, since 11h), standbys: 
fl31ca104ja0203, fl31ca104ja0202
mds:   1/1 daemons up, 2 standby
osd:   44 osds: 44 up (since 111m), 44 in (since 4w)
cephfs-mirror: 1 daemon active (1 hosts)
rgw:   3 daemons active (3 hosts, 1 zones)

  data:
volumes: 1/1 healthy
pools:   25 pools, 769 pgs
objects: 614.40k objects, 1.9 TiB
usage:   2.8 TiB used, 292 TiB / 295 TiB avail
pgs: 769 active+clean

root@fl31ca104ja0302:/# ceph mgr module enable mirroring
module 'mirroring' is already enabled
root@fl31ca104ja0302:/# ceph fs snapshot mirror peer_bootstrap import cephfs 
eyJmc2lkIjogImE2ZjUyNTk4LWU1Y2QtNGEwOC04NDIyLTdiNmZkYjFkNWRiZSIsICJmaWxlc3lzdGVtIjogImNlcGhmcyIsICJ1c2VyIjogImNsaWVudC5taXJyb3JfcmVtb3RlIiwgInNpdGVfbmFtZSI6ICJmbGV4Mi1zaXRlIiwgImtleSI6ICJBUUNmd01sa005MHBMQkFBd1h0dnBwOGowNEl2Qzh0cXBBRzliQT09IiwgIm1vbl9ob3N0IjogIlt2MjoxNzIuMTguNTUuNzE6MzMwMC8wLHYxOjE3Mi4xOC41NS43MTo2Nzg5LzBdIFt2MjoxNzIuMTguNTUuNzM6MzMwMC8wLHYxOjE3Mi4xOC41NS43Mzo2Nzg5LzBdIn0=


root@fl31ca104ja0302:/var/run/ceph# ceph --admin-daemon 
/var/run/ceph/ceph-client.cephfs-mirror.fl31ca104ja0302.sypagt.7.94083135960976.asok
 status
{
"metadata": {
"ceph_sha1": "d7ff0d10654d2280e08f1ab989c7cdf3064446a5",
"ceph_version": "ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)",
"entity_id": "cephfs-mirror.fl31ca104ja0302.sypagt",
"hostname": "fl31ca104ja0302",
"pid": "7",
"root": "/"
},
"dentry_count": 0,
"dentry_pinned_count": 0,
"id": 5194553,
"inst": {
"name": {
"type": "client",
"num": 5194553
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
}
},
"addr": {
"type": "v1",
"addr": "10.45.129.5:0",
"nonce": 2497002034
},
"inst_str": "client.5194553 10.45.129.5:0/2497002034",
"addr_str": "10.45.129.5:0/2497002034",
"inode_count": 1,
"mds_epoch": 118,
"osd_epoch": 6266,
"osd_epoch_barrier": 0,
"blocklisted": false,
"fs_name": "cephfs"
}

root@fl31ca104ja0302:/home/general# docker logs 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-cephfs-mirror-fl31ca104ja0302-sypagt 
--tail  10
debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable), process 
cephfs-mirror, pid 7
debug 2023-08-03T05:24:27.413+ 7f8eb6fc0280  0 pidfile_write: ignore empty 
--pid-file
debug 2023-08-03T05:24:27.445+ 7f8eb6fc0280  1 mgrc service_daemon_register 
cephfs-mirror.5184622 metadata 
{arch=x86_64,ceph_release=quincy,ceph_version=ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy 
(stable),ceph_version_short=17.2.6,container_hostname=fl31ca104ja0302,container_image=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e,cpu=Intel(R)
 Xeon(R) Gold 6252 CPU @ 2.10GHz,distro=centos,distro_description=CentOS Stream 
8,distro_version=8,hostname=fl31ca104ja0302,id=fl31ca104ja0302.sypagt,instance_id=5184622,kernel_description=#82-Ubuntu
 SMP Tue Jun 6 23:10:23 UTC 
2023,kernel_version=5.15.0-75-generic,mem_swap_kb=8388604,mem_total_kb=527946928,os=Linux}
debug 2023-08-03T05:27:10.419+ 7f8ea1b2c700  0 client.5194553 
ms_handle_reset on v2:10.45.128.141:3300/0
debug 2023-08-03T05:50:10.917+ 

[ceph-users] mgr services frequently crash on nodes 2,3,4

2023-08-02 Thread Adiga, Anantha
Hi,

Mgr service crash frequently on nodes 2 3 and 4  with the same condition after 
the 4th node was added.

root@zp3110b001a0104:/# ceph crash stat
19 crashes recorded
16 older than 1 days old:
2023-07-29T03:35:32.006309Z_7b622c2b-a2fc-425a-acb8-dc1673b4c189
2023-07-29T03:35:32.055174Z_a2ee1e23-5f41-4dbe-86ff-643fbf870dc9
2023-07-29T14:34:13.752432Z_39b6a0d9-1bc3-4481-9a14-c92fea6c2710
2023-07-30T03:02:57.510867Z_df595e04-0ac2-4e3d-93be-a7225348ea19
2023-07-30T06:20:09.322530Z_0c2485f8-281c-4440-8b08-89b08a669de4
2023-07-30T10:16:46.798405Z_79082f37-ee08-4a2b-84d1-d96c4026f321
2023-07-30T10:16:46.843441Z_788391d6-3278-48c4-a95b-1934ee3265c1
2023-07-31T02:26:55.903966Z_416a1e94-a8e1-4057-a683-a907faf400a1
2023-07-31T04:40:10.216044Z_bef9d811-4e92-45cd-bcd7-3282962c8dfe
2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6
2023-07-31T09:22:12.527968Z_f136c93b-7156-4176-a734-66a5a62513a4
2023-07-31T15:22:08.417988Z_b80c6255-5eb3-41dd-b0b1-8bc5b070094f
2023-07-31T23:05:16.589501Z_20ed8ef9-a478-49de-a371-08ea7a9937e5
2023-08-01T01:26:01.911387Z_670f9e3c-7fbe-497f-9f0b-abeaefd8f2b3
2023-08-01T01:51:39.759874Z_ff8206e4-34aa-44fe-82ac-7339e6714bb7
2023-08-01T01:56:21.955706Z_98c86cdd-45ec-47dc-8f0c-2e5e09731db8
7 older than 3 days old:
2023-07-29T03:35:32.006309Z_7b622c2b-a2fc-425a-acb8-dc1673b4c189
2023-07-29T03:35:32.055174Z_a2ee1e23-5f41-4dbe-86ff-643fbf870dc9
2023-07-29T14:34:13.752432Z_39b6a0d9-1bc3-4481-9a14-c92fea6c2710
2023-07-30T03:02:57.510867Z_df595e04-0ac2-4e3d-93be-a7225348ea19
2023-07-30T06:20:09.322530Z_0c2485f8-281c-4440-8b08-89b08a669de4
2023-07-30T10:16:46.798405Z_79082f37-ee08-4a2b-84d1-d96c4026f321
2023-07-30T10:16:46.843441Z_788391d6-3278-48c4-a95b-1934ee3265c1

root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#
 cat meta
{
"crash_id": 
"2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6",
"timestamp": "2023-07-31T08:44:20.893344Z",
"process_name": "ceph-mgr",
"entity_name": "mgr.zp3110b001a0104.tmbkzq",
"ceph_version": "16.2.5",
"utsname_hostname": "zp3110b001a0104",
"utsname_sysname": "Linux",
"utsname_release": "5.4.0-153-generic",
"utsname_version": "#170-Ubuntu SMP Fri Jun 16 13:43:31 UTC 2023",
"utsname_machine": "x86_64",
"os_name": "CentOS Linux",
"os_id": "centos",
"os_version_id": "8",
"os_version": "8",
"assert_condition": "pending_service_map.epoch > service_map.epoch",
"assert_func": "DaemonServer::got_service_map()::",
"assert_file": 
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc",
"assert_line": 2932,
"assert_thread_name": "ms_dispatch",
"assert_msg": 
"/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc:
 In function 'DaemonServer::got_service_map()::' 
thread 7f127440a700 time 
2023-07-31T08:44:20.887150+\n/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mgr/DaemonServer.cc:
 2932: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)\n",
"backtrace": [
"/lib64/libpthread.so.0(+0x12b20) [0x7f127c611b20]",
"gsignal()",
"abort()",
"(ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1a9) [0x7f127da26b75]",
"/usr/lib64/ceph/libceph-common.so.2(+0x276d3e) [0x7f127da26d3e]",
"(DaemonServer::got_service_map()+0xb2d) [0x5625aee23a4d]",
"(Mgr::handle_service_map(boost::intrusive_ptr)+0x1b6) 
[0x5625aee527c6]",
"(Mgr::ms_dispatch2(boost::intrusive_ptr const&)+0x894) 
[0x5625aee55424]",
"(MgrStandby::ms_dispatch2(boost::intrusive_ptr const&)+0xb0) 
[0x5625aee5ec10]",
"(DispatchQueue::entry()+0x126a) [0x7f127dc610ca]",
"(DispatchQueue::DispatchThread::entry()+0x11) [0x7f127dd11591]",
"/lib64/libpthread.so.0(+0x814a) [0x7f127c60714a]",
"clone()"
]
}
root@zp3110b001a0104:/var/lib/ceph/8dbfcd81-fee3-49d2-ac0c-e988c8be7178/crash/posted/2023-07-31T08:44:20.893344Z_037688ae-266f-4879-932c-2239f4679fd6#
 more log
--- begin dump of recent events ---
-> 2023-07-31T08:27:14.084+ 7f126fc01700 10 monclient: 
_send_mon_message to mon.zp3110b001a0104 at v2:XX.XXX.26.4:3300/0
-9998> 

[ceph-users] Re: warning: CEPHADM_APPLY_SPEC_FAIL

2023-06-29 Thread Adiga, Anantha
This was a simple  step to  delete the service
/# ceph orch rm osd.iops_optimized

WARN goes away

Just fyi: ceph orch  help does not list rm option

Thank you,
Anantha

From: Adiga, Anantha 
Sent: Thursday, June 29, 2023 4:38 PM
To: ceph-users@ceph.io
Subject: [ceph-users] warning: CEPHADM_APPLY_SPEC_FAIL

Hi,

I am not finding any reference to clear this warning AND stop the service. See 
below

After creating OSD with iops_optimized option, this WARN mesg appear.
Ceph 17.2.6

[cid:image001.png@01D9AAC5.6E4846D0]
6/29/23 4:10:45 PM
[WRN]
Health check failed: Failed to apply 1 service(s): osd.iops_optimized 
(CEPHADM_APPLY_SPEC_FAIL)

6/29/23 4:10:45 PM
[ERR]
Failed to apply osd.iops_optimized spec 
DriveGroupSpec.from_json(yaml.safe_load('''service_type: osd service_id: 
iops_optimized service_name: osd.iops_optimized placement: host_pattern: '*' 
spec: data_devices: rotational: 0 filter_logic: AND objectstore: bluestore 
''')): cephadm exited with an error code: 1, stderr:Inferring config 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/mon.fl31ca104ja0203/config 
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host 
--stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume 
--privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<mailto:CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
 -e NODE_NAME=fl31ca104ja0203 -e

#:/var/log/ceph# grep optimi cephadm.log
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized', '--image', 
'quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e',
 'ceph-volume', '--fsid', 'd0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e', 
'--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/nvme10n1', 
'/dev/nvme11n1', '/dev/nvme12n1', '/dev/nvme13n1', '/dev/nvme1n1', 
'/dev/nvme2n1', '/dev/nvme3n1', '/dev/nvme4n1', '/dev/nvme5n1', '/dev/nvme6n1', 
'/dev/nvme7n1', '/dev/nvme8n1', '/dev/nvme9n1', '--yes', '--no-systemd']
2023-06-29 23:06:28,340 7fc2668a7740 INFO Non-zero exit code 1 from 
/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host 
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<mailto:CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
 -e NODE_NAME=fl31ca104ja0202 -e CEPH_USE_RANDOM_NONCE=1 -e 
CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized -e CEPH_VOLUME_SKIP_RESTORECON=yes 
-e CEPH_VOLUME_DEBUG=1 -v 
/var/run/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/run/ceph:z -v 
/var/log/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/log/ceph:z -v 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v 
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v 
/tmp/ceph-tmp1v09i0jx:/etc/ceph/ceph.conf:z -v 
/tmp/ceph-tmphy3pnh46:/var/lib/ceph/bootstrap-osd/ceph.keyring:z 
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<mailto:quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
 lvm batch --no-auto /dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 
/dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 --yes --no-systemd
#:/var/log/ceph# grep optimi cephadm.log
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized', '--image', 
'quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e',
 'ceph-volume', '--fsid', 'd0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e', 
'--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/nvme10n1', 
'/dev/nvme11n1', '/dev/nvme12n1', '/dev/nvme13n1', '/dev/nvme1n1', 
'/dev/nvme2n1', '/dev/nvme3n1', '/dev/nvme4n1', '/dev/nvme5n1', '/dev/nvme6n1', 
'/dev/nvme7n1', '/dev/nvme8n1', '/dev/nvme9n1', '--yes', '--no-systemd']
2023-06-29 23:06:28,340 7fc2668a7740 INFO Non-zero exit code 1 from 
/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host 
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<mailto:CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
 -e NODE_NAME=fl31ca104ja0202 -e CEPH_USE_RANDOM_NONCE=1 -e 
CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized -e CEPH_VOLUME_SKIP_RESTORECON=yes 
-e CEPH_VOLUME_DEBUG=1 -v 
/var/run/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/run/ceph:z -v 
/var/log/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/log/ceph:z -v 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/ude

[ceph-users] warning: CEPHADM_APPLY_SPEC_FAIL

2023-06-29 Thread Adiga, Anantha
Hi,

I am not finding any reference to clear this warning AND stop the service. See 
below

After creating OSD with iops_optimized option, this WARN mesg appear.
Ceph 17.2.6

[cid:image001.png@01D9AAA5.8639A1F0]
6/29/23 4:10:45 PM
[WRN]
Health check failed: Failed to apply 1 service(s): osd.iops_optimized 
(CEPHADM_APPLY_SPEC_FAIL)

6/29/23 4:10:45 PM
[ERR]
Failed to apply osd.iops_optimized spec 
DriveGroupSpec.from_json(yaml.safe_load('''service_type: osd service_id: 
iops_optimized service_name: osd.iops_optimized placement: host_pattern: '*' 
spec: data_devices: rotational: 0 filter_logic: AND objectstore: bluestore 
''')): cephadm exited with an error code: 1, stderr:Inferring config 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/mon.fl31ca104ja0203/config 
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host 
--stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume 
--privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
 -e NODE_NAME=fl31ca104ja0203 -e

#:/var/log/ceph# grep optimi cephadm.log
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized', '--image', 
'quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e',
 'ceph-volume', '--fsid', 'd0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e', 
'--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/nvme10n1', 
'/dev/nvme11n1', '/dev/nvme12n1', '/dev/nvme13n1', '/dev/nvme1n1', 
'/dev/nvme2n1', '/dev/nvme3n1', '/dev/nvme4n1', '/dev/nvme5n1', '/dev/nvme6n1', 
'/dev/nvme7n1', '/dev/nvme8n1', '/dev/nvme9n1', '--yes', '--no-systemd']
2023-06-29 23:06:28,340 7fc2668a7740 INFO Non-zero exit code 1 from 
/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host 
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
 -e NODE_NAME=fl31ca104ja0202 -e CEPH_USE_RANDOM_NONCE=1 -e 
CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized -e CEPH_VOLUME_SKIP_RESTORECON=yes 
-e CEPH_VOLUME_DEBUG=1 -v 
/var/run/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/run/ceph:z -v 
/var/log/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/log/ceph:z -v 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v 
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v 
/tmp/ceph-tmp1v09i0jx:/etc/ceph/ceph.conf:z -v 
/tmp/ceph-tmphy3pnh46:/var/lib/ceph/bootstrap-osd/ceph.keyring:z 
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
 lvm batch --no-auto /dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 
/dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 --yes --no-systemd
#:/var/log/ceph# grep optimi cephadm.log
cephadm ['--env', 'CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized', '--image', 
'quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e',
 'ceph-volume', '--fsid', 'd0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e', 
'--config-json', '-', '--', 'lvm', 'batch', '--no-auto', '/dev/nvme10n1', 
'/dev/nvme11n1', '/dev/nvme12n1', '/dev/nvme13n1', '/dev/nvme1n1', 
'/dev/nvme2n1', '/dev/nvme3n1', '/dev/nvme4n1', '/dev/nvme5n1', '/dev/nvme6n1', 
'/dev/nvme7n1', '/dev/nvme8n1', '/dev/nvme9n1', '--yes', '--no-systemd']
2023-06-29 23:06:28,340 7fc2668a7740 INFO Non-zero exit code 1 from 
/usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host 
--entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e 
CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
 -e NODE_NAME=fl31ca104ja0202 -e CEPH_USE_RANDOM_NONCE=1 -e 
CEPH_VOLUME_OSDSPEC_AFFINITY=iops_optimized -e CEPH_VOLUME_SKIP_RESTORECON=yes 
-e CEPH_VOLUME_DEBUG=1 -v 
/var/run/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/run/ceph:z -v 
/var/log/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e:/var/log/ceph:z -v 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/crash:/var/lib/ceph/crash:z 
-v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v 
/run/lock/lvm:/run/lock/lvm -v /:/rootfs -v 
/tmp/ceph-tmp1v09i0jx:/etc/ceph/ceph.conf:z -v 
/tmp/ceph-tmphy3pnh46:/var/lib/ceph/bootstrap-osd/ceph.keyring:z 
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
 lvm batch --no-auto /dev/nvme10n1 /dev/nvme11n1 /dev/nvme12n1 /dev/nvme13n1 
/dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 
/dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 --yes --no-systemd

The warning message clears on its on for a few 

[ceph-users] Re: ceph orch host label rm : does not update label removal

2023-06-27 Thread Adiga, Anantha
Hello,  

This issue is resolved. 

The syntax of providing the labels was not correct. 

-Original Message-
From: Adiga, Anantha  
Sent: Thursday, June 22, 2023 1:08 PM
To: ceph-users@ceph.io
Subject: [ceph-users] ceph orch host label rm : does not update label removal

Hi ,

Not sure if the lables are really removed or the update is not working?
This was taken as a single label: mgrs,ceph osd,rgws.ceph  


root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster

root@fl31ca104ja0201:/# ceph orch host label rm fl31ca104ja0302 mgrs,ceph 
osd,rgws.ceph 
Removed label mgrs,ceph osd,rgws.ceph from host fl31ca104ja0302

root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin 
4 hosts in cluster

Thank you,
Anantha




root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster
root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/# ceph orch host label rm fl31ca104ja0302 rgws.ceph 
Removed label rgws.ceph from host fl31ca104ja0302 root@fl31ca104ja0201:/# ceph 
orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster
root@fl31ca104ja0201:/# ceph orch host label rm fl31ca104ja0302 rgws.ceph 
--force Removed label rgws.ceph from host fl31ca104ja0302 
root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster

Regards,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Grafana service fails to start due to bad directory name after Quincy upgrade

2023-06-23 Thread Adiga, Anantha
Hi Nizam,

Thanks much for the detail.


Regards,
Anantha



From: Nizamudeen A 
Sent: Friday, June 23, 2023 12:25 AM
To: Adiga, Anantha 
Cc: Eugen Block ; ceph-users@ceph.io
Subject: Re: [ceph-users] Re: Grafana service fails to start due to bad 
directory name after Quincy upgrade

Hi,

You can upgrade the grafana version individually by setting the config_opt for 
grafana container image like:
ceph config set mgr mgr/cephadm/container_image_grafana  
quay.io/ceph/ceph-grafana:8.3.5<http://quay.io/ceph/ceph-grafana:8.3.5>

and then redeploy the grafana container again either via dashboard or cephadm.

Regards,
Nizam



On Fri, Jun 23, 2023 at 12:05 AM Adiga, Anantha 
mailto:anantha.ad...@intel.com>> wrote:
Hi Eugen,

Thank you so much for the details.  Here is the update (comments in-line >>):

Regards,
Anantha
-Original Message-
From: Eugen Block mailto:ebl...@nde.ag>>
Sent: Monday, June 19, 2023 5:27 AM
To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] Re: Grafana service fails to start due to bad directory 
name after Quincy upgrade

Hi,

so grafana is starting successfully now? What did you change?
>>  I stopped and removed the Grafana image and  started it from "Ceph 
>> Dashboard" service. The version is still 6.7.4. I also had to change the 
>> following.
I do not have a way to make  this permanent, if the service is redeployed I  
will lose  the changes.
I did not save the file that cephadm generated. This was one reason why  
Grafana service would not start. I had replace it with the one below to resolve 
this issue.
[users]
  default_theme = light
[auth.anonymous]
  enabled = true
  org_name = 'Main Org.'
  org_role = 'Viewer'
[server]
  domain = 'bootstrap.storage.lab'
  protocol = https
  cert_file = /etc/grafana/certs/cert_file
  cert_key = /etc/grafana/certs/cert_key
  http_port = 3000
  http_addr =
[snapshots]
  external_enabled = false
[security]
  disable_initial_admin_creation = false
  cookie_secure = true
  cookie_samesite = none
  allow_embedding = true
  admin_password = paswd-value
  admin_user = user-name

Also this was the other change:
# This file is generated by cephadm.
apiVersion: 1   <--  This was the line added to 
var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana.fl31ca104ja0201/etc/grafana/provisioning/datasources/ceph-dashboard.yml
>>
Regarding the container images, yes there are defaults in cephadm which can be 
overridden with ceph config. Can you share this output?

ceph config dump | grep container_image
>>
Here it is
root@fl31ca104ja0201:/# ceph config dump | grep container_image
global   basic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<http://quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
  *
mgr  advanced  
mgr/cephadm/container_image_alertmanager   
docker.io/prom/alertmanager:v0.16.2<http://docker.io/prom/alertmanager:v0.16.2> 
   *
mgr  advanced  
mgr/cephadm/container_image_base   
quay.io/ceph/daemon<http://quay.io/ceph/daemon>
mgr  advanced  
mgr/cephadm/container_image_grafana
docker.io/grafana/grafana:6.7.4<http://docker.io/grafana/grafana:6.7.4> 
   *
mgr  advanced  
mgr/cephadm/container_image_node_exporter  
docker.io/prom/node-exporter:v0.17.0<http://docker.io/prom/node-exporter:v0.17.0>
   *
mgr  advanced  
mgr/cephadm/container_image_prometheus 
docker.io/prom/prometheus:v2.7.2<http://docker.io/prom/prometheus:v2.7.2>   
*
client.rgw.default.default.fl31ca104ja0201.ninovsbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<http://quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
  *
client.rgw.default.default.fl31ca104ja0202.yhjkmbbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<http://quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
  *
client.rgw.default.default.fl31ca104ja0203.fqnriqbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<h

[ceph-users] ceph orch host label rm : does not update label removal

2023-06-22 Thread Adiga, Anantha
Hi ,

Not sure if the lables are really removed or the update is not working?



root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster
root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/#
root@fl31ca104ja0201:/# ceph orch host label rm fl31ca104ja0302 rgws.ceph
Removed label rgws.ceph from host fl31ca104ja0302
root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster
root@fl31ca104ja0201:/# ceph orch host label rm fl31ca104ja0302 rgws.ceph 
--force
Removed label rgws.ceph from host fl31ca104ja0302
root@fl31ca104ja0201:/# ceph orch host ls
HOST ADDR   LABELS  
  STATUS
fl31ca104ja0201  XX.XX.XXX.139  ceph clients mdss mgrs monitoring mons osds rgws
fl31ca104ja0202  XX.XX.XXX.140  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0203  XX.XX.XXX.141  ceph clients mdss mgrs mons osds rgws
fl31ca104ja0302  XX.XX.XXX.5_admin mgrs,ceph osd,rgws.ceph
4 hosts in cluster

Regards,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Grafana service fails to start due to bad directory name after Quincy upgrade

2023-06-22 Thread Adiga, Anantha
Hi Eugen,

Thank you so much for the details.  Here is the update (comments in-line >>):

Regards,
Anantha
-Original Message-
From: Eugen Block  
Sent: Monday, June 19, 2023 5:27 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Grafana service fails to start due to bad directory 
name after Quincy upgrade

Hi,

so grafana is starting successfully now? What did you change?  
>>  I stopped and removed the Grafana image and  started it from "Ceph 
>> Dashboard" service. The version is still 6.7.4. I also had to change the 
>> following. 
I do not have a way to make  this permanent, if the service is redeployed I  
will lose  the changes. 
I did not save the file that cephadm generated. This was one reason why  
Grafana service would not start. I had replace it with the one below to resolve 
this issue. 
[users]
  default_theme = light
[auth.anonymous]
  enabled = true
  org_name = 'Main Org.'
  org_role = 'Viewer'
[server]
  domain = 'bootstrap.storage.lab'
  protocol = https
  cert_file = /etc/grafana/certs/cert_file
  cert_key = /etc/grafana/certs/cert_key
  http_port = 3000
  http_addr =
[snapshots]
  external_enabled = false
[security]
  disable_initial_admin_creation = false
  cookie_secure = true
  cookie_samesite = none
  allow_embedding = true
  admin_password = paswd-value
  admin_user = user-name

Also this was the other change: 
# This file is generated by cephadm.
apiVersion: 1   <--  This was the line added to 
var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana.fl31ca104ja0201/etc/grafana/provisioning/datasources/ceph-dashboard.yml
>>
Regarding the container images, yes there are defaults in cephadm which can be 
overridden with ceph config. Can you share this output?

ceph config dump | grep container_image
>>
Here it is
root@fl31ca104ja0201:/# ceph config dump | grep container_image
global   basic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
  *
mgr  advanced  
mgr/cephadm/container_image_alertmanager   docker.io/prom/alertmanager:v0.16.2  
  *
mgr  advanced  
mgr/cephadm/container_image_base   quay.io/ceph/daemon
mgr  advanced  
mgr/cephadm/container_image_grafanadocker.io/grafana/grafana:6.7.4  
  *
mgr  advanced  
mgr/cephadm/container_image_node_exporter  docker.io/prom/node-exporter:v0.17.0 
  *
mgr  advanced  
mgr/cephadm/container_image_prometheus docker.io/prom/prometheus:v2.7.2 
  *
client.rgw.default.default.fl31ca104ja0201.ninovsbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
  *
client.rgw.default.default.fl31ca104ja0202.yhjkmbbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
  *
client.rgw.default.default.fl31ca104ja0203.fqnriqbasic 
container_image
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
  *
>>
I tend to always use a specific image as described here [2]. I also haven't 
deployed grafana via dashboard yet so I can't really comment on that as well as 
on the warnings you report.


>>OK. The need for that is, in Quincy when you enable Loki and Promtail, to 
>>view the daemon logs Ceph board pulls in Grafana  dashboard. I will let you 
>>know once that issue is resolved.

Regards,
Eugen

[2]
https://docs.ceph.com/en/latest/cephadm/services/monitoring/#using-custom-images
>> Thank you I am following the document now

Zitat von "Adiga, Anantha" :

> Hi Eugene,
>
> Thank you for your response, here is the update.
>
> The upgrade to Quincy was done  following the cephadm orch upgrade 
> procedure ceph orch upgrade start --image quay.io/ceph/ceph:v17.2.6
>
> Upgrade completed with out errors. After the upgrade, upon creating 
> the Grafana service from Ceph dashboard, it deployed Grafana 6.7.4.
> The version is hardcoded in the code, should it not be 8.3.5 as listed 
> below in Quincy documentation? See below
>
> [Grafana service started from Cephdashboard]
>
> Quincy documentation states: 
> https://docs.ceph.com/en/latest/releases/quincy/
> ……documentation snippet
> Monitoring and alerting:
> 43 new 

[ceph-users] Re: Error while adding host : Error EINVAL: Traceback (most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in _handle_command

2023-06-20 Thread Adiga, Anantha
Hi Adam,

Thank you for the details. I see that the cephadm on the Ceph cluster is 
different from the host that is being added. I will go thru the ticket and the 
logs. Also the cluster is on Ubuntu Focal and the new host is on Ubuntu Jammy
The utility:
cephadm   16.2.13-1focal   
amd64cephadm utility to bootstrap ceph daemons with systemd and 
containers
cephadm   17.2.5-0ubuntu0.22.04.3amd64  
  cephadm utility to bootstrap ceph daemons with systemd and containers

Thanks again,
Anantha
From: Adam King 
Sent: Tuesday, June 20, 2023 4:25 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Error while adding host : Error EINVAL: Traceback 
(most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in 
_handle_command

There was a cephadm bug that wasn't fixed by the time 17.2.6 came out (I'm 
assuming that's the version being used here, although it may have been present 
in some slightly earlier quincy versions) that caused this misleading error to 
be printed out when adding a host failed. There's a tracker for it here 
https://tracker.ceph.com/issues/59081 that has roughly the same traceback. The 
real issue is likely a connectivity or permission issue from the active mgr 
trying to ssh to the host. In the case I saw from the tracker, it was caused by 
the ssh pub key not being set up on the host. If you check the cephadm cluster 
logs ("ceph log last 50 debug cephadm") after trying to add the host I'm 
guessing you'll see some error like the second set of output in the tracker 
that will hopefully give some more info on why adding the host failed.

On Tue, Jun 20, 2023 at 6:38 PM Adiga, Anantha 
mailto:anantha.ad...@intel.com>> wrote:
Hi,

I am seeing this error  after an offline  was deleted and while adding the host 
again. Thereafter, I have removed the /var/lib/cep  folder and removed the ceph 
quincy image in the offline host. What is the cause of this issue and the 
solution.

root@fl31ca104ja0201:/home/general# cephadm shell
Inferring fsid d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
Using recent ceph image 
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e<http://quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e><mailto:quay.io<mailto:quay.io>/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e>
root@fl31ca104ja0201:/#

root@fl31ca104ja0201:/# ceph orch host rm fl31ca104ja0302 --offline --force

Removed offline host 'fl31ca104ja0302'

root@fl31ca104ja0201:/# ceph -s

  cluster:

id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e

health: HEALTH_OK



  services:

mon: 3 daemons, quorum fl31ca104ja0201,fl31ca104ja0202,fl31ca104ja0203 (age 
28h)

mgr: fl31ca104ja0203(active, since 6d), standbys: fl31ca104ja0202, 
fl31ca104ja0201

mds: 1/1 daemons up, 2 standby

osd: 33 osds: 33 up (since 28h), 33 in (since 28h)

rgw: 3 daemons active (3 hosts, 1 zones)



  data:

volumes: 1/1 healthy

pools:   24 pools, 737 pgs

objects: 613.56k objects, 1.9 TiB

usage:   2.9 TiB used, 228 TiB / 231 TiB avail

pgs: 737 active+clean



  io:

client:   161 MiB/s rd, 75 op/s rd, 0 op/s wr


root@fl31ca104ja0201:/# ceph orch host add fl31ca104ja0302 10.45.219.5
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1756, in _handle_command
return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in 
handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in 
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # 
noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 356, in _add_host
return self._apply_misc([s], False, Format.plain)
 File "/usr/share/ceph/mgr/orchestrator/module.py", line 1092, in _apply_misc
raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in 
raise_if_exception
e = pickle.loads(c.serialized_exception)
TypeError: __init__() missing 2 required positional arguments: 'hostname' and 
'addr'

Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Error while adding host : Error EINVAL: Traceback (most recent call last): File /usr/share/ceph/mgr/mgr_module.py, line 1756, in _handle_command

2023-06-20 Thread Adiga, Anantha
Hi,

I am seeing this error  after an offline  was deleted and while adding the host 
again. Thereafter, I have removed the /var/lib/cep  folder and removed the ceph 
quincy image in the offline host. What is the cause of this issue and the 
solution.

root@fl31ca104ja0201:/home/general# cephadm shell
Inferring fsid d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
Using recent ceph image 
quay.io/ceph/ceph@sha256:af79fedafc42237b7612fe2d18a9c64ca62a0b38ab362e614ad671efa4a0547e
root@fl31ca104ja0201:/#

root@fl31ca104ja0201:/# ceph orch host rm fl31ca104ja0302 --offline --force

Removed offline host 'fl31ca104ja0302'

root@fl31ca104ja0201:/# ceph -s

  cluster:

id: d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e

health: HEALTH_OK



  services:

mon: 3 daemons, quorum fl31ca104ja0201,fl31ca104ja0202,fl31ca104ja0203 (age 
28h)

mgr: fl31ca104ja0203(active, since 6d), standbys: fl31ca104ja0202, 
fl31ca104ja0201

mds: 1/1 daemons up, 2 standby

osd: 33 osds: 33 up (since 28h), 33 in (since 28h)

rgw: 3 daemons active (3 hosts, 1 zones)



  data:

volumes: 1/1 healthy

pools:   24 pools, 737 pgs

objects: 613.56k objects, 1.9 TiB

usage:   2.9 TiB used, 228 TiB / 231 TiB avail

pgs: 737 active+clean



  io:

client:   161 MiB/s rd, 75 op/s rd, 0 op/s wr


root@fl31ca104ja0201:/# ceph orch host add fl31ca104ja0302 10.45.219.5
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1756, in _handle_command
return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in 
handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in 
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # 
noqa: E731
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 356, in _add_host
return self._apply_misc([s], False, Format.plain)
 File "/usr/share/ceph/mgr/orchestrator/module.py", line 1092, in _apply_misc
raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in 
raise_if_exception
e = pickle.loads(c.serialized_exception)
TypeError: __init__() missing 2 required positional arguments: 'hostname' and 
'addr'

Thank you,
Anantha
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Grafana service fails to start due to bad directory name after Quincy upgrade

2023-06-17 Thread Adiga, Anantha
Hi Eugene,

Thank you for your response, here is the update.

The upgrade to Quincy was done  following the cephadm orch upgrade procedure
ceph orch upgrade start --image quay.io/ceph/ceph:v17.2.6

Upgrade completed with out errors. After the upgrade, upon creating the Grafana 
service from Ceph dashboard, it deployed Grafana 6.7.4. The version is 
hardcoded in the code, should it not be 8.3.5 as listed below in Quincy 
documentation? See below

[Grafana service started from Cephdashboard]

Quincy documentation states: https://docs.ceph.com/en/latest/releases/quincy/
……documentation snippet
Monitoring and alerting:
43 new alerts have been added (totalling 68) improving observability of events 
affecting: cluster health, monitors, storage devices, PGs and CephFS.
Alerts can now be sent externally as SNMP traps via the new SNMP gateway 
service (the MIB is provided).
Improved integrated full/nearfull event notifications.
Grafana Dashboards now use grafonnet format (though they’re still available in 
JSON format).
Stack update: images for monitoring containers have been updated. Grafana 
8.3.5, Prometheus 2.33.4, Alertmanager 0.23.0 and Node Exporter 1.3.1. This 
reduced exposure to several Grafana vulnerabilities (CVE-2021-43798, 
CVE-2021-39226, CVE-2021-43798, CVE-2020-29510, CVE-2020-29511).
……….

I notice that the versions of the remaining stack, that Ceph dashboard deploys, 
 are also older than what is documented.  Prometheus 2.7.2, Alertmanager 0.16.2 
and Node Exporter 0.17.0.

AND 6.7.4 Grafana service reports a few warnings: highlighted below

root@fl31ca104ja0201:/home/general# systemctl status 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana.fl31ca104ja0201.service
● ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana.fl31ca104ja0201.service - 
Ceph grafana.fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e
 Loaded: loaded 
(/etc/systemd/system/ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@.service; 
enabled; vendor preset: enabled)
 Active: active (running) since Tue 2023-06-13 03:37:58 UTC; 11h ago
   Main PID: 391896 (bash)
  Tasks: 53 (limit: 618607)
 Memory: 17.9M
 CGroup: 
/system.slice/system-ceph\x2dd0a3b6e0\x2dd2c3\x2d11ed\x2dbe05\x2da7a3a1d7a87e.slice/ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana.fl31ca104j>
 ├─391896 /bin/bash 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana.fl31ca104ja0201/unit.run
 └─391969 /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM 
--net=host --init --name ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-grafana-fl>
-- Logs begin at Sun 2023-06-11 20:41:51 UTC, end at Tue 2023-06-13 15:35:12 
UTC. --
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="alter user_auth.auth_id 
to length 190"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="Add OAuth access token 
to user_auth"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="Add OAuth refresh token 
to user_auth"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="Add OAuth token type to 
user_auth"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="Add OAuth expiry to 
user_auth"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="Add index to user_id 
column in user_auth"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="create server_lock table"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="add index 
server_lock.operation_uid"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="create user auth token 
table"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="add unique index 
user_auth_token.auth_token"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="add unique index 
user_auth_token.prev_auth_token"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="create cache_data table"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Executing migration" logger=migrator id="add unique index 
cache_data.cache_key"
Jun 13 03:37:59 fl31ca104ja0201 bash[391969]: t=2023-06-13T03:37:59+ 
lvl=info msg="Created default organization" logger=sqlstore
Jun 13 03:37:59 fl31ca104ja0201 

[ceph-users] Re: Grafana service fails to start due to bad directory name after Quincy upgrade

2023-05-18 Thread Adiga, Anantha
Hi Ben,

After chown  tp 472,  “systemctl daemon-reload” changes it back to 167.

I also notice that these are still from docker.io while the rest are from quay
/home/general# docker  ps  --no-trunc | grep docker
93b8c3aa33580fb6f4951849a6ff9c2e66270eb913b8579aca58371ef41f2d6c   
docker.io/grafana/grafana:6.7.4 
"/run.sh"   


  10 days agoUp 10 days  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-grafana-fl31ca104ja0201
df6b7368a54d0af7d2cdd45c0c9bad0999d58c144cb99927a3f76683652b00f2   
docker.io/prom/alertmanager:v0.16.2 
"/bin/alertmanager --cluster.listen-address=:9094 
--web.listen-address=:9093 
--cluster.peer=fl31ca104ja0201.deacluster.intel.com:9094 
--config.file=/etc/alertmanager/alertmanager.yml"   
10 days agoUp 10 days  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-alertmanager-fl31ca104ja0201
aa2055733fe8d426312af5572c94558e89e7cf350e7baba2c22eb6a0e20682fc   
docker.io/prom/prometheus:v2.7.2
"/bin/prometheus --config.file=/etc/prometheus/prometheus.yml 
--storage.tsdb.path=/prometheus --web.listen-address=:9095 
--storage.tsdb.retention.time=15d --storage.tsdb.retention.size=0 
--web.external-url=http://fl31ca104ja0201:9095;10 days ago  
  Up 10 days  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-prometheus-fl31ca104ja0201
a9526f50dfacad47af298c0c1b2cf6cfd74b796b6df1945325529c79658d7356   
docker.io/prom/node-exporter:v0.17.0
"/bin/node_exporter --no-collector.timex --web.listen-address=:9100 
--path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/rootfs"  

  10 days agoUp 10 days  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-node-exporter-fl31ca104ja0201
440926ce479bdd114f43e3228cc8cbfe48b4e1a6c2c7fab58c4cd103bc0f3a0e   
docker.io/arcts/keepalived  
"./init.sh" 


  3 weeks agoUp 3 weeks  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-keepalived-rgw-default-default-fl31ca104ja0201-yiasjs
2813ca859a7ba0de7fcb6be74a00b9b11a23e79636c5f35fb2b6b4be31a29f89   
docker.io/library/haproxy:2.3   
"docker-entrypoint.sh haproxy -f /var/lib/haproxy/haproxy.cfg"  


  3 weeks agoUp 3 weeks  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-haproxy-rgw-default-default-fl31ca104ja0201-yvwsmz
d68e2f68c45f2ea9a10267c8d964c2aaf026b4291918f4f3fb306da20a532db9   
docker.io/arcts/keepalived  
"./init.sh" 


  3 weeks agoUp 3 weeks  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-keepalived-nfs-nfs-1-fl31ca104ja0201-dsynjg
40f3c0b7455f5540fdb4f428bef4e9032b0ff0f50d302352551abb208eff1f28   
docker.io/library/haproxy:2.3   
"docker-entrypoint.sh haproxy -f /var/lib/haproxy/haproxy.cfg"  


  3 weeks agoUp 3 weeks  
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e-haproxy-nfs-nfs-1-fl31ca104ja0201-zdbzvv


From: Ben 
Sent: Wednesday, May 17, 2023 6:32 PM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Grafana service fails to start due to bad directory 
name after Quincy upgrade


use this to get relevant long lines in log:

journalctl -u ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201 
| less -S

it is '--user 472'  by content of unit.run, not the default ceph user

[ceph-users] Re: Grafana service fails to start due to bad directory name after Quincy upgrade

2023-05-17 Thread Adiga, Anantha
Ben,

Thanks for the suggestion.
Changed the user and group  to 167 for all files in the data and etc folders in 
the grafana service folder were not 167.  Did a systemctl daemon-reload and 
restarted the  grafana service ,

but still seeing the same error

-- Logs begin at Mon 2023-05-15 19:39:34 UTC, end at Wed 2023-05-17 17:08:02 
UTC. --
May 17 17:07:44 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process exited, code=exite>
May 17 17:07:44 fl31ca104ja0201 bash[148899]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop:
 No >
May 17 17:07:44 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with result 'exit-code'.
May 17 17:07:54 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Scheduled restart job, restart >
May 17 17:07:54 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 17 17:07:54 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 17 17:07:54 fl31ca104ja0201 bash[149116]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.run:
 No such >
May 17 17:07:54 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process exited, code=exite>
May 17 17:07:54 fl31ca104ja0201 bash[149118]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop:
 No >
May 17 17:07:54 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with result 'exit-code'.
ESCOC
2 UTC. --
a3a1d7a87e@grafana-fl31ca104ja0201.service<mailto:a3a1d7a87e@grafana-fl31ca104ja0201.service>:
 Main process exited, code=exited, status=127/n/a
b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop: No such 
file or directory
a3a1d7a87e@grafana-fl31ca104ja0201.service<mailto:a3a1d7a87e@grafana-fl31ca104ja0201.service>:
 Failed with result 'exit-code'.
a3a1d7a87e@grafana-fl31ca104ja0201.service<mailto:a3a1d7a87e@grafana-fl31ca104ja0201.service>:
 Scheduled restart job, restart counter is at 3.
a0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
a0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.run: No such file 
or directory
a3a1d7a87e@grafana-fl31ca104ja0201.service<mailto:a3a1d7a87e@grafana-fl31ca104ja0201.service>:
 Main process exited, code=exited, status=127/n/a
b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop: No such 
file or directory
a3a1d7a87e@grafana-fl31ca104ja0201.service<mailto:a3a1d7a87e@grafana-fl31ca104ja0201.service>:
 Failed with result 'exit-code'.
~

Thank you,
Anantha

From: Ben 
Sent: Wednesday, May 17, 2023 2:29 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Grafana service fails to start due to bad directory 
name after Quincy upgrade

you could check owner of /var/lib/ceph on host with grafana container running. 
If its owner is root, change to 167:167 recursively.
Then systemctl daemon-reload and restart the service. Good luck.

Ben

Adiga, Anantha mailto:anantha.ad...@intel.com>> 
于2023年5月17日周三 03:57写道:
Hi

Upgraded from Pacific 16.2.5 to 17.2.6 on May 8th

However, Grafana fails to start due to bad folder path
:/tmp# journalctl -u 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201 -n 25
-- Logs begin at Sun 2023-05-14 20:05:52 UTC, end at Tue 2023-05-16 19:07:51 
UTC. --
May 16 19:05:00 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:00 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:00 fl31ca104ja0201 bash[2575021]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:00 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process ex>
May 16 19:05:00 fl31ca104ja0201 bash[2575030]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:00 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with res>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Scheduled resta>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:10 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:10 fl31ca104ja0201 bash[257

[ceph-users] Grafana service fails to start due to bad directory name after Quincy upgrade

2023-05-16 Thread Adiga, Anantha
Hi

Upgraded from Pacific 16.2.5 to 17.2.6 on May 8th

However, Grafana fails to start due to bad folder path
:/tmp# journalctl -u 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201 -n 25
-- Logs begin at Sun 2023-05-14 20:05:52 UTC, end at Tue 2023-05-16 19:07:51 
UTC. --
May 16 19:05:00 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:00 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:00 fl31ca104ja0201 bash[2575021]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:00 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process ex>
May 16 19:05:00 fl31ca104ja0201 bash[2575030]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:00 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with res>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Scheduled resta>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:10 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:10 fl31ca104ja0201 bash[2575273]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process ex>
May 16 19:05:10 fl31ca104ja0201 bash[2575282]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:10 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with res>
May 16 19:05:20 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Scheduled resta>
May 16 19:05:20 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:20 fl31ca104ja0201 systemd[1]: Started Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:20 fl31ca104ja0201 bash[2575369]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:20 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: Main 
process ex>
May 16 19:05:20 fl31ca104ja0201 bash[2575370]: /bin/bash: 
/var/lib/ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/u>
May 16 19:05:20 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with res>
May 16 19:05:30 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Scheduled resta>
May 16 19:05:30 fl31ca104ja0201 systemd[1]: Stopped Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
May 16 19:05:30 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Start request r>
May 16 19:05:30 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service: 
Failed with res>
May 16 19:05:30 fl31ca104ja0201 systemd[1]: Failed to start Ceph 
grafana-fl31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
ESCOC
19:07:51 UTC. --
31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.run: No 
such file or directory
-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service:
 Main process exited, code=exited, status=127/n/a
ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop:
 No such file or directory
-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service:
 Failed with result 'exit-code'.
-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service:
 Scheduled restart job, restart counter is at 3.
31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
31ca104ja0201 for d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e.
ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.run: No 
such file or directory
-be05-a7a3a1d7a87e@grafana-fl31ca104ja0201.service:
 Main process exited, code=exited, status=127/n/a
ceph/d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e/grafana-fl31ca104ja0201/unit.poststop:
 No such file or directory

[ceph-users] Re: rgw service fails to start with zone not found

2023-05-08 Thread Adiga, Anantha
Hi Eugene,

I had removed the zone before removing it from the zonegroup. I will check the 
objects and remove the appropriate ones. Thank you. 

As outline in the thread, after setting the config for the rgw service, they 
started ok . 

Thank you,
Anantha


-Original Message-
From: Eugen Block  
Sent: Monday, May 8, 2023 10:55 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: rgw service fails to start with zone not found

Hi,
how exactly did you remove the configuration?
Check out the .rgw.root pool, there are different namespaces where the 
corresponding objects are stored.

rados -p .rgw.root ls —all

You should be able to remove those objects from the pool, but be careful to not 
delete anything you actually need.

Zitat von "Adiga, Anantha" :

> Hi,
>
> An existing multisite configuration was removed.  But the radosgw 
> services still see the old zone name and fail to start.
>
> journalctl -u
> ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca10
> 4ja0201.ninovs
> ...
> May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.897+ 7f2f47634740  0 deferred set uid:gid to
> 167:167 (ceph:ceph)
> May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.897+ 7f2f47634740  0 ceph version 17.2.6 
> (d7ff0d10654d2280e08f1ab989> May 08 16:10:48 fl31ca104ja0201 
> bash[3964341]: debug
> 2023-05-08T16:10:48.897+ 7f2f47634740  0 framework: beast May 08 
> 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.897+ 7f2f47634740  0 framework conf key:  
> port, val: 8080
> May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.897+ 7f2f47634740  1 radosgw_Main not setting 
> numa affinity May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.901+ 7f2f47634740  1 rgw_d3n:  
> rgw_d3n_l1_local_datacache_enabled=0
> May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.901+ 7f2f47634740  1 D3N datacache enabled: 0 
> May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug
> 2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: could 
> not find zone (fl2site2) May 08 16:10:48 fl31ca104ja0201 
> bash[3964341]: debug
> 2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed 
> to start notify service> May 08 16:10:48 fl31ca104ja0201 
> bash[3964341]: debug
> 2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed 
> to init services (ret=(> May 08 16:10:48 fl31ca104ja0201 
> bash[3964341]: debug
> 2023-05-08T16:10:48.921+ 7f2f47634740 -1 Couldn't init storage 
> provider (RADOS) May 08 16:10:49 fl31ca104ja0201 systemd[1]:
> ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca10
> 4ja0201.ninovs.service: Main
> pr>
> May 08 16:10:49 fl31ca104ja0201 systemd[1]:  
> ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs.service:
>   
> Failed
>
> Here is the current configuration
>
> root@fl31ca104ja0201:/# radosgw-admin period get {
> "id": "729f7cef-6340-4750-b3ae-9164177c0df3",
> "epoch": 1,
> "predecessor_uuid": "1d124ad5-57b0-41de-8def-823bd40f72aa",
> "sync_status": [],
> "period_map": {
> "id": "729f7cef-6340-4750-b3ae-9164177c0df3",
> "zonegroups": [
> {
> "id": "21b8306c-be43-4567-a0c5-74ab69937535",
> "name": "default",
> "api_name": "default",
> "is_master": "true",
> "endpoints": [],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "ff468492-8a14-414b-a8cf-7c20ab699af3",
> "zones": [
> {
> "id": "ff468492-8a14-414b-a8cf-7c20ab699af3",
> "name": "default",
> "endpoints": [],
> "log_meta": "false",
> "log_data": "false",
> "bucket_index_max_shards": 11,
> "read_only": "false",
> "tier_type": "",
> "sync_from_all": "true",
> "sync_from": [],
> "redirect_zone": ""
> }
> 

[ceph-users] Re: rgw service fails to start with zone not found

2023-05-08 Thread Adiga, Anantha
Thank you so much.
Here it is. I set them to the current value and  now the  rgw services are up.

Should the configuration variables  get set automatically for the gateway 
services as part of multiste configuration updates? OR it should be a manual 
procedure?

ceph config dump | grep client.rgw.default.default
client.rgw.default.default   advanced  rgw_realm
  global
 *
client.rgw.default.default   advanced  rgw_zone 
  fl2site2  
 *

Thank you,
Anantha

From: Danny Webb 
Sent: Monday, May 8, 2023 10:54 AM
To: Adiga, Anantha ; ceph-users@ceph.io
Subject: Re: rgw service fails to start with zone not found

are the old multisite conf values still in ceph.conf (eg, rgw_zonegroup, 
rgw_zone, rgw_realm)?

From: Adiga, Anantha mailto:anantha.ad...@intel.com>>
Sent: 08 May 2023 18:27
To: ceph-users@ceph.io<mailto:ceph-users@ceph.io> 
mailto:ceph-users@ceph.io>>
Subject: [ceph-users] rgw service fails to start with zone not found

CAUTION: This email originates from outside THG

Hi,

An existing multisite configuration was removed.  But the radosgw services 
still see the old zone name and fail to start.

journalctl -u 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs
...
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 deferred set uid:gid to 167:167 
(ceph:ceph)
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 framework: beast
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 framework conf key: port, val: 8080
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  1 radosgw_Main not setting numa 
affinity
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.901+ 7f2f47634740  1 rgw_d3n: 
rgw_d3n_l1_local_datacache_enabled=0
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.901+ 7f2f47634740  1 D3N datacache enabled: 0
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: could not find 
zone (fl2site2)
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed to start 
notify service>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed to init 
services (ret=(>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.921+ 7f2f47634740 -1 Couldn't init storage provider 
(RADOS)
May 08 16:10:49 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs.service:
 Main pr>
May 08 16:10:49 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs.service:
 Failed

Here is the current configuration

root@fl31ca104ja0201:/# radosgw-admin period get
{
"id": "729f7cef-6340-4750-b3ae-9164177c0df3",
"epoch": 1,
"predecessor_uuid": "1d124ad5-57b0-41de-8def-823bd40f72aa",
"sync_status": [],
"period_map": {
"id": "729f7cef-6340-4750-b3ae-9164177c0df3",
"zonegroups": [
{
"id": "21b8306c-be43-4567-a0c5-74ab69937535",
"name": "default",
"api_name": "default",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"zones": [
{
"id": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"name": "default",
"endpoints": [],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": ""

[ceph-users] rgw service fails to start with zone not found

2023-05-08 Thread Adiga, Anantha
Hi,

An existing multisite configuration was removed.  But the radosgw services 
still see the old zone name and fail to start.

journalctl -u 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs
...
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 deferred set uid:gid to 167:167 
(ceph:ceph)
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 ceph version 17.2.6 
(d7ff0d10654d2280e08f1ab989>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 framework: beast
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  0 framework conf key: port, val: 8080
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.897+ 7f2f47634740  1 radosgw_Main not setting numa 
affinity
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.901+ 7f2f47634740  1 rgw_d3n: 
rgw_d3n_l1_local_datacache_enabled=0
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.901+ 7f2f47634740  1 D3N datacache enabled: 0
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: could not find 
zone (fl2site2)
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed to start 
notify service>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.917+ 7f2f47634740  0 rgw main: ERROR: failed to init 
services (ret=(>
May 08 16:10:48 fl31ca104ja0201 bash[3964341]: debug 
2023-05-08T16:10:48.921+ 7f2f47634740 -1 Couldn't init storage provider 
(RADOS)
May 08 16:10:49 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs.service:
 Main pr>
May 08 16:10:49 fl31ca104ja0201 systemd[1]: 
ceph-d0a3b6e0-d2c3-11ed-be05-a7a3a1d7a87e@rgw.default.default.fl31ca104ja0201.ninovs.service:
 Failed

Here is the current configuration

root@fl31ca104ja0201:/# radosgw-admin period get
{
"id": "729f7cef-6340-4750-b3ae-9164177c0df3",
"epoch": 1,
"predecessor_uuid": "1d124ad5-57b0-41de-8def-823bd40f72aa",
"sync_status": [],
"period_map": {
"id": "729f7cef-6340-4750-b3ae-9164177c0df3",
"zonegroups": [
{
"id": "21b8306c-be43-4567-a0c5-74ab69937535",
"name": "default",
"api_name": "default",
"is_master": "true",
"endpoints": [],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"zones": [
{
"id": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"name": "default",
"endpoints": [],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "8999e04c-17a4-4150-8845-cecd6672d312",
"sync_policy": {
"groups": []
}
}
],
"short_zone_ids": [
{
"key": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"val": 3883136521
}
]
},
"master_zonegroup": "21b8306c-be43-4567-a0c5-74ab69937535",
"master_zone": "ff468492-8a14-414b-a8cf-7c20ab699af3",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_ratelimit": {
"max_read_ops": 0,
"max_write_ops": 0,
"max_read_bytes": 0,
"max_write_bytes": 0,
"enabled": false
},
"bucket_ratelimit": {
"max_read_ops": 0,
"max_write_ops": 0,
"max_read_bytes": 0,
"max_write_bytes": 0,

[ceph-users] Re: ceph orch ps mon, mgr, osd shows for version, image and container id

2023-03-31 Thread Adiga, Anantha
Thank you so much Adam. I will check into the older release being used and 
update the ticket.

Anantha


From: Adam King 
Sent: Friday, March 31, 2023 5:46 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph orch ps mon, mgr, osd shows  for 
version, image and container id

I can see the json output for the osd you posted doesn't list any version

{
"style": "cephadm:v1",
"name": "osd.61",
"fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178",
"systemd_unit": 
ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61<mailto:ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61>,
"enabled": true,
"state": "running",
"memory_request": null,
"memory_limit": null,
"ports": null,
"container_id": 
"bb7d491335323689dfc6dcb8ae1b6c022f93b3721d69d46c6ed6036bbdd68255",
"container_image_name": 
"docker.io/ceph/daemon:latest-pacific<http://docker.io/ceph/daemon:latest-pacific>",
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_digests": [

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
],
{

The way the version is gathered for OSDs (and ceph daemons in general) by 
cephadm is to exec into the container and run "ceph -v"). I'm not sure why that 
wouldn't be working for the OSD here, but is for the mon. The one other thing I 
noted is the use of the 
docker.io/ceph/daemon:latest-pacific<http://docker.io/ceph/daemon:latest-pacific>
 image. We haven't been putting ceph images on docker for quite some time, so 
that's actually a fairly old pacific version (over 2 years old when I checked). 
There are much more recent pacific images on quay. Any reason for using that 
particular image? It's hard to remember if there was maybe some bug with such 
an old version or something that could cause this.

On Thu, Mar 30, 2023 at 1:40 PM Adiga, Anantha 
mailto:anantha.ad...@intel.com>> wrote:
Hi Adam,


Cephadm ls lists all details:

NAME   HOST PORTS   
 STATUS REFRESHED  AGE  MEM USE  MEM LIM  VERSION IMAGE ID  
CONTAINER ID
osd.61 zp3110b001a0101  
 running   3m ago   8M-22.0G  

mon.zp3110b001a0101zp3110b001a0101  
 running   3m ago   8M-2048M  

{
"style": "cephadm:v1",
"name": "osd.61",
"fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178",
"systemd_unit": 
ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61<mailto:ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61>,
"enabled": true,
"state": "running",
"memory_request": null,
"memory_limit": null,
"ports": null,
"container_id": 
"bb7d491335323689dfc6dcb8ae1b6c022f93b3721d69d46c6ed6036bbdd68255",
"container_image_name": 
"docker.io/ceph/daemon:latest-pacific<http://docker.io/ceph/daemon:latest-pacific>",
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_digests": [

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
],
{
"style": "cephadm:v1",
"name": "mon.zp3110b001a0101",
"fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178",
"systemd_unit": 
ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@mon.zp3110b001a0101<mailto:ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@mon.zp3110b001a0101>,
"enabled": true,
"state": "running",
"memory_request": null,
"memory_limit": null,
"ports": null,
"container_id": 
"32ba68d042c3dd7e7cf81a12b6b753cf12dfd8ed1faa8ffc0ecf9f55f4f26fe4",
"container_image_name": 
"docker.io/ceph/daemon:latest-pacific<http://docker.io/ceph/daemon:latest-pacific>",
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",

[ceph-users] Re: ceph orch ps mon, mgr, osd shows for version, image and container id

2023-03-30 Thread Adiga, Anantha
Hi Adam,


Cephadm ls lists all details:

NAME   HOST PORTS   
 STATUS REFRESHED  AGE  MEM USE  MEM LIM  VERSION IMAGE ID  
CONTAINER ID
osd.61 zp3110b001a0101  
 running   3m ago   8M-22.0G  

mon.zp3110b001a0101zp3110b001a0101  
 running   3m ago   8M-2048M  

{
"style": "cephadm:v1",
"name": "osd.61",
"fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178",
"systemd_unit": 
ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61<mailto:ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@osd.61>,
"enabled": true,
"state": "running",
"memory_request": null,
"memory_limit": null,
"ports": null,
"container_id": 
"bb7d491335323689dfc6dcb8ae1b6c022f93b3721d69d46c6ed6036bbdd68255",
"container_image_name": "docker.io/ceph/daemon:latest-pacific",
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_digests": [

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
],
{
"style": "cephadm:v1",
"name": "mon.zp3110b001a0101",
"fsid": "8dbfcd81-fee3-49d2-ac0c-e988c8be7178",
"systemd_unit": 
ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@mon.zp3110b001a0101<mailto:ceph-8dbfcd81-fee3-49d2-ac0c-e988c8be7178@mon.zp3110b001a0101>,
"enabled": true,
"state": "running",
"memory_request": null,
"memory_limit": null,
"ports": null,
"container_id": 
"32ba68d042c3dd7e7cf81a12b6b753cf12dfd8ed1faa8ffc0ecf9f55f4f26fe4",
"container_image_name": "docker.io/ceph/daemon:latest-pacific",
"container_image_id": 
"6e73176320aaccf3b3fb660b9945d0514222bd7a83e28b96e8440c630ba6891f",
"container_image_digests": [

docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586<mailto:docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586>
],
"memory_usage": 1104880336,
"version": "16.2.5",
"started": "2023-03-29T22:41:52.754971Z",
"created": "2022-07-13T16:31:48.766907Z",
"deployed": "2022-07-13T16:30:48.528809Z",
"configured": "2022-07-13T16:31:48.766907Z"
},


The unknown is only for osd, mon and mgr  services. It is across all nodes.
Also  some other items missing are  PORTS, STATUS (time),  MEM USE,
NAME
HOST PORTSSTATUS REFRESHED  AGE  MEM USE  
MEM LIM  VERSION IMAGE ID  CONTAINER ID
rgw.default.default.zp3110b001a0103.ftizjg   zp3110b001a0103  *:8080   
running (12h) 5m ago   8M 145M-  16.2.5 
 6e73176320aa  bd6c4d4262b3
alertmanager.zp3110b001a0101   zp3110b001a0101  
running   3m ago   8M-- 
   
mds.cephfs.zp3110b001a0102.sihibe   zp3110b001a0102 
  stopped   9m ago   4M-- 
 
mgr.zp3110b001a0101zp3110b001a0101  
 running   3m ago   8M-- 
 
mgr.zp3110b001a0102        zp3110b001a0102  
 running   9m ago   8M-- 
 
mon.zp3110b001a0101   zp3110b001a0101   
running   3m ago   8M-2048M 
 
mon.zp3110b001a0102  zp3110b001a0102
   running   9m ago   8M-2048M 
 

Thank you,
Anantha

From: Adam King 
Sent: Thursday, March 30, 2023 8:08 AM
To: Adiga, Anantha 
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] ceph orch ps mon, mgr, osd shows  for 
version, image and container id

if you put a copy of the cephadm binary onto one of the

[ceph-users] ceph orch ps shows version, container and image id as unknown

2023-03-27 Thread Adiga, Anantha
Hi,

Has anybody noticed this?

ceph orch ps shows version, container and image id as unknown only for mon, mgr 
and osds. Ceph health is  OK and all daemons are running fine.
cephadm ls shows values for version, container and image id.

root@cr21meg16ba0101:~# cephadm shell ceph orch ps
Inferring fsid a6f52598-e5cd-4a08-8422-7b6fdb1d5dbe
Using recent ceph image 
ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
NAME   HOST PORTS   
 STATUSREFRESHED  AGE  MEM USE  MEM LIM  VERSION IMAGE ID   
   CONTAINER ID
crash.cr21meg16ba0101  cr21meg16ba0101  
 running (4d) 2m ago   5w7107k-  16.2.5  
6e73176320aa  1001776a7c02
crash.cr21meg16ba0102  cr21meg16ba0102  
 running (5w) 2m ago   5w63.8M-  16.2.5  
6e73176320aa  ecfd19d15dbb
crash.cr21meg16ba0103  cr21meg16ba0103  
 running (5w) 2m ago   5w8131k-  16.2.5  
6e73176320aa  c508ad3979b0
grafana.cr21meg16ba0101cr21meg16ba0101  *:3000  
 running (4d) 2m ago   4d58.0M-  6.7.4   
be4c69a1aae8  ce2741a091c7
grafana.cr21meg16ba0102cr21meg16ba0102  *:3000  
 running (4d) 2m ago   7d53.0M-  6.7.4   
be4c69a1aae8  c09f53b31999
grafana.cr21meg16ba0103cr21meg16ba0103  *:3000  
 running (4d) 2m ago   7d54.5M-  6.7.4   
be4c69a1aae8  e58f6d9f44a2
haproxy.nfs.nfs-1.cr21meg16ba0101.cwsweq   cr21meg16ba0101  
*:2049,9049  running (4d) 2m ago   5w66.7M-  2.3.21-3ce4ee0  
7ecd3fda00f4  c5f4d94b5354
haproxy.nfs.nfs-1.cr21meg16ba0102.yodyxa   cr21meg16ba0102  
*:2049,9049  running (5w) 2m ago   5w75.9M-  2.3.21-3ce4ee0  
7ecd3fda00f4  0a6629e27463
haproxy.rgw.default.default.cr21meg16ba0101.ecpnxq cr21meg16ba0101  
*:80,9050running (4d) 2m ago   5w 102M-  2.3.21-3ce4ee0  
7ecd3fda00f4  3c61d34b8b7d
haproxy.rgw.default.default.cr21meg16ba0102.nffdzb cr21meg16ba0102  
*:80,9050running (5w) 2m ago   5w 114M-  2.3.21-3ce4ee0  
7ecd3fda00f4  406ee603a311
haproxy.rgw.default.default.cr21meg16ba0103.lvypmb cr21meg16ba0103  
*:80,9050running (5w) 2m ago   5w 108M-  2.3.21-3ce4ee0  
7ecd3fda00f4  a514c26c0a8e
keepalived.nfs.nfs-1.cr21meg16ba0101.qpvesrcr21meg16ba0101  
 running (4d) 2m ago   5w26.9M-  2.0.5   
073e0c3cd1b9  c4003cb45da6
keepalived.nfs.nfs-1.cr21meg16ba0102.hedpuocr21meg16ba0102  
 running (5w) 2m ago   5w38.0M-  2.0.5   
073e0c3cd1b9  b654e661493b
keepalived.rgw.default.default.cr21meg16ba0101.biaqvq  cr21meg16ba0101  
 running (4d) 2m ago   5w39.2M-  2.0.5   
073e0c3cd1b9  020c7cb700c4
keepalived.rgw.default.default.cr21meg16ba0102.dufodx  cr21meg16ba0102  
 running (5w) 2m ago   5w46.0M-  2.0.5   
073e0c3cd1b9  fe218ecaf398
keepalived.rgw.default.default.cr21meg16ba0103.utplxz  cr21meg16ba0103  
 running (5w) 2m ago   5w45.1M-  2.0.5   
073e0c3cd1b9  18a99c36ef29
mds.cephfs.cr21meg16ba0101.tmfknc  cr21meg16ba0101  
 running (4d) 2m ago   5w29.5M-  16.2.5  
6e73176320aa  e753a2498ccf
mds.cephfs.cr21meg16ba0102.vdrcvi  cr21meg16ba0102  
 running (5w) 2m ago   5w 212M-  16.2.5  
6e73176320aa  925f151da4de
mds.cephfs.cr21meg16ba0103.yacxeu  cr21meg16ba0103  
 running (5w) 2m ago   5w38.2M-  16.2.5  
6e73176320aa  79599f7ca3c8
mgr.cr21meg16ba0101cr21meg16ba0101  
 running  2m ago   5w--   
   
mgr.cr21meg16ba0102cr21meg16ba0102  
 running  2m ago   5w--   
   
mgr.cr21meg16ba0103cr21meg16ba0103  
 running  2m ago   5w--   
   
mon.cr21meg16ba0101cr21meg16ba0101  
 running  2m ago   5w-2048M   
   
mon.cr21meg16ba0102cr21meg16ba0102  
 running  2m ago   5w-2048M   
   
mon.cr21meg16ba0103cr21meg16ba0103  
 running  2m ago   5w-2048M   
   
nfs.nfs-1.0.63.cr21meg16ba0102.kkxpfh  cr21meg16ba0102  *:12049 
 running (5w)   

[ceph-users] Re: Creating a role for quota management

2023-03-07 Thread Adiga, Anantha
Thank you Xiubo, will try that option, looks like it is done with the intention 
to keep it at the client level.

Anantha

-Original Message-
From: Xiubo Li  
Sent: Tuesday, March 7, 2023 12:44 PM
To: Adiga, Anantha ; ceph-users@ceph.io
Subject: Re: [ceph-users] Creating a role for quota management

Hi

Maybe you can use the rule of CEPHFS CLIENT CAPABILITIES and only enable the 
'p' permission for some users, which will allow them to SET_VXATTR.

I didn't find the similar cap from the OSD CAPBILITIES.

Thanks

On 07/03/2023 00:33, anantha.ad...@intel.com wrote:
> Hello,
>
> Can you provide details on how to create a role for allowing a set of users 
> to set quota on CephFS pools ?
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Planning: Ceph User Survey 2020

2020-11-27 Thread Adiga, Anantha
Hi Yuval,

Your questions have been added.

Thank you,
Anantha

From: Yuval Lifshitz 
Sent: Wednesday, November 25, 2020 6:30 AM
To: Mike Perez 
Cc: ceph-users ; Adiga, Anantha ; 
Paul Mezzanini ; Anthony D'Atri 
Subject: Re: [ceph-users] Planning: Ceph User Survey 2020

Hi Mike,
Could we add more questions on RGW usecases and functionality adoption?

For instance:

bucket notifications:
* do you use "bucket notifications"?
* if so, which endpoint do you use: kafka, amqp, http?
* which other endpoints would you like to see there?

sync modules:
* do you use the cloud sync module? if so, with which cloud provider?
* do you use an archive zone?
* do you use the elasticsearch module?

multisite:
* do you have more than one realm in your setup? if so, how many?
* do you have more than one zone group in your setup?
* do you have more than one zone in your setup? if so, how many in the largest 
zone group?
* is the syncing policy between zones global or per bucket?

On Tue, Nov 24, 2020 at 8:06 PM Mike Perez 
mailto:mipe...@redhat.com>> wrote:
Hi everyone,

The Ceph User Survey 2020 is being planned by our working group. Please
review the draft survey pdf, and let's discuss any changes. You may also
join us in the next meeting *on November 25th at 12pm *PT

https://tracker.ceph.com/projects/ceph/wiki/User_Survey_Working_Group

https://tracker.ceph.com/attachments/download/5260/Ceph%20User%20Survey%202020.pdf

We're aiming to have something ready by mid-December.

--

Mike Perez

he/him

Ceph / Rook / RDO / Gluster Community Architect

Open-Source Program Office (OSPO)


M: +1-951-572-2633

494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
@Thingee <https://twitter.com/thingee>  Thingee
<https://www.linkedin.com/thingee> <https://www.facebook.com/RedHatInc>
<https://www.redhat.com>
___
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io