[ceph-users] Re: Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

2024-02-02 Thread Mark Schouten

Hi,

Cool, thanks!

As for the global_id_reclaim settings:
root@proxmox01:~# ceph config get mon 
auth_allow_insecure_global_id_reclaim

false
root@proxmox01:~# ceph config get mon 
auth_expose_insecure_global_id_reclaim

true
root@proxmox01:~# ceph config get mon 
mon_warn_on_insecure_global_id_reclaim

true
root@proxmox01:~# ceph config get mon 
mon_warn_on_insecure_global_id_reclaim_allowed

true


—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl


-- Original Message --
From "Eugen Block" 
To ceph-users@ceph.io
Date 02/02/2024, 08:30:45
Subject [ceph-users] Re: Cannot recreate monitor in upgrade from pacific 
to quincy (leveldb -> rocksdb)



I might have a reproducer, the second rebuilt mon is not joining the  cluster 
as well, I'll look into it and let you know if I find anything.

Zitat von Eugen Block :


Hi,


Can anyone confirm that ancient (2017) leveldb database mons should  just 
accept ‘mon.$hostname’ names for mons, a well as ‘mon.$id’ ?


at some point you had or have to remove one of the mons to recreate  it with a 
rocksdb backend, so the mismatch should not be an issue  here. I can confirm 
that when I tried to reproduce it in a small  test cluster with leveldb. So now 
I have two leveldb MONs and one  rocksdb MON:

jewel:~ # cat  
/var/lib/ceph/b08424fa-8530-4080-876d-2821c916d26c/mon.jewel/kv_backend
rocksdb
jewel2:~ # cat  
/var/lib/ceph/b08424fa-8530-4080-876d-2821c916d26c/mon.jewel2/kv_backend
leveldb
jewel3:~ # cat  
/var/lib/ceph/b08424fa-8530-4080-876d-2821c916d26c/mon.jewel3/kv_backend
leveldb

And the cluster is healthy, although it took a minute or two for the  rebuilt 
MON to sync (in a real cluster with some load etc. it might  take longer):

jewel:~ # ceph -s
  cluster:
id: b08424fa-8530-4080-876d-2821c916d26c
health: HEALTH_OK

  services:
mon: 3 daemons, quorum jewel2,jewel3,jewel (age 3m)

I'm wondering if this could have to do with the insecure_global_id  things. Can 
you send the output of:

ceph config get mon auth_allow_insecure_global_id_reclaim
ceph config get mon auth_expose_insecure_global_id_reclaim
ceph config get mon mon_warn_on_insecure_global_id_reclaim
ceph config get mon mon_warn_on_insecure_global_id_reclaim_allowed



Zitat von Mark Schouten :


Hi,

I don’t have a fourth machine available, so that’s not an option  unfortunatly.

I did enable a lot of debugging earlier, but that shows no  information as to 
why stuff is not working as to be expected.

Proxmox just deploys the mons, nothing fancy there, no special cases.

Can anyone confirm that ancient (2017) leveldb database mons should  just 
accept ‘mon.$hostname’ names for mons, a well as ‘mon.$id’ ?

—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl


-- Original Message --
From "Eugen Block" 
To ceph-users@ceph.io
Date 31/01/2024, 13:02:04
Subject [ceph-users] Re: Cannot recreate monitor in upgrade from  pacific to 
quincy (leveldb -> rocksdb)


Hi Mark,

as I'm not familiar with proxmox I'm not sure what happens under  the  hood. 
There are a couple of things I would try, not  necessarily in  this order:

- Check the troubleshooting guide [1], for example a clock skew  could  be one 
reason, have you verified ntp/chronyd functionality?
- Inspect debug log output, maybe first on the probing mon and if   those don't 
reveal the reason, enable debug logs for the other  MONs as  well:
ceph config set mon.proxmox03 debug_mon 20
ceph config set mon.proxmox03 debug_paxos 20

or for all MONs:
ceph config set mon debug_mon 20
ceph config set mon debug_paxos 20

- Try to deploy an additional MON on a different server (if you  have  more 
available) and see if that works.
- Does proxmox log anything?
- Maybe last resort, try to start a MON manually after adding it  to  the 
monmap with the monmaptool, but only if you know what  you're  doing. I wonder 
if the monmap doesn't get updated...

Regards,
Eugen

[1]  https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/

Zitat von Mark Schouten :


Hi,

During an upgrade from pacific to quincy, we needed to recreate  the  mons 
because the mons were pretty old and still using leveldb.

So step one was to destroy one of the mons. After that we  recreated  the 
monitor, and although it starts, it remains in  state ‘probing’,  as you can 
see below.

No matter what I tried, it won’t come up. I’ve seen quite some   messages that 
the MTU might be an issue, but that seems to be ok:
root@proxmox03:/var/log/ceph# fping -b 1472 10.10.10.{1..3} -M
10.10.10.1 is alive
10.10.10.2 is alive
10.10.10.3 is alive


Does anyone have an idea how to fix this? I’ve tried destroying  and  
recreating the mon a few times now. Could it be that the  leveldb  mons only 
support mon.$id notation for the monitors?

root@proxmox03:/var/log/ceph# ceph daemon mon.proxmox03 mon_status
{
  "name": “proxmox03”,
  "rank&q

[ceph-users] Re: Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

2024-01-31 Thread Mark Schouten

Hi,

I don’t have a fourth machine available, so that’s not an option 
unfortunatly.


I did enable a lot of debugging earlier, but that shows no information 
as to why stuff is not working as to be expected.


Proxmox just deploys the mons, nothing fancy there, no special cases.

Can anyone confirm that ancient (2017) leveldb database mons should just 
accept ‘mon.$hostname’ names for mons, a well as ‘mon.$id’ ?


—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl


-- Original Message --
From "Eugen Block" 
To ceph-users@ceph.io
Date 31/01/2024, 13:02:04
Subject [ceph-users] Re: Cannot recreate monitor in upgrade from pacific 
to quincy (leveldb -> rocksdb)



Hi Mark,

as I'm not familiar with proxmox I'm not sure what happens under the  hood. 
There are a couple of things I would try, not necessarily in  this order:

- Check the troubleshooting guide [1], for example a clock skew could  be one 
reason, have you verified ntp/chronyd functionality?
- Inspect debug log output, maybe first on the probing mon and if  those don't 
reveal the reason, enable debug logs for the other MONs as  well:
ceph config set mon.proxmox03 debug_mon 20
ceph config set mon.proxmox03 debug_paxos 20

or for all MONs:
ceph config set mon debug_mon 20
ceph config set mon debug_paxos 20

- Try to deploy an additional MON on a different server (if you have  more 
available) and see if that works.
- Does proxmox log anything?
- Maybe last resort, try to start a MON manually after adding it to  the monmap 
with the monmaptool, but only if you know what you're  doing. I wonder if the 
monmap doesn't get updated...

Regards,
Eugen

[1] https://docs.ceph.com/en/latest/rados/troubleshooting/troubleshooting-mon/

Zitat von Mark Schouten :


Hi,

During an upgrade from pacific to quincy, we needed to recreate the  mons 
because the mons were pretty old and still using leveldb.

So step one was to destroy one of the mons. After that we recreated  the 
monitor, and although it starts, it remains in state ‘probing’,  as you can see 
below.

No matter what I tried, it won’t come up. I’ve seen quite some  messages that 
the MTU might be an issue, but that seems to be ok:
root@proxmox03:/var/log/ceph# fping -b 1472 10.10.10.{1..3} -M
10.10.10.1 is alive
10.10.10.2 is alive
10.10.10.3 is alive


Does anyone have an idea how to fix this? I’ve tried destroying and  recreating 
the mon a few times now. Could it be that the leveldb  mons only support 
mon.$id notation for the monitors?

root@proxmox03:/var/log/ceph# ceph daemon mon.proxmox03 mon_status
{
"name": “proxmox03”,
"rank": 2,
"state": “probing”,
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": “2449958197560098820”,
"required_mon": [
“kraken”,
“luminous”,
“mimic”,
"osdmap-prune”,
“nautilus”,
“octopus”,
“pacific”,
"elector-pinging”
],
"quorum_con": “0”,
"quorum_mon": []
},
"outside_quorum": [
“proxmox03”
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 0,
"fsid": "39b1e85c-7b47-4262-9f0a-47ae91042bac”,
"modified": "2024-01-23T21:02:12.631320Z”,
"created": "2017-03-15T14:54:55.743017Z”,
"min_mon_release": 16,
"min_mon_release_name": “pacific”,
"election_strategy": 1,
"disallowed_leaders: ": “”,
"stretch_mode": false,
"tiebreaker_mon": “”,
"removed_ranks: ": “2”,
"features": {
"persistent": [
“kraken”,
“luminous”,
“mimic”,
"osdmap-prune”,
“nautilus”,
“octopus”,
“pacific”,
"elector-pinging”
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": “0”,
"public_addrs": {
"addrvec": [
{
"type": “v2”,
"addr": "10.10.10.1:3300”,
"nonce": 0
},
{
"type": “v1”,
"addr": "10.10.10.1:6789”,
"nonce": 0
}
]
},
"addr": &q

[ceph-users] Cannot recreate monitor in upgrade from pacific to quincy (leveldb -> rocksdb)

2024-01-31 Thread Mark Schouten

Hi,

During an upgrade from pacific to quincy, we needed to recreate the mons 
because the mons were pretty old and still using leveldb.


So step one was to destroy one of the mons. After that we recreated the 
monitor, and although it starts, it remains in state ‘probing’, as you 
can see below.


No matter what I tried, it won’t come up. I’ve seen quite some messages 
that the MTU might be an issue, but that seems to be ok:

root@proxmox03:/var/log/ceph# fping -b 1472 10.10.10.{1..3} -M
10.10.10.1 is alive
10.10.10.2 is alive
10.10.10.3 is alive


Does anyone have an idea how to fix this? I’ve tried destroying and 
recreating the mon a few times now. Could it be that the leveldb mons 
only support mon.$id notation for the monitors?


root@proxmox03:/var/log/ceph# ceph daemon mon.proxmox03 mon_status
{
"name": “proxmox03”,
"rank": 2,
"state": “probing”,
"election_epoch": 0,
"quorum": [],
"features": {
"required_con": “2449958197560098820”,
"required_mon": [
“kraken”,
“luminous”,
“mimic”,
"osdmap-prune”,
“nautilus”,
“octopus”,
“pacific”,
"elector-pinging”
],
"quorum_con": “0”,
"quorum_mon": []
},
"outside_quorum": [
“proxmox03”
],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": {
"epoch": 0,
"fsid": "39b1e85c-7b47-4262-9f0a-47ae91042bac”,
"modified": "2024-01-23T21:02:12.631320Z”,
"created": "2017-03-15T14:54:55.743017Z”,
"min_mon_release": 16,
"min_mon_release_name": “pacific”,
"election_strategy": 1,
"disallowed_leaders: ": “”,
"stretch_mode": false,
"tiebreaker_mon": “”,
"removed_ranks: ": “2”,
"features": {
"persistent": [
“kraken”,
“luminous”,
“mimic”,
"osdmap-prune”,
“nautilus”,
“octopus”,
“pacific”,
"elector-pinging”
],
"optional": []
},
"mons": [
{
"rank": 0,
"name": “0”,
"public_addrs": {
"addrvec": [
{
"type": “v2”,
"addr": "10.10.10.1:3300”,
"nonce": 0
},
{
"type": “v1”,
"addr": "10.10.10.1:6789”,
"nonce": 0
}
]
},
"addr": "10.10.10.1:6789/0”,
"public_addr": "10.10.10.1:6789/0”,
"priority": 0,
"weight": 0,
"crush_location": “{}”
},
{
"rank": 1,
"name": “1”,
"public_addrs": {
"addrvec": [
{
"type": “v2”,
"addr": "10.10.10.2:3300”,
"nonce": 0
},
{
"type": “v1”,
"addr": "10.10.10.2:6789”,
"nonce": 0
}
]
},
"addr": "10.10.10.2:6789/0”,
"public_addr": "10.10.10.2:6789/0”,
"priority": 0,
"weight": 0,
"crush_location": “{}”
},
{
"rank": 2,
"name": “proxmox03”,
"public_addrs": {
"addrvec": [
{
"type": “v2”,
"addr": "10.10.10.3:3300”,
    "nonce": 0
},
{
"type": “v1”,
"addr": "10.10.10.3:6789”,
"nonce": 0
}
]
},
"addr": "10.10.10.3:6789/0”,
"public_addr": "10.10.10.3:6789/0”,
"priority": 0,
"weight": 0,
"crush_location": “{}”
}
]
},
"feature_map": {
"mon": [
{
"features": “0x3f01cfbdfffd”,
"release": “luminous”,
"num": 1
}
]
},
"stretch_mode": false
}

—
Mark Schouten
CTO, Tuxis B.V.
+31 318 200208 / m...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-02-28 Thread Mark Schouten

Hi,

I just destroyed the filestore osd and added it as a bluestore osd. 
Worked fine.


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl / +31 318 200208


-- Original Message --
From "Jan Pekař - Imatic" 
To m...@tuxis.nl; ceph-users@ceph.io
Date 2/25/2023 4:14:54 PM
Subject Re: [ceph-users] OSD upgrade problem nautilus->octopus - 
snap_mapper upgrade stuck



Hi,

I tried upgrade to Pacific now. The same result. OSD is not starting, stuck at 
1500 keys.

JP

On 23/02/2023 00.16, Jan Pekař - Imatic wrote:

Hi,

I enabled debug and the same - 1500 keys is where it ends.. I also enabled 
debug_filestore and ...

2023-02-23T00:02:34.876+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb297e7920 already registered
2023-02-23T00:02:34.876+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb297e7920 seq 
148188829 osr(meta) 4859 bytes   (queue has 49 ops and 238167 bytes)
2023-02-23T00:02:34.876+0100 7f8efc23ee00 10 snap_mapper.convert_legacy 
converted 1470 keys
2023-02-23T00:02:34.880+0100 7f8efc23ee00  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_transactions(2303): osr 
0x55fb27780540 osr(meta)
2023-02-23T00:02:34.880+0100 7f8efc23ee00  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_transactions(2345): (writeahead) 
148188845 [Transaction(0x55fb299870e0)]
2023-02-23T00:02:34.880+0100 7f8efc23ee00 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb29c52d20 #-1:c0371625:::snapmapper:0# (0x55fb29f9c9e0)
2023-02-23T00:02:34.880+0100 7f8efc23ee00 10 snap_mapper.convert_legacy 
converted 1500 keys
2023-02-23T00:02:34.880+0100 7f8efc23ee00  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_transactions(2303): osr 
0x55fb27780540 osr(meta)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb297e7aa0 seq 
148188830 osr(meta) [Transaction(0x55fb29986000)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb297e7aa0 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb297e7aa0 seq 
148188830 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb297e7c20 seq 
148188831 osr(meta) [Transaction(0x55fb29986120)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb297e7c20 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb297e7c20 seq 
148188831 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb297e7d40 seq 
148188832 osr(meta) [Transaction(0x55fb29986240)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb297e7d40 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb297e7d40 seq 
148188832 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb297e7ec0 seq 
148188833 osr(meta) [Transaction(0x55fb29986360)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb297e7ec0 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb297e7ec0 seq 
148188833 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb2921db60 seq 
148188834 osr(meta) [Transaction(0x55fb29986480)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb2921db60 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb2921db60 seq 
148188834 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb2921c8a0 seq 
148188835 osr(meta) [Transaction(0x55fb299865a0)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb2921c8a0 already registered
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) queue_op(2181): 0x55fb2921c8a0 seq 
148188835 osr(meta) 4859 bytes   (queue has 50 ops and 243026 bytes)
2023-02-23T00:02:34.888+0100 7f8ef26d1700  5 
filestore(/var/lib/ceph/osd/ceph-0) _journaled_ahead(2440): 0x55fb29c52000 seq 
148188836 osr(meta) [Transaction(0x55fb299866c0)]
2023-02-23T00:02:34.888+0100 7f8ef26d1700 20 filestore.osr(0x55fb27780540) 
_register_apply 0x55fb29c52000 already registered
2023-02-23T00:02:34.888+0100 7f

[ceph-users] Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-02-07 Thread Mark Schouten

Hi,

Thanks. Someone told me that we could just destroy the FileStore OSD’s 
and recreate them as BlueStore, even though the cluster is partially 
upgraded. So I guess I’ll just do that. (Unless someone here tells me 
that that’s a terrible idea :))


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl / +31 318 200208


-- Original Message --
From "Eugen Block" 
To ceph-users@ceph.io
Date 2/7/2023 4:58:11 PM
Subject [ceph-users] Re: OSD upgrade problem nautilus->octopus - 
snap_mapper upgrade stuck



Hi,

I don't really have an answer, but there was a bug with snap mapper  [1], [2] is 
supposed to verify consistency, but Octopus is EOL so you  might need to upgrade 
directly to Pacific. That's what we did on  multiple clusters (N --> P) a few 
months back. I'm not sure if it  would just work if you already have a couple of 
Octopus daemons, maybe  you can try it on a test cluster.

Regards,
Eugen

[1] https://tracker.ceph.com/issues/56147
[2] https://github.com/ceph/ceph/pull/47388

Zitat von Mark Schouten :


Hi,

I’m seeing the same thing …

With debug logging enabled I see this:
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy 
converted 1410 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy 
converted 1440 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy 
converted 1470 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10  snap_mapper.convert_legacy 
converted 1500 keys

It ends at 1500 keys. And nothing happens.

I’m now stuck with a cluster that has 4 OSD’s on Octopus, 10 on  Nautilus, and 
one down .. A hint on how to work around this is  welcome :)

—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl / +31 318 200208


-- Original Message --
From "Jan Pekař - Imatic" 
To ceph-users@ceph.io
Date 1/12/2023 5:53:02 PM
Subject [ceph-users] OSD upgrade problem nautilus->octopus -  snap_mapper 
upgrade stuck


Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init  upgrade 
snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No  disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to  snap mapper 2 
reliable? What is OSD waiting for? Can I start OSD  without upgrade and get 
cluster healthy with old snap structure? Or  should I skip octopus upgrade and 
go to pacific directly (some bug  backport is missing?).

Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 ceph version 15.2.17  
(694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable),  process ceph-osd, 
pid 2566563
2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 pidfile_write: ignore  empty 
--pid-file
2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file,  inferring 
filestore from current/ dir
2023-01-09T19:12:49.531+0100 7f41f60f1e00  0 starting osd.0  osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public  interface
2023-01-09T19:12:49.871+0100 7f41f60f1e00  0 load: jerasure load:  lrc load: isa
2023-01-09T19:12:49.875+0100 7f41f60f1e00  0  
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:0.OSDShard using  op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:1.OSDShard using  op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:2.OSDShard using  op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:3.OSDShard using  op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:4.OSDShard using  op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue,  cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0  
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  FIEMAP 
ioctl is disabled via 'filestore fiemap' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole'  config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0  
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features:  splice() is 
disabled via 'filestore splice' config option
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0  
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) de

[ceph-users] Re: OSD upgrade problem nautilus->octopus - snap_mapper upgrade stuck

2023-02-06 Thread Mark Schouten

Hi,

I’m seeing the same thing …

With debug logging enabled I see this:
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1410 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1440 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1470 keys
2023-02-07T00:35:51.853+0100 7fdab9930e00 10 snap_mapper.convert_legacy 
converted 1500 keys


It ends at 1500 keys. And nothing happens.

I’m now stuck with a cluster that has 4 OSD’s on Octopus, 10 on 
Nautilus, and one down .. A hint on how to work around this is welcome 
:)


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl / +31 318 200208


-- Original Message --
From "Jan Pekař - Imatic" 
To ceph-users@ceph.io
Date 1/12/2023 5:53:02 PM
Subject [ceph-users] OSD upgrade problem nautilus->octopus - snap_mapper 
upgrade stuck



Hi all,

I have problem upgrading nautilus to octopus on my OSD.

Upgrade mon and mgr was OK and first OSD stuck on

2023-01-12T09:25:54.122+0100 7f49ff3eae00  1 osd.0 126556 init upgrade 
snap_mapper (first start as octopus)

and there were no activity after that for more than 48 hours. No disk activity.

I restarted OSD many times and nothing changed.

It is old, filestore OSD based on XFS filesystem. Is upgrade to snap mapper 2 
reliable? What is OSD waiting for? Can I start OSD without upgrade and get 
cluster healthy with old snap structure? Or should I skip octopus upgrade and 
go to pacific directly (some bug backport is missing?).

Thank you for help, I'm sending some logs below..

Log shows

2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 ceph version 15.2.17 
(694d03a6f6c6e9f814446223549caf9a9f60dba0) octopus (stable), process ceph-osd, 
pid 2566563
2023-01-09T19:12:49.471+0100 7f41f60f1e00  0 pidfile_write: ignore empty 
--pid-file
2023-01-09T19:12:49.499+0100 7f41f60f1e00 -1 missing 'type' file, inferring 
filestore from current/ dir
2023-01-09T19:12:49.531+0100 7f41f60f1e00  0 starting osd.0 osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
2023-01-09T19:12:49.531+0100 7f41f60f1e00 -1 Falling back to public interface
2023-01-09T19:12:49.871+0100 7f41f60f1e00  0 load: jerasure load: lrc load: isa
2023-01-09T19:12:49.875+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:0.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:1.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:2.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:3.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 osd.0:4.OSDShard using op 
scheduler ClassedOpQueueScheduler(queue=WeightedPriorityQueue, cutoff=196)
2023-01-09T19:12:49.883+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342)
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP ioctl 
is disabled via 'filestore fiemap' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: 
SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option
2023-01-09T19:12:49.927+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice() is 
disabled via 'filestore splice' config option
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 
genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: syncfs(2) 
syscall fully supported (by glibc and kernel)
2023-01-09T19:12:49.983+0100 7f41f60f1e00  0 
xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is 
disabled by conf
2023-01-09T19:12:50.015+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) start omap initiation
2023-01-09T19:12:50.079+0100 7f41f60f1e00  1 leveldb: Recovering log #165531
2023-01-09T19:12:50.083+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
started
2023-01-09T19:12:50.235+0100 7f41f60f1e00  1 leveldb: Level-0 table #165533: 
1598 bytes OK
2023-01-09T19:12:50.583+0100 7f41f60f1e00  1 leveldb: Delete type=0 #165531

2023-01-09T19:12:50.615+0100 7f41f60f1e00  1 leveldb: Delete type=3 #165529

2023-01-09T19:12:51.339+0100 7f41f60f1e00  0 
filestore(/var/lib/ceph/osd/ceph-0) mount(1861): enabling WRITEAHEAD journal 
mode: checkpoint is not enabled
2023-01-09T19:12:51.379+0100 7f41f60f1e00  1 journal _open 
/var/lib/ceph/osd/ceph-0/journal fd 35: 2998927360 bytes, block size 4096 
bytes, directio = 1, aio = 1
2023-01-09T19:12:51.931+0100 7f41f60f1e00 -1 journal do

[ceph-users] Re: how to upgrade host os under ceph

2022-10-26 Thread Mark Schouten

Hi Simon,

You can just dist-upgrade the underlying OS. Assuming that you installed 
the packages from https://download.ceph.com/debian-octopus/, just change 
bionic to focal in all apt-sources, and dist-upgrade away.


—
Mark Schouten, CTO
Tuxis B.V.
m...@tuxis.nl


-- Original Message --
From "Simon Oosthoek" 
To "ceph-users@ceph.io" 
Date 26/10/2022 16:14:28
Subject [ceph-users] how to upgrade host os under ceph


Dear list,

I'm looking for some guide or pointers to how people upgrade the underlying 
host OS in a ceph cluster (if this is the right way to proceed, I don't even 
know...)

Our cluster is nearing the 4.5 years of age and now our ubuntu 18.04 is nearing 
the end of support date. We have a mixed cluster of u18 and u20 nodes, all 
running octopus at the moment.

We would like to upgrade the OS on the nodes, without changing the ceph version 
for now (or per se).

Is it as easy as installing a new OS version, installing the ceph-osd package 
and a correct ceph.conf file and restoring the host key?

Or is more needed regarding the specifics of the OSD disks/WAL/journal?

Or is it necessary to drain a node of all data and re-add the OSDs as new 
units? (This would be too much work, so I doubt it ;-)

The problem with searching for information about this, is that it seems 
undocumented in the ceph documentation, and search results are flooded with 
ceph version upgrades.

Cheers

/Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Cluster downtime due to unsynchronized clocks

2021-09-23 Thread Mark Schouten
Hi,

Last night we’ve had downtime on a simple three-node cluster. Here’s 
what happened:
2021-09-23 00:18:48.331528 mon.node01 (mon.0) 834384 : cluster [WRN] 
message from mon.2 was stamped 8.401927s in the future, clocks not 
synchronized
2021-09-23 00:18:57.783437 mon.node01 (mon.0) 834386 : cluster [WRN] 1 
clock skew 8.40163s > max 0.05s
2021-09-23 00:18:57.783486 mon.node01 (mon.0) 834387 : cluster [WRN] 2 
clock skew 8.40146s > max 0.05s
2021-09-23 00:18:59.843444 mon.node01 (mon.0) 834388 : cluster [WRN] 
Health check failed: clock skew detected on mon.node02, mon.node03 
(MON_CLOCK_SKEW)

The cause of this timeshift is the terrible way that systemd-timesyncd 
works, depending on a single NTP-server. If that one is going haywire, 
systemd-timesyncd does not check with others, but just sets the clock on 
your machine incorrect. We will fix this with chrony.

However, what I don’t understand is that why the cluster does not see 
the single monitor as incorrect, but the two correct machines as 
incorrect. Is this because one of the three is master-ish?

Obviously we will fix the time issues, but I would like to understand 
the reasoning of Ceph to stop functioning because one monitor has 
incorrect time.

Thanks!

--
Mark Schouten
CTO, Tuxis B.V. | https://www.tuxis.nl/
 <mailto:m...@tuxis.nl> | +31 318 200208
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-07-08 Thread Mark Schouten

Hi,

Op 15-05-2021 om 22:17 schreef Mark Schouten:

Ok, so that helped for one of the MDS'es. Trying to deactivate another
mds, it started to release inos and dns'es, until it was almost done.
When it had a 50-ish left, a client started to complain and be
blacklisted until I restarted the deactivated MDS..

So no joy yet, not deactivated until a single active MDS. Any ideas to
achieve that are appreciated.


I've been able to deactivate the second MDS. Not sure why, but I had a 
lot of stray entries, which I've cleaned up by running a `find -ls` on 
the whole CephFS tree.


That already was a few weeks ago, but I decided to just try and 
deactivate the second MDS, which now worked. So now I can finally do the 
upgrade. :)


Thanks!

--
Mark Schouten
CTO, Tuxis B.V. | https://www.tuxis.nl/
 | +31 318 200208
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 12:38:07PM +0200, Mark Schouten wrote:
> On Thu, May 27, 2021 at 06:25:44AM +, Martin Rasmus Lundquist Hansen 
> wrote:
> > After scaling the number of MDS daemons down, we now have a daemon stuck in 
> > the
> > "up:stopping" state. The documentation says it can take several minutes to 
> > stop the
> > daemon, but it has been stuck in this state for almost a full day. 
> > According to
> > the "ceph fs status" output attached below, it still holds information 
> > about 2
> > inodes, which we assume is the reason why it cannot stop completely.
> > 
> > Does anyone know what we can do to finally stop it?
> 
> I have no clients, and it still does not want to stop rank1. Funny
> thing is, while trying to fix this by restarting mdses, I sometimes see
> a list of clients popping up in the dashboard, even though no clients
> are connected..

Configuring debuglogging shows me the following:
https://p.6core.net/p/rlMaunS8IM1AY5E58uUB6oy4


I have quite a lot of hardlinks on this filesystem, which I've seen
issue with 'No space left on device'. I have mds_bal_fragment_size_max
set to 20 to mitigate that.

The message 'waiting for strays to migrate' makes me feel like I should
push the MDS to migrate them somehow .. But how?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 06:25:44AM +, Martin Rasmus Lundquist Hansen wrote:
> After scaling the number of MDS daemons down, we now have a daemon stuck in 
> the
> "up:stopping" state. The documentation says it can take several minutes to 
> stop the
> daemon, but it has been stuck in this state for almost a full day. According 
> to
> the "ceph fs status" output attached below, it still holds information about 2
> inodes, which we assume is the reason why it cannot stop completely.
> 
> Does anyone know what we can do to finally stop it?

I have no clients, and it still does not want to stop rank1. Funny
thing is, while trying to fix this by restarting mdses, I sometimes see
a list of clients popping up in the dashboard, even though no clients
are connected..

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Spam] �ظ�: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 10:37:33AM +0200, Mark Schouten wrote:
> On Thu, May 27, 2021 at 07:02:16AM +, 胡 玮文 wrote:
> > You may hit https://tracker.ceph.com/issues/50112, which we failed to find 
> > the root cause yet. I resolved this by restart rank 0. (I have only 2 
> > active MDSs)
> 
> I have this exact issue while trying to upgrade from 12.2 (which is
> pending this mds issue). I don't have any active clients, restarting
> rank0 does not help.

Since I have no active clients. Can I just shut down the all mds'es,
upgrade them and expect an upgrade to fix this magically? Or would
upgrading possibly break the CephFS-fs?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [Spam] �ظ�: MDS stuck in up:stopping state

2021-05-27 Thread Mark Schouten
On Thu, May 27, 2021 at 07:02:16AM +, 胡 玮文 wrote:
> You may hit https://tracker.ceph.com/issues/50112, which we failed to find 
> the root cause yet. I resolved this by restart rank 0. (I have only 2 active 
> MDSs)

I have this exact issue while trying to upgrade from 12.2 (which is
pending this mds issue). I don't have any active clients, restarting
rank0 does not help.

+--+--+---+---+---+---+
| Rank |  State   |MDS |Activity  |  dns  |  inos |
+--+--+---+---+---+---+
|  0   |  active  | osdnode05 | Reqs:0 /s | 2760k | 2760k |
|  1   | stopping | osdnode06 |   |   10  |   11  |
+--+--+---+---+---+---+


-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Force processing of num_strays in mds

2021-05-18 Thread Mark Schouten
Hi,

I have a 12.2.13 I want to go and upgrade. However, there are a whole
bunch of stray files/inodes(?) which I would want to have processed.
Also because I get a lot of 'No space left on device' messages. I
started a 'find . -ls'  in the root of the CephFS filesystem, but that
causes overload and takes a lot of time, while not neccesarily fixing
the num_strays.

How do I force the mds'es to process those strays so that clients do not
get 'incorrect' errors?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-15 Thread Mark Schouten
On Fri, May 14, 2021 at 09:12:07PM +0200, Mark Schouten wrote:
> It seems (documentation was no longer available, so ik took some
> searching) that I needed to run ceph mds deactivate $fs:$rank for every
> MDS I wanted to deactivate.

Ok, so that helped for one of the MDS'es. Trying to deactivate another
mds, it started to release inos and dns'es, until it was almost done.
When it had a 50-ish left, a client started to complain and be
blacklisted until I restarted the deactivated MDS..

So no joy yet, not deactivated until a single active MDS. Any ideas to
achieve that are appreciated.

Thanks!

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: "No space left on device" when deleting a file

2021-05-14 Thread Mark Schouten
On Tue, May 11, 2021 at 02:55:05PM +0200, Mark Schouten wrote:
> On Tue, May 11, 2021 at 09:53:10AM +0200, Mark Schouten wrote:
> > This helped me too. However, should I see num_strays decrease again?
> > I'm  running a `find -ls` over my CephFS tree..
> 
> This helps, the amount of stray files is slowly decreasing. But given
> the number of files in the cluster, it'll take a while ...


Deactivating one of the MDS'es triggered a lot of work too.

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-14 Thread Mark Schouten
On Mon, May 10, 2021 at 10:46:45PM +0200, Mark Schouten wrote:
> I still have three active ranks. Do I simply restart two of the MDS'es
> and force max_mds to one daemon, or is there a nicer way to move two
> mds'es from active to standby?

It seems (documentation was no longer available, so ik took some
searching) that I needed to run ceph mds deactivate $fs:$rank for every
MDS I wanted to deactivate.

That helped!

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: "No space left on device" when deleting a file

2021-05-11 Thread Mark Schouten
On Tue, May 11, 2021 at 09:53:10AM +0200, Mark Schouten wrote:
> This helped me too. However, should I see num_strays decrease again?
> I'm  running a `find -ls` over my CephFS tree..

This helps, the amount of stray files is slowly decreasing. But given
the number of files in the cluster, it'll take a while ...

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-11 Thread Mark Schouten
On Tue, May 11, 2021 at 09:13:51AM +, Eugen Block wrote:
> You can check the remaining active daemons if they have pinned subtrees:
> 
> ceph daemon mds.daemon-a get subtrees | jq '.[] | [.dir.path, .auth_first]'

This gives me output, a whole lot of lines. However, none of the
directories are directories anyone has ever actively put pinning on...


-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-11 Thread Mark Schouten
On Tue, May 11, 2021 at 08:47:26AM +, Eugen Block wrote:
> I don't have a Luminous cluster at hand right now but setting max_mds to 1
> already should take care and stop MDS services. Do you have have pinning
> enabled (subdirectories pinned to a specific MDS)?

Not on this cluster, AFAIK. How can I check that?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: "No space left on device" when deleting a file

2021-05-11 Thread Mark Schouten
[Resent because of incorrect ceph-users@ address..]

On Tue, Mar 26, 2019 at 05:19:24PM +, Toby Darling wrote:
> Hi Dan
> 
> Thanks!
> 
>   ceph tell mds.ceph1 config set mds_bal_fragment_size_max 20
> 
> got us running again.

This helped me too. However, should I see num_strays decrease again?
I'm  running a `find -ls` over my CephFS tree..

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade tips from Luminous to Nautilus?

2021-05-10 Thread Mark Schouten
On Thu, Apr 29, 2021 at 10:58:15AM +0200, Mark Schouten wrote:
> We've done our fair share of Ceph cluster upgrades since Hammer, and
> have not seen much problems with them. I'm now at the point that I have
> to upgrade a rather large cluster running Luminous and I would like to
> hear from other users if they have experiences with issues I can expect
> so that I can anticipate on them beforehand.


Thanks for the replies! 

Just one question though. Step one for me was to lower max_mds to one.
Documentation seems to suggest that the cluster automagically moves > 1
mds'es to a standby state. However, nothing really happens.

root@osdnode01:~# ceph fs get dadup_pmrb | grep max_mds
max_mds 1

I still have three active ranks. Do I simply restart two of the MDS'es
and force max_mds to one daemon, or is there a nicer way to move two
mds'es from active to standby?

Thanks again!

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Upgrade tips from Luminous to Nautilus?

2021-04-29 Thread Mark Schouten
Hi,

We've done our fair share of Ceph cluster upgrades since Hammer, and
have not seen much problems with them. I'm now at the point that I have
to upgrade a rather large cluster running Luminous and I would like to
hear from other users if they have experiences with issues I can expect
so that I can anticipate on them beforehand.

As said, the cluster is running Luminous (12.2.13) and has the following
services active:
  services:
mon: 3 daemons, quorum osdnode01,osdnode02,osdnode04
mgr: osdnode01(active), standbys: osdnode02, osdnode03
mds: pmrb-3/3/3 up 
{0=osdnode06=up:active,1=osdnode08=up:active,2=osdnode07=up:active}, 1 
up:standby
osd: 116 osds: 116 up, 116 in;
rgw: 3 daemons active


Of the OSD's, we have 11 SSD's and 105 HDD. The capacity of the cluster
is 1.01PiB.

We have 2 active crush-rules on 18 pools. All pools have a size of 3 there is a 
total of 5760 pgs.
{
"rule_id": 1,
"rule_name": "hdd-data",
"ruleset": 1,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -10,
"item_name": "default~hdd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "ssd-data",
"ruleset": 2,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -21,
"item_name": "default~ssd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}

rbd -> crush_rule: hdd-data
.rgw.root -> crush_rule: hdd-data
default.rgw.control -> crush_rule: hdd-data
default.rgw.data.root -> crush_rule: ssd-data
default.rgw.gc -> crush_rule: ssd-data
default.rgw.log -> crush_rule: ssd-data
default.rgw.users.uid -> crush_rule: hdd-data
default.rgw.usage -> crush_rule: ssd-data
default.rgw.users.email -> crush_rule: hdd-data
default.rgw.users.keys -> crush_rule: hdd-data
default.rgw.meta -> crush_rule: hdd-data
default.rgw.buckets.index -> crush_rule: ssd-data
default.rgw.buckets.data -> crush_rule: hdd-data
default.rgw.users.swift -> crush_rule: hdd-data
default.rgw.buckets.non-ec -> crush_rule: ssd-data
DB0475 -> crush_rule: hdd-data
cephfs_pmrb_data -> crush_rule: hdd-data
cephfs_pmrb_metadata -> crush_rule: ssd-data


All but four clients are running Luminous, the four are running Jewel
(that needs upgrading before proceeding with this upgrade).

So, normally, I would 'just' upgrade all Ceph packages on the
monitor-nodes and restart mons and then mgrs.

After that, I would upgrade all Ceph packages on the OSD nodes and
restart all the OSD's. Then, after that, the MDSes and RGWs. Restarting
the OSD's will probably take a while.

If anyone has a hint on what I should expect to cause some extra load or
waiting time, that would be great.

Obviously, we have read
https://ceph.com/releases/v14-2-0-nautilus-released/ , but I'm looking
for real world experiences.

Thanks!


-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to delete versioned bucket

2021-04-29 Thread Mark Schouten
On Sat, Apr 24, 2021 at 06:06:04PM +0200, Mark Schouten wrote:
> Using the following command:
> 3cmd setlifecycle lifecycle.xml s3://syslog_tuxis_net
> 
> That gave no error, and I see in s3browser that it's active.
> 
> The RGW does not seem to kick in yet, but I'll keep an eye on that.

Unfortunatly, the deletemarkers are still there. Anyone has a tip on how
to fix this?

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to delete versioned bucket

2021-04-24 Thread Mark Schouten
Hi,

Thanks!

On Fri, Apr 23, 2021 at 10:04:52PM +0530, Soumya Koduri wrote:
> This command is working for DeleteMarker at least on the latest master. I am
> not sure if there is any bug in the version you are using. One other way to
> delete them is to set lifecycle policy on the bucket with below tag -
> 
>     Enabled
>     
> true
>    

I've set the following XML:

  
Enabled
/

  true

  


Using the following command:
3cmd setlifecycle lifecycle.xml s3://syslog_tuxis_net

That gave no error, and I see in s3browser that it's active.

The RGW does not seem to kick in yet, but I'll keep an eye on that.

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Unable to delete versioned bucket

2021-04-23 Thread Mark Schouten
Hi,

I have a bucket that has versioning enabled and I am trying to remove
it. This is not possible, because the bucket is not empty:

mark@tuxis:~$ aws --endpoint=https://nl.dadup.eu s3api delete-bucket
--bucket syslog_tuxis_net 

An error occurred (BucketNotEmpty) when calling the DeleteBucket
operation: Unknown

It took me a while to determine that `curator` had enabled bucket
versioning, but after I did I removed all versions.

But, when trying to delete the bucket, it still doesn't work because
there are DeleteMarkers, eg:

{
"Owner": {
"DisplayName": "Syslog Backup",
"ID": "DB0220$syslog_backup"
},
"Key": "incompatible-snapshots",
"VersionId": "noeJtvBpV5HINGQTJeXEq5mzlzsWneg",
"IsLatest": true,
"LastModified": "2021-03-22T16:35:18.298Z"
},


I cannot download that object:
mark@tuxis:~$ aws --endpoint=https://nl.dadup.eu s3api get-object
--bucket syslog_tuxis_net --key incompatible-snapshots /tmp/foobar

An error occurred (NoSuchKey) when calling the GetObject operation:
Unknown

Nor can I delete the object:
mark@tuxis:~$ aws --endpoint=https://nl.dadup.eu s3api delete-object
--bucket syslog_tuxis_net --key incompatible-snapshots 


So, according to
https://ceph.io/planet/on-ceph-rgw-s3-object-versioning/#on-delete-marker
I should be able to delete that DeleteMarker with this command:
mark@tuxis:~$ aws --endpoint=https://nl.dadup.eu s3api delete-object
--bucket syslog_tuxis_net --key incompatible-snapshots --version-id
noeJtvBpV5HINGQTJeXEq5mzlzsWneg


But that command does not give any output, and it does not delete the
marker either.


So I'm stuck with that bucket which I would like to remove without
abusing radosgw-admin.

This cluster is running 12.2.13 with civetweb rgw's behind a haproxy
setup. All is working fine, except for this versioning bucket. Can
anywone point me in the right direction to remove this bucket as a
normal user?

Thanks!


-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] CephFS max_file_size

2020-12-11 Thread Mark Schouten

Hi,

There is a default limit of 1TiB for the max_file_size in CephFS. I altered 
that to 2TiB, but I now got a request for storing a file up to 7TiB.

I'd expect the limit to be there for a reason, but what is the risk of setting 
that value to say 10TiB?

--
Mark Schouten 

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208 
 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD takes more almost two hours to boot from Luminous -> Nautilus

2020-08-19 Thread Mark Schouten
On Wed, Aug 19, 2020 at 03:55:17PM +0300, Igor Fedotov wrote:
>most probably you're suffering from a per-pool omap statistics update.
>Which is performed during the first start of an upgraded OSD. One can
>disable this behavior via 'bluestore_fsck_quick_fix_on_mount' flag. But
>please expect incomplete OMAP usage report if stats are not updated.

Thanks! If there is no technical issue with it, I think I could just
configure bluestore_fsck_quick_fix_on_mount in ceph.conf, do the upgrade
'quick' and afterwards remove the configline and restart the osd's again
at a more convenient time? 

-- 
Mark Schouten | Tuxis B.V.
KvK: 74698818 | http://www.tuxis.nl/
T: +31 318 200208 | i...@tuxis.nl
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] OSD takes more almost two hours to boot from Luminous -> Nautilus

2020-08-19 Thread Mark Schouten

Hi,

Last night I upgraded a Luminous cluster to Nautilus. All went well, but there 
was one sleep depriving issue I would like to prevent from happening next week 
while upgrading another cluster. Maybe you people can help me figure out what 
actually happened.

So I upgraded the packages and restarted mons and mgrs. Then I started 
restarting the OSD's on one of the nodes. Below are the start and 'start_boot' 
times, during which the disks read at full speed, I think the whole disk.

2020-08-19 02:08:10.568 7fd742b09c80  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-08-19 02:09:33.591 7fd742b09c80  1 osd.8 2188 start_boot


2020-08-19 02:08:10.592 7fb453887c80  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-08-19 02:17:40.878 7fb453887c80  1 osd.5 2188 start_boot


2020-08-19 02:08:10.836 7f907bc0cc80  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-08-19 02:19:58.462 7f907bc0cc80  1 osd.3 2188 start_boot


2020-08-19 02:08:10.584 7f1ca892cc80  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-08-19 03:13:24.179 7f1ca892cc80  1 osd.11 2188 start_boot

2020-08-19 02:08:10.568 7f059f80dc80  0 set uid:gid to 64045:64045 (ceph:ceph)
2020-08-19 04:06:55.342 7f059f80dc80  1 osd.14 2188 start_boot

So, while this is not an issue which breaks anything technical, I would like to 
know how I can arrange for the OSD to do this 'maintenance' beforehand so I 
don't have to wait too long. :)

I do see a warning in the logging, is that related: "store not yet converted to 
per-pool stats"  ?

Thanks!
--
Mark Schouten 

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208 
 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrade procedure on Ubuntu Bionic with stock packages

2019-08-28 Thread Mark Schouten

Cool, thanks!

--
Mark Schouten 

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208 
 
- Originele bericht -
Van: James Page (james.p...@canonical.com)
Datum: 28-08-2019 11:02
Naar: Mark Schouten (m...@tuxis.nl)
Cc: ceph-users@ceph.io
Onderwerp: Re: [ceph-users] Upgrade procedure on Ubuntu Bionic with stock 
packages

Hi Mark
On Wed, Aug 28, 2019 at 9:51 AM Mark Schouten  wrote:


Hi,

I have a cluster running on Ubuntu Bionic, with stock Ubuntu Ceph packages. 
When upgrading, I always try to follow the procedure as documented here: 
https://docs.ceph.com/docs/master/install/upgrading-ceph/

However, the Ubuntu packages restart all daemons upon upgrade, per node. So if 
I upgrade the first node, it will restart mon, osds, rgw, and mds'es on that 
node, even though the rest of the cluster is running the old version.

I tried upgrading a single package, to see how that goes, but due to 
dependencies in dpkg, all other packages are upgraded as well.


This is a known issue in the Ceph packages in Ubuntu:

https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1840347

the behaviour of debhelper (which generates snippets for maintainer scripts) 
changed and it was missed - fix being worked on at the moment.

Cheers

James

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Upgrade procedure on Ubuntu Bionic with stock packages

2019-08-28 Thread Mark Schouten

Hi,

I have a cluster running on Ubuntu Bionic, with stock Ubuntu Ceph packages. 
When upgrading, I always try to follow the procedure as documented here: 
https://docs.ceph.com/docs/master/install/upgrading-ceph/

However, the Ubuntu packages restart all daemons upon upgrade, per node. So if 
I upgrade the first node, it will restart mon, osds, rgw, and mds'es on that 
node, even though the rest of the cluster is running the old version.

I tried upgrading a single package, to see how that goes, but due to 
dependencies in dpkg, all other packages are upgraded as well.

How should I proceed?


Thanks,

--
Mark Schouten 

Tuxis, Ede, https://www.tuxis.nl

T: +31 318 200208 
 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io