[ceph-users] OSDs not balanced

2024-03-04 Thread Ml Ml
Hello, i wonder why my autobalancer is not working here: root@ceph01:~# ceph -s cluster: id: 5436dd5d-83d4-4dc8-a93b-60ab5db145df health: HEALTH_ERR 1 backfillfull osd(s) 1 full osd(s) 1 nearfull osd(s) 4 pool(s) full => osd.17 was

[ceph-users] Re: Planning cluster

2023-07-11 Thread Ml Ml
Never ever use osd pool default min size = 1 this will break your neck and does not make sense really. :-) On Mon, Jul 10, 2023 at 7:33 PM Dan van der Ster wrote: > > Hi Jan, > > On Sun, Jul 9, 2023 at 11:17 PM Jan Marek wrote: > > > Hello, > > > > I have a cluster, which have this

[ceph-users] cephadm ls / ceph orch ps => here does it get its information?

2022-12-29 Thread Ml Ml
Hello, i seem to not have removed old osd. Now i have: root@ceph07:/tmp# ceph orch ps |grep -e error -e stopped |grep ceph07 _osd.33 ceph07 stopped 2h ago 2y quay.io/ceph/ceph:v15.2.17 mon.ceph01ceph07 error 2h ago 2y

[ceph-users] ceph osd df tree information missing on one node

2022-12-28 Thread Ml Ml
Hello, after reinstalling one node (ceph06) from Backup the OSDs on that node do not show any Disk information with "ceph osd df tree": https://pastebin.com/raw/7zeAx6EC Any hint how i could fix this? Thanks, Mario ___ ceph-users mailing list --

[ceph-users] 1 pools have many more objects per pg than average

2021-08-17 Thread Ml Ml
Hello, i get: 1 pools have many more objects per pg than average detail: pool cephfs.backup.data objects per pg (203903) is more than 20.307 times cluster average (10041) I set pg_num and pgp_num from 32 to 128 but my autoscaler seem to set them back to 32 again :-/ For Details please see:

[ceph-users] PGs stuck after replacing OSDs

2021-08-17 Thread Ml Ml
Hello List, I am running Proxmox on top of ceph 14.2.20 on the nodes, replica 3, size 2. Last week I wanted to swap the HDDs to SDDs on one node. Since i have 3 Nodes with replica 3, size 2 i did the following: 1.) cep osd set noout 2.) Stopped all OSD on that one node 3.) i set the OSDs to

[ceph-users] Re: Can not mount rbd device anymore

2021-06-25 Thread Ml Ml
Btw: dd bs=1M count=2048 if=/dev/rbd6 of=/dev/null => gives me 50MB/sec. So reading the block device seems to work?! On Fri, Jun 25, 2021 at 12:39 PM Ml Ml wrote: > > I started the mount 15mins ago.: > mount -nv /dev/rbd6 /mnt/backup-cluster5 > > ps: > root 1143 0.2

[ceph-users] Re: Can not mount rbd device anymore

2021-06-25 Thread Ml Ml
--pid=1143: uses about 50KB/sec it might mount after a few hours i gues... :-( On Fri, Jun 25, 2021 at 11:39 AM Ilya Dryomov wrote: > > On Fri, Jun 25, 2021 at 11:25 AM Ml Ml wrote: > > > > The rbd Client is not on one of the OSD Nodes. > > > > I now add

[ceph-users] Re: Can not mount rbd device anymore

2021-06-25 Thread Ml Ml
n 23, 2021 at 11:25 AM Ilya Dryomov wrote: > > On Wed, Jun 23, 2021 at 9:59 AM Matthias Ferdinand > wrote: > > > > On Tue, Jun 22, 2021 at 02:36:00PM +0200, Ml Ml wrote: > > > Hello List, > > > > > > oversudden i can not mount a specific rbd device

[ceph-users] Re: Can not mount rbd device anymore

2021-06-22 Thread Ml Ml
t 8:36 AM Ml Ml wrote: >> >> Hello List, >> >> oversudden i can not mount a specific rbd device anymore: >> >> root@proxmox-backup:~# rbd map backup-proxmox/cluster5 -k >> /etc/ceph/ceph.client.admin.keyring >> /dev/rbd0 >> >> root@proxm

[ceph-users] OT: How to Build a poor man's storage with ceph

2021-06-08 Thread Ml Ml
Hello List, i used to build 3 Node Cluster with spinning Rust and later with (Enterprise) SSDs. All i did was to buy a 19" Server with 10/12 Slots, plug in the Disks and i was done. The Requirements were just 10/15TB Disk usage (30-45TB Raw). Now i was asked if i could also build a cheap

[ceph-users] Re: Octopus - unbalanced OSDs

2021-04-19 Thread Ml Ml
Anyone an idea? :) On Fri, Apr 16, 2021 at 3:09 PM Ml Ml wrote: > > Hello List, > > any ideas why my OSDs are that unbalanced ? > > root@ceph01:~# ceph -s > cluster: > id: 5436dd5d-83d4-4dc8-a93b-60ab5db145df > health: HEALTH_WARN > 1 n

[ceph-users] Octopus - unbalanced OSDs

2021-04-16 Thread Ml Ml
Hello List, any ideas why my OSDs are that unbalanced ? root@ceph01:~# ceph -s cluster: id: 5436dd5d-83d4-4dc8-a93b-60ab5db145df health: HEALTH_WARN 1 nearfull osd(s) 4 pool(s) nearfull services: mon: 3 daemons, quorum ceph03,ceph01,ceph02 (age 2w)

[ceph-users] HEALTH_WARN - Recovery Stuck?

2021-04-12 Thread Ml Ml
Hello, i kind of ran out of disk space, so i added another host with osd.37. But it does not seem to move much data on it. (85MB in 2h) Any idea why the recovery process seems to be stuck? Should i fix the 4 backfillfull osds first? (by changing the weight)? root@ceph01:~# ceph -s cluster:

[ceph-users] Low Memory Nodes

2020-11-06 Thread Ml Ml
Hello List, i think 3 of 6 Nodes have to less memory. This triggers the effect, that the nodes will swap a lot and almost kill themselfes. That triggers OSDs to go down, which triggers a rebalance which does not really help :D I already ordered more ram. Can i turn temporary down the RAM usage

[ceph-users] Re: How to reset Log Levels

2020-11-04 Thread Ml Ml
Is this still debug output or "normal"?: Nov 04 10:19:39 ceph01 bash[2648]: audit 2020-11-04T09:19:38.577088+ mon.ceph03 (mon.0) 7738 : audit [DBG] from='mgr.42824785 10.10.2.103:0/3293316818' entity='mgr.ceph03' cmd=[{"prefix": "mds metadata", "who": "cephfs.ceph04.hrcvab"}]: dispatch Nov 04

[ceph-users] Re: Restart Error: osd.47 already exists in network host

2020-11-02 Thread Ml Ml
e? cluster id is: 5436dd5d-83d4-4dc8-a93b-60ab5db145df On Mon, Nov 2, 2020 at 10:03 AM Eugen Block wrote: > > Hi, > > are you sure it's the right container ID you're using for the restart? > I noticed that 'cephadm ls' shows older containers after a daemon had > to be recrea

[ceph-users] Restart Error: osd.47 already exists in network host

2020-11-02 Thread Ml Ml
Hello List, sometimes some OSD get taken our for some reason ( i am still looking for the reason, and i guess its due to some overload), however, when i try to restart them i get: Nov 02 08:05:26 ceph05 bash[9811]: Error: No such container: ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df-osd.47 Nov 02

[ceph-users] How to reset Log Levels

2020-10-29 Thread Ml Ml
Hello, i played around with some log level i can´t remember and my logs are now getting bigger than my DVD-Movie collection. E.g.: journalctl -b -u ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df@mon.ceph03.service > out.file is 1,1GB big. I did already try: ceph tell mon.ceph03 config set debug_mon

[ceph-users] can not remove orch service

2020-08-26 Thread Ml Ml
Hello, root@ceph02:~# ceph orch ps NAME HOSTSTATUS REFRESHED AGE VERSIONIMAGE NAME IMAGE ID CONTAINER ID mgr.ceph01ceph01 running (18m) 6s ago 4w 15.2.4 docker.io/ceph/ceph:v15.2.4 54fa7e66fb03 7deebe09f6fd

[ceph-users] Stuck removing osd with orch

2020-07-29 Thread Ml Ml
Hello, yesterday i did: ceph osd purge 32 --yes-i-really-mean-it I also started to upgrade: ceph orch upgrade start --ceph-version 15.2.4 It seems its really gone: ceph osd crush remove osd.32 => device 'osd.32' does not appear in the crush map ceph orch ps: osd.32

[ceph-users] 6 hosts fail cephadm check (15.2.4)

2020-07-28 Thread Ml Ml
Hello, i get: [WRN] CEPHADM_HOST_CHECK_FAILED: 6 hosts fail cephadm check host ceph01 failed check: Failed to connect to ceph01 (ceph01). Check that the host is reachable and accepts connections using the cephadm SSH key you may want to run: > ssh -F =(ceph cephadm get-ssh-config) -i =(ceph

[ceph-users] Ceph stuck at: objects misplaced (0.064%)

2020-07-08 Thread Ml Ml
Hello, ceph is stuck since 4 days with 0.064% misplaced and i dunno why. Can anyone help me to get it fixed? I did restart some OSDs and reweight them again to get some data moving but that did not help. root@node01:~ # ceph -s cluster: id: 251c937e-0b55-48c1-8f34-96e84e4023d4 health:

[ceph-users] bad balacing (octopus)

2020-06-04 Thread Ml Ml
Hello, any idea why it´s so bad balanced? e.g.: osd.52 (82%) vs osd.34 (29%) I did run "/usr/bin/ceph osd reweight-by-utilization " by cron for some time, since i was low on space for some time and that helped a bit. What should i do next? Here is some info: root@ceph01:~# ceph -s cluster:

[ceph-users] ERROR: osd init failed: (1) Operation not permitted

2020-05-29 Thread Ml Ml
Hello List, first of all: Yes - i made mistakes. Now i am trying to recover :-/ I had a healthy 3 node cluster which i wanted to convert to a single one. My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes. I was able to healthy turn it from a 3 Node Cluster to a 2 Node

[ceph-users] Re: how to restart daemons on 15.2 on Debian 10

2020-05-18 Thread Ml Ml
, May 15, 2020 at 12:27 PM Ml Ml wrote: > > Hello List, > > how do you restart daemons (mgr, mon, osd) on 15.2.1? > > It used to be something like: > systemctl stop ceph-osd@10 > > Or: > systemctl start ceph-mon@ceph03 > > however, those command do nothing on

[ceph-users] Re: how to restart daemons on 15.2 on Debian 10

2020-05-18 Thread Ml Ml
stration tool: > > # ceph orch ps > > and then restart it with the orchestration tool: > > # ceph orch restart {name from ceph orch ps} > > > Hope it helps. > > > Ceers, > > Simon > > > Von: Ml Ml > Gesendet: Freita

[ceph-users] how to restart daemons on 15.2 on Debian 10

2020-05-15 Thread Ml Ml
Hello List, how do you restart daemons (mgr, mon, osd) on 15.2.1? It used to be something like: systemctl stop ceph-osd@10 Or: systemctl start ceph-mon@ceph03 however, those command do nothing on my setup. Is this because i use cephadm and that docker stuff? The Logs also seem to be

[ceph-users] ceph orch ps => osd (Octopus 15.2.1)

2020-05-14 Thread Ml Ml
Hello, any idea what´s wrong with my osd.34+35? root@ceph01:~# ceph orch ps NAMEHOSTSTATUS REFRESHED AGE VERSIONIMAGE NAME IMAGE ID CONTAINER ID (...) osd.34 ceph04 running- - osd.35 ceph04

[ceph-users] Re: How to debug ssh: ceph orch host add ceph01 10.10.1.1

2020-04-23 Thread Ml Ml
Can anyone help me here? :-/ On Wed, Apr 22, 2020 at 10:36 PM Ml Ml wrote: > > Hello List, > > i did: > root@ceph01:~# ceph cephadm set-ssh-config -i /tmp/ssh_conf > > root@ceph01:~# cat /tmp/ssh_conf > Host * > User root > StrictHostKeyChecking no > UserKnownHo

[ceph-users] How to debug ssh: ceph orch host add ceph01 10.10.1.1

2020-04-22 Thread Ml Ml
Hello List, i did: root@ceph01:~# ceph cephadm set-ssh-config -i /tmp/ssh_conf root@ceph01:~# cat /tmp/ssh_conf Host * User root StrictHostKeyChecking no UserKnownHostsFile /dev/null root@ceph01:~# ceph config-key set mgr/cephadm/ssh_identity_key -i /root/.ssh/id_rsa set

[ceph-users] How to remove a deamon from orch

2020-04-22 Thread Ml Ml
Hello list, i somehow have this "mgr.cph02 ceph02 stopped " line here. root@ceph01:~# ceph orch ps NAMEHOSTSTATUSREFRESHED AGE VERSIONIMAGE NAME IMAGE ID CONTAINER ID mgr.ceph02 ceph02 running (2w) 2w ago -15.2.0 docker.io/ceph/ceph:v15

[ceph-users] Re: octopus upgrade stuck: Assertion `map->require_osd_release >= ceph_release_t::mimic' failed.

2020-03-25 Thread Ml Ml
20-03-25T22:10:13.107+0100 7f0bd5320e00 0 osd.32 57223 done with init, starting boot process 2020-03-25T22:10:13.107+0100 7f0bd5320e00 1 osd.32 57223 start_boot does the line: check_osdmap_features require_osd_release unknown -> luminous mean it thinks the local osd itself is luminous? On Wed, Mar 25, 20

[ceph-users] octopus upgrade stuck: Assertion `map->require_osd_release >= ceph_release_t::mimic' failed.

2020-03-25 Thread Ml Ml
Hello List, i followed: https://ceph.io/releases/v15-2-0-octopus-released/ I came from a healthy nautilus and i am stuck at: 5.) Upgrade all OSDs by installing the new packages and restarting the ceph-osd daemons on all OSD host When i try to start an osd like this, i get:

[ceph-users] Re: OSDs wont mount on Debian 10 (Buster) with Nautilus

2020-03-25 Thread Ml Ml
to bluestore or something? > > > On Wed, Mar 25, 2020 at 4:23 PM Marc Roos > wrote: > > > > Try this > > > > chown ceph.ceph /dev/sdc2 > > chown ceph.ceph /dev/sdd2 > > chown ceph.ceph /dev/sde2 > > chown ceph.ceph /dev/sdf2 > > ch

[ceph-users] Re: OSDs wont mount on Debian 10 (Buster) with Nautilus

2020-03-25 Thread Ml Ml
e2 > chown ceph.ceph /dev/sdf2 > chown ceph.ceph /dev/sdg2 > chown ceph.ceph /dev/sdh2 > > > > -Original Message- > From: Ml Ml [mailto:mliebher...@googlemail.com] > Sent: 25 March 2020 16:22 > To: Marc Roos > Subject: Re: [ceph-users] OSDs wont mount on De

[ceph-users] OSDs wont mount on Debian 10 (Buster) with Nautilus

2020-03-25 Thread Ml Ml
Hello list, i upgraded to Debian 10, after that i upgraded from luminous to nautilus. I restarted the mons, then the OSDs. Everything was up and healthy. After rebooting a node, only 3/10 OSD start up: -4 20.07686 host ceph03 4 hdd 2.67020 osd.4 down 1.0 1.0

[ceph-users] rbd-mirror -> how far behind_master am i time wise?

2020-03-24 Thread Ml Ml
Hello List, i use rbd-mirror and i asynchronously mirror to my backup cluster. My backup cluster only has "spinnung rust" and wont be able to always perform like the live cluster. Thats is fine for me, as far as it´s not further behind than 12h. vm-194-disk-1: global_id:

[ceph-users] Re: How to recover/mount mirrored rbd image for file recovery

2020-03-23 Thread Ml Ml
re's a way to directly access the remote image > > since it's read-only. > > > > Regards, > > Eugen > > > > > > Zitat von Ml Ml : > > > > > Hello, > > > > > > my goal is to back up a proxmox cluster with rbd-mirror for de

[ceph-users] How to recover/mount mirrored rbd image for file recovery

2020-03-19 Thread Ml Ml
Hello, my goal is to back up a proxmox cluster with rbd-mirror for desaster recovery. Promoting/Demoting, etc.. works great. But how can i access a single file on the mirrored cluster? I tried: root@ceph01:~# rbd-nbd --read-only map cluster5-rbd/vm-114-disk-1 --cluster backup /dev/nbd1

[ceph-users] Point-in-Time Recovery

2020-03-13 Thread Ml Ml
Hello List, when reading: https://docs.ceph.com/docs/master/rbd/rbd-mirroring/ it says: (...)Journal-based: This mode uses the RBD journaling image feature to ensure point-in-time, crash-consistent replication between clusters(...) Does this mean, that mean, that we have some kind of

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
. Looking that the modes, i should change from Journal-based to Snapshot-based mirroring? Thanks, Michael On Tue, Mar 10, 2020 at 3:43 PM Jason Dillaman wrote: > > On Tue, Mar 10, 2020 at 10:36 AM Ml Ml wrote: > > > > Hello Jason, > > > > thanks for that fast reply. &g

[ceph-users] Re: rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
node (where the rbd-mirror runs): 8.92 Gbits/sec Any other idea? Thanks, Michael On Tue, Mar 10, 2020 at 2:19 PM Jason Dillaman wrote: > > On Tue, Mar 10, 2020 at 6:47 AM Ml Ml wrote: > > > > Hello List, > > > > when i initially enable journal/mirror on an image

[ceph-users] rbd-mirror replay is very slow - but initial bootstrap is fast

2020-03-10 Thread Ml Ml
Hello List, when i initially enable journal/mirror on an image it gets bootstrapped to my site-b pretty quickly with 250MB/sec which is about the IO Write limit. Once its up2date, the replay is very slow. About 15KB/sec and the entries_behind_maste is just running away: root@ceph01:~# rbd

[ceph-users] ERROR: osd init failed: (1) Operation not permitted

2020-02-10 Thread Ml Ml
Hello List, first of all: Yes - i made mistakes. Now i am trying to recover :-/ I had a healthy 3 node cluster which i wanted to convert to a single one. My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes. I was able to healthy turn it from a 3 Node Cluster to a 2 Node