[ceph-users] mds crash loop

2019-11-05 Thread Karsten Nielsen
Hi, Last week I upgraded my ceph cluster from luminus to mimic 13.2.6 It was running fine for a while but yesterday my mds went into a crash loop. I have 1 active and 1 standby mds for my cephfs both of which is running the same crash loop. I am running ceph based on https://hub.docker.com/r/cep

[ceph-users] Re: mds crash loop

2019-11-05 Thread Karsten Nielsen
from ceph -w [root@k8s-node-01 /]# ceph -w cluster: id: 571d4bfe-2c5d-45ca-8da1-91dcaf69942c health: HEALTH_WARN 1 filesystem is degraded services: mon: 3 daemons, quorum k8s-node-00,k8s-node-01,k8s-node-02 mgr: k8s-node-01(active) mds: cephfs-1/1/1 up rec

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread vitalif
Use `cache=writeback` QEMU option for HDD clusters, that should solve your issue Hi, I recently upgraded my 3-node cluster to proxmox 6 / debian-10 and recreated my ceph cluster with a new release (14.2.4 bluestore) - basically hoping to gain some I/O speed. The installation went flawlessly, r

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Hermann Himmelbauer
Hi, Thank you for your quick reply, Proxmox offers me "writeback" (cache=writeback) and "writeback unsafe" (cache=unsafe), however, for my "dd" test, this makes no difference at all. I still have write speeds of ~ 4,5 MB/s. Perhaps "dd" disables the write cache? Would it perhaps help to put the

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Виталий Филиппов
Yes, cache=unsafe has no effect with RBD. Hm, that's strange, you should get ~40*6 MB/s linear write with 6 HDDs and Bluestore. Try to create a test image and test it with 'fio -ioengine=rbd -name=test -direct=1 -rw=write -bs=4M -iodepth=16 -pool= -rbdname=' from outside a VM. If you still get

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Simon Ironside
Hi, My three-node lab cluster is similar to yours but with 3x bluestore OSDs per node (4TB SATA spinning disks) and 1x shared DB/WAL (240GB SATA SSD) device per node. I'm only using gigabit networking (one interface public, one interface cluster) also ceph 14.2.4 with 3x replicas. I would ha

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Marc Roos
Good you are getting 124MB/s via Gbit, I have only been able to get 110MB/s. If you are interested, I am also having 4TB sata hdd without db/wal on ssd, 4 nodes, but 10Gbit [@]# dd if=/dev/zero of=zero.file bs=32M oflag=direct status=progress 3758096384 bytes (3.8 GB) copied, 36.364817 s, 103

[ceph-users] Re: Slow write speed on 3-node cluster with 6* SATA Harddisks (~ 3.5 MB/s)

2019-11-05 Thread Paul Emmerich
On Mon, Nov 4, 2019 at 11:44 PM Hermann Himmelbauer wrote: > > Hi, > I recently upgraded my 3-node cluster to proxmox 6 / debian-10 and > recreated my ceph cluster with a new release (14.2.4 bluestore) - > basically hoping to gain some I/O speed. > > The installation went flawlessly, reading is fa

[ceph-users] Re: multiple pgs down with all disks online

2019-11-05 Thread Kári Bertilsson
This turned out to be because another OSD (no 90) went missing, and the PG wanted to query the missing OSD for unfound object even if this OSD was not part of the PG. I found this out by doing `ceph pg query 41.3db` Marking OSD 90 as lost resolved this issue and all PG's became active. A tiny bit

[ceph-users] stretch repository only has ceph-deploy

2019-11-05 Thread Rodrigo Severo - Fábrica
Hi, I'm trying ceph for the first time. I'm trying to use the repository below: deb https://download.ceph.com/debian-nautilus/ stretch main But it seems that this repository only has the ceph-deploy package, not the rest of ceph. Why is that? How can I get all updated nautilus packages? Reg

[ceph-users] Re: stretch repository only has ceph-deploy

2019-11-05 Thread Paul Emmerich
We maintain Nautilus packages for Buster, see https://croit.io/2019/07/07/2019-07-07-debian-mirror However, Stretch will probably never support Nautilus because Debian doesn't do backports of GCC Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io cro

[ceph-users] Re: Balancer configuration fails with Error EINVAL: unrecognized config option 'mgr/balancer/max_misplaced'

2019-11-05 Thread 王予智
This option have been replaced with "target_max_misplaced_ratio" since Nautilus v1.14.0 See description about "target_max_misplaced_ratio" in changelog here https://docs.ceph.com/docs/master/releases/nautilus/ Thomas Schneider <74cmo...@gmail.com> 於 2019年11月4日 週一 下午5:00寫道: > Hi, > > I want to adj

[ceph-users] Re: mds crash loop

2019-11-05 Thread Yan, Zheng
On Tue, Nov 5, 2019 at 5:29 PM Karsten Nielsen wrote: > > Hi, > > Last week I upgraded my ceph cluster from luminus to mimic 13.2.6 > It was running fine for a while but yesterday my mds went into a crash loop. > > I have 1 active and 1 standby mds for my cephfs both of which is running the > sam