[ceph-users] Re: Bucket deletion is very slow.

2021-08-11 Thread mhnx
Thanks for the answer. I already tried --max-concurrent-ios=2048 and 4096 but didn't changed much. Have you tried to run rm command on different gateways with the same time? I wonder is it a bad idea? 6 Ağu 2021 Cum 10:55 tarihinde Płaza Tomasz şunu yazdı: > Hi, > > We use: radosgw-admin bucket

[ceph-users] Re: Announcing go-ceph v0.11.0

2021-08-11 Thread Wido den Hollander
Op 10-08-2021 om 22:32 schreef Xiaolong Jiang: Thank you for your response. We are making a choice between go and java. Internally, our company has very good java ecosystem support with spring which I don't want to lose. Go binding is being actively developed/supported, it's a better choice

[ceph-users] Re: The cluster expands the osd, but the storage pool space becomes smaller

2021-08-11 Thread David Yang
hi, thank you very much, I will add this information Currently 7 nodes are used to run a cluster: node1: mon+mgr+mds node2: mon+mgr+mds osd1: mon+mgr+mds+osd osd2: osd osd3: osd osd4: osd osd5: osd Each osd node is configured with 12*10T hdd, 1*1.5T nvme ssd, 150G*1 ssd; Operating system: CentOS

[ceph-users] How to safely turn off a ceph cluster

2021-08-11 Thread huxia...@horebdata.cn
Dear Cepher, We are informed that the power supply of one 10-node ceph cluster will be cut off for a period of maintenance time. Is there a ceph command to put Ceph in a maintenace status, so that I can safely turn off each node of Ceph cluster? Any suggestions, comments would be highly appreci

[ceph-users] Re: Is it a bad Idea to build a Ceph Cluster over different Data Centers?

2021-08-11 Thread Janne Johansson
Den tis 10 aug. 2021 kl 21:59 skrev Ralph Soika : > communicate via public IPs. I did some tests and - not surprising - the > network performance is going down to 500Mbit/s between two nodes in two > different data centers. > My questions are: > > 1. Is 500Mbit/s to slow to run a ceph cluster? Th

[ceph-users] Re: How to safely turn off a ceph cluster

2021-08-11 Thread Robert Sander
Hi, On 11.08.21 10:02, huxia...@horebdata.cn wrote: > We are informed that the power supply of one 10-node ceph cluster will be cut > off for a period of maintenance time. Is there a ceph command to put Ceph in > a maintenace status, so that I can safely turn off each node of Ceph cluster? cep

[ceph-users] Re: How to safely turn off a ceph cluster

2021-08-11 Thread Eugen Block
Hi, there's plenty of information available online, e.g. the Red Hat docs [1], mailing list threads [2]. [1] https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/4/html/administration_guide/understanding-process-management-for-ceph#powering-down-and-rebooting-a-red-hat-ceph-st

[ceph-users] Re: The cluster expands the osd, but the storage pool space becomes smaller

2021-08-11 Thread Eneko Lacunza
Hi David, You need to provide the details for each node; OSDs with their size and pool configuration. El 11/8/21 a las 5:30, David Yang escribió: There is also a set of mon+mgr+mds running on one of the storage nodes. David Yang 于2021年8月11日周三 上午11:24写道: hi I have a cluster of 5 storage nod

[ceph-users] Re: The cluster expands the osd, but the storage pool space becomes smaller

2021-08-11 Thread Nico Schottelius
Hey David, that is a normal process due to rebalancing. After the rebalancing is done, you will have more space. Best regards, Nico David Yang writes: > There is also a set of mon+mgr+mds running on one of the storage nodes. > David Yang 于2021年8月11日周三 上午11:24写道: > >> hi >> I have a cluster

[ceph-users] Very slow I/O during rebalance - options to tune?

2021-08-11 Thread Nico Schottelius
Good morning, after removing 3 osds which had been dead for some time, rebalancing started this morning and makes client I/O really slow (in the 10~30 MB/s area!). Rebalancing started at 1.2 ~ 1.6 Gb/s after issuing ceph tell 'osd.*' injectargs --osd-max-backfills=1 --osd-recovery-max-acti

[ceph-users] ceph osd continously fails

2021-08-11 Thread Amudhan P
Hi, I am using ceph version 15.2.7 in 4 node cluster my OSD's is continuously stopping and even if I start again it stops after some time. I couldn't find anything from the log. I have set norecover and nobackfil as soon as I unset norecover OSD starts to fail. cluster: id: b6437922-3edf-

[ceph-users] Re: Very slow I/O during rebalance - options to tune?

2021-08-11 Thread Peter Lieven
Have you tried setting osd op queue cut off to high? Peter > Am 11.08.2021 um 15:24 schrieb Frank Schilder : > > The recovery_sleep options are the next choice to look at. Increase it and > clients will get more I/O time slots. However, with your settings, I'm > surprised clients are impac

[ceph-users] Docker container snapshots accumulate until disk full failure?

2021-08-11 Thread Harry G. Coin
Does ceph remove container subvolumes holding previous revisions of daemon images after upgrades? I have a couple servers using btrfs to hold the containers.   The number of docker related sub-volumes just keeps growing, way beyond the number of daemons running.  If I ignore this, I'll get disk-fu

[ceph-users] Re: ceph osd continously fails

2021-08-11 Thread Amudhan P
Hi, Below are the logs in one of the failed OSD. Aug 11 16:55:48 bash[27152]: debug-20> 2021-08-11T11:25:47.433+ 7fbf3b819700 3 osd.12 6697 handle_osd_map epochs [6696,6697], i have 6697, src has [ Aug 11 16:55:48 bash[27152]: debug-19> 2021-08-11T11:25:47.433+ 7fbf32006700 5 os

[ceph-users] Re: Very slow I/O during rebalance - options to tune?

2021-08-11 Thread Frank Schilder
The recovery_sleep options are the next choice to look at. Increase it and clients will get more I/O time slots. However, with your settings, I'm surprised clients are impacted at all. I usually leave the op-priority at its default and use osd-max-backfill=2..4 for HDDs. With this, clients usual

[ceph-users] Getting alarm emails every 600s after Ceph Pacific install

2021-08-11 Thread Stefan Schneebeli
Hi all, I just installed a new Ceph Pacific Cluster with ceph-ansible which worked very well. But now one of the mon servers sends every 600s an email with this subject: ALERT localhost/trap: trap timeout and this body: Summary output: trap timeout Group : localho