[ceph-users] Re: Unbalanced data distribution

2019-10-22 Thread Konstantin Shalygin
On 10/23/19 1:14 PM, Thomas Schneider wrote: My understanding is that Ceph's algorithm should be smart enough to determine which object should be placed where and ensure balanced utilisation. I agree that I have a major impact if a node with 7.2TB disks go down, though. Ceph is don't care about

[ceph-users] Re: Unbalanced data distribution

2019-10-22 Thread Thomas Schneider
The number of PGs on the 7.2TB disks is 120 in avg., and the number of PGs on the 1.6TB disks is 35 in avg. This means a difference by factor 3-4. However I don't understand why this should explain the unbalanced data distribution on the 1.6TB disks only (the 7.2 TB disks are balanced) And all the

[ceph-users] Re: Unbalanced data distribution

2019-10-22 Thread Anthony D'Atri
I agree wrt making the nodes weights uniform. When mixing drive sizes, be careful that the larger ones don’t run afoul of the pg max — they will receive more pgs than the smaller ones, and if you lose a node that might be enough to send some over the max. ‘ceph OSD df’ and look at the PG c

[ceph-users] Re: Unbalanced data distribution

2019-10-22 Thread Konstantin Shalygin
On 10/22/19 7:52 PM, Thomas wrote: Node 1 48x 1.6TB Node 2 48x 1.6TB Node 3 48x 1.6TB Node 4 48x 1.6TB Node 5 48x 7.2TB Node 6 48x 7.2TB Node 7 48x 7.2TB I suggest to balance disks in hosts, e.g. ~ 28x1.6TB + 20x7.2TB per host. Why is the data distribution on the 1.6TB disks unequal? How can

[ceph-users] Re: mix ceph-disk and ceph-volume

2019-10-22 Thread David Turner
Yes, there is nothing wrong with this and had been a common scenario for people during their migration from filestore to bluestore. On Tue, Oct 22, 2019, 9:46 PM Frank R wrote: > Is it ok to create a new OSD using ceph-volume on a server where the other > OSDs were created with ceph-disk? > > th

[ceph-users] Re: minimum osd size?

2019-10-22 Thread David Turner
I did a set of 30GB OSDs before with extra disk space on my SSDs for the metadata pool on cephfs and my entire cluster locked up about 3 weeks later. Some metadata operation was happening, filled some of the 30GB disks to 100%, and all IO was blocked in the cluster. I did some trickery of deleting

[ceph-users] mix ceph-disk and ceph-volume

2019-10-22 Thread Frank R
Is it ok to create a new OSD using ceph-volume on a server where the other OSDs were created with ceph-disk? thx Frank ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] minimum osd size?

2019-10-22 Thread Frank R
Hi all, I have 40 nvme drives with about 20G free space each. Would creating a 10GB partition/lvm on each of the nvmes for an rgw index pool be a bad idea? RGW has about about 5 million objects I don't think space will be an issue but I am worried about the 10G size, is it just too small for a

[ceph-users] Help unsubscribe please

2019-10-22 Thread Sumit Gaur
On Wed, 23 Oct 2019 at 3:12 am, wrote: > Send ceph-users mailing list submissions to > ceph-users@ceph.io > > To subscribe or unsubscribe via email, send a message with subject or > body 'help' to > ceph-users-requ...@ceph.io > > You can reach the person managing the list at >

[ceph-users] Re: Nautilus power outage - 2/3 mons and mgrs dead and no cephfs

2019-10-22 Thread Alex L
Hi Wido, It was one of the first thing I checked yes, and it was synched properly. I have the full logs but since everything works now, I am unsure if I should upload them to the tracker ? Thanks, A ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: rgw multisite failover

2019-10-22 Thread Ed Fisher
> On Oct 18, 2019, at 10:40 PM, Frank R wrote: > > I am looking to change an RGW multisite deployment so that the secondary will > become master. This is meant to be a permanent change. > > Per: > https://docs.ceph.com/docs/mimic/radosgw/multisite/ >

[ceph-users] Re: Fwd: large concurrent rbd operations block for over 15 mins!

2019-10-22 Thread Mark Nelson
Out of curiosity, when you chose EC over replication how did you weigh IOPS vs space amplification in your decision making process?  I'm wondering if we should prioritize EC latency vs other tasks in future tuning efforts (it's always a tradeoff deciding what to focus on). Thanks, Mark On 1

[ceph-users] Re: Replace ceph osd in a container

2019-10-22 Thread Sasha Litvak
Frank, Thank you for your suggestion. It sounds very promising. I will definitely try it. Best, On Tue, Oct 22, 2019, 2:44 AM Frank Schilder wrote: > > I am suspecting that mon or mgr have no access to /dev or /var/lib while > osd containers do. > > Cluster configured originally by ceph-ansi

[ceph-users] Unbalanced data distribution

2019-10-22 Thread Thomas
Hi, in my 7 OSD node cluster I have the following disks: Node 1 48x 1.6TB Node 2 48x 1.6TB Node 3 48x 1.6TB Node 4 48x 1.6TB Node 5 48x 7.2TB Node 6 48x 7.2TB Node 7 48x 7.2TB The disk sizes are represented in CRUSH map accordingly. For these disks only I created a pool "hdb_backup" with size 3.

[ceph-users] Re: ceph mon failed to start

2019-10-22 Thread 徐蕴
I followed the instruction from https://docs.ceph.com/docs/jewel/rados/troubleshooting/troubleshooting-mon/#recovery-using-osds, but mon can’t still go up. The log shows: 2019-10-22 19:20:17.096346 7f1e0e9de680 0 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe), process ceph-mon

[ceph-users] Re: multiple nvme per osd

2019-10-22 Thread ceph
We did that too, using ceph-volume lvm batch --dmcrypt --osds-per-device 2 /dev/nvme0n1 (it was on a mimic cluster) On 10/22/19 8:17 AM, Ingo Schmidt wrote: > Hi Frank > > We use such a setup on our nautilus cluster. I manually partitioned the NVME > drive to 8 equally sized partitions with f

[ceph-users] Re: ceph mon failed to start

2019-10-22 Thread 徐蕴
It seems that ceph-kvstore-tool is not available in 10.2.10/Jewel. > 2019年10月22日 下午5:28,huang jun 写道: > > Try this https://docs.ceph.com/docs/mimic/man/8/ceph-kvstore-tool/ and > use the 'repair' operation > > 徐蕴 于2019年10月22日周二 下午3:51写道: >> >> Hi, >> >> Our cluster got an unexpected power o

[ceph-users] Re: ceph mon failed to start

2019-10-22 Thread huang jun
Try this https://docs.ceph.com/docs/mimic/man/8/ceph-kvstore-tool/ and use the 'repair' operation 徐蕴 于2019年10月22日周二 下午3:51写道: > > Hi, > > Our cluster got an unexpected power outage. Ceph mon cannot start after that. > The log shows: > > Running command: '/usr/bin/ceph-mon -f -i 10.10.198.11 --pu

[ceph-users] ceph mon failed to start

2019-10-22 Thread 徐蕴
Hi, Our cluster got an unexpected power outage. Ceph mon cannot start after that. The log shows: Running command: '/usr/bin/ceph-mon -f -i 10.10.198.11 --public-addr 10.10.198.11:6789' Corruption: 15 missing files; e.g.: /var/lib/ceph/mon/ceph-10.10.198.11/store.db/2676107.sst Is there any wa

[ceph-users] Re: Replace ceph osd in a container

2019-10-22 Thread Frank Schilder
> I am suspecting that mon or mgr have no access to /dev or /var/lib while osd > containers do. > Cluster configured originally by ceph-ansible (nautilus 14.2.2) They don't, because they don't need to. > The question is if I want to replace all disks on a single node, and I have 6 > nodes with

[ceph-users] Re: mds log showing msg with HANGUP

2019-10-22 Thread Amudhan P
Ok, thanks. On Mon, Oct 21, 2019 at 8:28 AM Konstantin Shalygin wrote: > On 10/18/19 8:43 PM, Amudhan P wrote: > > I am getting below error msg in ceph nautilus cluster, do I need to > > worry about this? > > > > Oct 14 06:25:02 mon01 ceph-mds[35067]: 2019-10-14 06:25:02.209 > > 7f55a4c48700 -1

[ceph-users] Re: Fwd: large concurrent rbd operations block for over 15 mins!

2019-10-22 Thread Frank Schilder
Getting decent RBD performance is not a trivial exercise. While at a first glance 61 SSDs for 245 clients sounds more or less OK, it does come down to a bit more than that. The first thing is, how to get SSD performance out of SSDs with ceph. This post will provide very good clues and might alr