Re: [ceph-users] SSD-primary crush rule doesn't work as intended

2018-05-24 Thread Peter Linder
It will also only work reliably if you use a single level tree structure with failure domain "host". If you want say, separate data center failure domains, you need extra steps to make sure a SSD host and a HDD host do not get selected from the same DC. I have done such a layout so it is possi

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Peter Linder
Den 2018-03-29 kl. 14:26, skrev David Rabel: On 29.03.2018 13:50, Peter Linder wrote: Den 2018-03-29 kl. 12:29, skrev David Rabel: On 29.03.2018 12:25, Janne Johansson wrote: 2018-03-29 11:50 GMT+02:00 David Rabel : You are right. But with my above example: If I have min_size 2 and size 4

Re: [ceph-users] [SOLVED] Replicated pool with an even size - has min_size to be bigger than half the size?

2018-03-29 Thread Peter Linder
Den 2018-03-29 kl. 12:29, skrev David Rabel: On 29.03.2018 12:25, Janne Johansson wrote: 2018-03-29 11:50 GMT+02:00 David Rabel : You are right. But with my above example: If I have min_size 2 and size 4, and because of a network issue the 4 OSDs are split into 2 and 2, is it possible that I

Re: [ceph-users] PGs stuck activating after adding new OSDs

2018-03-27 Thread Peter Linder
1285M 9312G  0.01    0  0 101   hdd       0  1.0 9313G  1271M 9312G  0.01    0  0 On Tue, Mar 27, 2018 at 2:29 PM, Peter Linder mailto:peter.lin...@fiberdirekt.se>> wrote: I've had similar issues, but I think your problem might be something else. Could you send the output of

Re: [ceph-users] PGs stuck activating after adding new OSDs

2018-03-27 Thread Peter Linder
I've had similar issues, but I think your problem might be something else. Could you send the output of "ceph osd df"? Other people will probably be interested in what version you are using as well. Den 2018-03-27 kl. 20:07, skrev Jon Light: Hi all, I'm adding a new OSD node with 36 OSDs t

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-31 Thread Peter Linder
Yes, this did turn out to be our main issue. We also had a smaller issue, but this was the one that caused parts of our pools to go offline for a short time. Or, 'cause' was us adding some new NVMe drives that were much larger than the ones we already had so too many PGs got mapped to them but

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Peter Linder
em to solve. You can probably "fix" it by turning up the choose_retries value (or whatever it is) to a high enough level that trying to map a PG eventually actually grabs the small host. But I wouldn't be very confident in a solution like this; it seems very fragile and subje

Re: [ceph-users] [Best practise] Adding new data center

2018-01-29 Thread Peter Linder
But the OSDs themselves introduce latency also, even if they are NVMe. We find that it is in the same ballpark. Latency does reduce I/O, but for sub-ms ones it is still thousands of IOPS even for a single thread. For a use case with many concurrent writers/readers (VMs), aggregated throughput

Re: [ceph-users] [Best practise] Adding new data center

2018-01-29 Thread Peter Linder
Your data centers seem to be pretty close, some 13-14km? If it is a more or less straight fiber run then latency should be 0.1-0.2ms or something, clearly not a problem for synchronous replication. It should work rather well. With "only" 2 data centers however, you need to manually decide if t

Re: [ceph-users] CRUSH straw2 can not handle big weight differences

2018-01-29 Thread Peter Linder
We kind of turned the crushmap inside out a little bit. Instead of the traditional "for 1 PG, select OSDs from 3 separate data centers" we did "force selection from only one datacenter (out of 3) and leave enough options only to make sure precisely 1 SSD and 2 HDD are selected". We then orga

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-26 Thread Peter Linder
Ok, by randomly toggling settings *MOST* of the PGs in the test cluster is online, but a few are not. No matter how much I change, a few of them seem to not activate. They are running bluestore with version 12.2.2, i think created with ceph-volume. Here is the output from ceph pg X query of on

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-26 Thread Peter Linder
r, great success :). Now I only have to learn how to fix it, any ideas anyone? Den 2018-01-26 kl. 12:59, skrev Peter Linder: Well, we do, but our problem is with our hybrid setup (1 nvme and 2 hdds). The other two (that we rarely use) are nvme only and hdd only, as far as I can tell they

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-26 Thread Peter Linder
* Do you have any pools that follow a crush rule to only use osds that are backed by hdds (i.e not nvmes)? * Do these pools obey that rule? i.e do they maybe have pgs that are on nvmes? Regards, Tom On Fri, Jan 26, 2018 at 11:48 AM, Peter Linder mailto:peter.lin...@fiberdirekt.se>&

Re: [ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-26 Thread Peter Linder
Hi Thomas, No, we haven't gotten any closer to resolving this, in fact we had another issue again when we added a new nvme drive to our nvme servers (storage11, storage12 and storage13) that had weight 1.7 instead of the usual 0.728 size. This (see below) is what a nvme and hdd server pair at

Re: [ceph-users] Stuck pgs (activating+remapped) and slow requests after adding OSD node via ceph-ansible

2018-01-22 Thread Peter Linder
Did you find out anything about this? We are also getting pgs stuck "activating+remapped". I have to manually alter bucket weights so that they are basically the same everywhere, even if disks aren't the same size to fix the problem, but it is a real hassle every time we add a new node or disk.

[ceph-users] Weird issues related to (large/small) weights in mixed nvme/hdd pool

2018-01-20 Thread peter . linder
Hi all, I'm getting such weird problems when we for instance re-add a server, add disks etc! Most of the time some PGs end up in "active+clean+remapped" mode, but today some of them got stuck "activating" which meant that some PGs were offline for a while. I'm able to fix things, but the fix

Re: [ceph-users] Needed help to setup a 3-way replication between 2 datacenters

2017-11-10 Thread Peter Linder
On 11/10/2017 7:17 AM, Sébastien VIGNERON wrote: > Hi everyone, > > Beginner with Ceph, i’m looking for a way to do a 3-way replication > between 2 datacenters as mention in ceph docs (but not describe). > > My goal is to keep access to the data (at least read-only access) even > when the link betw

Re: [ceph-users] All replicas of pg 5.b got placed on the same host - how to correct?

2017-10-10 Thread Peter Linder
Probably chooseleaf also instead of choose. Konrad Riedel skrev: (10 oktober 2017 17:05:52 CEST) >Hello Ceph-users, > >after switching to luminous I was excited about the great >crush-device-class feature - now we have 5 servers with 1x2TB NVMe >based OSDs, 3 of them additionally with 4 HDDS per

Re: [ceph-users] All replicas of pg 5.b got placed on the same host - how to correct?

2017-10-10 Thread Peter Linder
I think your failure domain within your rules is wrong. step choose firstn 0 type osd Should be: step choose firstn 0 type host On 10/10/2017 5:05 PM, Konrad Riedel wrote: > Hello Ceph-users, > > after switching to luminous I was excited about the great > crush-device-class feature - now we ha

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-09 Thread Peter Linder
d be good > as long as you have 1 mon per DC. You should test this to see how the > recovery goes, but there shouldn't be a problem. > > > On Sat, Oct 7, 2017, 6:10 PM Дробышевский, Владимир <mailto:v...@itgorod.ru>> wrote: > > 2017-10-08 2:02 GMT+05:00 Peter Lin

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-08 Thread Peter Linder
gt; in each datacenter. The mons control the maps and you should be good > as long as you have 1 mon per DC. You should test this to see how the > recovery goes, but there shouldn't be a problem. > > > On Sat, Oct 7, 2017, 6:10 PM Дробышевский, Владимир <mailto

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-07 Thread Peter Linder
ut that would leave me manually balancing load. And if one node went down, some RBDs would completely loose their SSD read capability instead of just 1/3 of it...  perhaps acceptable, but not optimal :) /Peter > > On Sat, Oct 7, 2017 at 3:36 PM Peter Linder > mailto:peter.lin...@fiberdirek

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-07 Thread Peter Linder
s selecting buckets?). > On Sat, Oct 7, 2017, 1:48 PM Peter Linder <mailto:peter.lin...@fiberdirekt.se>> wrote: > > On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote: >> Hello! >> >> 2017-10-07 19:12 GMT+05:00 Peter Linder >> ma

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-07 Thread Peter Linder
s. > > Op 7 okt. 2017 om 19:39 heeft Peter Linder > mailto:peter.lin...@fiberdirekt.se>> het > volgende geschreven: > >> On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote: >>> Hello! >>> >>> 2017-10-07 19:12 GMT+05:00 Peter Linder >> <m

Re: [ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-07 Thread Peter Linder
On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote: > Hello! > > 2017-10-07 19:12 GMT+05:00 Peter Linder <mailto:peter.lin...@fiberdirekt.se>>: > > The idea is to select an nvme osd, and > then select the rest from hdd osds in different datacenters (see crush &g

[ceph-users] PGs get placed in the same datacenter (Trying to make a hybrid NVMe/HDD pool with 6 servers, 2 in each datacenter)

2017-10-07 Thread Peter Linder
Hello Ceph-users! Ok, so I've got 3 separate datacenters (low latency network in between) and I want to make a hybrid NMVe/HDD pool for performance and cost reasons. There are 3 servers with NVMe based OSDs, and 2 servers with normal HDDS (Yes, one is missing, will be 3 of course. It needs some m