It will also only work reliably if you use a single level tree structure
with failure domain "host". If you want say, separate data center
failure domains, you need extra steps to make sure a SSD host and a HDD
host do not get selected from the same DC.
I have done such a layout so it is possi
Den 2018-03-29 kl. 14:26, skrev David Rabel:
On 29.03.2018 13:50, Peter Linder wrote:
Den 2018-03-29 kl. 12:29, skrev David Rabel:
On 29.03.2018 12:25, Janne Johansson wrote:
2018-03-29 11:50 GMT+02:00 David Rabel :
You are right. But with my above example: If I have min_size 2 and size
4
Den 2018-03-29 kl. 12:29, skrev David Rabel:
On 29.03.2018 12:25, Janne Johansson wrote:
2018-03-29 11:50 GMT+02:00 David Rabel :
You are right. But with my above example: If I have min_size 2 and size
4, and because of a network issue the 4 OSDs are split into 2 and 2, is
it possible that I
1285M 9312G 0.01 0 0
101 hdd 0 1.0 9313G 1271M 9312G 0.01 0 0
On Tue, Mar 27, 2018 at 2:29 PM, Peter Linder
mailto:peter.lin...@fiberdirekt.se>> wrote:
I've had similar issues, but I think your problem might be
something else. Could you send the output of
I've had similar issues, but I think your problem might be something
else. Could you send the output of "ceph osd df"?
Other people will probably be interested in what version you are using
as well.
Den 2018-03-27 kl. 20:07, skrev Jon Light:
Hi all,
I'm adding a new OSD node with 36 OSDs t
Yes, this did turn out to be our main issue. We also had a smaller
issue, but this was the one that caused parts of our pools to go offline
for a short time. Or, 'cause' was us adding some new NVMe drives that
were much larger than the ones we already had so too many PGs got mapped
to them but
em to
solve. You can probably "fix" it by turning up the choose_retries
value (or whatever it is) to a high enough level that trying to map a
PG eventually actually grabs the small host. But I wouldn't be very
confident in a solution like this; it seems very fragile and subje
But the OSDs themselves introduce latency also, even if they are NVMe.
We find that it is in the same ballpark. Latency does reduce I/O, but
for sub-ms ones it is still thousands of IOPS even for a single thread.
For a use case with many concurrent writers/readers (VMs), aggregated
throughput
Your data centers seem to be pretty close, some 13-14km? If it is a more
or less straight fiber run then latency should be 0.1-0.2ms or
something, clearly not a problem for synchronous replication. It should
work rather well.
With "only" 2 data centers however, you need to manually decide if t
We kind of turned the crushmap inside out a little bit.
Instead of the traditional "for 1 PG, select OSDs from 3 separate data
centers" we did "force selection from only one datacenter (out of 3) and
leave enough options only to make sure precisely 1 SSD and 2 HDD are
selected".
We then orga
Ok, by randomly toggling settings *MOST* of the PGs in the test cluster
is online, but a few are not. No matter how much I change, a few of them
seem to not activate. They are running bluestore with version 12.2.2, i
think created with ceph-volume.
Here is the output from ceph pg X query of on
r, great success :). Now I only have to learn how to fix it,
any ideas anyone?
Den 2018-01-26 kl. 12:59, skrev Peter Linder:
Well, we do, but our problem is with our hybrid setup (1 nvme and 2
hdds). The other two (that we rarely use) are nvme only and hdd only,
as far as I can tell they
* Do you have any pools that follow a crush rule to only use osds
that are backed by hdds (i.e not nvmes)?
* Do these pools obey that rule? i.e do they maybe have pgs that are
on nvmes?
Regards,
Tom
On Fri, Jan 26, 2018 at 11:48 AM, Peter Linder
mailto:peter.lin...@fiberdirekt.se>&
Hi Thomas,
No, we haven't gotten any closer to resolving this, in fact we had
another issue again when we added a new nvme drive to our nvme servers
(storage11, storage12 and storage13) that had weight 1.7 instead of the
usual 0.728 size. This (see below) is what a nvme and hdd server pair at
Did you find out anything about this? We are also getting pgs stuck
"activating+remapped". I have to manually alter bucket weights so that
they are basically the same everywhere, even if disks aren't the same
size to fix the problem, but it is a real hassle every time we add a new
node or disk.
Hi all,
I'm getting such weird problems when we for instance re-add a server,
add disks etc! Most of the time some PGs end up in
"active+clean+remapped" mode, but today some of them got stuck
"activating" which meant that some PGs were offline for a while. I'm
able to fix things, but the fix
On 11/10/2017 7:17 AM, Sébastien VIGNERON wrote:
> Hi everyone,
>
> Beginner with Ceph, i’m looking for a way to do a 3-way replication
> between 2 datacenters as mention in ceph docs (but not describe).
>
> My goal is to keep access to the data (at least read-only access) even
> when the link betw
Probably chooseleaf also instead of choose.
Konrad Riedel skrev: (10 oktober 2017 17:05:52 CEST)
>Hello Ceph-users,
>
>after switching to luminous I was excited about the great
>crush-device-class feature - now we have 5 servers with 1x2TB NVMe
>based OSDs, 3 of them additionally with 4 HDDS per
I think your failure domain within your rules is wrong.
step choose firstn 0 type osd
Should be:
step choose firstn 0 type host
On 10/10/2017 5:05 PM, Konrad Riedel wrote:
> Hello Ceph-users,
>
> after switching to luminous I was excited about the great
> crush-device-class feature - now we ha
d be good
> as long as you have 1 mon per DC. You should test this to see how the
> recovery goes, but there shouldn't be a problem.
>
>
> On Sat, Oct 7, 2017, 6:10 PM Дробышевский, Владимир <mailto:v...@itgorod.ru>> wrote:
>
> 2017-10-08 2:02 GMT+05:00 Peter Lin
gt; in each datacenter. The mons control the maps and you should be good
> as long as you have 1 mon per DC. You should test this to see how the
> recovery goes, but there shouldn't be a problem.
>
>
> On Sat, Oct 7, 2017, 6:10 PM Дробышевский, Владимир <mailto
ut that would leave me manually balancing load. And if one node went
down, some RBDs would completely loose their SSD read capability instead
of just 1/3 of it... perhaps acceptable, but not optimal :)
/Peter
>
> On Sat, Oct 7, 2017 at 3:36 PM Peter Linder
> mailto:peter.lin...@fiberdirek
s selecting buckets?).
> On Sat, Oct 7, 2017, 1:48 PM Peter Linder <mailto:peter.lin...@fiberdirekt.se>> wrote:
>
> On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote:
>> Hello!
>>
>> 2017-10-07 19:12 GMT+05:00 Peter Linder
>> ma
s.
>
> Op 7 okt. 2017 om 19:39 heeft Peter Linder
> mailto:peter.lin...@fiberdirekt.se>> het
> volgende geschreven:
>
>> On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote:
>>> Hello!
>>>
>>> 2017-10-07 19:12 GMT+05:00 Peter Linder >> <m
On 10/7/2017 7:36 PM, Дробышевский, Владимир wrote:
> Hello!
>
> 2017-10-07 19:12 GMT+05:00 Peter Linder <mailto:peter.lin...@fiberdirekt.se>>:
>
> The idea is to select an nvme osd, and
> then select the rest from hdd osds in different datacenters (see crush
&g
Hello Ceph-users!
Ok, so I've got 3 separate datacenters (low latency network in between)
and I want to make a hybrid NMVe/HDD pool for performance and cost reasons.
There are 3 servers with NVMe based OSDs, and 2 servers with normal HDDS
(Yes, one is missing, will be 3 of course. It needs some m
26 matches
Mail list logo