quot;: 0,
"num_snap_trimming": 0,
"op_queue_age_hist": {
"histogram": [],
"upper_bound": 1
},
"fs_perf_stat": {
"commit_latency_ms": 0,
"apply_latency_ms": 49
think you have a significant latency/iops issue: a 36 all SSDs
> cluster should give much higher that 2.5K iops
>
> Maged
>
>
> On 2017-12-07 23:57, Russell Glaue wrote:
>
> I want to provide an update to my interesting situation.
> (New storage nodes were purchased and a
27, 2017 at 4:21 PM, Russell Glaue wrote:
> Yes, several have recommended the fio test now.
> I cannot perform a fio test at this time. Because the post referred to
> directs us to write the fio test data directly to the disk device, e.g.
> /dev/sdj. I'd have to take an OSD comple
t; device doesn't test well in fio/dd testing, then the drives
> are (as expected) not a great choice for journals and you might want to
> look at hardware/backplane/RAID configuration differences that are somehow
> allowing them to perform adequately.
>
> On Fri, Oct 27, 2017
ut last I remember, this was a hardware
issue and could not be resolved with firmware.
Paging Kyle Bader...
On Fri, Oct 27, 2017 at 9:24 AM, Russell Glaue wrote:
> We have older crucial M500 disks operating without such problems. So, I
> have to believe it is a hardware firmware issue.
&
at times (based on unquantified anecdotal personal
> experience with other consumer model SSDs). I wouldn't touch these
> with a long stick for anything but small toy-test clusters.
>
> On Fri, Oct 27, 2017 at 3:44 AM, Russell Glaue wrote:
>
>
> On Wed, Oct 25, 2017
are lower
than 80%. So, for whatever reason, shutting down the OSDs and starting them
back up, allowed many (not all) of the OSDs performance to improve on the
problem host.
Maged
>
> On 2017-10-25 23:44, Russell Glaue wrote:
>
> Thanks to all.
> I took the OSDs down in the problem
latency(s): 0.00107473
Cleaning up (deleting benchmark objects)
Clean up completed and total clean up time :16.269393
On Fri, Oct 20, 2017 at 1:35 PM, Russell Glaue wrote:
> On the machine in question, the 2nd newest, we are using the LSI MegaRAID
> SAS-3 3008 [Fury], which allows us
ian Balzer wrote:
>
> Hello,
>
> On Fri, 20 Oct 2017 13:35:55 -0500 Russell Glaue wrote:
>
> > On the machine in question, the 2nd newest, we are using the LSI MegaRAID
> > SAS-3 3008 [Fury], which allows us a "Non-RAID" option, and has no
> batter
ers.
>
> On Thu, Oct 19, 2017, 8:15 PM Christian Balzer wrote:
>
>>
>> Hello,
>>
>> On Thu, 19 Oct 2017 17:14:17 -0500 Russell Glaue wrote:
>>
>> > That is a good idea.
>> > However, a previous rebalancing processes has brought performance of
OSDs on the second questionable server, mark the OSDs on that server as
> out, let the cluster rebalance and when all PGs are active+clean just
> replay the test.
>
> All IOs should then go only to the other 3 servers.
>
> JC
>
> On Oct 19, 2017, at 13:49, Russell Glaue wrote:
&
n nodes and see if the
> problem follows the component.
>
> On Thu, Oct 19, 2017 at 4:49 PM Russell Glaue wrote:
>
>> No, I have not ruled out the disk controller and backplane making the
>> disks slower.
>> Is there a way I could test that theory, other than swapping
ing
> slower?
>
> On Thu, Oct 19, 2017 at 4:42 PM Russell Glaue wrote:
>
>> I ran the test on the Ceph pool, and ran atop on all 4 storage servers,
>> as suggested.
>>
>> Out of the 4 servers:
>> 3 of them performed with 17% to 30% disk %busy, and 11% CPU
Oct 18, 2017 at 1:33 PM, Maged Mokhtar
> wrote:
>
>> measuring resource load as outlined earlier will show if the drives are
>> performing well or not. Also how many osds do you have ?
>>
>> On 2017-10-18 19:26, Russell Glaue wrote:
>>
>> The SSD drives ar
show if the drives are
> performing well or not. Also how many osds do you have ?
>
> On 2017-10-18 19:26, Russell Glaue wrote:
>
> The SSD drives are Crucial M500
> A Ceph user did some benchmarks and found it had good performance
> https://forum.proxmox.com/threads/ceph-
ing resource load as outlined earlier will show if the drives are
> performing well or not. Also how many osds do you have ?
>
> On 2017-10-18 19:26, Russell Glaue wrote:
>
> The SSD drives are Crucial M500
> A Ceph user did some benchmarks and found it had good performance
> h
ch i doubt is the case). If only 1 disk %busy is high,
> there may be something wrong with this disk should be removed.
>
> Maged
>
> On 2017-10-18 18:13, Russell Glaue wrote:
>
> In my previous post, in one of my points I was wondering if the request
> size would increase if I
ting larger requests. Depending on your kernel, the io scheduler
> may be different for rbd (blq-mq) vs sdx (cfq) but again i would think the
> request size is a result not a cause.
>
> Maged
>
> On 2017-10-17 23:12, Russell Glaue wrote:
>
> I am running ceph jewel on 5 node
I am running ceph jewel on 5 nodes with SSD OSDs.
I have an LVM image on a local RAID of spinning disks.
I have an RBD image on in a pool of SSD disks.
Both disks are used to run an almost identical CentOS 7 system.
Both systems were installed with the same kickstart, though the disk
partitioning i
19 matches
Mail list logo