[ceph-users] Re: HBA vs caching Raid controller

2021-04-21 Thread Marc
> > This is what I have when I query prometheus, most hdd's are still sata
> 5400rpm, there are also some ssd's. I also did not optimize cpu
> frequency settings. (forget about the instance=c03, that is just because
> the data comes from mgr c03, these drives are on different hosts)
> >
> > ceph_osd_apply_latency_ms
> >
> > ceph_osd_apply_latency_ms{ceph_daemon="osd.12", instance="c03",
> job="ceph"}   42
> > ...
> > ceph_osd_apply_latency_ms{ceph_daemon="osd.19", instance="c03",
> job="ceph"}   1
> 
> I assume this looks somewhat normal, with a bit of variance due to
> access.
> 
> > avg (ceph_osd_apply_latency_ms)
> > 9.336
> 
> I see something similar, around 9ms average latency for HDD based osds,
> best case average around 3ms.
> 
> > So I guess it is possible for you to get lower values on the lsi hba
> 
> Can you let me know which exact model you have?

[~]# sas2flash -list
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved

 Adapter Selected is a LSI SAS: SAS2308_2(D1)

 Controller Number  : 0
 Controller : SAS2308_2(D1)
 PCI Address: 00:04:00:00
 SAS Address: 500605b-0-05a6-c49e
 NVDATA Version (Default)   : 14.01.00.06
 NVDATA Version (Persistent): 14.01.00.06
 Firmware Product ID: 0x2214 (IT)
 Firmware Version   : 20.00.07.00
 NVDATA Vendor  : LSI
 NVDATA Product ID  : SAS9207-8i
 BIOS Version   : 07.39.02.00
 UEFI BSD Version   : N/A
 FCODE Version  : N/A
 Board Name : SAS9207-8i
 Board Assembly : N/A
 Board Tracer Number: N/A

> 
> > Maybe you can tune read a head on the lsi with something like this.
> > echo 8192 > /sys/block/$line/queue/read_ahead_kb
> > echo 1024 > /sys/block/$line/queue/nr_requests
> 
> I tried both of them, even going up to 16MB read ahead cache, but
> besides a short burst when changing the values, the average stays +/-
> the same on that host.
> 
> I also checked cpu speed (same as the rest), io scheduler (using "none"
> really drives the disks crazy). What I observed is that the avq value in
> atop is lower than on the other servers, which are around 15. This
> server is more in the range 1-3.
> 
> > Also check for pci-e 3 those have higher bus speeds.
> 
> True, even though pci-e 2, x8 should be able to deliver 4 GB/s, if I am
> not mistaken.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Mark Lehrer
> - The pattern is mainly write centric, so write latency is
>   probably the real factor
> - The HDD OSDs behind the raid controllers can cache / reorder
>   writes and reduce seeks potentially

OK that makes sense.

Unfortunately, re-ordering HDD writes without a battery backup is kind
of dangerous -- writes need to happen in order or the filesystem will
punish you when you least expect it.  This is the whole point of the
battery backup - to make sure that out-of-order writes get written to
disk even if there is a power loss in the middle of writing the
controller-write-cache data in an HDD-optimized order.

Your use case is ideal for an SSD-based WAL -- though it may be
difficult to beat the cost of H800s these days.


> In this context: is anyone here using HBAs with battery
> backed cache, and if yes, which controllers do you tend to use?

I almost always use MegaRAID-based controllers (such as the H800).


Good luck,
Mark


On Tue, Apr 20, 2021 at 2:28 PM Nico Schottelius
 wrote:
>
>
> Mark Lehrer  writes:
>
> >> One server has LSI SAS3008 [0] instead of the Perc H800,
> >> which comes with 512MB RAM + BBU. On most servers latencies are around
> >> 4-12ms (average 6ms), on the system with the LSI controller we see
> >> 20-60ms (average 30ms) latency.
> >
> > Are these reads, writes, or a mixed workload?  I would expect an
> > improvement in writes, but 512MB of cache isn't likely to help much on
> > reads with such a large data set.
>
> It's mostly write (~20MB/s), little read (1-5 MB/s) work load. This is
> probably due to many people using this storage for backup.
>
> > Just as a test, you could removing the battery on one of the H800s to
> > disable the write cache -- or else disable write caching with megaraid
> > or equivalent.
>
> That is certainly an interesting idea - and rereading your message and
> my statement above might actually explain the behaviour:
>
> - The pattern is mainly write centric, so write latency is probably the
>   real factor
> - The HDD OSDs behind the raid controllers can cache / reorder writes
>   and reduce seeks potentially
>
> So while "a raid controller" per se does probably not improve or reduce
> speed for ceph, "a (disk/raid) controller with a battery backed cache",
> might actually.
>
> In this context: is anyone here using HBAs with battery backed cache,
> and if yes, which controllers do you tend to use?
>
> Nico
>
>
> --
> Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Anthony D'Atri

> It's not a 100% clear to me, but is the pdcache the same as the disk
> internal (non battery backed up) cache?

Yes, AIUI.

> As we are located very nearby the hydropower plant, we actually connect
> each server individually to an UPS.

Lucky you. I’ve seen an entire DC go dark with a power outage thanks to a 
transfer switch not kicking in generators.  Then again two weeks later.  Taught 
me to be paranoid about power loss.  And with dual power supplies, to 
proactively check status, so that I don’t find out only when it’s too late that 
a cord unseated or a PSU died, resulting in a silent loss of redundancy.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Nico Schottelius



Reed Dier  writes:

> I don't have any performance bits to offer, but I do have one experiential 
> bit to offer.
>
> My initial ceph deployment was on existing servers, that had LSI raid 
> controllers (3108 specifically).
> We created R0 vd's for each disk, and had BBUs so were using write back 
> caching.
> The big problem that arose was the pdcache value, which in my case defaults 
> to on.
>
> We had a lightning strike that took out the datacenter, and we lost 21/24 
> OSDs.
> Granted, this was back in XFS-on-filestore days, but this was a painful 
> lesson learned.
> It was narrowed down to the pdcache and not to the raid controller caching 
> functions after carrying out some power-loss scenarios after the incident.

It's not a 100% clear to me, but is the pdcache the same as the disk
internal (non battery backed up) cache?

As we are located very nearby the hydropower plant, we actually connect
each server individually to an UPS. Our original motivation was mainly
to cut off over voltage, but it has the nice side effect of having
another battery buffer on servers.


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Nico Schottelius


Mark Lehrer  writes:

>> One server has LSI SAS3008 [0] instead of the Perc H800,
>> which comes with 512MB RAM + BBU. On most servers latencies are around
>> 4-12ms (average 6ms), on the system with the LSI controller we see
>> 20-60ms (average 30ms) latency.
>
> Are these reads, writes, or a mixed workload?  I would expect an
> improvement in writes, but 512MB of cache isn't likely to help much on
> reads with such a large data set.

It's mostly write (~20MB/s), little read (1-5 MB/s) work load. This is
probably due to many people using this storage for backup.

> Just as a test, you could removing the battery on one of the H800s to
> disable the write cache -- or else disable write caching with megaraid
> or equivalent.

That is certainly an interesting idea - and rereading your message and
my statement above might actually explain the behaviour:

- The pattern is mainly write centric, so write latency is probably the
  real factor
- The HDD OSDs behind the raid controllers can cache / reorder writes
  and reduce seeks potentially

So while "a raid controller" per se does probably not improve or reduce
speed for ceph, "a (disk/raid) controller with a battery backed cache",
might actually.

In this context: is anyone here using HBAs with battery backed cache,
and if yes, which controllers do you tend to use?

Nico


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Nico Schottelius



Marc  writes:

> This is what I have when I query prometheus, most hdd's are still sata 
> 5400rpm, there are also some ssd's. I also did not optimize cpu frequency 
> settings. (forget about the instance=c03, that is just because the data comes 
> from mgr c03, these drives are on different hosts)
>
> ceph_osd_apply_latency_ms
>
> ceph_osd_apply_latency_ms{ceph_daemon="osd.12", instance="c03", job="ceph"}   
> 42
> ...
> ceph_osd_apply_latency_ms{ceph_daemon="osd.19", instance="c03", job="ceph"}   
> 1

I assume this looks somewhat normal, with a bit of variance due to
access.

> avg (ceph_osd_apply_latency_ms)
> 9.336

I see something similar, around 9ms average latency for HDD based osds,
best case average around 3ms.

> So I guess it is possible for you to get lower values on the lsi hba

Can you let me know which exact model you have?

> Maybe you can tune read a head on the lsi with something like this.
> echo 8192 > /sys/block/$line/queue/read_ahead_kb
> echo 1024 > /sys/block/$line/queue/nr_requests

I tried both of them, even going up to 16MB read ahead cache, but
besides a short burst when changing the values, the average stays +/-
the same on that host.

I also checked cpu speed (same as the rest), io scheduler (using "none"
really drives the disks crazy). What I observed is that the avq value in
atop is lower than on the other servers, which are around 15. This
server is more in the range 1-3.

> Also check for pci-e 3 those have higher bus speeds.

True, even though pci-e 2, x8 should be able to deliver 4 GB/s, if I am
not mistaken.



--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Anthony D'Atri

I don’t have the firmware versions handy, but at one point around the 2014-2015 
timeframe I found that both LSI’s firmware and storcli claimed that the default 
setting was DiskDefault, ie. leave whatever the drive has alone.  It turned 
out, though, that for the 9266 and 9271, at least, behind the scenes it was 
claiming DiskDefault, but was actually turning on the drive’s volatile cache.  
Which resulted in the power-loss behavior you describe.

There were also hardware and firmware issues that resulted in preserved / 
pinned cache not being properly restored - in one case, if even a drive failed 
hard, I had to replace the HBA in order to boot.

I posted a list of RoC HBA thoughts including these to the list back around 
late-summer 2017.  

> 
> I don't have any performance bits to offer, but I do have one experiential 
> bit to offer.
> 
> My initial ceph deployment was on existing servers, that had LSI raid 
> controllers (3108 specifically).
> We created R0 vd's for each disk, and had BBUs so were using write back 
> caching.
> The big problem that arose was the pdcache value, which in my case defaults 
> to on.
> 
> We had a lightning strike that took out the datacenter, and we lost 21/24 
> OSDs.
> Granted, this was back in XFS-on-filestore days, but this was a painful 
> lesson learned.
> It was narrowed down to the pdcache and not to the raid controller caching 
> functions after carrying out some power-loss scenarios after the incident.
> 
> So, make sure you turn your pdcache off in perccli.
> 
> Reed
> 
>> On Apr 19, 2021, at 1:20 PM, Nico Schottelius  
>> wrote:
>> 
>> 
>> Good evening,
>> 
>> I've to tackle an old, probably recurring topic: HBAs vs. Raid
>> controllers. While generally speaking many people in the ceph field
>> recommend to go with HBAs, it seems in our infrastructure the only
>> server we phased in with an HBA vs. raid controller is actually doing
>> worse in terms of latency.
>> 
>> For the background: we have many Perc H800+MD1200 [1] systems running with
>> 10TB HDDs (raid0, read ahead, writeback cache).
>> One server has LSI SAS3008 [0] instead of the Perc H800,
>> which comes with 512MB RAM + BBU. On most servers latencies are around
>> 4-12ms (average 6ms), on the system with the LSI controller we see
>> 20-60ms (average 30ms) latency.
>> 
>> Now, my question is, are we doing some inherently wrong with the
>> SAS3008 or does in fact the cache help to possible reduce seek time?
>> 
>> We were considering to move more towards LSI HBAs to reduce maintenance
>> effort, however if we have a factor of 5 in latency between the two
>> different systems, it might be better to stay on the H800 path for
>> disks.
>> 
>> Any input/experiences appreciated.
>> 
>> Best regards,
>> 
>> Nico
>> 
>> [0]
>> 05:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 
>> PCI-Express Fusion-MPT SAS-3 (rev 02)
>>  Subsystem: Dell 12Gbps HBA
>>  Kernel driver in use: mpt3sas
>>  Kernel modules: mpt3sas
>> 
>> [1]
>> 08:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 
>> [Liberator] (rev 05)
>>  Subsystem: Dell PERC H800 Adapter
>>  Kernel driver in use: megaraid_sas
>>  Kernel modules: megaraid_sas
>> 
>> --
>> Sustainable and modern Infrastructures by ungleich.ch
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Reed Dier
I don't have any performance bits to offer, but I do have one experiential bit 
to offer.

My initial ceph deployment was on existing servers, that had LSI raid 
controllers (3108 specifically).
We created R0 vd's for each disk, and had BBUs so were using write back caching.
The big problem that arose was the pdcache value, which in my case defaults to 
on.

We had a lightning strike that took out the datacenter, and we lost 21/24 OSDs.
Granted, this was back in XFS-on-filestore days, but this was a painful lesson 
learned.
It was narrowed down to the pdcache and not to the raid controller caching 
functions after carrying out some power-loss scenarios after the incident.

So, make sure you turn your pdcache off in perccli.

Reed

> On Apr 19, 2021, at 1:20 PM, Nico Schottelius  
> wrote:
> 
> 
> Good evening,
> 
> I've to tackle an old, probably recurring topic: HBAs vs. Raid
> controllers. While generally speaking many people in the ceph field
> recommend to go with HBAs, it seems in our infrastructure the only
> server we phased in with an HBA vs. raid controller is actually doing
> worse in terms of latency.
> 
> For the background: we have many Perc H800+MD1200 [1] systems running with
> 10TB HDDs (raid0, read ahead, writeback cache).
> One server has LSI SAS3008 [0] instead of the Perc H800,
> which comes with 512MB RAM + BBU. On most servers latencies are around
> 4-12ms (average 6ms), on the system with the LSI controller we see
> 20-60ms (average 30ms) latency.
> 
> Now, my question is, are we doing some inherently wrong with the
> SAS3008 or does in fact the cache help to possible reduce seek time?
> 
> We were considering to move more towards LSI HBAs to reduce maintenance
> effort, however if we have a factor of 5 in latency between the two
> different systems, it might be better to stay on the H800 path for
> disks.
> 
> Any input/experiences appreciated.
> 
> Best regards,
> 
> Nico
> 
> [0]
> 05:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 
> PCI-Express Fusion-MPT SAS-3 (rev 02)
>   Subsystem: Dell 12Gbps HBA
>   Kernel driver in use: mpt3sas
>   Kernel modules: mpt3sas
> 
> [1]
> 08:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 
> [Liberator] (rev 05)
>   Subsystem: Dell PERC H800 Adapter
>   Kernel driver in use: megaraid_sas
>   Kernel modules: megaraid_sas
> 
> --
> Sustainable and modern Infrastructures by ungleich.ch
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-20 Thread Mark Lehrer
> One server has LSI SAS3008 [0] instead of the Perc H800,
> which comes with 512MB RAM + BBU. On most servers latencies are around
> 4-12ms (average 6ms), on the system with the LSI controller we see
> 20-60ms (average 30ms) latency.

Are these reads, writes, or a mixed workload?  I would expect an
improvement in writes, but 512MB of cache isn't likely to help much on
reads with such a large data set.

Just as a test, you could removing the battery on one of the H800s to
disable the write cache -- or else disable write caching with megaraid
or equivalent.





On Mon, Apr 19, 2021 at 12:21 PM Nico Schottelius
 wrote:
>
>
> Good evening,
>
> I've to tackle an old, probably recurring topic: HBAs vs. Raid
> controllers. While generally speaking many people in the ceph field
> recommend to go with HBAs, it seems in our infrastructure the only
> server we phased in with an HBA vs. raid controller is actually doing
> worse in terms of latency.
>
> For the background: we have many Perc H800+MD1200 [1] systems running with
> 10TB HDDs (raid0, read ahead, writeback cache).
> One server has LSI SAS3008 [0] instead of the Perc H800,
> which comes with 512MB RAM + BBU. On most servers latencies are around
> 4-12ms (average 6ms), on the system with the LSI controller we see
> 20-60ms (average 30ms) latency.
>
> Now, my question is, are we doing some inherently wrong with the
> SAS3008 or does in fact the cache help to possible reduce seek time?
>
> We were considering to move more towards LSI HBAs to reduce maintenance
> effort, however if we have a factor of 5 in latency between the two
> different systems, it might be better to stay on the H800 path for
> disks.
>
> Any input/experiences appreciated.
>
> Best regards,
>
> Nico
>
> [0]
> 05:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS3008 
> PCI-Express Fusion-MPT SAS-3 (rev 02)
> Subsystem: Dell 12Gbps HBA
> Kernel driver in use: mpt3sas
> Kernel modules: mpt3sas
>
> [1]
> 08:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 
> [Liberator] (rev 05)
> Subsystem: Dell PERC H800 Adapter
> Kernel driver in use: megaraid_sas
> Kernel modules: megaraid_sas
>
> --
> Sustainable and modern Infrastructures by ungleich.ch
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-19 Thread Marc



This is what I have when I query prometheus, most hdd's are still sata 5400rpm, 
there are also some ssd's. I also did not optimize cpu frequency settings. 
(forget about the instance=c03, that is just because the data comes from mgr 
c03, these drives are on different hosts)

ceph_osd_apply_latency_ms

ceph_osd_apply_latency_ms{ceph_daemon="osd.0", instance="c03", job="ceph"}  
11
ceph_osd_apply_latency_ms{ceph_daemon="osd.1", instance="c03", job="ceph"}  
5
ceph_osd_apply_latency_ms{ceph_daemon="osd.10", instance="c03", job="ceph"} 
8
ceph_osd_apply_latency_ms{ceph_daemon="osd.11", instance="c03", job="ceph"} 
33
ceph_osd_apply_latency_ms{ceph_daemon="osd.12", instance="c03", job="ceph"} 
42
ceph_osd_apply_latency_ms{ceph_daemon="osd.13", instance="c03", job="ceph"} 
17
ceph_osd_apply_latency_ms{ceph_daemon="osd.14", instance="c03", job="ceph"} 
27
ceph_osd_apply_latency_ms{ceph_daemon="osd.15", instance="c03", job="ceph"} 
15
ceph_osd_apply_latency_ms{ceph_daemon="osd.16", instance="c03", job="ceph"} 
14
ceph_osd_apply_latency_ms{ceph_daemon="osd.17", instance="c03", job="ceph"} 
4
ceph_osd_apply_latency_ms{ceph_daemon="osd.18", instance="c03", job="ceph"} 
18
ceph_osd_apply_latency_ms{ceph_daemon="osd.19", instance="c03", job="ceph"} 
1
ceph_osd_apply_latency_ms{ceph_daemon="osd.2", instance="c03", job="ceph"}  
14
ceph_osd_apply_latency_ms{ceph_daemon="osd.20", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.21", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.22", instance="c03", job="ceph"} 
9
ceph_osd_apply_latency_ms{ceph_daemon="osd.23", instance="c03", job="ceph"} 
2
ceph_osd_apply_latency_ms{ceph_daemon="osd.24", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.25", instance="c03", job="ceph"} 
15
ceph_osd_apply_latency_ms{ceph_daemon="osd.26", instance="c03", job="ceph"} 
18
ceph_osd_apply_latency_ms{ceph_daemon="osd.27", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.28", instance="c03", job="ceph"} 
4
ceph_osd_apply_latency_ms{ceph_daemon="osd.29", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.3", instance="c03", job="ceph"}  
10
ceph_osd_apply_latency_ms{ceph_daemon="osd.30", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.31", instance="c03", job="ceph"} 
2
ceph_osd_apply_latency_ms{ceph_daemon="osd.32", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.33", instance="c03", job="ceph"} 
1
ceph_osd_apply_latency_ms{ceph_daemon="osd.34", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.35", instance="c03", job="ceph"} 
2
ceph_osd_apply_latency_ms{ceph_daemon="osd.36", instance="c03", job="ceph"} 
2
ceph_osd_apply_latency_ms{ceph_daemon="osd.37", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.38", instance="c03", job="ceph"} 0
ceph_osd_apply_latency_ms{ceph_daemon="osd.39", instance="c03", job="ceph"} 
1
ceph_osd_apply_latency_ms{ceph_daemon="osd.4", instance="c03", job="ceph"}  
11
ceph_osd_apply_latency_ms{ceph_daemon="osd.40", instance="c03", job="ceph"} 
8
ceph_osd_apply_latency_ms{ceph_daemon="osd.41", instance="c03", job="ceph"} 
5
ceph_osd_apply_latency_ms{ceph_daemon="osd.5", instance="c03", job="ceph"}  
12
ceph_osd_apply_latency_ms{ceph_daemon="osd.6", instance="c03", job="ceph"}  
18
ceph_osd_apply_latency_ms{ceph_daemon="osd.7", instance="c03", job="ceph"}  
8
ceph_osd_apply_latency_ms{ceph_daemon="osd.8", instance="c03", job="ceph"}  
33
ceph_osd_apply_latency_ms{ceph_daemon="osd.9", instance="c03", job="ceph"}  
22

avg (ceph_osd_apply_latency_ms)
9.336


So I guess it is possible for you to get lower values on the lsi hba

Maybe you can tune read a head on the lsi with something like this.
echo 8192 > /sys/block/$line/queue/read_ahead_kb
echo 1024 > /sys/block/$line/queue/nr_requests

Also check for pci-e 3 those have higher bus speeds.



> -Original Message-
> Sent: 19 April 2021 20:57
> Subject: Re: [ceph-users] HBA vs caching Raid controller
> 
> 
> 
> >> For the background: we have many Perc H800+MD1200 [1] systems running
> >> with
> >> 10TB HDDs (raid0, read ahead, writeback cache).
> >> One server has LSI SAS3008 [0] instead of the Perc H800,
> >> which comes with 512MB RAM + BBU. On most servers latencies are
> around
> >> 4-12ms (average 6ms), on the system with the LSI controller we see
> >> 20-60ms (average 30ms) latency.
> >
> > How did you get these latencies? Then I can show you maybe what I have
> with the SAS2308.
> 
> Via grafana->prometheus->ceph-mgr:
> 
> 
> 
> avg by (hostname) (ceph_osd_apply_latency_ms{dc="$place"} * on
> (ceph_daemon) group_left(hostname) 

[ceph-users] Re: HBA vs caching Raid controller

2021-04-19 Thread Nico Schottelius


Marc  writes:

>> For the background: we have many Perc H800+MD1200 [1] systems running
>> with
>> 10TB HDDs (raid0, read ahead, writeback cache).
>> One server has LSI SAS3008 [0] instead of the Perc H800,
>> which comes with 512MB RAM + BBU. On most servers latencies are around
>> 4-12ms (average 6ms), on the system with the LSI controller we see
>> 20-60ms (average 30ms) latency.
>
> How did you get these latencies? Then I can show you maybe what I have with 
> the SAS2308.

Via grafana->prometheus->ceph-mgr:


avg by (hostname) (ceph_osd_apply_latency_ms{dc="$place"} * on
(ceph_daemon) group_left(hostname) ceph_osd_metadata{dc="$place"})


where $place = the data center name. I cross checked the numbers with
the OSDs using


ceph_osd_apply_latency_ms{dc="$place"}


which showed that all OSDs attached to that controller are in a similar
range, so the average above is not hiding "one bad osd".

Does that help?


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HBA vs caching Raid controller

2021-04-19 Thread Marc
> For the background: we have many Perc H800+MD1200 [1] systems running
> with
> 10TB HDDs (raid0, read ahead, writeback cache).
> One server has LSI SAS3008 [0] instead of the Perc H800,
> which comes with 512MB RAM + BBU. On most servers latencies are around
> 4-12ms (average 6ms), on the system with the LSI controller we see
> 20-60ms (average 30ms) latency.

How did you get these latencies? Then I can show you maybe what I have with the 
SAS2308.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io