Re: [ceph-users] Investigating Config Error, 300x reduction in IOPs performance on RGW layer

Ravi Patel Fri, 19 Jul 2019 09:51:09 -0700

Thank you again for reaching out.

Based on your feedback, we decided to try a few more benchmarks. We were
originally doing single node testing using some internal applications and
this tool:
   - s3bench https://github.com/igneous-systems/s3bench to generate our
results.
Looks like the poor benchmarking contributed to some of a large chunk of
the error.


We just setup cosbench with 200 workers. This seems to have better
performance but still a significant drop from the RADOS layer.
- COSbench we are seeing ~2700 IOP/s for 4K files.
This still represents a ~18x drop from the RADOS layer in terms of IOP/s

It would be good to understand what metrics we should look at in ceph to
debug the issue, what to try next from a tuning perspective, and any other
benchmarks that the community could suggest to help us figure this out.


Cheers,
Ravi



---

Ravi Patel, PhD
Machine Learning Systems Lead
Email: r...@kheironmed.com



On Thu, 18 Jul 2019 at 09:42, Paul Emmerich <paul.emmer...@croit.io> wrote:

>
>
> On Thu, Jul 18, 2019 at 3:44 AM Robert LeBlanc <rob...@leblancnet.us>
> wrote:
>
>> I'm pretty new to RGW, but I'm needing to get max performance as well.
>> Have you tried moving your RGW metadata pools to nvme? Carve out a bit of
>> NVMe space and then pin the pool to the SSD class in CRUSH, that way the
>> small metadata ops aren't on slow media.
>>
>
> no, don't do that:
>
> 1) a performance difference of 130 vs. 48k iopos is not due to SSD vs.
> NVMe for metadata unless the SSD is absolute crap
> 2) the OSDs already have an NVMe DB device, it's much easier to use it
> directly than by partioning the NVMes to create a separate partition as a
> normal OSD
>
>
> Assuming your NVMe disks are a reasonable size (30GB per OSD): put the
> metadata pools on the HDDs. It's better to have 48 OSDs with 4 NVMes behind
> them handling metadata than only 4 OSDs with SSDs.
>
> Running mons in VMs with gigabit network is fine for small clusters and
> not a performance problem
>
>
> How are you benchmarking?
>
> Paul
>
>
>> ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>>
>> On Wed, Jul 17, 2019 at 5:59 PM Ravi Patel <r...@kheironmed.com> wrote:
>>
>>> Hello,
>>>
>>> We have deployed ceph cluster and we are trying to debug a massive drop
>>> in performance between the RADOS layer vs the RGW layer
>>>
>>> ## Cluster config
>>> 4 OSD nodes (12 Drives each, NVME Journals, 1 SSD drive) 40GbE NIC
>>> 2 RGW nodes ( DNS RR load balancing) 40GbE NIC
>>> 3 MON nodes 1 GbE NIC
>>>
>>> ## Pool configuration
>>> RGW data pool  - replicated 3x 4M stripe (HDD)
>>> RGW metadata pool - replicated 3x (SSD) pool
>>>
>>> ## Benchmarks
>>> 4K Read IOP/s performance using RADOS Bench 48,000~ IOP/s
>>> 4K Read RGW performance via s3 interface ~ 130 IOP/s
>>>
>>> Really trying to understand how to debug this issue. all the nodes never
>>> break 15% CPU utilization and there is plenty of RAM. The one pathological
>>> issue in our cluster is that the MON nodes are currently on VMs that are
>>> sitting behind a single 1 GbE NIC. (We are in the process of moving them,
>>> but are unsure if that will fix the issue.
>>>
>>> What metrics should we be looking at to debug the RGW layer. Where do we
>>> need to look?
>>>
>>> ---
>>>
>>> Ravi Patel, PhD
>>> Machine Learning Systems Lead
>>> Email: r...@kheironmed.com
>>>
>>>
>>> *Kheiron Medical Technologies*
>>>
>>> kheironmed.com | supporting radiologists with deep learning
>>>
>>> Kheiron Medical Technologies Ltd. is a registered company in England and
>>> Wales. This e-mail and its attachment(s) are intended for the above named
>>> only and are confidential. If they have come to you in error then you must
>>> take no action based upon them but contact us immediately. Any disclosure,
>>> copying, distribution or any action taken or omitted to be taken in
>>> reliance on it is prohibited and may be unlawful. Although this e-mail and
>>> its attachments are believed to be free of any virus, it is the
>>> responsibility of the recipient to ensure that they are virus free. If you
>>> contact us by e-mail then we will store your name and address to facilitate
>>> communications. Any statements contained herein are those of the individual
>>> and not the organisation.
>>>
>>> Registered number: 10184103. Registered office: RocketSpace, 40
>>> Islington High Street, London, N1 8EQ
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>

-- 










*Kheiron Medical Technologies*

kheironmed.com 
<http://kheironmed.com/> | supporting radiologists with deep learning


Kheiron Medical Technologies Ltd. is a registered company in England and 
Wales. This e-mail and its attachment(s) are intended for the above named 
only and are confidential. If they have come to you in error then you must 
take no action based upon them but contact us immediately. Any disclosure, 
copying, distribution or any action taken or omitted to be taken in 
reliance on it is prohibited and may be unlawful. Although this e-mail and 
its attachments are believed to be free of any virus, it is the 
responsibility of the recipient to ensure that they are virus free. If you 
contact us by e-mail then we will store your name and address to facilitate 
communications. Any statements contained herein are those of the individual 
and not the organisation.




Registered number: 10184103. Registered 
office: RocketSpace, 40 Islington High Street, London, N1 8EQ

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Investigating Config Error, 300x reduction in IOPs performance on RGW layer

Reply via email to