Re: [ceph-users] Erasure pool performance expectations

Peter Kerdisle Tue, 03 May 2016 04:15:08 -0700

Hey Nick,

Thanks for taking the time to answer my questions. Some in-line comments.


On Tue, May 3, 2016 at 10:51 AM, Nick Fisk <n...@fisk.me.uk> wrote:

> Hi Peter,
>
>
> > -----Original Message-----
> > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> > Peter Kerdisle
> > Sent: 02 May 2016 08:17
> > To: ceph-users@lists.ceph.com
> > Subject: [ceph-users] Erasure pool performance expectations
> >
> > Hi guys,
> >
> > I am currently testing the performance of RBD using a cache pool and a
> 4/2
> > erasure profile pool.
> >
> > I have two SSD cache servers (2 SSDs for journals, 7 SSDs for data) with
> > 2x10Gbit bonded each and a six OSD nodes with a 10Gbit public and 10Gbit
> > cluster network for the erasure pool (10x3TB without separate journal).
> This
> > is all on Jewel.
> >
> > What I would like to know is if the performance I'm seeing is to be
> expected
> > and if there is some way to test this in a more qualifiable way.
> >
> > Everything works as expected if the files are present on the cache pool,
> > however when things need to be retrieved from the cache pool I see
> > performance degradation. I'm trying to simulate real usage as much as
> > possible and trying to retrieve files from the RBD volume over FTP from a
> > client server. What I'm seeing is that the FTP transfer will stall for
> seconds at a
> > time and then get some more data which results in an average speed of
> > 200KB/s. From the cache this is closer to 10MB/s. Is this the expected
> > behaviour from a erasure coded tier with cache in front?
>
> Unfortunately yes. The whole Erasure/Cache thing only really works well if
> the data in the EC tier is only accessed infrequently, otherwise the
> overheads in cache promotion/flushing quickly brings the cluster down to
> its knees. However it looks as though you are mainly doing reads, which
> means you can probably alter your cache settings to not promote so
> aggressively on reads, as reads can be proxied through to the EC tier
> instead of promoting. This should reduce the amount of required cache
> promotions.
>

You are correct that reads have a lower priority of being cached, only when
they are used very frequently should this be done in an ideal situation.


>
> Can you try setting min_read_recency_for promote to something higher?
>

I looked into the setting before but I must admit it's exact purpose eludes
me still. Would it be correct to simplify it
as 'min_read_recency_for_promote determines the amount of times a piece
would have to be read in a certain interval (set by hit_set_period) in
order to promote it to the caching tier' ?


> Also can you check what your hit_set_period and hit_set_count is currently
> set to.
>

hit_set_count is set to 1 and hit_set_period to 1800.

What would increasing the hit_set_count do exactly?


>
> > Right now I'm unsure how to scientifically test the performance
> retrieving
> > files when there is a cache miss. If somebody could point me towards a
> > better way of doing that I would appreciate the help.
> >
> > An other thing is that I'm seeing a lot of messages popping up in dmesg
> on
> > my client server on which the RBD volumes are mounted. (IPs removed)
> >
> > [685881.477383] libceph: osd50 :6800 socket closed (con state OPEN)
> > [685895.597733] libceph: osd54 :6808 socket closed (con state OPEN)
> > [685895.663971] libceph: osd54 :6808 socket closed (con state OPEN)
> > [685895.710424] libceph: osd54 :6808 socket closed (con state OPEN)
> > [685895.749417] libceph: osd54 :6808 socket closed (con state OPEN)
> > [685896.517778] libceph: osd54 :6808 socket closed (con state OPEN)
> > [685906.690445] libceph: osd74 :6824 socket closed (con state OPEN)
> >
> > Is this a symptom of something?
>
> This is just stale connections to the OSD's timing out after the idle
> period and is nothing to worry about.
>

Glad to hear that, I was fearing something might be wrong.

Thanks again.

Peter

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Erasure pool performance expectations

Reply via email to