Ok,

So good news that RADOS appears to be doing well. I'd say next is to follow some of the recommendations here:

http://ceph.com/docs/master/radosgw/troubleshooting/

If you examine the objecter_requests and perfcounters during your cosbench write test, it might help explain where the requests are backing up. Another thing to look for (as noted in the above URL) are HTTP errors in the apache logs (if relevant).

Other general thoughts: When you upgraded to hammer did you change the RGW configuration at all? Are you using civetweb now? Does the rgw.buckets pool have enough PGs?


Mark

On 07/21/2015 08:17 PM, Florent MONTHEL wrote:
Hi Mark

I've something like 600 write IOPs on EC pool and 800 write IOPs on replicated 
3 pool with rados bench

With  Radosgw  I have 30/40 write IOPs with Cosbench (1 radosgw- the same with 
2) and servers are sleeping :
- 0.005 core for radosgw process
- 0.01 core for osd process

I don't know if we can have .rgw* pool locking or something like that with 
Hammer (or situation specific to me)

On 100% read profile, Radosgw and Ceph servers are working very well with more 
than 6000 IOPs on one radosgw server :
- 7 cores for radosgw process
- 1 core for each osd process
- 0,5 core for each Apache process

Thanks

Sent from my iPhone

On 14 juil. 2015, at 21:03, Mark Nelson <mnel...@redhat.com> wrote:

Hi Florent,

10x degradation is definitely unusual!  A couple of things to look at:

Are 8K rados bench writes to the rgw.buckets pool slow?  You can with something 
like:

rados -p rgw.buckets bench 30 write -t 256 -b 8192

You may also want to try targeting a specific RGW server to make sure the 
RR-DNS setup isn't interfering (at least while debugging).  It may also be 
worth creating a new replicated pool and try writes to that pool as well to see 
if you see much difference.

Mark

On 07/14/2015 07:17 PM, Florent MONTHEL wrote:
Yes of course thanks Mark

Infrastructure : 5 servers with 10 sata disks (50 osd at all) - 10gb connected 
- EC 2+1 on rgw.buckets pool - 2 radosgw RR-DNS like installed on 2 cluster 
servers
No SSD drives used

We're using Cosbench to send :
- 8k object size : 100% read with 256 workers : better results with Hammer
  - 8k object size : 80% read - 20% write with 256 workers : real degradation 
between Firefly and Hammer (divided by something like 10)
- 8k object size : 100% write with 256 workers : real degradation between 
Firefly and Hammer (divided by something like 10)

Thanks

Sent from my iPhone

On 14 juil. 2015, at 19:57, Mark Nelson <mnel...@redhat.com> wrote:

On 07/14/2015 06:42 PM, Florent MONTHEL wrote:
Hi All,

I've just upgraded Ceph cluster from Firefly 0.80.8 (Redhat Ceph 1.2.3) to 
Hammer (Redhat Ceph 1.3) - Usage : radosgw with Apache 2.4.19 on MPM prefork 
mode
I'm experiencing huge write performance degradation just after upgrade 
(Cosbench).

Do you already run performance tests between Hammer and Firefly ?

No problem with read performance that was amazing

Hi Florent,

Can you talk a little bit about how your write tests are setup?  How many 
concurrent IOs and what size?  Also, do you see similar problems with rados 
bench?

We have done some testing and haven't seen significant performance degradation 
except when switching to civetweb which appears to perform deletes more slowly 
than what we saw with apache+fcgi.

Mark



Sent from my iPhone
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to