[ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds
hi, everyone when I user rest bench testing RGW with cmd : rest-bench --access-key=ak --secret=sk --bucket=bucket --seconds=360 -t 200 -b 524288 --no-cleanup write I found when RGW call the method bucket_prepare_op is very slow. so I observed from 'dump_historic_ops',to see: { description: osd_op(client.4211.0:265984 .dir.default.4148.1 [call rgw.bucket_prepare_op] 3.b168f3d0 e37), received_at: 2014-07-03 11:07:02.465700, age: 308.315230, duration: 3.401743, type_data: [ commit sent; apply or cleanup, { client: client.4211, tid: 265984}, [ { time: 2014-07-03 11:07:02.465852, event: waiting_for_osdmap}, { time: 2014-07-03 11:07:02.465875, event: queue op_wq}, { time: 2014-07-03 11:07:03.729087, event: reached_pg}, { time: 2014-07-03 11:07:03.729120, event: started}, { time: 2014-07-03 11:07:03.729126, event: started}, { time: 2014-07-03 11:07:03.804366, event: waiting for subops from [19,9]}, { time: 2014-07-03 11:07:03.804431, event: commit_queued_for_journal_write}, { time: 2014-07-03 11:07:03.804509, event: write_thread_in_journal_buffer}, { time: 2014-07-03 11:07:03.934419, event: journaled_completion_queued}, { time: 2014-07-03 11:07:05.297282, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.297319, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.311217, event: op_applied}, { time: 2014-07-03 11:07:05.867384, event: op_commit finish lock}, { time: 2014-07-03 11:07:05.867385, event: op_commit}, { time: 2014-07-03 11:07:05.867424, event: commit_sent}, { time: 2014-07-03 11:07:05.867428, event: op_commit finish}, { time: 2014-07-03 11:07:05.867443, event: done}]]}]} so I find 2 performance degradation. one is from queue op_wq to reached_pg , anothor is from journaled_completion_queued to op_commit. and I must stess that there are so many ops write to one bucket object, so how to reduce Latency ? baijia...@126.com___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds
It looks like you're just putting in data faster than your cluster can handle (in terms of IOPS). The first big hole (queue_op_wq-reached_pg) is it sitting in a queue and waiting for processing. The second parallel blocks are 1) write_thread_in_journal_buffer-journaled_completion_queued, and that is again a queue while it's waiting to be written to disk, 2) waiting for subops from [19,9]-sub_op_commit_received(x2) is waiting for the replica OSDs to write the transaction to disk. You might be able to tune it a little, but right now bucket indices live in one object, so every write has to touch the same set of OSDs (twice! to mark an object as putting, and put). 2*3/360 = 166, which is probably past what those disks can do, and artificially increasing the latency. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jul 2, 2014 at 11:24 PM, baijia...@126.com baijia...@126.com wrote: hi, everyone when I user rest bench testing RGW with cmd : rest-bench --access-key=ak --secret=sk --bucket=bucket --seconds=360 -t 200 -b 524288 --no-cleanup write I found when RGW call the method bucket_prepare_op is very slow. so I observed from 'dump_historic_ops',to see: { description: osd_op(client.4211.0:265984 .dir.default.4148.1 [call rgw.bucket_prepare_op] 3.b168f3d0 e37), received_at: 2014-07-03 11:07:02.465700, age: 308.315230, duration: 3.401743, type_data: [ commit sent; apply or cleanup, { client: client.4211, tid: 265984}, [ { time: 2014-07-03 11:07:02.465852, event: waiting_for_osdmap}, { time: 2014-07-03 11:07:02.465875, event: queue op_wq}, { time: 2014-07-03 11:07:03.729087, event: reached_pg}, { time: 2014-07-03 11:07:03.729120, event: started}, { time: 2014-07-03 11:07:03.729126, event: started}, { time: 2014-07-03 11:07:03.804366, event: waiting for subops from [19,9]}, { time: 2014-07-03 11:07:03.804431, event: commit_queued_for_journal_write}, { time: 2014-07-03 11:07:03.804509, event: write_thread_in_journal_buffer}, { time: 2014-07-03 11:07:03.934419, event: journaled_completion_queued}, { time: 2014-07-03 11:07:05.297282, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.297319, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.311217, event: op_applied}, { time: 2014-07-03 11:07:05.867384, event: op_commit finish lock}, { time: 2014-07-03 11:07:05.867385, event: op_commit}, { time: 2014-07-03 11:07:05.867424, event: commit_sent}, { time: 2014-07-03 11:07:05.867428, event: op_commit finish}, { time: 2014-07-03 11:07:05.867443, event: done}]]}]} so I find 2 performance degradation. one is from queue op_wq to reached_pg , anothor is from journaled_completion_queued to op_commit. and I must stess that there are so many ops write to one bucket object, so how to reduce Latency ? baijia...@126.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds
I find that the function of OSD::OpWQ::_process use pg-lock lock the whole function.so this mean that osd threads can't handle op which write for the same object. though add log to the ReplicatedPG::op_commit , I find pg lock cost long time sometimes. but I don't know where lock pg . where lock pg for a long time? thanks baijia...@126.com From: Gregory Farnum Date: 2014-07-04 01:02 To: baijia...@126.com CC: ceph-users Subject: Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds It looks like you're just putting in data faster than your cluster can handle (in terms of IOPS). The first big hole (queue_op_wq-reached_pg) is it sitting in a queue and waiting for processing. The second parallel blocks are 1) write_thread_in_journal_buffer-journaled_completion_queued, and that is again a queue while it's waiting to be written to disk, 2) waiting for subops from [19,9]-sub_op_commit_received(x2) is waiting for the replica OSDs to write the transaction to disk. You might be able to tune it a little, but right now bucket indices live in one object, so every write has to touch the same set of OSDs (twice! to mark an object as putting, and put). 2*3/360 = 166, which is probably past what those disks can do, and artificially increasing the latency. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jul 2, 2014 at 11:24 PM, baijia...@126.com baijia...@126.com wrote: hi, everyone when I user rest bench testing RGW with cmd : rest-bench --access-key=ak --secret=sk --bucket=bucket --seconds=360 -t 200 -b 524288 --no-cleanup write I found when RGW call the method bucket_prepare_op is very slow. so I observed from 'dump_historic_ops',to see: { description: osd_op(client.4211.0:265984 .dir.default.4148.1 [call rgw.bucket_prepare_op] 3.b168f3d0 e37), received_at: 2014-07-03 11:07:02.465700, age: 308.315230, duration: 3.401743, type_data: [ commit sent; apply or cleanup, { client: client.4211, tid: 265984}, [ { time: 2014-07-03 11:07:02.465852, event: waiting_for_osdmap}, { time: 2014-07-03 11:07:02.465875, event: queue op_wq}, { time: 2014-07-03 11:07:03.729087, event: reached_pg}, { time: 2014-07-03 11:07:03.729120, event: started}, { time: 2014-07-03 11:07:03.729126, event: started}, { time: 2014-07-03 11:07:03.804366, event: waiting for subops from [19,9]}, { time: 2014-07-03 11:07:03.804431, event: commit_queued_for_journal_write}, { time: 2014-07-03 11:07:03.804509, event: write_thread_in_journal_buffer}, { time: 2014-07-03 11:07:03.934419, event: journaled_completion_queued}, { time: 2014-07-03 11:07:05.297282, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.297319, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.311217, event: op_applied}, { time: 2014-07-03 11:07:05.867384, event: op_commit finish lock}, { time: 2014-07-03 11:07:05.867385, event: op_commit}, { time: 2014-07-03 11:07:05.867424, event: commit_sent}, { time: 2014-07-03 11:07:05.867428, event: op_commit finish}, { time: 2014-07-03 11:07:05.867443, event: done}]]}]} so I find 2 performance degradation. one is from queue op_wq to reached_pg , anothor is from journaled_completion_queued to op_commit. and I must stess that there are so many ops write to one bucket object, so how to reduce Latency ? baijia...@126.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds
I put .rgw.buckets.index pool to SSD osd,bucket object must write to the SSD, and disk use ratio less than 50%. so I don't think disk is bottleneck baijia...@126.com From: baijia...@126.com Date: 2014-07-04 01:29 To: Gregory Farnum CC: ceph-users Subject: Re: Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds I find that the function of OSD::OpWQ::_process use pg-lock lock the whole function.so this mean that osd threads can't handle op which write for the same object. though add log to the ReplicatedPG::op_commit , I find pg lock cost long time sometimes. but I don't know where lock pg . where lock pg for a long time? thanks baijia...@126.com From: Gregory Farnum Date: 2014-07-04 01:02 To: baijia...@126.com CC: ceph-users Subject: Re: [ceph-users] RGW performance test , put 30 thousands objects to one bucket, average latency 3 seconds It looks like you're just putting in data faster than your cluster can handle (in terms of IOPS). The first big hole (queue_op_wq-reached_pg) is it sitting in a queue and waiting for processing. The second parallel blocks are 1) write_thread_in_journal_buffer-journaled_completion_queued, and that is again a queue while it's waiting to be written to disk, 2) waiting for subops from [19,9]-sub_op_commit_received(x2) is waiting for the replica OSDs to write the transaction to disk. You might be able to tune it a little, but right now bucket indices live in one object, so every write has to touch the same set of OSDs (twice! to mark an object as putting, and put). 2*3/360 = 166, which is probably past what those disks can do, and artificially increasing the latency. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Jul 2, 2014 at 11:24 PM, baijia...@126.com baijia...@126.com wrote: hi, everyone when I user rest bench testing RGW with cmd : rest-bench --access-key=ak --secret=sk --bucket=bucket --seconds=360 -t 200 -b 524288 --no-cleanup write I found when RGW call the method bucket_prepare_op is very slow. so I observed from 'dump_historic_ops',to see: { description: osd_op(client.4211.0:265984 .dir.default.4148.1 [call rgw.bucket_prepare_op] 3.b168f3d0 e37), received_at: 2014-07-03 11:07:02.465700, age: 308.315230, duration: 3.401743, type_data: [ commit sent; apply or cleanup, { client: client.4211, tid: 265984}, [ { time: 2014-07-03 11:07:02.465852, event: waiting_for_osdmap}, { time: 2014-07-03 11:07:02.465875, event: queue op_wq}, { time: 2014-07-03 11:07:03.729087, event: reached_pg}, { time: 2014-07-03 11:07:03.729120, event: started}, { time: 2014-07-03 11:07:03.729126, event: started}, { time: 2014-07-03 11:07:03.804366, event: waiting for subops from [19,9]}, { time: 2014-07-03 11:07:03.804431, event: commit_queued_for_journal_write}, { time: 2014-07-03 11:07:03.804509, event: write_thread_in_journal_buffer}, { time: 2014-07-03 11:07:03.934419, event: journaled_completion_queued}, { time: 2014-07-03 11:07:05.297282, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.297319, event: sub_op_commit_rec}, { time: 2014-07-03 11:07:05.311217, event: op_applied}, { time: 2014-07-03 11:07:05.867384, event: op_commit finish lock}, { time: 2014-07-03 11:07:05.867385, event: op_commit}, { time: 2014-07-03 11:07:05.867424, event: commit_sent}, { time: 2014-07-03 11:07:05.867428, event: op_commit finish}, { time: 2014-07-03 11:07:05.867443, event: done}]]}]} so I find 2 performance degradation. one is from queue op_wq to reached_pg , anothor is from journaled_completion_queued to op_commit. and I must stess that there are so many ops write to one bucket object, so how to reduce Latency ? baijia...@126.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com