
I'm experiencing problem with poor performance of RadosGW while
operating on bucket with many object. That's known issue with LevelDB
and can be partially resolved using shrading but I have one more idea.
As I see in ceph osd logs all slow requests are while making call to

2016-02-17 03:17:56.846694 7f5396f63700  0 log_channel(cluster) log
[WRN] : slow request 30.272904 seconds old, received at 2016-02-17
03:17:26.573742: osd_op(client.12611484.0:15137332 .dir.default.4162.3
[call rgw.bucket_list] 9.2955279 ack+read+known_if_redirected e3252)
currently started

I don't know exactly how Ceph internally works but maybe data required
to return results for rgw.bucket_list could be cached for some time.
Cache TTL would be parametrized and could be disabled to keep the same
behaviour as current one. There can be 3 cases when there's a call to
1. no cached data
2. up-to-date cache
3. outdated cache

Ad 1. First call starts generating full list. All new requests are put
on hold. When list is ready it's saved to cache
Ad 2. All calls are served from cache
Ad 3. First request starts generating full list. All new requests are
served from outdated cache until new cached data is ready

This can be even optimized by periodically generating fresh cache, even
if it's not expired yet to reduce cases when cache is outdated.

Maybe this idea is stupid, maybe not, but if it's doable it would be
nice to have choice.

Kind regards -
Krzysztof Księżyk

ceph-users mailing list

Reply via email to