Just to close the loop on this one in case someone reads this in the future. We were encountering this bug:
https://tracker.ceph.com/issues/44671 And updating to 14.2.20 solved it. Cheers, Dylan On Thu, 2021-04-15 at 18:48 +0000, Dylan Griff wrote: > > Just some more info on this, it started happening after they added several > thousand objects to their buckets. While the client side times out, the > operation seems to proceed in ceph for a very long > time happily working away getting the stat info for their objects. It doesn't > appear to be failing, just taking an extremely long time. This doesn't seem > right to me, but can someone confirm that they > can run an account level stat with swift on a user with several thousand > buckets/objects? > > Any info would be helpful! > > Cheers, > Dylan > > On Tue, 2021-04-13 at 21:50 +0000, Dylan Griff wrote: > > Hey folks! > > > > We have a user with ~1900 buckets in our RGW service and running this stat > > command results in a timeout for them: > > > > swift -A https://<URL>:443/auth/1.0 -U <UID> -K <KEY> stat > > > > Running the same command, but specifiying one of their buckets, returns > > promptly. Running the command for a different user with minimal buckets > > returns promptly as well. Turning up debug logging to > > 20 > > for rgw resulted in a great deal of logs showing: > > > > 20 reading from default.rgw.meta:root:.bucket.meta.<BUCKET-ID> > > 20 get_system_obj_state: rctx=0x559b32a6b570 > > obj=default.rgw.meta:root:.bucket.meta.<BUCKET-ID> state=0x559b32c37e40 > > s->prefetch_data=0 > > 10 cache get: name=default.rgw.meta+root+.bucket.meta.<BUCKET-ID> : hit > > (requested=0x16, cached=0x17) > > 20 get_system_obj_state: s->obj_tag was set empty > > 10 cache get: name=default.rgw.meta+root+.bucket.meta.<BUCKET-ID> : hit > > (requested=0x11, cached=0x17) > > > > Which looks like to me it is iterating getting the state of all their > > stuff. My question: is ~1900 an unreasonable amount of buckets such that we > > should expect to see this full account 'stat' > > command > > timeout? Or should I be expecting it to return promptly still? Thanks! > > > > Cheers, > > Dylan > > -- > > Dylan Griff > > Senior System Administrator > > CLE D063 > > RCS - Systems - University of Victoria > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io