Public bug reported:

[ Impact ]

Improper locking in ObjectCache::get() allows for a race condition when
accessing the cache map. This can cause the RGW service to crash, see:
issue#51927 [0].

```
    "backtrace": [
        "(()+0x46210) [0x7f7c25b8c210]",
        "(ceph::buffer::v15_2_0::ptr::ptr(ceph::buffer::v15_2_0::ptr 
const&)+0x17) [0x7f7c25888bb7]",
        
"(ceph::buffer::v15_2_0::ptr_node::cloner::operator()(ceph::buffer::v15_2_0::ptr_node
 const&)+0x2d) [0x7f7c2588bf0d]",
        "(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >* std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > 
>::_M_copy<std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > 
>::_Reuse_or_alloc_node>(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char,
 std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > const*, std::_Rb_tree_node_base*, 
std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >::_Reuse_or_alloc_node&)+0x4c3) 
[0x7f7c2623b8c3]",
        "(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >* std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > 
>::_M_copy<std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > 
>::_Reuse_or_alloc_node>(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char,
 std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > const*, std::_Rb_tree_node_base*, 
std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >::_Reuse_or_alloc_node&)+0x182) 
[0x7f7c2623b582]",
        "(std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > 
>::operator=(std::_Rb_tree<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const, ceph::buffer::v15_2_0::list>, 
std::_Select1st<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> >, std::less<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > > const&)+0x97) [0x7f7c2623bb57]",
        "(ObjectCache::get(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, ObjectCacheInfo&, 
unsigned int, rgw_cache_entry_info*)+0x4fe) [0x7f7c2637085e]",
        "(RGWSI_SysObj_Cache::raw_stat(rgw_raw_obj const&, unsigned long*, 
std::chrono::time_point<ceph::time_detail::real_clock, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, unsigned 
long*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, ceph::buffer::v15_2_0::list, 
std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >*, ceph::buffer::v15_2_0::list*, 
RGWObjVersionTracker*, optional_yield)+0x23c) [0x7f7c267c9d1c]",
        "(RGWSI_SysObj_Core::get_system_obj_state_impl(RGWSysObjectCtxBase*, 
rgw_raw_obj const&, RGWSysObjState**, RGWObjVersionTracker*, 
optional_yield)+0x7f8) [0x7f7c262d9078]",
        "(RGWSI_SysObj_Core::get_system_obj_state(RGWSysObjectCtxBase*, 
rgw_raw_obj const&, RGWSysObjState**, RGWObjVersionTracker*, 
optional_yield)+0x4a) [0x7f7c262d9ada]",
        "(RGWSI_SysObj_Core::stat(RGWSysObjectCtxBase&, 
RGWSI_SysObj_Obj_GetObjState&, rgw_raw_obj const&, 
std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, ceph::buffer::v15_2_0::list, 
std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >*, bool, 
std::chrono::time_point<ceph::time_detail::real_clock, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, unsigned 
long*, RGWObjVersionTracker*, optional_yield)+0x6a) [0x7f7c262d9b6a]",
        "(RGWSI_SysObj::Obj::ROp::stat(optional_yield)+0x66) [0x7f7c262d0506]",
        "(rgw_get_system_obj(RGWSysObjectCtx&, rgw_pool const&, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
const&, ceph::buffer::v15_2_0::list&, RGWObjVersionTracker*, 
std::chrono::time_point<ceph::time_detail::real_clock, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, 
optional_yield, std::map<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, 
std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >*, rgw_cache_entry_info*, 
boost::optional<obj_version>)+0x1ef) [0x7f7c266e16ef]",
        "(RGWSI_MetaBackend_SObj::get_entry(RGWSI_MetaBackend::Context*, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
const&, RGWSI_MetaBackend::GetParams&, RGWObjVersionTracker*, 
optional_yield)+0xfb) [0x7f7c267bab1b]",
        
"(RGWSI_Bucket_SObj::do_read_bucket_instance_info(ptr_wrapper<RGWSI_MetaBackend::Context,
 4>&, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const&, RGWBucketInfo*, 
std::chrono::time_point<ceph::time_detail::real_clock, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, 
std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, ceph::buffer::v15_2_0::list, 
std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >*, rgw_cache_entry_info*, 
boost::optional<obj_version>, optional_yield)+0x1a8) [0x7f7c2679d718]",
        
"(RGWSI_Bucket_SObj::read_bucket_instance_info(ptr_wrapper<RGWSI_MetaBackend::Context,
 4>&, std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > const&, RGWBucketInfo*, 
std::chrono::time_point<ceph::time_detail::real_clock, 
std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> > >*, 
std::map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, ceph::buffer::v15_2_0::list, 
std::less<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
ceph::buffer::v15_2_0::list> > >*, optional_yield, rgw_cache_entry_info*, 
boost::optional<obj_version>)+0x637) [0x7f7c2679df17]",
        "(()+0x5f2035) [0x7f7c2632a035]",
        "(std::_Function_handler<int (RGWSI_MetaBackend_Handler::Op*), 
RGWBucketInstanceMetadataHandler::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj>
 >, std::function<int (ptr_wrapper<RGWSI_MetaBackend::Context, 
4>&)>)::{lambda(RGWSI_MetaBackend_Handler::Op*)#1}>::_M_invoke(std::_Any_data 
const&, RGWSI_MetaBackend_Handler::Op*&&)+0x36) [0x7f7c26349466]",
        "(()+0xa807de) [0x7f7c267b87de]",
        
"(RGWSI_MetaBackend_SObj::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj>
 >, std::function<int (RGWSI_MetaBackend::Context*)>)+0x9e) [0x7f7c267bb3fe]",
        
"(RGWSI_MetaBackend_Handler::call(std::optional<std::variant<RGWSI_MetaBackend_CtxParams_SObj>
 >, std::function<int (RGWSI_MetaBackend_Handler::Op*)>)+0x5f) 
[0x7f7c267b85bf]",
        "(RGWBucketCtl::read_bucket_info(rgw_bucket const&, RGWBucketInfo*, 
optional_yield, RGWBucketCtl::BucketInstance::GetParams const&, 
RGWObjVersionTracker*)+0x1fb) [0x7f7c2632888b]",
        "(rgw_build_bucket_policies(rgw::sal::RGWRadosStore*, 
req_state*)+0xcc0) [0x7f7c265ac0d0]",
        "(RGWHandler::do_init_permissions()+0x32) [0x7f7c265ad832]",
        "(RGWHandler_REST::init_permissions(RGWOp*)+0x178) [0x7f7c2665ff98]",
        "(rgw_process_authenticated(RGWHandler_REST*, RGWOp*&, RGWRequest*, 
req_state*, bool)+0x248) [0x7f7c261e1e28]",
        "(process_request(rgw::sal::RGWRadosStore*, RGWREST*, RGWRequest*, 
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > 
const&, rgw::auth::StrategyRegistry const&, RGWRestfulIO*, OpsLogSocket*, 
optional_yield, rgw::dmclock::Scheduler*, int*)+0x2d34) [0x7f7c261e78a4]",
        "(()+0x3dfb53) [0x7f7c26117b53]",
        "(()+0x3e1051) [0x7f7c26119051]",
        "(()+0x3e11ac) [0x7f7c261191ac]",
        "(make_fcontext()+0x2f) [0x7f7c268c439f]" 
    ],
```

The fix ensures that the unique_lock for writes remains held until the
function returns, preventing unsafe modifications.

This fix was not backported to Octopus because the release reached EOL
upstream and no further updates were accepted [1]. This fix was first
applied in v16.2.8 and has been stable in production deployments since
2022-06-21, with no reported regressions or issues in the field.

[ Test Plan ]

This issue is a race condition, making it rare and difficult to
reproduce reliably.

It is more likely to manifest under high concurrency, when multiple
clients are accessing and modifying the object cache simultaneously.

To increase the likelihood of triggering this issue, concurrent fio S3
benchmarking tests will be run [3].

[ Where problems could occur ]

This change modifies the locking mechanism within ObjectCache::get(),
with potential risks including deadlocks, performance regressions due to
increased contention, or unintended behavior from incorrect lock
promotion.

Regression symptoms could include increased request latency, unexpected
thread contention, or crashes if deadlocks occur. Testing will focus on
performance benchmarks and stress testing under high concurrency.

[ Other Info ]

The fix follows C++ best practices for locking (RAII with
std::defer_lock) and is consistent with locking patterns used elsewhere
in Ceph.

[0] https://tracker.ceph.com/issues/52800
[1] https://tracker.ceph.com/issues/53071
[2] https://github.com/axboe/fio/blob/master/examples/http-s3.fio

** Affects: ceph (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: ceph (Ubuntu Focal)
     Importance: Undecided
         Status: New

** Also affects: ceph (Ubuntu Focal)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2100567

Title:
  [SRU] rgw: fix lock scope in ObjectCache::get()

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/2100567/+subscriptions


-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to