I did the minor point release update from 10.0.2 to 10.0.4 and found my cinder 
volume services would go out to lunch during startup. They would do their 
initial heartbeat then get marked as dead never sending another heartbeat.  The 
process was running and there were constant logs about ceph connections but 
what was missing was the follow up to "Initializing RPC dependent components of 
volume driver RBDDriver (1.2.0)”. It never finished the rpc init "Driver post 
RPC initialization completed successfully.”  Digging in a little bit with my 
limited knowledge of the python librbd it seems that this commit landed in 
10.0.4 
https://github.com/openstack/cinder/commit/e72dead5ce085a6ba66f7aad2ff58061842f43d2
  Instead of looping over the volume size for every volume it looped over all 
the volumes calling diff_iterate from offset 0 to the end.   Near as I can tell 
this actually calls whatever you pass in as iterate_cb for every used extent of 
the volume. So a handful of empty volumes no problem, but in production by my 
count I would have to call iterate_cb 12.6M times just to add up the bytes used 
from each extent.   I’ve filed a bug 
https://bugs.launchpad.net/cinder/+bug/1708507 and downgrading to 10.0.2 seems 
to be an ok workaround.

TLDR; if you have ceph don’t upgrade past 10.0.2, for the time being
_______________________________________________
OpenStack-operators mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Reply via email to