Francesco Romani has posted comments on this change. Change subject: vm: enable disk stats collection on recovery ......................................................................
Patch Set 1: (1 comment) http://gerrit.ovirt.org/#/c/32406/1/vdsm/virt/vm.py File vdsm/virt/vm.py: Line 2895: else: Line 2896: for drive in devices[DISK_DEVICES]: Line 2897: if drive['device'] == 'disk' and isVdsmImage(drive): Line 2898: self.sdIds.append(drive['domainID']) Line 2899: self.startDisksStatsCollection() > But note that in prepare path, we prepare the volumes, and startDiskStatsCo Good point. I was under the impression something deeper was lurking behind, and turns out is exactly like that. I was aware of the startDisksStatsCollection being a liar, and about the flag, but definitely not about the irs thing, thanks for the insight. The impact of the current behaviour is: * new VMs will enable disk stats independently, so this affects only recovered VMs. * succesfull live snapshot commands will eventually re-enable stats gathering, so to do a snapshot on a recovered VM should incidentally fix this issue * same for the disk replication (more specifically, diskReplicateFinish). * the disk stats reporting becomes broken, so engine is fed with stale data or no data. The effects depend on how the user consumes this data. * last, the big one: highWrite, thus the high water checks, becomes a NOP if disk stats are stopped. Not good. I think there is a reasonably clean and correct way forward, which is the following * since we don't have callbacks, we have to resort to polling. But stats already does polling; we want to make this smarter and more efficient, but we're not going to change that. * since as you said IRS reports when it's ready, it may be sufficent to replace this check def _highWrite(self): if not self._vm.isDisksStatsCollectionEnabled(): # Avoid queries from storage during recovery process return # ... with something which takes in account the IRS readyness flag. Something like (pseudocode) def _highWrite(self): disk_stats_enabled = False # local caching to avoid to access irs every time from different threads and avoid # contention, will be dropped if this is no issue if not disk_stats_enabled: disk_stats_enabled = self._vm.isDiskStatsCollectionEnabled() and self._vm.cif.irs.ready # ... (no other changes) This way, all the disk samplings will automatically reenable themselves as soon as possible, leveraging existing informations. I believe this could be a good fix not only for the short term. What do you think? Line 2900: Line 2901: for devType, devClass in self.DeviceMapping: Line 2902: for dev in devices[devType]: Line 2903: self._devices[devType].append(devClass(self.conf, self.log, -- To view, visit http://gerrit.ovirt.org/32406 To unsubscribe, visit http://gerrit.ovirt.org/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ibd7d32373f5a0a7dd4ef88c5a0ad5a9f3d52938c Gerrit-PatchSet: 1 Gerrit-Project: vdsm Gerrit-Branch: master Gerrit-Owner: Francesco Romani <[email protected]> Gerrit-Reviewer: Dan Kenigsberg <[email protected]> Gerrit-Reviewer: Francesco Romani <[email protected]> Gerrit-Reviewer: Michal Skrivanek <[email protected]> Gerrit-Reviewer: Nir Soffer <[email protected]> Gerrit-Reviewer: Vinzenz Feenstra <[email protected]> Gerrit-Reviewer: [email protected] Gerrit-Reviewer: oVirt Jenkins CI Server Gerrit-HasComments: Yes _______________________________________________ vdsm-patches mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches
