Francesco Romani has posted comments on this change.

Change subject: virt: support per-vm stats age
......................................................................


Patch Set 32:

(9 comments)

https://gerrit.ovirt.org/#/c/38066/32/vdsm/virt/sampling.py
File vdsm/virt/sampling.py:

Line 362:             return None
Line 363:         _, last_sample = self._samples[-1]
Line 364:         return last_sample
Line 365: 
Line 366:     def values(self):
not needed anymore, will remove
Line 367:         """
Line 368:         Return the raw collected values.
Line 369:         """
Line 370:         if len(self._samples) < 2:


Line 424:     Provide facilities to retrieve per-vm samples,
Line 425:     and the glue code to deal with disappearing per-vm samples.
Line 426:     """
Line 427: 
Line 428:     EMPTY = StatsSample(None, None, None, None)  # 'null' sample
> Should be in the same namespace as StatsSample imho.
will move
Line 429: 
Line 430:     _log = logging.getLogger("sampling.StatsCache")
Line 431: 
Line 432:     def __init__(self, clock=utils.monotonic_time):


Line 435:         self._last_sample_time = 0
Line 436:         self._vm_last_timestamp = defaultdict(int)
Line 437:         self._lock = threading.Lock()
Line 438: 
Line 439:     def clock(self):
> Making it a @property would be nicer from pythonic standpoint.
probably yes, will play with it
Line 440:         """
Line 441:         Provide timestamp compatible with what put() expects
Line 442:         """
Line 443:         return self._clock()


Line 448:         """
Line 449:         with self._lock:
Line 450:             try:
Line 451:                 del self._vm_last_timestamp[vmid]
Line 452:             except KeyError:
do not swallow KeyError here
Line 453:                 pass
Line 454: 
Line 455:     def register(self, vmid):
Line 456:         """


Line 451:                 del self._vm_last_timestamp[vmid]
Line 452:             except KeyError:
Line 453:                 pass
Line 454: 
Line 455:     def register(self, vmid):
unneeded since we use defaultdict
Line 456:         """
Line 457:         Warm up the cache for `vmid'
Line 458:         """
Line 459:         with self._lock:


Line 464:         Return the available StatSample for the given VM.
Line 465:         """
Line 466:         with self._lock:
Line 467:             (first_batch, first_batch_timestamp,
Line 468:              last_batch, last_batch_timestamp) = 
self._samples.values()
we can use stats() method
Line 469: 
Line 470:             if first_batch is None:
Line 471:                 return self.EMPTY
Line 472: 


Line 475: 
Line 476:             if first_sample is None or last_sample is None:
Line 477:                 return self.EMPTY
Line 478: 
Line 479:             elapsed_time = last_batch_timestamp - 
first_batch_timestamp
it is 'interval' actually, and no gain in calculating it here
Line 480:             stats_age = self._clock() - self._vm_last_timestamp[vmid]
Line 481: 
Line 482:             return StatsSample(first_sample, last_sample,
Line 483:                                elapsed_time, stats_age)


Line 492:         returned by unblocked stuck calls, to avoid overwrite fresh 
data
Line 493:         with stale one.
Line 494:         """
Line 495:         with self._lock:
Line 496:             if monotonic_ts >= self._last_sample_time:
> Not exactly related to this patchset, but still something very close: is >=
it's just me being paranoid. I don't have a clear case to use '>=' instead of 
'>'
Line 497:                 self._samples.append(bulk_stats)
Line 498:                 self._last_sample_time = monotonic_ts
Line 499: 
Line 500:                 self._update_ts(bulk_stats, monotonic_ts)


Line 496:             if monotonic_ts >= self._last_sample_time:
Line 497:                 self._samples.append(bulk_stats)
Line 498:                 self._last_sample_time = monotonic_ts
Line 499: 
Line 500:                 self._update_ts(bulk_stats, monotonic_ts)
> Not sure which semantics did you target and I think it should be noted some
Not sure I understood your remark, but here the problem I'm addressing is a 
stuck call that eventually unblocks.
When it unblocks, it will try to add a sample to this StatsCache, but that 
sample can be stale, because another worker can have took over, and succesfully 
collected a new sample.

So, under the assumption that at stable state stats collection has a time cost 
negligible with respect the collection interval, I need to take the sample time 
BEFORE to start the possibly-blocking call. If I take the time after, I have no 
means to distinguish between a well behaving call and an unblocked stuck call.

If that is what you meant, I can add a comment/docstring explaining the API 
weirdness.
Line 501:             else:
Line 502:                 self._log.warning('dropped stale old sample')
Line 503: 
Line 504:     def _update_ts(self, bulk_stats, monotonic_ts):


-- 
To view, visit https://gerrit.ovirt.org/38066
To unsubscribe, visit https://gerrit.ovirt.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id8cd75631c328aa380c10f1f6dd1a8075b2fe0ed
Gerrit-PatchSet: 32
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Francesco Romani <[email protected]>
Gerrit-Reviewer: Dan Kenigsberg <[email protected]>
Gerrit-Reviewer: Francesco Romani <[email protected]>
Gerrit-Reviewer: Martin Polednik <[email protected]>
Gerrit-Reviewer: Vinzenz Feenstra <[email protected]>
Gerrit-Reviewer: [email protected]
Gerrit-Reviewer: oVirt Jenkins CI Server
Gerrit-HasComments: Yes
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches

Reply via email to