Zhou Zheng Sheng has posted comments on this change.
Change subject: AdvancedStatsThread: Throttle duplicated exception log
......................................................................
Patch Set 1: (2 inline comments)
Thanks Yaniv and Mark. From the discussion above, I think messages from stats
functions need a dedicate channel and filter.
....................................................
File vdsm/utils.py
Line 308:
Line 309: def needSuppress(self, statsFunction, exceptObj):
Line 310: if (statsFunction not in self._exceptions or
Line 311: not self._exceptionEqual(
Line 312: self._exceptions[statsFunction]['exception'],
exceptObj)):
Thanks for noticing this problem. I thought of it too. The patch here is just
for throttling exception log, so when a stats function raises a exception, it
stops executing at that point, it means the problem remains the same, then it
will raise the same exception next time, so I think the situation you mentioned
could be very rare.
I think implementing a dedicate log channel to filter these messages is better.
Line 313:
Line 314: self._exceptions[statsFunction] = {'exception': exceptObj,
Line 315: 'dupCount': 0}
Line 316: return False
Line 436: statsFunction()
Line 437: except Exception, e:
Line 438: if not self.handleStatsException(e):
Line 439: if not self._logThrottle.needSuppress(
Line 440: statsFunction, e):
Even for this kind of serious messages, it's meaningless to report the same
content again and again, and we do want to be alerted when the bug is
triggered. So I think these messages can go to a dedicate channel, and the
receiver side of the channel discard the duplicated messages and just keep a
limited number of latest messages for the same kind. Or the messages can be
sent to some supervisor program.
When we know exactly what the problem is, we can just grep the log and find out
what we need. Sometimes we do not know problem, so just want to watch the log
line by line. If this kind of messages are too much, the log is hard to digest.
I think the log of stats functions both debug and error messages, can be put to
a separated log file. It's really annoying to see the debugging messaging of
"dd" every second when 'tail -f' the vdsm log. Yes we can grep out the unwanted
"dd" messages, but it's still annoying to add an extra pipe to grep out the
"dd" messages every time.
Line 441: self._log.error("Stats function
failed: %s",
Line 442: statsFunction,
exc_info=True)
Line 443:
Line 444: self._stopEvent.wait(waitInterval)
--
To view, visit http://gerrit.ovirt.org/8884
To unsubscribe, visit http://gerrit.ovirt.org/settings
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ee04d8d82e2a14b0a003627981c28e5e64a46ab
Gerrit-PatchSet: 1
Gerrit-Project: vdsm
Gerrit-Branch: master
Gerrit-Owner: Zhou Zheng Sheng <[email protected]>
Gerrit-Reviewer: Federico Simoncelli <[email protected]>
Gerrit-Reviewer: Gal Hammer <[email protected]>
Gerrit-Reviewer: Mark Wu <[email protected]>
Gerrit-Reviewer: Yaniv Bronhaim <[email protected]>
Gerrit-Reviewer: Zhou Zheng Sheng <[email protected]>
_______________________________________________
vdsm-patches mailing list
[email protected]
https://lists.fedorahosted.org/mailman/listinfo/vdsm-patches