2016-02-16 12:27 GMT+01:00 singh.janmejay <[email protected]>:
> @Thomas: This is not about testing and quantifying loss during a test. > Its about quantifying it during normal operation. I see it as a choice > between: > A. deploy the strongest protocol at every system-boundary and test > each one rigorously and each change rigorously to identify or bound > loss in test conditions, and expect nothing unexpected to show up in > production > B. do the former and measure loss in production to identify that > something unexpected happened > C. deploy efficient protocols at all system-boundaries and measure > loss (as long as loss stays within an acceptable level, deployment > benefits from all the efficiency gains) > > I am talking in the context of C. > > If/when loss is above acceptable level, one needs to debug and fix the > problem. Both B and C provide the data required to identify > situation(s) when such debugging needs to happen. > > The approach of stamping on one end an measuring on the other treats > all intermediate hops as a blackbox. For instance, it can be used to > quantify losses in face of frequent machine failures or down-time free > maintenance etc. > > @David: As of now, I am thinking of end-of-the-day style measurement > (basically report number of messages lost at a good-enough > granularity, say host x severity). > > I am thinking of this as something independent of frequency of outages > and unrelated to maintenance windows. Im thinking of it as a report > that captures extent of loss, where one can pull down several months > of this data and verify loss was never beyond a acceptable level, > compare it across days when load profile was very different (the day > when too many circuit-breakers engaged etc). > > I just wanted to push in a link to upcoming new feature: https://github.com/rsyslog/rsyslog/pull/764 Rainer > I haven't thought through this, but reset may not be required. > Basically let the counter count-up and wrap-around (as long as > wrap-around is well defined behavior which is accounted for during > measurement). > > > On Sat, Feb 13, 2016 at 5:13 AM, David Lang <[email protected]> wrote: > > On Sat, 13 Feb 2016, singh.janmejay wrote: > > > >> The ideal solution would be one that identifies host, log-source and > >> time of loss along with accurate number of messages lost. > >> > >> pstats makes sense, but correlating data from stats across large > >> number of machines will be difficult (some machines may send stats > >> slightly delayed which may skew aggregation etc). > > > > > > if you don't reset the counters, they keep increasing, so over time the > > error due to the slew becomes a very minor componnent. > > > > > >> One approach I can think of: slap a stream-identifier and > >> sequence-number on each received message, then find gaps in sequence > >> number for a session-id on the other side (as a query over log-store > >> etc). > > > > > > I'll point out that generating/checking a monotonic sequence number > destroys > > parallelism, and so it can seriously hurt performance. > > > > Are you trying to detect problems 'on the fly' as they happen? or at the > end > > of the hour/day saying 'hey, there was a problem at some point' > > > > how frequent do you think problems are? I would suggest that you run some > > stress tests on your equipment/network and push things until you do have > > problems, so you can track when they happen. I expect that you will find > > that they don't start happening until you have much higher loads than you > > expect (at least after a bit of tuning), and this can make it so that the > > most invastive solutions aren't needed. > > > > David Lang > > > > > >> Large issues such as producer suddenly going silent can be detected > >> using macro mechanisms (like pstats). > >> > >> On Sat, Feb 13, 2016 at 2:56 AM, David Lang <[email protected]> wrote: > >>> > >>> On Sat, 13 Feb 2016, Andre wrote: > >>> > >>>> > >>>> The easiest way I found to do that is to have a control system and > send > >>>> two > >>>> streams of data to two or more different destinations. > >>>> > >>>> In case of rsyslog processing a large message volume UDP the loss has > >>>> always been noticeable. > >>> > >>> > >>> > >>> this depends on your setup. I was able to send UDP logs at gig-E wire > >>> speed > >>> with no losses, but it required tuning the receiving sytem to not do > DNS > >>> lookups, have sufficient RAM for buffering, etc > >>> > >>> > >>> I never was able to get my hands on 10G equiepment to push up from > there. > >>> > >>> David Lang > >>> > >>> _______________________________________________ > >>> rsyslog mailing list > >>> http://lists.adiscon.net/mailman/listinfo/rsyslog > >>> http://www.rsyslog.com/professional-services/ > >>> What's up with rsyslog? Follow https://twitter.com/rgerhards > >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a > myriad > >>> of > >>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T > >>> LIKE THAT. > >> > >> > >> > >> > >> > > _______________________________________________ > > rsyslog mailing list > > http://lists.adiscon.net/mailman/listinfo/rsyslog > > http://www.rsyslog.com/professional-services/ > > What's up with rsyslog? Follow https://twitter.com/rgerhards > > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of > > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > > LIKE THAT. > > > > -- > Regards, > Janmejay > http://codehunk.wordpress.com > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad > of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you > DON'T LIKE THAT. > _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.

