Re: [rsyslog] RFC: dynamic-stats support
I'll create 2 implementations for this. Will first build it with hash-table (which will use exclusive lock only when adding new metric to the table, increments will happen with shared-lock). In the second cut, I'll change it to trie (which is what I discussed in the proposal). After benchmarking both we'll be in a better position to decide which one should be kept. On Fri, Oct 9, 2015 at 3:20 AM, singh.janmejaywrote: > On Thu, Oct 8, 2015 at 11:07 PM, David Lang wrote: >> Atomic ops are actually rather expensive (almost as expsnsive as full >> locks). If you want a lockless metrics capability, you should do a separate >> set of variables per thread, gathering them a reporting time. And document >> that there is going to be inconsistancy between the different metrics > > Yes, the write thing to do is to maintain thread-local counters. It'll > require slightly wider set of changes for that, but the approach > allows us to make those improvements later without changing the > config-interface. > > Cost of atomic-ops is close to uncontended lock, but much lower than > contended lock. > >> >> unless you lock everything, you are going to have inconsistancies across the >> different metrics. > > Yes, but isn't that acceptable in most cases? > >> >> Also, not all platforms have the easy atomic ops you are thinking of, and >> atomic ops can't operate on all sizes of data (for example, 32 bit systems >> probably can't update a 64 bit value atomically) > > We should eventually move to thread-level accumulators for all metrics > (static or dynamic). > >> >> David Lang >> >> >> On Thu, 8 Oct 2015, singh.janmejay wrote: >> >>> Did you mean it's not atomic across different metrics? That I think should >>> be acceptable. A single metric however should get swapped losslessly with >>> atomic-swap. Either an increment of m should be applied before swap >>> (making >>> the reading n + m, or after it leaving the accumulator at m and reading at >>> n). But it should not be lost. >>> >>> Am I misunderstanding something? >>> >>> -- >>> Regards, >>> Janmejay >>> >>> PS: Please blame the typos in this mail on my phone's uncivilized soft >>> keyboard sporting it's not-so-smart-assist technology. >>> >>> On Oct 8, 2015 12:33 PM, "Rainer Gerhards" >>> wrote: >>> 2015-10-08 8:30 GMT+02:00 singh.janmejay : >> >> Similarly, when one thread goes to output the stats, you need to lock > > them so that there isn't a lost increment between the time that you read > the stat and the time you zero it. > > No, this involves the same shared (uncontended) lock, except > atomic-increment is replaced by atomic-swap with 0. > Just FYI: this is what the current stats system also does. It is also where some inaccuracy stems from. Reporting stats is not atomic without looks, so a stats counter may be read with value n, then m atomic increments happen to it on another thread, then value n is being reported (but we are really at n+m) and then the stats counter is reset to 0 via an atomic swap. So m updates are lost. IMHO this is perfectly acceptable, because otherwise we would lose almost all concurrency. Rainer ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. >>> ___ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >>> LIKE THAT. >>> >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > > > > -- > Regards, > Janmejay > http://codehunk.wordpress.com -- Regards, Janmejay http://codehunk.wordpress.com ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad
Re: [rsyslog] RFC: dynamic-stats support
-- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 7, 2015 11:25 PM, "David Lang"wrote: > > On Wed, 7 Oct 2015, singh.janmejay wrote: > >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 7, 2015 3:26 AM, "David Lang" wrote: >>> >>> >>> On Wed, 7 Oct 2015, singh.janmejay wrote: >>> -- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 6, 2015 10:32 PM, "David Lang" wrote: > > > > On Tue, 6 Oct 2015, singh.janmejay wrote: > >> It is possible to use global-variables (it'll require some >> enhancements, table-support etc), but it'll be very inefficient >> compared to this approach. For instance, choice of data-structure etc >> allows making the solution a lot more efficient. > > > > > As for the data structures, Rainer has been identifying inefficencies in how json-c works and working to improve them > > > That optimizes variable system. But it still is a general propose >> >> variable system. It can't and shouldn't understand relationship between variables. >>> >>> >>> >>> what relationships are there between the different metrics? >>> >> >> The fact that they are read in one shot, reported and reset. > > > I don't understand why you couldn't do $\stats!metric1, $\stats!metric2, etc. There is no reason why this couldn't have the same locking as your new structure. > If this is built over variable backend, it has to be global variables(which is what you are suggesting). But consider the implementation of this table ($/stats). Any efficient indexing (hash or tree based data structures), will require locks for rebalance/re-hash operations. But even counter increment operations need to contend for the lock because of the potential of concurrent operations that change the table. It can't have an assured uncontended path except for initial opportunistic wide shared lock which allows checking existence of key. But the scope of lock and contention due to it is wider. Also, this involves technical intricacies that all users may not be comfortable with. It's asking them to build the stats system as opposed to providing them a ready-made one. In this case it is without any loss in generality or any sacrifice in capability of the feature. I like it because it's easier to use. > >>> > >> Here its possible to locklessly increment counters in most cases, so >> its overhead is a lot lesser than global-variables. > > > > > how can you manage counters in multiple threads without locks? >> >> Especially when dealing with batches. > > > Consider a trie based implementation. With bounded fanout-factor, it's >> >> O(1) wrt metric-names cardinality. It also has very little lock contention involved. Usually operations work with read-locks, only when new metric >> >> is initialized it requires a write lock on patent node. If recycles are few and far apart, lock contention would be negligible. >>> >>> >>> >>> if you have multiple threads that may need to update the same metric at >> >> the same time, a tree doesn't eliminate the locking. >>> >>> >> >> The only situation involving a lock that is contended for, is when a metric >> is to be initialized. Consider this trie: >> >> A -> B -> C >> -> D >> >> Now for incrementing key ABC, no contention exists, because it involves >> read-locks only. It just uses atomic-increment to bump the counter at node >> C. Same for ABD. >> >> But ABE will require a write-lock at node B, because node E doesn't exist >> yet. However key with not shared parent can again be initialized >> concurrently. Init operation can be amortized over a large set of increment >> operations making its cost negligible(this is the knob that reset interval >> exposes). > > > you are misunderstanding me > > If you have two threads that both want to change the counter at node C, you have to have locking to keep them from having problems. > Assuming the nodes A, B and C already exist, here is the seq of op: Shared-lock A (uncontended) Shared-lock B (uncontended) Shared-lock C (uncontended) Atomic-increment counter of C Shared-lock release C Shared-lock release B Shared-lock release A This sequence can be run concurrently without any additional locks. The other (much less frequent) situation is when metric doesn't exist, in which case it'll do: Shared-lock A (uncontended) Exclusive-lock B (contended) Create C Increment counter of C Exclysive-lock release B Shared-lock release A > Similarly, when
Re: [rsyslog] RFC: dynamic-stats support
2015-10-08 8:30 GMT+02:00 singh.janmejay: >> Similarly, when one thread goes to output the stats, you need to lock > them so that there isn't a lost increment between the time that you read > the stat and the time you zero it. > > No, this involves the same shared (uncontended) lock, except > atomic-increment is replaced by atomic-swap with 0. > Just FYI: this is what the current stats system also does. It is also where some inaccuracy stems from. Reporting stats is not atomic without looks, so a stats counter may be read with value n, then m atomic increments happen to it on another thread, then value n is being reported (but we are really at n+m) and then the stats counter is reset to 0 via an atomic swap. So m updates are lost. IMHO this is perfectly acceptable, because otherwise we would lose almost all concurrency. Rainer ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
Yep, makes sense. I second your opinion, absolute consistency between metrics is not that valuable. On Thu, Oct 8, 2015 at 8:57 PM, Rainer Gerhardswrote: > 2015-10-08 17:19 GMT+02:00 singh.janmejay : >> Did you mean it's not atomic across different metrics? That I think should >> be acceptable. A single metric however should get swapped losslessly with >> atomic-swap. Either an increment of m should be applied before swap (making >> the reading n + m, or after it leaving the accumulator at m and reading at >> n). But it should not be lost. >> >> Am I misunderstanding something? > > No, I haven't been precise enough. Usually, we have a set of counters > which interdepend. And so you never get them really consistent. > > Rainer >> >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 8, 2015 12:33 PM, "Rainer Gerhards" wrote: >> >>> 2015-10-08 8:30 GMT+02:00 singh.janmejay : >>> >> Similarly, when one thread goes to output the stats, you need to lock >>> > them so that there isn't a lost increment between the time that you read >>> > the stat and the time you zero it. >>> > >>> > No, this involves the same shared (uncontended) lock, except >>> > atomic-increment is replaced by atomic-swap with 0. >>> > >>> >>> Just FYI: this is what the current stats system also does. It is also >>> where some inaccuracy stems from. Reporting stats is not atomic >>> without looks, so a stats counter may be read with value n, then m >>> atomic increments happen to it on another thread, then value n is >>> being reported (but we are really at n+m) and then the stats counter >>> is reset to 0 via an atomic swap. So m updates are lost. IMHO this is >>> perfectly acceptable, because otherwise we would lose almost all >>> concurrency. >>> >>> Rainer >>> ___ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. -- Regards, Janmejay http://codehunk.wordpress.com ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
Atomic ops are actually rather expensive (almost as expsnsive as full locks). If you want a lockless metrics capability, you should do a separate set of variables per thread, gathering them a reporting time. And document that there is going to be inconsistancy between the different metrics unless you lock everything, you are going to have inconsistancies across the different metrics. Also, not all platforms have the easy atomic ops you are thinking of, and atomic ops can't operate on all sizes of data (for example, 32 bit systems probably can't update a 64 bit value atomically) David Lang On Thu, 8 Oct 2015, singh.janmejay wrote: Did you mean it's not atomic across different metrics? That I think should be acceptable. A single metric however should get swapped losslessly with atomic-swap. Either an increment of m should be applied before swap (making the reading n + m, or after it leaving the accumulator at m and reading at n). But it should not be lost. Am I misunderstanding something? -- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 8, 2015 12:33 PM, "Rainer Gerhards"wrote: 2015-10-08 8:30 GMT+02:00 singh.janmejay : Similarly, when one thread goes to output the stats, you need to lock them so that there isn't a lost increment between the time that you read the stat and the time you zero it. No, this involves the same shared (uncontended) lock, except atomic-increment is replaced by atomic-swap with 0. Just FYI: this is what the current stats system also does. It is also where some inaccuracy stems from. Reporting stats is not atomic without looks, so a stats counter may be read with value n, then m atomic increments happen to it on another thread, then value n is being reported (but we are really at n+m) and then the stats counter is reset to 0 via an atomic swap. So m updates are lost. IMHO this is perfectly acceptable, because otherwise we would lose almost all concurrency. Rainer ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT. ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
On Thu, Oct 8, 2015 at 11:07 PM, David Langwrote: > Atomic ops are actually rather expensive (almost as expsnsive as full > locks). If you want a lockless metrics capability, you should do a separate > set of variables per thread, gathering them a reporting time. And document > that there is going to be inconsistancy between the different metrics Yes, the write thing to do is to maintain thread-local counters. It'll require slightly wider set of changes for that, but the approach allows us to make those improvements later without changing the config-interface. Cost of atomic-ops is close to uncontended lock, but much lower than contended lock. > > unless you lock everything, you are going to have inconsistancies across the > different metrics. Yes, but isn't that acceptable in most cases? > > Also, not all platforms have the easy atomic ops you are thinking of, and > atomic ops can't operate on all sizes of data (for example, 32 bit systems > probably can't update a 64 bit value atomically) We should eventually move to thread-level accumulators for all metrics (static or dynamic). > > David Lang > > > On Thu, 8 Oct 2015, singh.janmejay wrote: > >> Did you mean it's not atomic across different metrics? That I think should >> be acceptable. A single metric however should get swapped losslessly with >> atomic-swap. Either an increment of m should be applied before swap >> (making >> the reading n + m, or after it leaving the accumulator at m and reading at >> n). But it should not be lost. >> >> Am I misunderstanding something? >> >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 8, 2015 12:33 PM, "Rainer Gerhards" >> wrote: >> >>> 2015-10-08 8:30 GMT+02:00 singh.janmejay : > > Similarly, when one thread goes to output the stats, you need to lock them so that there isn't a lost increment between the time that you read the stat and the time you zero it. No, this involves the same shared (uncontended) lock, except atomic-increment is replaced by atomic-swap with 0. >>> >>> Just FYI: this is what the current stats system also does. It is also >>> where some inaccuracy stems from. Reporting stats is not atomic >>> without looks, so a stats counter may be read with value n, then m >>> atomic increments happen to it on another thread, then value n is >>> being reported (but we are really at n+m) and then the stats counter >>> is reset to 0 via an atomic swap. So m updates are lost. IMHO this is >>> perfectly acceptable, because otherwise we would lose almost all >>> concurrency. >>> >>> Rainer >>> ___ >>> rsyslog mailing list >>> http://lists.adiscon.net/mailman/listinfo/rsyslog >>> http://www.rsyslog.com/professional-services/ >>> What's up with rsyslog? Follow https://twitter.com/rgerhards >>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you >>> DON'T LIKE THAT. >>> >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad >> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. >> > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. -- Regards, Janmejay http://codehunk.wordpress.com ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
I hope there is a stats about metrics based on $programname, $severity, $fromhost-ip etc, extends the ruleset(impstats). 2015-10-07 16:19 GMT+08:00 singh.janmejay: > -- > Regards, > Janmejay > > PS: Please blame the typos in this mail on my phone's uncivilized soft > keyboard sporting it's not-so-smart-assist technology. > > On Oct 7, 2015 3:26 AM, "David Lang" wrote: > > > > On Wed, 7 Oct 2015, singh.janmejay wrote: > > > >> -- > >> Regards, > >> Janmejay > >> > >> PS: Please blame the typos in this mail on my phone's uncivilized soft > >> keyboard sporting it's not-so-smart-assist technology. > >> > >> On Oct 6, 2015 10:32 PM, "David Lang" wrote: > >>> > >>> > >>> On Tue, 6 Oct 2015, singh.janmejay wrote: > >>> > It is possible to use global-variables (it'll require some > enhancements, table-support etc), but it'll be very inefficient > compared to this approach. For instance, choice of data-structure etc > allows making the solution a lot more efficient. > >>> > >>> > >>> > >>> As for the data structures, Rainer has been identifying inefficencies > in > >> > >> how json-c works and working to improve them > >>> > >>> > >> > >> That optimizes variable system. But it still is a general propose > variable > >> system. It can't and shouldn't understand relationship between > variables. > > > > > > what relationships are there between the different metrics? > > > > The fact that they are read in one shot, reported and reset. > > > > >>> > Here its possible to locklessly increment counters in most cases, so > its overhead is a lot lesser than global-variables. > >>> > >>> > >>> > >>> how can you manage counters in multiple threads without locks? > Especially > >> > >> when dealing with batches. > >>> > >>> > >> > >> Consider a trie based implementation. With bounded fanout-factor, it's > O(1) > >> wrt metric-names cardinality. It also has very little lock contention > >> involved. Usually operations work with read-locks, only when new metric > is > >> initialized it requires a write lock on patent node. If recycles are few > >> and far apart, lock contention would be negligible. > > > > > > if you have multiple threads that may need to update the same metric at > the same time, a tree doesn't eliminate the locking. > > > > The only situation involving a lock that is contended for, is when a metric > is to be initialized. Consider this trie: > > A -> B -> C >-> D > > Now for incrementing key ABC, no contention exists, because it involves > read-locks only. It just uses atomic-increment to bump the counter at node > C. Same for ABD. > > But ABE will require a write-lock at node B, because node E doesn't exist > yet. However key with not shared parent can again be initialized > concurrently. Init operation can be amortized over a large set of increment > operations making its cost negligible(this is the knob that reset interval > exposes). > > > The current json-c locking is being make intentionally over-broad right > now because it appears that some json-c code is not thread-safe and we > haven't identified it yet. Once that's tracked down and fixed (or json-c > replaced), updating one item should not require locking any more than that > item. > > > > > >>> > Recycle is precisely to allow this lockless mechanism to work. Its > basically saying, it'll track metric-names he has seen in last 1 hour. > If we kill tracking of it as soon as we don't see an increment > (between 2 reporting runs of impstats), it'll lead to unnecessary > churn when low-values are common or load is not uniform in time. > >>> > >>> > >>> > >>> that depends on the cost of initializing a metric vs the cost of > tracking > >> > >> the recycle mechanism. > >>> > >>> > >> > >> 0 value data-points can easily be filtered out. So they don't create any > >> processing overhead downstream. Cost of tracking for recycle is minimal > >> because it's a single counter bring tracked, when it reaches zero it's > >> reset to orig starting value and trie is killed after reporting > accumulated > >> stats. > > > > > > actually, filtering out 0 data-points can be a very bad thing. Far too > many monitoring tools produce stright-line graphs/estimates between > reported data-points, so it's very important to report 0 value data-points > > > > I agree. > > > > Implementing it on top of global-variables is not only has very high > performance-penalty(it'll be prohibitive for high-throughput > scenarios), it also exposes too much complexity to the user (where > user has to worry about reset etc). > > I don't plan to have a scheduler in this implementation. > GetAllStatsLines call will purge the tree instead of reset at that > interval. Its basically a balance between freeing-up memory occupied > by stale-metric-names vs. performance (lockless handling of > increment). So it will be
Re: [rsyslog] RFC: dynamic-stats support
-- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 7, 2015 3:26 AM, "David Lang"wrote: > > On Wed, 7 Oct 2015, singh.janmejay wrote: > >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 6, 2015 10:32 PM, "David Lang" wrote: >>> >>> >>> On Tue, 6 Oct 2015, singh.janmejay wrote: >>> It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. >>> >>> >>> >>> As for the data structures, Rainer has been identifying inefficencies in >> >> how json-c works and working to improve them >>> >>> >> >> That optimizes variable system. But it still is a general propose variable >> system. It can't and shouldn't understand relationship between variables. > > > what relationships are there between the different metrics? > The fact that they are read in one shot, reported and reset. > >>> Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. >>> >>> >>> >>> how can you manage counters in multiple threads without locks? Especially >> >> when dealing with batches. >>> >>> >> >> Consider a trie based implementation. With bounded fanout-factor, it's O(1) >> wrt metric-names cardinality. It also has very little lock contention >> involved. Usually operations work with read-locks, only when new metric is >> initialized it requires a write lock on patent node. If recycles are few >> and far apart, lock contention would be negligible. > > > if you have multiple threads that may need to update the same metric at the same time, a tree doesn't eliminate the locking. > The only situation involving a lock that is contended for, is when a metric is to be initialized. Consider this trie: A -> B -> C -> D Now for incrementing key ABC, no contention exists, because it involves read-locks only. It just uses atomic-increment to bump the counter at node C. Same for ABD. But ABE will require a write-lock at node B, because node E doesn't exist yet. However key with not shared parent can again be initialized concurrently. Init operation can be amortized over a large set of increment operations making its cost negligible(this is the knob that reset interval exposes). > The current json-c locking is being make intentionally over-broad right now because it appears that some json-c code is not thread-safe and we haven't identified it yet. Once that's tracked down and fixed (or json-c replaced), updating one item should not require locking any more than that item. > > >>> Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. >>> >>> >>> >>> that depends on the cost of initializing a metric vs the cost of tracking >> >> the recycle mechanism. >>> >>> >> >> 0 value data-points can easily be filtered out. So they don't create any >> processing overhead downstream. Cost of tracking for recycle is minimal >> because it's a single counter bring tracked, when it reaches zero it's >> reset to orig starting value and trie is killed after reporting accumulated >> stats. > > > actually, filtering out 0 data-points can be a very bad thing. Far too many monitoring tools produce stright-line graphs/estimates between reported data-points, so it's very important to report 0 value data-points > I agree. > Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of purge_known_keys_after_they_have_been_reported_N_times). >>> >>> >>> >>> if this is just adding additional metrics to the impstats output that >> >> eliminates the schedular/reset issue. >>> >>> >>> I think we should have a metric configuration be fairly static, allow >> >> configuring custom metrics and add to them, but don't use data from the >> message as part of
Re: [rsyslog] RFC: dynamic-stats support
It'll support dynamic-key, so any property will work. On Wed, Oct 7, 2015 at 2:38 PM, chenlin raowrote: > I hope there is a stats about metrics based on $programname, $severity, > $fromhost-ip etc, extends the ruleset(impstats). > > 2015-10-07 16:19 GMT+08:00 singh.janmejay : > >> -- >> Regards, >> Janmejay >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> keyboard sporting it's not-so-smart-assist technology. >> >> On Oct 7, 2015 3:26 AM, "David Lang" wrote: >> > >> > On Wed, 7 Oct 2015, singh.janmejay wrote: >> > >> >> -- >> >> Regards, >> >> Janmejay >> >> >> >> PS: Please blame the typos in this mail on my phone's uncivilized soft >> >> keyboard sporting it's not-so-smart-assist technology. >> >> >> >> On Oct 6, 2015 10:32 PM, "David Lang" wrote: >> >>> >> >>> >> >>> On Tue, 6 Oct 2015, singh.janmejay wrote: >> >>> >> It is possible to use global-variables (it'll require some >> enhancements, table-support etc), but it'll be very inefficient >> compared to this approach. For instance, choice of data-structure etc >> allows making the solution a lot more efficient. >> >>> >> >>> >> >>> >> >>> As for the data structures, Rainer has been identifying inefficencies >> in >> >> >> >> how json-c works and working to improve them >> >>> >> >>> >> >> >> >> That optimizes variable system. But it still is a general propose >> variable >> >> system. It can't and shouldn't understand relationship between >> variables. >> > >> > >> > what relationships are there between the different metrics? >> > >> >> The fact that they are read in one shot, reported and reset. >> >> > >> >>> >> Here its possible to locklessly increment counters in most cases, so >> its overhead is a lot lesser than global-variables. >> >>> >> >>> >> >>> >> >>> how can you manage counters in multiple threads without locks? >> Especially >> >> >> >> when dealing with batches. >> >>> >> >>> >> >> >> >> Consider a trie based implementation. With bounded fanout-factor, it's >> O(1) >> >> wrt metric-names cardinality. It also has very little lock contention >> >> involved. Usually operations work with read-locks, only when new metric >> is >> >> initialized it requires a write lock on patent node. If recycles are few >> >> and far apart, lock contention would be negligible. >> > >> > >> > if you have multiple threads that may need to update the same metric at >> the same time, a tree doesn't eliminate the locking. >> > >> >> The only situation involving a lock that is contended for, is when a metric >> is to be initialized. Consider this trie: >> >> A -> B -> C >>-> D >> >> Now for incrementing key ABC, no contention exists, because it involves >> read-locks only. It just uses atomic-increment to bump the counter at node >> C. Same for ABD. >> >> But ABE will require a write-lock at node B, because node E doesn't exist >> yet. However key with not shared parent can again be initialized >> concurrently. Init operation can be amortized over a large set of increment >> operations making its cost negligible(this is the knob that reset interval >> exposes). >> >> > The current json-c locking is being make intentionally over-broad right >> now because it appears that some json-c code is not thread-safe and we >> haven't identified it yet. Once that's tracked down and fixed (or json-c >> replaced), updating one item should not require locking any more than that >> item. >> > >> > >> >>> >> Recycle is precisely to allow this lockless mechanism to work. Its >> basically saying, it'll track metric-names he has seen in last 1 hour. >> If we kill tracking of it as soon as we don't see an increment >> (between 2 reporting runs of impstats), it'll lead to unnecessary >> churn when low-values are common or load is not uniform in time. >> >>> >> >>> >> >>> >> >>> that depends on the cost of initializing a metric vs the cost of >> tracking >> >> >> >> the recycle mechanism. >> >>> >> >>> >> >> >> >> 0 value data-points can easily be filtered out. So they don't create any >> >> processing overhead downstream. Cost of tracking for recycle is minimal >> >> because it's a single counter bring tracked, when it reaches zero it's >> >> reset to orig starting value and trie is killed after reporting >> accumulated >> >> stats. >> > >> > >> > actually, filtering out 0 data-points can be a very bad thing. Far too >> many monitoring tools produce stright-line graphs/estimates between >> reported data-points, so it's very important to report 0 value data-points >> > >> >> I agree. >> >> > >> Implementing it on top of global-variables is not only has very high >> performance-penalty(it'll be prohibitive for high-throughput >> scenarios), it also exposes too much complexity to the user (where >> user has to worry about reset etc). >> >> I don't plan to have a scheduler in
Re: [rsyslog] RFC: dynamic-stats support
On Wed, 7 Oct 2015, singh.janmejay wrote: -- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 7, 2015 3:26 AM, "David Lang"wrote: On Wed, 7 Oct 2015, singh.janmejay wrote: -- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 6, 2015 10:32 PM, "David Lang" wrote: On Tue, 6 Oct 2015, singh.janmejay wrote: It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. As for the data structures, Rainer has been identifying inefficencies in how json-c works and working to improve them That optimizes variable system. But it still is a general propose variable system. It can't and shouldn't understand relationship between variables. what relationships are there between the different metrics? The fact that they are read in one shot, reported and reset. I don't understand why you couldn't do $\stats!metric1, $\stats!metric2, etc. There is no reason why this couldn't have the same locking as your new structure. Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. how can you manage counters in multiple threads without locks? Especially when dealing with batches. Consider a trie based implementation. With bounded fanout-factor, it's O(1) wrt metric-names cardinality. It also has very little lock contention involved. Usually operations work with read-locks, only when new metric is initialized it requires a write lock on patent node. If recycles are few and far apart, lock contention would be negligible. if you have multiple threads that may need to update the same metric at the same time, a tree doesn't eliminate the locking. The only situation involving a lock that is contended for, is when a metric is to be initialized. Consider this trie: A -> B -> C -> D Now for incrementing key ABC, no contention exists, because it involves read-locks only. It just uses atomic-increment to bump the counter at node C. Same for ABD. But ABE will require a write-lock at node B, because node E doesn't exist yet. However key with not shared parent can again be initialized concurrently. Init operation can be amortized over a large set of increment operations making its cost negligible(this is the knob that reset interval exposes). you are misunderstanding me If you have two threads that both want to change the counter at node C, you have to have locking to keep them from having problems. Similarly, when one thread goes to output the stats, you need to lock them so that there isn't a lost increment between the time that you read the stat and the time you zero it. That's the locking that's expensive. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
I can't fully dig into this, but I think we must *very carefully* evaluate the overall design. Some time ago we introduced arrays for the limited liblognorm use case, and it hurts us every now and then when folks want to use arrays for other use cases. It may probably make sense to re-think how the variable engine etc behaves before adding more functionality. And make sure that everything works smooth in all use cases. While anything else may take care for some use cases, I fear we may get too fragmented. At least this is what I learned in the past months discussions. Anyone else? Rainer 2015-10-06 17:10 GMT+02:00 singh.janmejay: > It is possible to use global-variables (it'll require some > enhancements, table-support etc), but it'll be very inefficient > compared to this approach. For instance, choice of data-structure etc > allows making the solution a lot more efficient. > > Here its possible to locklessly increment counters in most cases, so > its overhead is a lot lesser than global-variables. > > Recycle is precisely to allow this lockless mechanism to work. Its > basically saying, it'll track metric-names he has seen in last 1 hour. > If we kill tracking of it as soon as we don't see an increment > (between 2 reporting runs of impstats), it'll lead to unnecessary > churn when low-values are common or load is not uniform in time. > > Implementing it on top of global-variables is not only has very high > performance-penalty(it'll be prohibitive for high-throughput > scenarios), it also exposes too much complexity to the user (where > user has to worry about reset etc). > > I don't plan to have a scheduler in this implementation. > GetAllStatsLines call will purge the tree instead of reset at that > interval. Its basically a balance between freeing-up memory occupied > by stale-metric-names vs. performance (lockless handling of > increment). So it will be governed by impstat schedule. May be I > should change name to better name (equivalent of > purge_known_keys_after_they_have_been_reported_N_times). > > > On Tue, Oct 6, 2015 at 4:30 PM, David Lang wrote: >> On Tue, 6 Oct 2015, singh.janmejay wrote: >> >>> Hi, >>> >>> I am working on support for stats with dynamic-name. This comes handy >>> in situations where metric-name is dependent upon value of a certain >>> attribute of the message. >>> >>> Say, for a central log-aggregation service, its valuable to know what >>> is inbound message-count distribution across application-clusters that >>> send logs to it, or for a shared-server, its valuable to know what is >>> the log-volume generation across users etc. >>> >>> Im thinking of using functions-like interface to support this. It may >>> look similar to this: >>> >>> >>> dyn_stats("user_msg_count") >>> >>> ... >>> >>> ruleset(...) { >>> ... >>> dyn_inc("user_msg_count", $.user) >>> ... >>> } >>> >>> >>> dyn_stats signature looks like: >>> dyn_stats(, , >> default=10k>, ) >>> >>> dyn_inc signature looks like: >>> dyn_inc(, ) >>> >>> >>> Reporting would work similar to static-metric via impstats. Mapping: >>> statsobj_s.name = name_space >>> statsobj_s.origin = "dyn" >>> ctr_s.name = "foo" (say $.user had value foo) >>> >>> >>> Thoughts / suggestions? >> >> >> how is this different/better than global variables? (although we may need to >> implement soem functions, atomic inc/dec copy+clear) If you have pstats >> output in json format, you can even piggyback on it's schedule to output the >> data. >> >> >> things like stats can very easily end up being expensive in terms of locking >> (something global variables already have figured out), and it sounds like >> you are proposing adding a scheduler of some sort to output the data. >> >> variables should not need to be 'recycled', either they contain data or they >> don't. If they contain data, you need to keep the data until you do >> something with it, if they don't, you don't have to track them. >> >> >> I am actually doing this sort of thing external to rsyslog in SEC >> >> I have a template in rsyslog that contains hostname, fromhost-ip, >> programname and I output it via improg to SEC. SEC accumulates counters and >> has scheduled outputs to files. >> >> before I started using SEC for this, I used the same template to output to a >> file and then for reports, used cut + sort + uniq -c to extract the data I >> need. When the files only contain the significant data, this is actually not >> bad to do, even at higher volumes. >> >> David Lang >> ___ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ >> What's up with rsyslog? Follow https://twitter.com/rgerhards >> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of >> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T >> LIKE THAT. > > > > -- > Regards,
Re: [rsyslog] RFC: dynamic-stats support
On Wed, 7 Oct 2015, singh.janmejay wrote: -- Regards, Janmejay PS: Please blame the typos in this mail on my phone's uncivilized soft keyboard sporting it's not-so-smart-assist technology. On Oct 6, 2015 10:32 PM, "David Lang"wrote: On Tue, 6 Oct 2015, singh.janmejay wrote: It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. As for the data structures, Rainer has been identifying inefficencies in how json-c works and working to improve them That optimizes variable system. But it still is a general propose variable system. It can't and shouldn't understand relationship between variables. what relationships are there between the different metrics? Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. how can you manage counters in multiple threads without locks? Especially when dealing with batches. Consider a trie based implementation. With bounded fanout-factor, it's O(1) wrt metric-names cardinality. It also has very little lock contention involved. Usually operations work with read-locks, only when new metric is initialized it requires a write lock on patent node. If recycles are few and far apart, lock contention would be negligible. if you have multiple threads that may need to update the same metric at the same time, a tree doesn't eliminate the locking. The current json-c locking is being make intentionally over-broad right now because it appears that some json-c code is not thread-safe and we haven't identified it yet. Once that's tracked down and fixed (or json-c replaced), updating one item should not require locking any more than that item. Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. that depends on the cost of initializing a metric vs the cost of tracking the recycle mechanism. 0 value data-points can easily be filtered out. So they don't create any processing overhead downstream. Cost of tracking for recycle is minimal because it's a single counter bring tracked, when it reaches zero it's reset to orig starting value and trie is killed after reporting accumulated stats. actually, filtering out 0 data-points can be a very bad thing. Far too many monitoring tools produce stright-line graphs/estimates between reported data-points, so it's very important to report 0 value data-points Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of purge_known_keys_after_they_have_been_reported_N_times). if this is just adding additional metrics to the impstats output that eliminates the schedular/reset issue. I think we should have a metric configuration be fairly static, allow configuring custom metrics and add to them, but don't use data from the message as part of the name of the metric, and continue reporting them forever, even if they are 0 (so no need to 'recycle' names) Dynamic metrics are a real usecase for any shared system(utilisation across several users, several hosts, several clusters, several-subnets etc are easily reportable with this). The only way to report utilisation in many scenarios is to have dyn-metric names. The alternative is to pre-declare all keys, but that to me is a more indirect solution. It's not as flexible/adaptive. I think declarative static-key a useful feature on its own, for eg when classifying reportable metric into buckets known in advance, but dyn-key and configurable-static-key are not interchangeable. dynamic systems will have pathalogical failure condiditions. Consider what happens when someone uses hostname in a dynafile template and then some system starts spewing malformed logs that put garbage data in that field. Creating hundreds or thousands of metric variables is much worse. I agree that pre-declared keys are less flexible, but they are also going to be far safer and easier to deal with. David Lang ___ rsyslog mailing list
[rsyslog] RFC: dynamic-stats support
Hi, I am working on support for stats with dynamic-name. This comes handy in situations where metric-name is dependent upon value of a certain attribute of the message. Say, for a central log-aggregation service, its valuable to know what is inbound message-count distribution across application-clusters that send logs to it, or for a shared-server, its valuable to know what is the log-volume generation across users etc. Im thinking of using functions-like interface to support this. It may look similar to this: dyn_stats("user_msg_count") ... ruleset(...) { ... dyn_inc("user_msg_count", $.user) ... } dyn_stats signature looks like: dyn_stats(, , , ) dyn_inc signature looks like: dyn_inc(, ) Reporting would work similar to static-metric via impstats. Mapping: statsobj_s.name = name_space statsobj_s.origin = "dyn" ctr_s.name = "foo" (say $.user had value foo) Thoughts / suggestions? -- Regards, Janmejay http://codehunk.wordpress.com ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
On Tue, 6 Oct 2015, singh.janmejay wrote: Hi, I am working on support for stats with dynamic-name. This comes handy in situations where metric-name is dependent upon value of a certain attribute of the message. Say, for a central log-aggregation service, its valuable to know what is inbound message-count distribution across application-clusters that send logs to it, or for a shared-server, its valuable to know what is the log-volume generation across users etc. Im thinking of using functions-like interface to support this. It may look similar to this: dyn_stats("user_msg_count") ... ruleset(...) { ... dyn_inc("user_msg_count", $.user) ... } dyn_stats signature looks like: dyn_stats(, , , ) dyn_inc signature looks like: dyn_inc(, ) Reporting would work similar to static-metric via impstats. Mapping: statsobj_s.name = name_space statsobj_s.origin = "dyn" ctr_s.name = "foo" (say $.user had value foo) Thoughts / suggestions? how is this different/better than global variables? (although we may need to implement soem functions, atomic inc/dec copy+clear) If you have pstats output in json format, you can even piggyback on it's schedule to output the data. things like stats can very easily end up being expensive in terms of locking (something global variables already have figured out), and it sounds like you are proposing adding a scheduler of some sort to output the data. variables should not need to be 'recycled', either they contain data or they don't. If they contain data, you need to keep the data until you do something with it, if they don't, you don't have to track them. I am actually doing this sort of thing external to rsyslog in SEC I have a template in rsyslog that contains hostname, fromhost-ip, programname and I output it via improg to SEC. SEC accumulates counters and has scheduled outputs to files. before I started using SEC for this, I used the same template to output to a file and then for reports, used cut + sort + uniq -c to extract the data I need. When the files only contain the significant data, this is actually not bad to do, even at higher volumes. David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
Rainer, I see this as something completely outside the scope of variables. Building stats collector over variables is possible, but then we are then talking about a general purpose language which allows building such complex things. This increases the scope of Rainerscript and with larger scope comes complexity. I feel this is in-line with the other Lua discussion where you emphasized that Rainerscript should not become a fully-general-purpose language? Eg. creating an atomic-increment function for variable requires that we educate users about what can and can't be done if atomic-increment function is used anywhere on a variable. What relationship they can expect it to have with other atomic-incrementing variables (which gets into memory model). On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhardswrote: > I can't fully dig into this, but I think we must *very carefully* > evaluate the overall design. Some time ago we introduced arrays for > the limited liblognorm use case, and it hurts us every now and then > when folks want to use arrays for other use cases. It may probably > make sense to re-think how the variable engine etc behaves before > adding more functionality. And make sure that everything works smooth > in all use cases. While anything else may take care for some use > cases, I fear we may get too fragmented. At least this is what I > learned in the past months discussions. > > Anyone else? > > Rainer > > 2015-10-06 17:10 GMT+02:00 singh.janmejay : >> It is possible to use global-variables (it'll require some >> enhancements, table-support etc), but it'll be very inefficient >> compared to this approach. For instance, choice of data-structure etc >> allows making the solution a lot more efficient. >> >> Here its possible to locklessly increment counters in most cases, so >> its overhead is a lot lesser than global-variables. >> >> Recycle is precisely to allow this lockless mechanism to work. Its >> basically saying, it'll track metric-names he has seen in last 1 hour. >> If we kill tracking of it as soon as we don't see an increment >> (between 2 reporting runs of impstats), it'll lead to unnecessary >> churn when low-values are common or load is not uniform in time. >> >> Implementing it on top of global-variables is not only has very high >> performance-penalty(it'll be prohibitive for high-throughput >> scenarios), it also exposes too much complexity to the user (where >> user has to worry about reset etc). >> >> I don't plan to have a scheduler in this implementation. >> GetAllStatsLines call will purge the tree instead of reset at that >> interval. Its basically a balance between freeing-up memory occupied >> by stale-metric-names vs. performance (lockless handling of >> increment). So it will be governed by impstat schedule. May be I >> should change name to better name (equivalent of >> purge_known_keys_after_they_have_been_reported_N_times). >> >> >> On Tue, Oct 6, 2015 at 4:30 PM, David Lang wrote: >>> On Tue, 6 Oct 2015, singh.janmejay wrote: >>> Hi, I am working on support for stats with dynamic-name. This comes handy in situations where metric-name is dependent upon value of a certain attribute of the message. Say, for a central log-aggregation service, its valuable to know what is inbound message-count distribution across application-clusters that send logs to it, or for a shared-server, its valuable to know what is the log-volume generation across users etc. Im thinking of using functions-like interface to support this. It may look similar to this: dyn_stats("user_msg_count") ... ruleset(...) { ... dyn_inc("user_msg_count", $.user) ... } dyn_stats signature looks like: dyn_stats(, , >>> default=10k>, ) dyn_inc signature looks like: dyn_inc(, ) Reporting would work similar to static-metric via impstats. Mapping: statsobj_s.name = name_space statsobj_s.origin = "dyn" ctr_s.name = "foo" (say $.user had value foo) Thoughts / suggestions? >>> >>> >>> how is this different/better than global variables? (although we may need to >>> implement soem functions, atomic inc/dec copy+clear) If you have pstats >>> output in json format, you can even piggyback on it's schedule to output the >>> data. >>> >>> >>> things like stats can very easily end up being expensive in terms of locking >>> (something global variables already have figured out), and it sounds like >>> you are proposing adding a scheduler of some sort to output the data. >>> >>> variables should not need to be 'recycled', either they contain data or they >>> don't. If they contain data, you need to keep the data until you do >>> something with it, if they don't, you don't have to track them. >>> >>> >>> I am actually doing
Re: [rsyslog] RFC: dynamic-stats support
I personally would argue that stats around the actual content of syslog messages is outside of the domain that rsyslog should be responsible for. impstats makes sense to me as it provides statistics around rsyslogs operation itself. Once I start wanting stats and counters around message content, I would rather delegate that to a different system entirely. On Tue, Oct 6, 2015 at 12:04 PM, singh.janmejaywrote: > Rainer, > > I see this as something completely outside the scope of variables. > Building stats collector over variables is possible, but then we are > then talking about a general purpose language which allows building > such complex things. This increases the scope of Rainerscript and with > larger scope comes complexity. I feel this is in-line with the other > Lua discussion where you emphasized that Rainerscript should not > become a fully-general-purpose language? > > Eg. creating an atomic-increment function for variable requires that > we educate users about what can and can't be done if atomic-increment > function is used anywhere on a variable. What relationship they can > expect it to have with other atomic-incrementing variables (which gets > into memory model). > > > > On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards > wrote: > > I can't fully dig into this, but I think we must *very carefully* > > evaluate the overall design. Some time ago we introduced arrays for > > the limited liblognorm use case, and it hurts us every now and then > > when folks want to use arrays for other use cases. It may probably > > make sense to re-think how the variable engine etc behaves before > > adding more functionality. And make sure that everything works smooth > > in all use cases. While anything else may take care for some use > > cases, I fear we may get too fragmented. At least this is what I > > learned in the past months discussions. > > > > Anyone else? > > > > Rainer > > > > 2015-10-06 17:10 GMT+02:00 singh.janmejay : > >> It is possible to use global-variables (it'll require some > >> enhancements, table-support etc), but it'll be very inefficient > >> compared to this approach. For instance, choice of data-structure etc > >> allows making the solution a lot more efficient. > >> > >> Here its possible to locklessly increment counters in most cases, so > >> its overhead is a lot lesser than global-variables. > >> > >> Recycle is precisely to allow this lockless mechanism to work. Its > >> basically saying, it'll track metric-names he has seen in last 1 hour. > >> If we kill tracking of it as soon as we don't see an increment > >> (between 2 reporting runs of impstats), it'll lead to unnecessary > >> churn when low-values are common or load is not uniform in time. > >> > >> Implementing it on top of global-variables is not only has very high > >> performance-penalty(it'll be prohibitive for high-throughput > >> scenarios), it also exposes too much complexity to the user (where > >> user has to worry about reset etc). > >> > >> I don't plan to have a scheduler in this implementation. > >> GetAllStatsLines call will purge the tree instead of reset at that > >> interval. Its basically a balance between freeing-up memory occupied > >> by stale-metric-names vs. performance (lockless handling of > >> increment). So it will be governed by impstat schedule. May be I > >> should change name to better name (equivalent of > >> purge_known_keys_after_they_have_been_reported_N_times). > >> > >> > >> On Tue, Oct 6, 2015 at 4:30 PM, David Lang wrote: > >>> On Tue, 6 Oct 2015, singh.janmejay wrote: > >>> > Hi, > > I am working on support for stats with dynamic-name. This comes handy > in situations where metric-name is dependent upon value of a certain > attribute of the message. > > Say, for a central log-aggregation service, its valuable to know what > is inbound message-count distribution across application-clusters that > send logs to it, or for a shared-server, its valuable to know what is > the log-volume generation across users etc. > > Im thinking of using functions-like interface to support this. It may > look similar to this: > > > dyn_stats("user_msg_count") > > ... > > ruleset(...) { > ... > dyn_inc("user_msg_count", $.user) > ... > } > > > dyn_stats signature looks like: > dyn_stats(, , default=10k>, ) > > dyn_inc signature looks like: > dyn_inc(, ) > > > Reporting would work similar to static-metric via impstats. Mapping: > statsobj_s.name = name_space > statsobj_s.origin = "dyn" > ctr_s.name = "foo" (say $.user had value foo) > > > Thoughts / suggestions? > >>> > >>> > >>> how is this different/better than global variables? (although we may > need to >
Re: [rsyslog] RFC: dynamic-stats support
Sure, sounds good. The ability to gather stats with dynamic-key is important. I am willing to even help with a rewrite if at some point we feel its best implemented in a different way in the light of variable-implementation changes. On Tue, Oct 6, 2015 at 9:48 PM, Rainer Gerhardswrote: > 2015-10-06 18:04 GMT+02:00 singh.janmejay : >> Rainer, >> >> I see this as something completely outside the scope of variables. >> Building stats collector over variables is possible, but then we are >> then talking about a general purpose language which allows building >> such complex things. This increases the scope of Rainerscript and with >> larger scope comes complexity. I feel this is in-line with the other >> Lua discussion where you emphasized that Rainerscript should not >> become a fully-general-purpose language? >> >> Eg. creating an atomic-increment function for variable requires that >> we educate users about what can and can't be done if atomic-increment >> function is used anywhere on a variable. What relationship they can >> expect it to have with other atomic-incrementing variables (which gets >> into memory model). > > Maybe I just feel overwhelemed in the moment with keeping track of > everything that is going on. How about this: we can merge it BUT flag > it as experimental. If all works out well, I am free starting early > next year to have a deep look at the overall design and sticking > together all those loose edges. I suspect that I would like to change > a couple of things in the interest of tying it all well together (like > I currently do in liblognorm). > > But if I need to carry all this legacy, that's really a burden (e.g. > liblognorm now contains the full v1 code as a copy, which means it > also needs to be somewhat maintained). I want to avoid this. As long > as we document this as an *interim* solution that is not necessarily > here to stay and as such "use at your own risk and it will probably > break next year" I am sufficiently happy with that. We just must be > aware that things may really break and there is a big chance the > actually will. And I don't want to hear about potential vuln or > compatiblity issues or whatever when this code is changed/removed. > Also keep on your mind that I probably need to totally revamp the > variable system, as json-c has many problematic parts for our use > (what I learned when digging deep with liblognorm). So I *know* that > there are big changes coming up next year! > > And, full ack: I want to limit the scope of RainerScript. Arrays was a > good sample of why it may be a bad idea to go to boldly forward > without thinking about the big picture. > > Rainer >> >> >> >> On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards >> wrote: >>> I can't fully dig into this, but I think we must *very carefully* >>> evaluate the overall design. Some time ago we introduced arrays for >>> the limited liblognorm use case, and it hurts us every now and then >>> when folks want to use arrays for other use cases. It may probably >>> make sense to re-think how the variable engine etc behaves before >>> adding more functionality. And make sure that everything works smooth >>> in all use cases. While anything else may take care for some use >>> cases, I fear we may get too fragmented. At least this is what I >>> learned in the past months discussions. >>> >>> Anyone else? >>> >>> Rainer >>> >>> 2015-10-06 17:10 GMT+02:00 singh.janmejay : It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of
Re: [rsyslog] RFC: dynamic-stats support
2015-10-06 18:04 GMT+02:00 singh.janmejay: > Rainer, > > I see this as something completely outside the scope of variables. > Building stats collector over variables is possible, but then we are > then talking about a general purpose language which allows building > such complex things. This increases the scope of Rainerscript and with > larger scope comes complexity. I feel this is in-line with the other > Lua discussion where you emphasized that Rainerscript should not > become a fully-general-purpose language? > > Eg. creating an atomic-increment function for variable requires that > we educate users about what can and can't be done if atomic-increment > function is used anywhere on a variable. What relationship they can > expect it to have with other atomic-incrementing variables (which gets > into memory model). Maybe I just feel overwhelemed in the moment with keeping track of everything that is going on. How about this: we can merge it BUT flag it as experimental. If all works out well, I am free starting early next year to have a deep look at the overall design and sticking together all those loose edges. I suspect that I would like to change a couple of things in the interest of tying it all well together (like I currently do in liblognorm). But if I need to carry all this legacy, that's really a burden (e.g. liblognorm now contains the full v1 code as a copy, which means it also needs to be somewhat maintained). I want to avoid this. As long as we document this as an *interim* solution that is not necessarily here to stay and as such "use at your own risk and it will probably break next year" I am sufficiently happy with that. We just must be aware that things may really break and there is a big chance the actually will. And I don't want to hear about potential vuln or compatiblity issues or whatever when this code is changed/removed. Also keep on your mind that I probably need to totally revamp the variable system, as json-c has many problematic parts for our use (what I learned when digging deep with liblognorm). So I *know* that there are big changes coming up next year! And, full ack: I want to limit the scope of RainerScript. Arrays was a good sample of why it may be a bad idea to go to boldly forward without thinking about the big picture. Rainer > > > > On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards > wrote: >> I can't fully dig into this, but I think we must *very carefully* >> evaluate the overall design. Some time ago we introduced arrays for >> the limited liblognorm use case, and it hurts us every now and then >> when folks want to use arrays for other use cases. It may probably >> make sense to re-think how the variable engine etc behaves before >> adding more functionality. And make sure that everything works smooth >> in all use cases. While anything else may take care for some use >> cases, I fear we may get too fragmented. At least this is what I >> learned in the past months discussions. >> >> Anyone else? >> >> Rainer >> >> 2015-10-06 17:10 GMT+02:00 singh.janmejay : >>> It is possible to use global-variables (it'll require some >>> enhancements, table-support etc), but it'll be very inefficient >>> compared to this approach. For instance, choice of data-structure etc >>> allows making the solution a lot more efficient. >>> >>> Here its possible to locklessly increment counters in most cases, so >>> its overhead is a lot lesser than global-variables. >>> >>> Recycle is precisely to allow this lockless mechanism to work. Its >>> basically saying, it'll track metric-names he has seen in last 1 hour. >>> If we kill tracking of it as soon as we don't see an increment >>> (between 2 reporting runs of impstats), it'll lead to unnecessary >>> churn when low-values are common or load is not uniform in time. >>> >>> Implementing it on top of global-variables is not only has very high >>> performance-penalty(it'll be prohibitive for high-throughput >>> scenarios), it also exposes too much complexity to the user (where >>> user has to worry about reset etc). >>> >>> I don't plan to have a scheduler in this implementation. >>> GetAllStatsLines call will purge the tree instead of reset at that >>> interval. Its basically a balance between freeing-up memory occupied >>> by stale-metric-names vs. performance (lockless handling of >>> increment). So it will be governed by impstat schedule. May be I >>> should change name to better name (equivalent of >>> purge_known_keys_after_they_have_been_reported_N_times). >>> >>> >>> On Tue, Oct 6, 2015 at 4:30 PM, David Lang wrote: On Tue, 6 Oct 2015, singh.janmejay wrote: > Hi, > > I am working on support for stats with dynamic-name. This comes handy > in situations where metric-name is dependent upon value of a certain > attribute of the message. > > Say, for a central log-aggregation service, its valuable to know
Re: [rsyslog] RFC: dynamic-stats support
2015-10-06 18:36 GMT+02:00 singh.janmejay: > Sure, sounds good. > > The ability to gather stats with dynamic-key is important. I am > willing to even help with a rewrite if at some point we feel its best > implemented in a different way in the light of variable-implementation > changes. going a bit OT here: json-c is fine. But we have begun a bit to abuse it as a general variable system, which it really isn't. So I think we need to revamp the var system completely and that should, among others, address things like concurrency and performance. Once we reach this point, it may (may!) make sense to change some of the inner workings of liblognorm again. I am currently modelling the changes in a way that makes it relatively easy to do so later. The idea is that we could later have a switch to either return json-c objects (which would be generated just before handing the object back to the user) ... or some better representation. I am *not* saying this will necessarily happenm but is a potential route. I focus on liblognorm at the moment. When that is fully completed, I'll look into what I can further improve in json-c without breaking what it really is. Once this is done, it's the right time to look at actual need to replace the rsyslog variable system (I am 90% sure it's required, but that really needs to be seen...). Just to clarify. To the current project: whatever you do, please make sure the doc prominently says "this is an interim solution and it may totally change". I hate to request that (and even more hate it if I need to break it later), but that's the only path I currently see moving forward in a situation where I see a big change is very probable -- but don't have the time to do it right now. I really hate potentially wasting your time. Rainer > > On Tue, Oct 6, 2015 at 9:48 PM, Rainer Gerhards > wrote: >> 2015-10-06 18:04 GMT+02:00 singh.janmejay : >>> Rainer, >>> >>> I see this as something completely outside the scope of variables. >>> Building stats collector over variables is possible, but then we are >>> then talking about a general purpose language which allows building >>> such complex things. This increases the scope of Rainerscript and with >>> larger scope comes complexity. I feel this is in-line with the other >>> Lua discussion where you emphasized that Rainerscript should not >>> become a fully-general-purpose language? >>> >>> Eg. creating an atomic-increment function for variable requires that >>> we educate users about what can and can't be done if atomic-increment >>> function is used anywhere on a variable. What relationship they can >>> expect it to have with other atomic-incrementing variables (which gets >>> into memory model). >> >> Maybe I just feel overwhelemed in the moment with keeping track of >> everything that is going on. How about this: we can merge it BUT flag >> it as experimental. If all works out well, I am free starting early >> next year to have a deep look at the overall design and sticking >> together all those loose edges. I suspect that I would like to change >> a couple of things in the interest of tying it all well together (like >> I currently do in liblognorm). >> >> But if I need to carry all this legacy, that's really a burden (e.g. >> liblognorm now contains the full v1 code as a copy, which means it >> also needs to be somewhat maintained). I want to avoid this. As long >> as we document this as an *interim* solution that is not necessarily >> here to stay and as such "use at your own risk and it will probably >> break next year" I am sufficiently happy with that. We just must be >> aware that things may really break and there is a big chance the >> actually will. And I don't want to hear about potential vuln or >> compatiblity issues or whatever when this code is changed/removed. >> Also keep on your mind that I probably need to totally revamp the >> variable system, as json-c has many problematic parts for our use >> (what I learned when digging deep with liblognorm). So I *know* that >> there are big changes coming up next year! >> >> And, full ack: I want to limit the scope of RainerScript. Arrays was a >> good sample of why it may be a bad idea to go to boldly forward >> without thinking about the big picture. >> >> Rainer >>> >>> >>> >>> On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards >>> wrote: I can't fully dig into this, but I think we must *very carefully* evaluate the overall design. Some time ago we introduced arrays for the limited liblognorm use case, and it hurts us every now and then when folks want to use arrays for other use cases. It may probably make sense to re-think how the variable engine etc behaves before adding more functionality. And make sure that everything works smooth in all use cases. While anything else may take care for some use cases, I fear we may get too
Re: [rsyslog] RFC: dynamic-stats support
On Tue, 6 Oct 2015, singh.janmejay wrote: It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. As for the data structures, Rainer has been identifying inefficencies in how json-c works and working to improve them Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. how can you manage counters in multiple threads without locks? Especially when dealing with batches. Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. that depends on the cost of initializing a metric vs the cost of tracking the recycle mechanism. Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of purge_known_keys_after_they_have_been_reported_N_times). if this is just adding additional metrics to the impstats output that eliminates the schedular/reset issue. I think we should have a metric configuration be fairly static, allow configuring custom metrics and add to them, but don't use data from the message as part of the name of the metric, and continue reporting them forever, even if they are 0 (so no need to 'recycle' names) David Lang ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Re: [rsyslog] RFC: dynamic-stats support
It is possible to use global-variables (it'll require some enhancements, table-support etc), but it'll be very inefficient compared to this approach. For instance, choice of data-structure etc allows making the solution a lot more efficient. Here its possible to locklessly increment counters in most cases, so its overhead is a lot lesser than global-variables. Recycle is precisely to allow this lockless mechanism to work. Its basically saying, it'll track metric-names he has seen in last 1 hour. If we kill tracking of it as soon as we don't see an increment (between 2 reporting runs of impstats), it'll lead to unnecessary churn when low-values are common or load is not uniform in time. Implementing it on top of global-variables is not only has very high performance-penalty(it'll be prohibitive for high-throughput scenarios), it also exposes too much complexity to the user (where user has to worry about reset etc). I don't plan to have a scheduler in this implementation. GetAllStatsLines call will purge the tree instead of reset at that interval. Its basically a balance between freeing-up memory occupied by stale-metric-names vs. performance (lockless handling of increment). So it will be governed by impstat schedule. May be I should change name to better name (equivalent of purge_known_keys_after_they_have_been_reported_N_times). On Tue, Oct 6, 2015 at 4:30 PM, David Langwrote: > On Tue, 6 Oct 2015, singh.janmejay wrote: > >> Hi, >> >> I am working on support for stats with dynamic-name. This comes handy >> in situations where metric-name is dependent upon value of a certain >> attribute of the message. >> >> Say, for a central log-aggregation service, its valuable to know what >> is inbound message-count distribution across application-clusters that >> send logs to it, or for a shared-server, its valuable to know what is >> the log-volume generation across users etc. >> >> Im thinking of using functions-like interface to support this. It may >> look similar to this: >> >> >> dyn_stats("user_msg_count") >> >> ... >> >> ruleset(...) { >> ... >> dyn_inc("user_msg_count", $.user) >> ... >> } >> >> >> dyn_stats signature looks like: >> dyn_stats(, , > default=10k>, ) >> >> dyn_inc signature looks like: >> dyn_inc(, ) >> >> >> Reporting would work similar to static-metric via impstats. Mapping: >> statsobj_s.name = name_space >> statsobj_s.origin = "dyn" >> ctr_s.name = "foo" (say $.user had value foo) >> >> >> Thoughts / suggestions? > > > how is this different/better than global variables? (although we may need to > implement soem functions, atomic inc/dec copy+clear) If you have pstats > output in json format, you can even piggyback on it's schedule to output the > data. > > > things like stats can very easily end up being expensive in terms of locking > (something global variables already have figured out), and it sounds like > you are proposing adding a scheduler of some sort to output the data. > > variables should not need to be 'recycled', either they contain data or they > don't. If they contain data, you need to keep the data until you do > something with it, if they don't, you don't have to track them. > > > I am actually doing this sort of thing external to rsyslog in SEC > > I have a template in rsyslog that contains hostname, fromhost-ip, > programname and I output it via improg to SEC. SEC accumulates counters and > has scheduled outputs to files. > > before I started using SEC for this, I used the same template to output to a > file and then for reports, used cut + sort + uniq -c to extract the data I > need. When the files only contain the significant data, this is actually not > bad to do, even at higher volumes. > > David Lang > ___ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ > What's up with rsyslog? Follow https://twitter.com/rgerhards > NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of > sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T > LIKE THAT. -- Regards, Janmejay http://codehunk.wordpress.com ___ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/ What's up with rsyslog? Follow https://twitter.com/rgerhards NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.