2015-10-06 18:36 GMT+02:00 singh.janmejay <singh.janme...@gmail.com>:
> Sure, sounds good.
>
> The ability to gather stats with dynamic-key is important. I am
> willing to even help with a rewrite if at some point we feel its best
> implemented in a different way in the light of variable-implementation
> changes.

going a bit OT here: json-c is fine. But we have begun a bit to abuse
it as a general variable system, which it really isn't. So I think we
need to revamp the var system completely and that should, among
others, address things like concurrency and performance. Once we reach
this point, it may (may!) make sense to change some of the inner
workings of liblognorm again. I am currently modelling the changes in
a way that makes it relatively easy to do so later. The idea is that
we could later have a switch to either return json-c objects (which
would be generated just before handing the object back to the user)
... or some better representation. I am *not* saying this will
necessarily happenm but is a potential route.

I focus on liblognorm at the moment. When that is fully completed,
I'll look into what I can further improve in json-c without breaking
what it really is. Once this is done, it's the right time to look at
actual need to replace the rsyslog variable system (I am 90% sure it's
required, but that really needs to be seen...).

Just to clarify. To the current project: whatever you do, please make
sure the doc prominently says "this is an interim solution and it may
totally change". I hate to request that (and even more hate it if I
need to break it later), but that's the only path I currently see
moving forward in a situation where I see a big change is very
probable -- but don't have the time to do it right now. I really hate
potentially wasting your time.

Rainer
>
> On Tue, Oct 6, 2015 at 9:48 PM, Rainer Gerhards
> <rgerha...@hq.adiscon.com> wrote:
>> 2015-10-06 18:04 GMT+02:00 singh.janmejay <singh.janme...@gmail.com>:
>>> Rainer,
>>>
>>> I see this as something completely outside the scope of  variables.
>>> Building stats collector over variables is possible, but then we are
>>> then talking about a general purpose language which allows building
>>> such complex things. This increases the scope of Rainerscript and with
>>> larger scope comes complexity. I feel this is in-line with the other
>>> Lua discussion where you emphasized that Rainerscript should not
>>> become a fully-general-purpose language?
>>>
>>> Eg. creating an atomic-increment function for variable requires that
>>> we educate users about what can and can't be done if atomic-increment
>>> function is used anywhere on a variable. What relationship they can
>>> expect it to have with other atomic-incrementing variables (which gets
>>> into memory model).
>>
>> Maybe I just feel overwhelemed in the moment with keeping track of
>> everything that is going on. How about this: we can merge it BUT flag
>> it as experimental. If all works out well, I am free starting early
>> next year to have a deep look at the overall design and sticking
>> together all those loose edges. I suspect that I would like to change
>> a couple of things in the interest of tying it all well together (like
>> I currently do in liblognorm).
>>
>> But if I need to carry all this legacy, that's really a burden (e.g.
>> liblognorm now contains the full v1 code as a copy, which means it
>> also needs to be somewhat maintained). I want to avoid this. As long
>> as we document this as an *interim* solution that is not necessarily
>> here to stay and as such "use at your own risk and it will probably
>> break next year" I am sufficiently happy with that. We just must be
>> aware that things may really break and there is a big chance the
>> actually will. And I don't want to hear about potential vuln or
>> compatiblity issues or whatever when this code is changed/removed.
>> Also keep on your mind that I probably need to totally revamp the
>> variable system, as json-c has many problematic parts for our use
>> (what I learned when digging deep with liblognorm). So I *know* that
>> there are big changes coming up next year!
>>
>> And, full ack: I want to limit the scope of RainerScript. Arrays was a
>> good sample of why it may be a bad idea to go to boldly forward
>> without thinking about the big picture.
>>
>> Rainer
>>>
>>>
>>>
>>> On Tue, Oct 6, 2015 at 8:49 PM, Rainer Gerhards
>>> <rgerha...@hq.adiscon.com> wrote:
>>>> I can't fully dig into this, but I think we must *very carefully*
>>>> evaluate the overall design. Some time ago we introduced arrays for
>>>> the limited liblognorm use case, and it hurts us every now and then
>>>> when folks want to use arrays for other use cases. It may probably
>>>> make sense to re-think how the variable engine etc behaves before
>>>> adding more functionality. And make sure that everything works smooth
>>>> in all use cases. While anything else may take care for some use
>>>> cases, I fear we may get too fragmented. At least this is what I
>>>> learned in the past months discussions.
>>>>
>>>> Anyone else?
>>>>
>>>> Rainer
>>>>
>>>> 2015-10-06 17:10 GMT+02:00 singh.janmejay <singh.janme...@gmail.com>:
>>>>> It is possible to use global-variables (it'll require some
>>>>> enhancements, table-support etc), but it'll be very inefficient
>>>>> compared to this approach. For instance, choice of data-structure etc
>>>>> allows making the solution a lot more efficient.
>>>>>
>>>>> Here its possible to locklessly increment counters in most cases, so
>>>>> its overhead is a lot lesser than global-variables.
>>>>>
>>>>> Recycle is precisely to allow this lockless mechanism to work. Its
>>>>> basically saying, it'll track metric-names he has seen in last 1 hour.
>>>>> If we kill tracking of it as soon as we don't see an increment
>>>>> (between 2 reporting runs of impstats), it'll lead to unnecessary
>>>>> churn when low-values are common or load is not uniform in time.
>>>>>
>>>>> Implementing it on top of global-variables is not only has very high
>>>>> performance-penalty(it'll be prohibitive for high-throughput
>>>>> scenarios), it also exposes too much complexity to the user (where
>>>>> user has to worry about reset etc).
>>>>>
>>>>> I don't plan to have a scheduler in this implementation.
>>>>> GetAllStatsLines call will purge the tree instead of reset at that
>>>>> interval. Its basically a balance between freeing-up memory occupied
>>>>> by stale-metric-names vs. performance (lockless handling of
>>>>> increment). So it will be governed by impstat schedule. May be I
>>>>> should change name to better name (equivalent of
>>>>> purge_known_keys_after_they_have_been_reported_N_times).
>>>>>
>>>>>
>>>>> On Tue, Oct 6, 2015 at 4:30 PM, David Lang <da...@lang.hm> wrote:
>>>>>> On Tue, 6 Oct 2015, singh.janmejay wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am working on support for stats with dynamic-name. This comes handy
>>>>>>> in situations where metric-name is dependent upon value of a certain
>>>>>>> attribute of the message.
>>>>>>>
>>>>>>> Say, for a central log-aggregation service, its valuable to know what
>>>>>>> is inbound message-count distribution across application-clusters that
>>>>>>> send logs to it, or for a shared-server, its valuable to know what is
>>>>>>> the log-volume generation across users etc.
>>>>>>>
>>>>>>> Im thinking of using functions-like interface to support this. It may
>>>>>>> look similar to this:
>>>>>>>
>>>>>>> ====================
>>>>>>> dyn_stats("user_msg_count")
>>>>>>>
>>>>>>> ...
>>>>>>>
>>>>>>> ruleset(...) {
>>>>>>> ...
>>>>>>> dyn_inc("user_msg_count", $.user)
>>>>>>> ...
>>>>>>> }
>>>>>>> ====================
>>>>>>>
>>>>>>> dyn_stats signature looks like:
>>>>>>> dyn_stats(<name_space>, <resettable: default=true>, <max_cardinality:
>>>>>>> default=10k>, <recycle_metric_names_after: default=1hr>)
>>>>>>>
>>>>>>> dyn_inc signature looks like:
>>>>>>> dyn_inc(<name_space>, <metric_name>)
>>>>>>>
>>>>>>>
>>>>>>> Reporting would work similar to static-metric via impstats. Mapping:
>>>>>>> statsobj_s.name = name_space
>>>>>>> statsobj_s.origin = "dyn"
>>>>>>> ctr_s.name = "foo" (say $.user had value foo)
>>>>>>>
>>>>>>>
>>>>>>> Thoughts / suggestions?
>>>>>>
>>>>>>
>>>>>> how is this different/better than global variables? (although we may 
>>>>>> need to
>>>>>> implement soem functions, atomic inc/dec copy+clear) If you have pstats
>>>>>> output in json format, you can even piggyback on it's schedule to output 
>>>>>> the
>>>>>> data.
>>>>>>
>>>>>>
>>>>>> things like stats can very easily end up being expensive in terms of 
>>>>>> locking
>>>>>> (something global variables already have figured out), and it sounds like
>>>>>> you are proposing adding a scheduler of some sort to output the data.
>>>>>>
>>>>>> variables should not need to be 'recycled', either they contain data or 
>>>>>> they
>>>>>> don't. If they contain data, you need to keep the data until you do
>>>>>> something with it, if they don't, you don't have to track them.
>>>>>>
>>>>>>
>>>>>> I am actually doing this sort of thing external to rsyslog in SEC
>>>>>>
>>>>>> I have a template in rsyslog that contains hostname, fromhost-ip,
>>>>>> programname and I output it via improg to SEC. SEC accumulates counters 
>>>>>> and
>>>>>> has scheduled outputs to files.
>>>>>>
>>>>>> before I started using SEC for this, I used the same template to output 
>>>>>> to a
>>>>>> file and then for reports, used cut + sort + uniq -c to extract the data 
>>>>>> I
>>>>>> need. When the files only contain the significant data, this is actually 
>>>>>> not
>>>>>> bad to do, even at higher volumes.
>>>>>>
>>>>>> David Lang
>>>>>> _______________________________________________
>>>>>> rsyslog mailing list
>>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>>> http://www.rsyslog.com/professional-services/
>>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad 
>>>>>> of
>>>>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T
>>>>>> LIKE THAT.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Janmejay
>>>>> http://codehunk.wordpress.com
>>>>> _______________________________________________
>>>>> rsyslog mailing list
>>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>>> http://www.rsyslog.com/professional-services/
>>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad 
>>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you 
>>>>> DON'T LIKE THAT.
>>>> _______________________________________________
>>>> rsyslog mailing list
>>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>>> http://www.rsyslog.com/professional-services/
>>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad 
>>>> of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you 
>>>> DON'T LIKE THAT.
>>>
>>>
>>>
>>> --
>>> Regards,
>>> Janmejay
>>> http://codehunk.wordpress.com
>>> _______________________________________________
>>> rsyslog mailing list
>>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>>> http://www.rsyslog.com/professional-services/
>>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
>>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
>>> LIKE THAT.
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
>> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
>> LIKE THAT.
>
>
>
> --
> Regards,
> Janmejay
> http://codehunk.wordpress.com
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
> sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T 
> LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of 
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE 
THAT.

Reply via email to