On Fri, 25 Oct 2013, Rainer Gerhards wrote:
On Fri, Oct 25, 2013 at 10:42 AM, David Lang <[email protected]> wrote:
On Fri, 25 Oct 2013, Rainer Gerhards wrote:
Hi all,
I thought out the details of what I have on my mind and think the solution
can work and support all known use cases. I've also managed to write it
down this morning:
http://blog.gerhards.net/2013/**10/a-proposal-for-rsyslog-**
state-variables.html<http://blog.gerhards.net/2013/10/a-proposal-for-rsyslog-state-variables.html>
I would appreciate if you could check it and see how the spec can be
technically broken or identify use cases which it will not be able to
handle.
One case I don't think it can handle is the current work that mmcount is
doing.
As I understand mmcount, it creates a counter for each appname that it
sees, and uses that to count how many times that appname has been seen.
the fact that you can use $/a!b lets you have the variables without having
the conflict with anything else,
but you end up with no way to know what variables exist
you can't output the entire set of counts without being able to use $/a
you can't even lookup the count for the current message's appname without
being able to use domething like $a/{$appname}
you are right, that use case breaks. Thanks! I'll review mmcount in more
depth now and see how it is actually used.
The ability to support subtrees can be added, but will be relatively
expensive.
I don't think you need normal full subtree support, you could get away with a
function like export($/a) to return the tree, this could be with atomic locking
if needed.
I think the hardest thing to do (although possibly the most valuable in the long
run) is the concept of $[/!.]a!{$b} to allow you to use a variable as part of
the name of another variable.
depending on how large the table of global variables ends up, I wonder if
it would be easier (and because of the simpler design, possibly even
faster) to just make a shadow copy of all global variables at the start of
the message processing, (I'm thinking the RCU read-copy-update) mechanism
may be a good fit for this)
I don't think so, as with each of this "copies" I would need to update
work flags (like modifiable) for all variables. Also, I need to know which
one is actually shadowed, so that would be a new flag (currently, it is
just a failed lookup on the shadow dictionary).
my thought is that you essentially shadow everything. each variable would have a
flag that in the master copy is modifiable.
for each message, you would
grab a pointer to the 'current' master table.
if all you do is read variables, you read them from the copy you have a pointer
to.
when you do an update, you copy the entire table, modify your copy and it
becomes the new master copy (with a different pointer), any other thread has a
pointer to the old copy and keeps using it (making the old master become the
'shadow' version for all the other threads). once all threads have finished a
batch, old copies can be garbage collected (how to track this is a different
discussion that I will leave out for the moment for simplicity)
atomic operations would have to lock the pointer to the current master table,
check to see if it matches the pointer the thread is using, if so, continue
normally, if not, it would have to update the variable based on the current
master, not it's 'shadow' copy.
My assumption (I should
have spelled this out) is also that we have a very low number of state
variable updates, and even a low number of read accesses to them.
I think this assumption is incorrect. I think there are four distinct use-cases
1. no global variables in use
no updates, no reads. (obviously :-)
2. global variables used for configuration/path type capabilities
few, if any updates, extremely frequent reads.
3a. counters for reporting
extremely frequent updates, infrequent reads.
3b. counters for load balancing or sequence numbers
extremely frequent updates and reads.
the RCU approach works extremely well for cases #1 and #2, but not for #3a and
#3b
I'm actually thinking that we should possibly split the two use cases.
State Variables ($/) would be intended for infrequent updates (case #2) and
could be handled very well with the RCU style mechanism that I'm talking about.
If you use them for counters, performance will be poor and you have no atomic
operations
Counter Variables would only be accessed through function calls
atomic_set(counter='varname' value=number)
sets value, returns oldvalue
atomic_add(counter='varname' step=number)
modifies value, returns newvalue
atomic_report(counter='varname')
returns value
Counter variables would not be shadowed, they would be true global variables,
and can be modified in unexpected ways between accesses. if you want to use the
value of a counter for something, you save it's value into a local/message
variable. Because these do not look or act like normal variables, there isn't
the expectation that you can do math and multiple actions on them.
David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of
sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE
THAT.