Re: [pmacct-discussion] Flexible aggregation

2009-06-14 Thread Paolo Lucente
Hi Karl,

On Sat, Jun 13, 2009 at 04:30:07PM -0500, Karl O. Pinc wrote:

 A good database should not have problems with simultaneous updates,
 or is there another reason why synchronization is an issue?

No, not really - expecially when it's down to an INSERTs-only
scenario. Just an effort to help spreading the load. 

 a unique part of a key.  (Although good database design says
 that you don't put meaningful information into a key, which
 makes the key issue moot but would still require another
 database column to track the source (plugid id) of the
 data.)

The concept of a plugin ID is already there. post_tag feature
implements this and is coded into the 'agent_id' field. It
was originally conceived right for this purpose. It could be
an idea for the future to allow OR'ing of its value into the
'agent_id' field for a smooth co-existence of pre- and post-
tagging. 

Cheers,
Paolo


___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Flexible aggregation

2009-06-14 Thread Chris Wilson
Hi Paolo and Karl,

On Sat, 13 Jun 2009, Paolo Lucente wrote:

 On Sat, Jun 13, 2009 at 03:07:01PM -0500, Karl O. Pinc wrote:
 
  We are only interested in a single table.
 
  Why can't two separate sql plugins write to the same table?
 
 What Karl is proposing here might really result in a simpler
 approach compared to the sub-aggregation scenario - which, with
 some care (ie. sql_startup_delay to svoid events syncronization
 while retaining same sql_history and sql_refresh_time settings),
 can not only achieve same results but best of all is already
 there. Let us know your thoughts!

I don't think it can. For example, how would we write the configuration? 
Let's say we just want to zero (not aggregate on) the destination IP for 
flows less than 1000 bytes. We could try:

  plugins: mysql[with_dst], mysql[without_dst]
  aggregate[with_dst]: src_host, src_port, dst_host, dst_port, proto
  aggregate[without_dst]: src_host, src_port, dst_port, proto
  sql_preprocess[with_dst]: minb = 1000
  sql_preprocess[without_dst]: maxb = 1000

but the flow aggregates are not the same for both plugins, so we can't 
ensure that any flow ends up in one plugin or the other but not both or 
neither.

How else could we do it with what we already have? We could write to 
different tables at different levels of aggregation, and let the user 
choose which one to use, and delete old data from each table to stop it 
becoming too large... but that gets more complicated for the user.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists


Re: [pmacct-discussion] Flexible aggregation

2009-06-14 Thread Paolo Lucente
Hi Chris,

On Sun, Jun 14, 2009 at 02:25:10PM +0300, Chris Wilson wrote:

 I don't think it can. For example, how would we write the configuration? 
 Let's say we just want to zero (not aggregate on) the destination IP for 
 flows less than 1000 bytes. We could try:
 
   plugins: mysql[with_dst], mysql[without_dst]
   aggregate[with_dst]: src_host, src_port, dst_host, dst_port, proto
   aggregate[without_dst]: src_host, src_port, dst_port, proto
   sql_preprocess[with_dst]: minb = 1000
   sql_preprocess[without_dst]: maxb = 1000
 
 but the flow aggregates are not the same for both plugins, so we can't 
 ensure that any flow ends up in one plugin or the other but not both or 
 neither.

Actually, you are right. The only alternative, though un-scalable,
is the one mentioned previously: same aggregation, two SQL tables,
complementary 'sql_preprocess' directives.

This implies having a script called from the crontab doing the sub-
aggregation job and eventually putting these new aggregates in the
table featuring the sql_preprocess configured with minb = 1000. But
this is how the problem can be tackled with what we have. 

With the only nice way to do this being doing sub-aggregation at
the sql_preprocess stage. 0.12.0p2 should be highly feasible. 

Cheers,
Paolo



___
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists