Re: [pmacct-discussion] Flexible aggregation

Chris Wilson Sat, 13 Jun 2009 04:11:12 -0700

Hi Paolo,

On Sat, 13 Jun 2009, Paolo Lucente wrote:


> Good pointer. From a brief scan of the Aguri homepage, please feel free 
> to correct whether i'm wrong, i see many similarities between pmacct and 
> Aguri.

I guess so; I was thinking that Aguri seems to store its output in text 
files rather than a database, and perhaps provides more dynamic/automatic 
filtering, but seems to be a research project and not highly supported or 
maintained.

> Aguri is slightly more limited in the fact it has only a set of (4?) 
> traffic aggregation profiles whereas pmacct offers a wider range of 
> primitives. But I guess the point you wanted to make was the dynamic 
> variation of the sampling rate under increased traffic load (ie. DDoS).

OK, I didn't realise that it was just the sample rate that was varied. I 
thought it was to do with the flexible aggregation, e.g. if we have 1000 
flows with the same source IP and source port, they might be aggregated 
together as a single, more highly summarised flow.

> pmacct actually does have such feature only available to the SQL
> plugins: it's part of the SQL preprocess infrastructure (look for 
> 'sql_preprocess' in the CONFIG-KEYS document or the wiki) and is
> called 'fsrc' (Flow Sampling under Resource Constraints). It is
> an implementation i did years ago loosely based on a paper coming
> from AT&T Labs. It aims at offering to the SQL database a sort of
> stream-lined number of aggregates; aggregates are weighted, ranked
> and sampled based on probability (which gives the dynamic/adaptive
> part of the approach); the resource constraint is expressed via
> the number of flows you want to end in the database (which is in
> turn seen as the constrained resource here).

We are using this feature to filter out small flows, but the problem is 
that they are not accounted for at all, so the database contents e.g. 
SUM(bytes) no longer reflect the interface totals.

What I would ideally like to see, but I realise that it's hard is 
something like this:

Initial filter selects flows over a certain size and non-selected flows 
can either be discarded (as now) or reaggregated by zeroing a selected 
feature, e.g. the destination port, and combined into a new single record 
if there is more than one of them. These, more highly aggregated records 
then continue down the preprocess chain, and if they fail to match a later 
condition then they can be aggregated again in a different way, e.g. by 
zeroing the destination IP address, and so on, until we end up with a 
single record where all the features were aggregated.

For example, sql_preprocess might look something like this:

minb = 10000, zero_dstip, minb = 10000, zero_dstport, minb = 10000, 
zero_srcport, minb = 10000, zero_srcip

Then any flows which together do not add up to enough bytes to pass the 
minb filters, even after aggregation, end up in a record where all the 
selector fields are zeroed out. Since there is no final minb condition, 
this row would always be added to the database, never rejected, so 
SUM(bytes) would again equal the interface counters for any given time 
range.

Cheers, Chris.
-- 
Aptivate | http://www.aptivate.org | Phone: +44 1223 760887
The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES

Aptivate is a not-for-profit company registered in England and Wales
with company number 04980791.

_______________________________________________
pmacct-discussion mailing list
http://www.pmacct.net/#mailinglists

Re: [pmacct-discussion] Flexible aggregation

Reply via email to