Hi Ross,

I'm Cc'ing this to [EMAIL PROTECTED] since that is the
flowscan-specific mailing list.

I've replied in context below:

On Thu, Feb 21, 2008 at 02:13:15PM +1100, Ross Tsolakidis wrote:
> Hi all,
> 
> I just needed some clarification on whether this is normal.
> We run flowscan with flow-tools then extract the data from rrd into a mysql 
> DB for usage figures.
> 
> One thing I'm noticing though is it's very CPU intensive.
<snip>
> Flowscan runs every 5 minutes, using the CUFlow class.
> 
> 2008/02/21 07:20:07 working on file 
> /var/netflow/ft-v05.2008-02-21.071500+1100...
> 2008/02/21 07:20:10 flowscan-1.020 CUFlow: Cflow::find took  3 wallclock secs 
> ( 3.36 usr +  0.00 sys =  3.36 CPU) for 585896 flow file bytes, flow hit 
> ratio: 32073/32970
> 2008/02/21 07:20:11 flowscan-1.020 CUFlow: report took  1 wallclock secs ( 
> 0.00 usr  0.00 sys +  0.11 cusr  0.04 csys =  0.15 CPU)

OK, looks good (more details below).
 
> The CUFlow.cf has approx 31 Class Cs in it.
> I am analysing every IP, eg, every IP in those Class Cs has it's own RRD.
> Not sure if this is too much for it to do.

Not at all.

> 2336 ?        Ss     0:37 /usr/bin/flow-capture -w /var/netflow/ft 0/0/2055 
> -S5 -V5 -E1G -n 287 -N 0 -R /usr/local/netflow/bin/linkme
> 3099 ?        S      4:20 /usr/bin/perl /usr/bin/flowscan
> 4941 ?        R      1:31 /usr/bin/perl /usr/bin/flowscan
> 
> As you can see it???s running 2 sometimes 3 processes of flowscan.
> Is this normal ?

No, it's not normal to have two flowscan processes... probably a
mistake with your rc scripts, i.e. started it twice.
I'd kill off the older one.  (Of course look at the PPID first, to
verify they are unrelated.)

> Am I doing this right ?  ???
> 
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  4941 root      25   0 21708  18m 1700 R  100  0.9   3:19.58 flowscan
>  2336 root      15   0  2896 1244  504 S    1  0.1   0:37.41 flow-capture
> 
> I missed the 2nd flowscan in top while writing this email,
> but it basically flatlines the cpu 24/7 doing this.

Doing what?  Your flowscan log shows that it only worked for 4 seconds
to process 5 minutes of flows.

If you haven't cleared it up after this email, I suggest telling
us the load average on the machine.  If there's only <5 seconds of
CPU-intensive flowscan work every five minutes, it should be golden.
If the load average is high, then perhaps there's something unrelated
wrong with your machine, in which case sar will get you far...
(Look at running sadc to collect performance information.)

> Can someone point out anything that I am doing wrong ?

>From what you've shown in the log and ps output, it looks good to me ('cept
the two flowscan processes, but that should at most incorrectly double
the load).

To reiterate, it's only taking a few seconds for the find and
report phases (shown as 3 and 1 wallclock seconds, respecively, in
the flowscan logs above) for you to process 5 minutes of flow data.
So, as soon as everything is caught up (so that it's processing in
real time, flowscan will only be hitting the CPU for about 5 seconds
every five minutes...

As for what is using the CPU inside flowscan, it is the Cflow perl
module (that translates the flow files for your perl script) and the
report code (in your case CUFlow) that uses the CPU - its because it's
perl code... lots of management of data structures and also a lot of
converting raw flow records (from the flow files) into perl variables.

Dave

-- 
[EMAIL PROTECTED]  http://net.doit.wisc.edu/~plonka/  Madison, WI
_______________________________________________
Flow-tools mailing list
[EMAIL PROTECTED]
http://mailman.splintered.net/mailman/listinfo/flow-tools

Reply via email to