Just following on from this, I'm trying to narrow down where the issue is.

This is my understanding of how this all works.

==
Flow-capture collects Netflow raw data from the Routers.
Flowscan is launched every 5 mins to process this data from the tmp file 
created from the above process.

Here's an example of what I'm seeing in the logs.
2008/02/22 15:25:31 flowscan-1.020 CUFlow: Cflow::find took  4 wallclock secs ( 
3.69 usr +  0.00 sys =  3.69 CPU) for 642543 flow file bytes, flow hit ratio: 
34353/35700
2008/02/22 15:30:08 flowscan-1.020 CUFlow: Cflow::find took  7 wallclock secs ( 
6.55 usr +  0.00 sys =  6.55 CPU) for 700461 flow file bytes, flow hit ratio: 
37792/39240
2008/02/22 15:35:15 flowscan-1.020 CUFlow: Cflow::find took  6 wallclock secs ( 
6.07 usr +  0.01 sys =  6.08 CPU) for 638391 flow file bytes, flow hit ratio: 
33862/35460
2008/02/22 15:40:18 flowscan-1.020 CUFlow: Cflow::find took  3 wallclock secs ( 
3.19 usr +  0.00 sys =  3.19 CPU) for 568464 flow file bytes, flow hit ratio: 
29665/31170
2008/02/22 15:45:22 flowscan-1.020 CUFlow: Cflow::find took  3 wallclock secs ( 
3.23 usr +  0.00 sys =  3.23 CPU) for 580433 flow file bytes, flow hit ratio: 
29981/31620
2008/02/22 15:50:26 flowscan-1.020 CUFlow: Cflow::find took  4 wallclock secs ( 
3.41 usr +  0.00 sys =  3.41 CPU) for 611599 flow file bytes, flow hit ratio: 
31890/33360
2008/02/22 15:55:30 flowscan-1.020 CUFlow: Cflow::find took  4 wallclock secs ( 
3.76 usr +  0.01 sys =  3.77 CPU) for 661072 flow file bytes, flow hit ratio: 
34889/36630
2008/02/22 16:00:06 flowscan-1.020 CUFlow: Cflow::find took  6 wallclock secs ( 
6.01 usr +  0.00 sys =  6.01 CPU) for 640229 flow file bytes, flow hit ratio: 
33088/34980
2008/02/22 16:05:11 flowscan-1.020 CUFlow: Cflow::find took  5 wallclock secs ( 
4.66 usr +  0.01 sys =  4.67 CPU) for 668444 flow file bytes, flow hit ratio: 
41535/42930
2008/02/22 16:10:17 flowscan-1.020 CUFlow: Cflow::find took  5 wallclock secs ( 
5.50 usr +  0.01 sys =  5.51 CPU) for 586241 flow file bytes, flow hit ratio: 
30329/31800
2008/02/22 16:15:27 flowscan-1.020 CUFlow: Cflow::find took  9 wallclock secs ( 
5.31 usr +  0.01 sys =  5.32 CPU) for 575349 flow file bytes, flow hit ratio: 
29203/30780
2008/02/22 16:20:38 flowscan-1.020 CUFlow: Cflow::find took 11 wallclock secs ( 
5.82 usr +  0.00 sys =  5.82 CPU) for 626396 flow file bytes, flow hit ratio: 
32176/33780
2008/02/22 16:25:20 flowscan-1.020 CUFlow: Cflow::find took 11 wallclock secs ( 
6.08 usr +  0.02 sys =  6.10 CPU) for 645224 flow file bytes, flow hit ratio: 
33314/34920
2008/02/22 16:30:36 flowscan-1.020 CUFlow: Cflow::find took 15 wallclock secs ( 
6.12 usr +  0.01 sys =  6.13 CPU) for 644156 flow file bytes, flow hit ratio: 
33146/34710
2008/02/22 16:35:25 flowscan-1.020 CUFlow: Cflow::find took 19 wallclock secs ( 
7.62 usr +  0.01 sys =  7.63 CPU) for 769207 flow file bytes, flow hit ratio: 
40907/42600
2008/02/22 16:40:47 flowscan-1.020 CUFlow: Cflow::find took 21 wallclock secs ( 
8.37 usr +  0.00 sys =  8.37 CPU) for 841025 flow file bytes, flow hit ratio: 
44987/46800
2008/02/22 16:45:43 flowscan-1.020 CUFlow: Cflow::find took 25 wallclock secs ( 
7.70 usr +  0.00 sys =  7.70 CPU) for 795074 flow file bytes, flow hit ratio: 
42475/44610
2008/02/22 16:50:35 flowscan-1.020 CUFlow: Cflow::find took 20 wallclock secs ( 
6.97 usr +  0.01 sys =  6.98 CPU) for 744377 flow file bytes, flow hit ratio: 
38356/40560
2008/02/22 16:55:27 flowscan-1.020 CUFlow: Cflow::find took 21 wallclock secs ( 
7.60 usr +  0.01 sys =  7.61 CPU) for 802334 flow file bytes, flow hit ratio: 
41759/43890
2008/02/22 17:00:53 flowscan-1.020 CUFlow: Cflow::find took 25 wallclock secs ( 
7.68 usr +  0.01 sys =  7.69 CPU) for 797614 flow file bytes, flow hit ratio: 
42425/44550
2008/02/22 17:05:49 flowscan-1.020 CUFlow: Cflow::find took 25 wallclock secs ( 
7.84 usr +  0.00 sys =  7.84 CPU) for 793587 flow file bytes, flow hit ratio: 
42936/44970
2008/02/22 17:10:47 flowscan-1.020 CUFlow: Cflow::find took 27 wallclock secs ( 
8.21 usr +  0.00 sys =  8.21 CPU) for 814166 flow file bytes, flow hit ratio: 
44722/46740
2008/02/22 17:15:45 flowscan-1.020 CUFlow: Cflow::find took 27 wallclock secs ( 
7.57 usr +  0.00 sys =  7.57 CPU) for 749273 flow file bytes, flow hit ratio: 
41375/43410
2008/02/22 17:20:48 flowscan-1.020 CUFlow: Cflow::find took 31 wallclock secs ( 
8.68 usr +  0.01 sys =  8.69 CPU) for 863693 flow file bytes, flow hit ratio: 
47824/49920

As you can see it's taking longer and longer (wallclock secs).

It will eventually get to this:
008/02/23 17:31:18 flowscan-1.020 CUFlow: Cflow::find took 254 wallclock secs ( 
9.08 usr +  0.01 sys =  9.09 CPU) for 922132 flow file bytes, flow hit ratio: 
49702/51390
2008/02/23 17:35:39 flowscan-1.020 CUFlow: Cflow::find took 237 wallclock secs 
(11.15 usr +  0.00 sys = 11.15 CPU) for 1073584 flow file bytes, flow hit 
ratio: 60328/62400
2008/02/23 17:44:18 flowscan-1.020 CUFlow: Cflow::find took 506 wallclock secs 
(10.48 usr +  0.02 sys = 10.50 CPU) for 1049448 flow file bytes, flow hit 
ratio: 57060/59520
2008/02/23 17:48:38 flowscan-1.020 CUFlow: Cflow::find took 247 wallclock secs 
( 8.91 usr +  0.02 sys =  8.93 CPU) for 923017 flow file bytes, flow hit ratio: 
48785/50670
2008/02/23 17:52:06 flowscan-1.020 CUFlow: Cflow::find took 194 wallclock secs 
( 8.92 usr +  0.00 sys =  8.92 CPU) for 923040 flow file bytes, flow hit ratio: 
49157/51240
2008/02/23 17:55:14 flowscan-1.020 CUFlow: Cflow::find took 180 wallclock secs 
( 9.34 usr +  0.00 sys =  9.34 CPU) for 955834 flow file bytes, flow hit ratio: 
51547/53340
2008/02/23 17:58:02 flowscan-1.020 CUFlow: Cflow::find took 159 wallclock secs 
( 8.54 usr +  0.00 sys =  8.54 CPU) for 884217 flow file bytes, flow hit ratio: 
46810/49110
2008/02/23 18:11:51 flowscan-1.020 CUFlow: Cflow::find took 814 wallclock secs 
( 9.64 usr +  0.02 sys =  9.66 CPU) for 986885 flow file bytes, flow hit ratio: 
52130/54480
2008/02/23 18:30:48 flowscan-1.020 CUFlow: Cflow::find took 1091 wallclock secs 
( 9.30 usr +  0.04 sys =  9.34 CPU) for 950331 flow file bytes, flow hit ratio: 
50770/52500
2008/02/23 18:44:44 flowscan-1.020 CUFlow: Cflow::find took 745 wallclock secs 
( 9.13 usr +  0.02 sys =  9.15 CPU) for 938829 flow file bytes, flow hit ratio: 
49541/51810
2008/02/23 18:53:37 flowscan-1.020 CUFlow: Cflow::find took 386 wallclock secs 
( 8.53 usr +  0.00 sys =  8.53 CPU) for 883836 flow file bytes, flow hit ratio: 
46801/48750
2008/02/23 18:57:35 flowscan-1.020 CUFlow: Cflow::find took 230 wallclock secs 
( 7.88 usr +  0.00 sys =  7.88 CPU) for 815832 flow file bytes, flow hit ratio: 
42967/45120
2008/02/23 19:10:18 flowscan-1.020 CUFlow: Cflow::find took 708 wallclock secs 
( 8.34 usr +  0.01 sys =  8.35 CPU) for 843948 flow file bytes, flow hit ratio: 
45503/47550
2008/02/23 19:29:16 flowscan-1.020 CUFlow: Cflow::find took 1078 wallclock secs 
( 9.34 usr +  0.03 sys =  9.37 CPU) for 941108 flow file bytes, flow hit ratio: 
50325/52770
2008/02/23 19:41:23 flowscan-1.020 CUFlow: Cflow::find took 640 wallclock secs 
( 8.78 usr +  0.02 sys =  8.80 CPU) for 895410 flow file bytes, flow hit ratio: 
47862/49740
2008/02/23 19:54:02 flowscan-1.020 CUFlow: Cflow::find took 754 wallclock secs 
( 8.83 usr +  0.01 sys =  8.84 CPU) for 893960 flow file bytes, flow hit ratio: 
47974/49860
2008/02/23 20:08:45 flowscan-1.020 CUFlow: Cflow::find took 763 wallclock secs 
( 8.59 usr +  0.01 sys =  8.60 CPU) for 878969 flow file bytes, flow hit ratio: 
46564/48990
 
Very ugly.

>From my understanding, using this example.
2008/02/22 16:20:38 flowscan-1.020 CUFlow: Cflow::find took 11 wallclock secs ( 
5.82 usr +  0.00 sys =  5.82 CPU) for 626396 flow file bytes, flow hit ratio: 
32176/33780
The flowscan process should be finished and be terminated.
How does something taking ~10secs end up getting longer and longer ?



I did a test on the secondary machine up there.
Killed flow-capture and flowscan.
Deleted all files in the /ft folder.

d3-syd-equinix01:/var/netflow/ft# ls -al
total 792
drwxr-xr-x 2 root root 794624 2008-02-25 13:38 .
drwxr-xr-x 6 root root  12288 2008-02-25 13:37 ..
d3-syd-equinix01:/var/netflow/ft#

Then started flow-capture.

20241 ?        Ss     0:00 /usr/local/netflow/bin/flow-capture -w 
/var/netflow/ft 0/0/2055 -S5 -V5 -E1G -n 287 -N 0 -R 
/usr/local/netflow/bin/linkme

d3-syd-equinix01:/var/netflow/ft# ls -al
total 848
drwxr-xr-x 2 root root 794624 2008-02-25 13:40 .
drwxr-xr-x 6 root root  12288 2008-02-25 13:37 ..
-rw-r--r-- 1 root root  49244 2008-02-25 13:40 tmp-v05.2008-02-25.134041+1100
d3-syd-equinix01:/var/netflow/ft#

Now started flowscan.

20241 ?        Ss     0:00 /usr/local/netflow/bin/flow-capture -w 
/var/netflow/ft 0/0/2055 -S5 -V5 -E1G -n 287 -N 0 -R 
/usr/local/netflow/bin/linkme
20253 pts/2    S      0:00 /usr/bin/perl bin/flowscan

sleep 30...
sleep 30...

So far so good.

On the 5min mark:
2008/02/25 13:45:05 working on file 
/var/netflow/ft-v05.2008-02-25.134041+1100...
2008/02/25 13:45:08 flowscan-1.020 CUFlow: Cflow::find took  3 wallclock secs ( 
3.66 usr +  0.02 sys =  3.68 CPU) for 666906 flow file bytes, flow hit ratio: 
37098/39630
2008/02/25 13:45:08 flowscan-1.020 CUFlow: report took  0 wallclock secs ( 0.00 
usr  0.00 sys +  0.09 cusr  0.03 csys =  0.12 CPU)

20241 ?        Ss     0:00 /usr/local/netflow/bin/flow-capture -w 
/var/netflow/ft 0/0/2055 -S5 -V5 -E1G -n 287 -N 0 -R 
/usr/local/netflow/bin/linkme
20253 pts/2    S      0:04 /usr/bin/perl bin/flowscan
20265 pts/2    D      0:35 /usr/bin/perl bin/flowscan

top - 13:46:53 up  1:52,  3 users,  load average: 0.96, 0.70, 2.13
Tasks:  76 total,   2 running,  74 sleeping,   0 stopped,   0 zombie
Cpu0  : 88.1%us,  0.0%sy,  0.0%ni,  0.0%id, 11.9%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  :  0.0%us,  1.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1036428k total,  1022988k used,    13440k free,    15964k buffers
Swap:  5301368k total,       52k used,  5301316k free,   872916k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID P SWAP   
TIME COMMAND
20265 root      18   0 21912  17m 1732 R   88  1.7   1:18.68     1 0 3900   
1:18 flowscan

Now here's the part I don't understand.
Why is flowscan still running ?
It finished didn't it ?

Now on the next run (10min mark):

2008/02/25 13:50:09 working on file 
/var/netflow/ft-v05.2008-02-25.134500+1100...
2008/02/25 13:50:15 flowscan-1.020 CUFlow: Cflow::find took  6 wallclock secs ( 
6.24 usr +  0.01 sys =  6.25 CPU) for 734307 flow file bytes, flow hit ratio: 
40698/43290
2008/02/25 13:50:15 flowscan-1.020 CUFlow: report took  0 wallclock secs ( 0.00 
usr  0.01 sys +  0.11 cusr  0.03 csys =  0.15 CPU)

20241 ?        Ss     0:01 /usr/local/netflow/bin/flow-capture -w 
/var/netflow/ft 0/0/2055 -S5 -V5 -E1G -n 287 -N 0 -R 
/usr/local/netflow/bin/linkme
20253 pts/2    S      0:10 /usr/bin/perl bin/flowscan
20265 pts/2    R      4:34 /usr/bin/perl bin/flowscan
20274 pts/2    R      0:35 /usr/bin/perl bin/flowscan

top - 13:50:40 up  1:56,  3 users,  load average: 1.34, 0.93, 1.90
Tasks:  77 total,   5 running,  72 sleeping,   0 stopped,   0 zombie
Cpu0  : 88.0%us,  1.0%sy,  0.0%ni,  0.0%id, 11.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 86.1%us,  1.0%sy,  0.0%ni,  2.0%id, 10.9%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1036428k total,  1022256k used,    14172k free,    30528k buffers
Swap:  5301368k total,       52k used,  5301316k free,   844556k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID P SWAP   
TIME COMMAND
20274 root      18   0 22688  18m 1732 R   90  1.8   0:12.58     1 0 3900   
0:12 flowscan
20265 root      18   0 21912  17m 1732 R   85  1.7   4:12.54     1 1 3900   
4:12 flowscan

Can anyone shed some light on this ?


Regards,
Ross 

From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Ross Tsolakidis
Sent: Thursday, 21 February 2008 4:06 PM
To: [email protected]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: [Flow-tools] cpu utilisation on flowscan processes

Hi Dave,

Thanks again for responding.
I had a look at the rc scripts, I did notice flow-capture had 2 startup 
scripts, which was wierd.
I removed that.
But not flowscan, it's only being started once.


Here's a top with the required info.

top - 16:04:06 up 1 day,  5:34,  1 user,  load average: 9.20, 8.68, 8.46
Tasks:  83 total,   7 running,  75 sleeping,   0 stopped,   1 zombie
Cpu0  : 96.0%us,  4.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu1  : 98.0%us,  2.0%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1036428k total,  1022864k used,    13564k free,    17768k buffers
Swap:  5301368k total,       52k used,  5301316k free,   758620k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   PPID P SWAP   
TIME COMMAND
 9013 root      18   0 30280  26m 1732 R   44  2.7   2:38.67     1 0 2776   
2:38 flowscan
16158 root      18   0 30284  26m 1732 R   44  2.7   0:51.91     1 1 2780   
0:51 flowscan
24721 root      18   0 30116  26m 1728 R   40  2.6   6:12.83     1 0 2656   
6:12 flowscan
 3869 root      18   0 30120  26m 1728 R   39  2.6   3:51.55     1 0 2656   
3:51 flowscan
 2384 mysql     15   0  128m  36m 5220 S   13  3.6 252:44.53  2347 0  91m 
252:44 mysqld
29815 root      23   0 30284  26m 1732 R    9  2.7   5:44.42     1 1 2780   
5:44 flowscan


Regards,
Ross 

DISCLAIMER: 
This e-mail and any files transmitted with it may be privileged and 
confidential, and are intended only for the use of the intended recipient. If 
you are not the intended recipient or responsible for delivering this e-mail to 
the intended recipient, any use, dissemination, forwarding, printing or copying 
of this e-mail and any attachments is strictly prohibited. 
If you have received this e-mail in error, please REPLY TO the SENDER to advise 
the error AND then DELETE the e-mail from your system.
Any views expressed in this e-mail and any files transmitted with it are those 
of the individual sender, except where the sender specifically states them to 
be the views of our organisation.
Our organisation does not represent or warrant that the attached files are free 
from computer viruses or other defects. 
25/2/2008
The user assumes all responsibility for any loss or damage resulting directly 
or indirectly from the use of the attached files. In any event, the liability 
to our organisation is limited to either the resupply of the attached files or 
the cost of having the attached files resupplied.

_______________________________________________
Flow-tools mailing list
[EMAIL PROTECTED]
http://mailman.splintered.net/mailman/listinfo/flow-tools

Reply via email to