The performance of all these tools is kind of determined by the
amount of learning and tweaking your hardware and learning the nuances
of the software. Explore the -m option on flow-cat (disables mmap())
this will buy much better performance. Also, always keep in mind
everything that is going on and as you think it through you can learn
where to optimize. Our system for flow analysis uses SAN space, a ram
san and various other tricks. Yesterday after I wrote this post my
coworker parsed, sorted and flow-stat'd 61 GBs of flows. The operation
took just over 13 minutes.


Karl

On Wed, 2007-01-10 at 15:30 -0800, jay alvarez wrote:
> 
> ----- Original Message ----
> From: Karl Tatgenhorst <[EMAIL PROTECTED]>
> To: Jonathan Glass <[EMAIL PROTECTED]>
> Cc: jay alvarez <[EMAIL PROTECTED]>; [email protected]
> Sent: Wednesday, January 10, 2007 10:53:00 PM
> Subject: Re: [Flow-tools] flow-cat "20gig of flows" |flow-stat -f8 -S2 takes 
> forever to complete...
> 
> 
>   Hi,
> 
> >   Not sure about your hardwares specs but here are some tips.
> 
> # cat /proc/version
> Linux version 2.6.8-2-386 ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 
> 1:3.3.5-13)) #1 Tue Aug 16 12:46:35 UTC 2005
> 
> Intel Xeon 3.00Ghz with 1 gig ram and lots of hd space (200gb).
> 
> 
> >  First, the file size is unbelievably unwieldy. You are most likely
> >  looking at only certain types of traffic (and if not perhaps you should
> >  consider breaking it out by traffic type) why not rewrite the files in
> >  that way. Let us say for example that ICMP is not important and flows
> >  with less than 3 pkts (not a full tcp handshake). I bet this would cut a
> >  sizable percentage out of your files.
> 
> The goal is to have an output of Top destination ip (using flow-stat) then 
> parse it using custom scripts to agreggate all IPs belonging to a particular 
> country.. Sort of finding out the top destination countries for the month so 
> that the network guys can do all their routing trick...
> 
> 
> 
> >  Next, flow-stat needs flow-cat to finish entirely in memory before it
> >  can build the hashes. This means that you need 20 GBS worth of memory
> >  used PRIOR to flow-stat building the hashes. 
> 
> I see... so I guess I'll have to limit my flow-cat to 1 or less Gb of flows 
> or make use of the extra 2Gb swap just for the flow-stat to function 
> smoothly..
> 
> > This is a difficult trick
> >  since Debian would need to be specially tuned to use 3GB (2GB is the
> >  usual max 3 is high end). To accomplish this you would need 20GBs
> >  minimum of swap space and that would need to be physically on a drive
> >  other than the drive holding the flow files or you will just be i/o
> >  bound. Why not cat together single weeks of traffic (with the above
> >  mentioned edits) and then put them in excel to create the monthly
> >  reports?
> 
> 1st day flow totals to 750Mb.. And adding the 2nd day equals 1.6 gb. I guess 
> this would still be tolerable considering I am planning to use the swap space.
> 
> So what now? I mean, is it ok if I just "flow-cat 2_days_of_flows" flow-cat 
> another 2 days and so on. Then I will flow-cat each output all together then 
> throw it to flow-stat? hmm... this is getting trickier... I guess it would 
> really be impossible to do a "flow-cat 20gb_flows |flow-stat -f8", even if I 
> remove sorting, right?
> 
> I haven't tried flow-cat'ing a week of flows yet.. and given 750mb per 1day 
> flow, it would roughly be around 5 to 6 gb.. Will it be ok if I flow-cat 
> |flow-stat -f8 -S2 this size considering my current hardware specs? Or should 
> I just show them the top destination countries for every two days, 
> then aggregate them as needed?
> 
> Or you have any other suggestion?
> 
> I see, perhaps this is why Flowviewer is taking too long when showing flow 
> reports for a long span of time. I wonder if Flowviewer guys have already 
> considered this. No wonder why other admins here said they have left 
> flowviewer because it takes forever to complete a month of report.
> 
> 
> Thanks.
> -jay
> 
> >     The tip on lsof -p <pid> very cool, just thought I would mention
> >  that. Thanks.
> 
> Karl Tatgenhorst
> 
> On Wed, 2007-01-10 at 09:28 -0500, Jonathan Glass wrote:
> > jay alvarez wrote:
> > > Hi,
> > > 
> > > I have a directory of flow-captured flows for a whole month(Dec2006) and
> > > I'm trying to do a
> > > flow-cat "flows_dir" | flowstat -f8 -S2 > topdestination
> > > 
> > > I left it in background and it's been running for 30 hours now.
> > > Doing a "top" shows flow-stat being on top of the list from time to time
> > > consuming around 60% of memory on a debian system. Noticeably, flow-cat
> > > doesn't appear in "top" (perhaps it's done with its job)
> > > 
> > > however ps shows them both.
> > > 
> > > #ps -aux |grep flow
> > > 
> > > root     22604  0.9  0.0  6448  284 ?        S    Jan09  16:31 flow-cat
> > > /var/netflow/ft/all/dec2006/
> > > root     22605  7.0 52.3 875204 474452 ?     D    Jan09 123:07 flow-stat
> > > -f8 -S2
> > > 
> > > 
> > > 
> > > Also lsof
> > > 
> > > # lsof |grep flow-cat
> > > 
> > > flow-cat  22604     root  cwd       DIR        8,3      224      36536
> > > flow-cat  22604     root  rtd       DIR        8,4      584          2 /
> > > flow-cat  22604     root  txt       REG        8,3    88716      25290
> > > /usr/bin/flow-cat
> > > flow-cat  22604     root  mem       REG        8,4    90248        110
> > > /lib/ld-2.3.2.so
> > > flow-cat  22604     root  mem       REG        8,4    73304       5891
> > > /lib/tls/libnsl-2.3.2.so
> > > flow-cat  22604     root  mem       REG        8,4    28880       6019
> > > /lib/libwrap.so.0.7.6
> > > flow-cat  22604     root  mem       REG        8,3    67468       5598
> > > /usr/lib/libz.so.1.2.2
> > > flow-cat  22604     root  mem       REG        8,4  1254660       5886
> > > /lib/tls/libc-2.3.2.so
> > > flow-cat  22604     root  mem       REG        8,1  3548008      48872
> > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800
> > > flow-cat  22604     root    0u      CHR      136,0                   2
> > > /dev/pts/0 (deleted)
> > > flow-cat  22604     root    1w     FIFO        0,7            12005820 
> > > pipe
> > > flow-cat  22604     root    2u      CHR      136,0                   2
> > > /dev/pts/0 (deleted)
> > > flow-cat  22604     root    3r      REG        8,1  3548008      48872
> > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800
> > > 
> > > Above shows flow-cat seems to have stopped processing at Dec 21, don't
> > > know why.
> > > 
> > > 
> > > # lsof |grep flow-stat
> > > 
> > > flow-stat 22605     root  cwd       DIR        8,3      224      36536
> > > /usr/local/home/jayson/topcountries
> > > flow-stat 22605     root  rtd       DIR        8,4      584          2 /
> > > flow-stat 22605     root  txt       REG        8,3   130208      25291
> > > /usr/bin/flow-stat
> > > flow-stat 22605     root  mem       REG        8,4    90248        110
> > > /lib/ld-2.3.2.so
> > > flow-stat 22605     root  mem       REG        8,4    73304       5891
> > > /lib/tls/libnsl-2.3.2.so
> > > flow-stat 22605     root  mem       REG        8,4    28880       6019
> > > /lib/libwrap.so.0.7.6
> > > flow-stat 22605     root  mem       REG        8,3    67468       5598
> > > /usr/lib/libz.so.1.2.2
> > > flow-stat 22605     root  mem       REG        8,4  1254660       5886
> > > /lib/tls/libc-2.3.2.so
> > > flow-stat 22605     root    0r     FIFO        0,7            12005820 
> > > pipe
> > > flow-stat 22605     root    1w      REG        8,3        0      36353
> > > /usr/local/home/jayson/topcountries/topdestinationip
> > > flow-stat 22605     root    2u      CHR      136,0                   2
> > > /dev/pts/0 (deleted)
> > > 
> > > As you can see above, I have redirected the output to "topdestinatioip"
> > > But up to now, the file is still empty.
> > > 
> > > Do you know am I going to find out the progress of what I'm doing?
> > > I'm just afraid that the program might have stopped running and I am
> > > waiting for nothing now.
> > > 
> > > Thanks
> > > - jay
> > > 
> > > 
> > > 
> > > ------------------------------------------------------------------------
> > > Want to start your own business? Learn how on Yahoo! Small Business.
> > > <http://us.rd.yahoo.com/evt=41244/*http://smallbusiness.yahoo.com/r-index>
> > > 
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > _______________________________________________
> > > Flow-tools mailing list
> > > [EMAIL PROTECTED]
> > > http://mailman.splintered.net/mailman/listinfo/flow-tools
> > 
> > Just as a personal preference, I like to start my flow-cat sessions in
> > the background, find their process id, and watch it. Literally:
> > 
> > flow-cat &
> > ps -aef|grep flow-cat
> > watch "lsof -p <flow-cat-pid>"
> > 
> > So I can see exactly what files flow-cat is processing, and watch for it
> > to die.
> > 
> 
> 
> 
> 
> 
> 
>  
> ____________________________________________________________________________________
> Cheap talk?
> Check out Yahoo! Messenger's low PC-to-Phone call rates.
> http://voice.yahoo.com

_______________________________________________
Flow-tools mailing list
[EMAIL PROTECTED]
http://mailman.splintered.net/mailman/listinfo/flow-tools

Reply via email to