The performance of all these tools is kind of determined by the amount of learning and tweaking your hardware and learning the nuances of the software. Explore the -m option on flow-cat (disables mmap()) this will buy much better performance. Also, always keep in mind everything that is going on and as you think it through you can learn where to optimize. Our system for flow analysis uses SAN space, a ram san and various other tricks. Yesterday after I wrote this post my coworker parsed, sorted and flow-stat'd 61 GBs of flows. The operation took just over 13 minutes.
Karl On Wed, 2007-01-10 at 15:30 -0800, jay alvarez wrote: > > ----- Original Message ---- > From: Karl Tatgenhorst <[EMAIL PROTECTED]> > To: Jonathan Glass <[EMAIL PROTECTED]> > Cc: jay alvarez <[EMAIL PROTECTED]>; [email protected] > Sent: Wednesday, January 10, 2007 10:53:00 PM > Subject: Re: [Flow-tools] flow-cat "20gig of flows" |flow-stat -f8 -S2 takes > forever to complete... > > > Hi, > > > Not sure about your hardwares specs but here are some tips. > > # cat /proc/version > Linux version 2.6.8-2-386 ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian > 1:3.3.5-13)) #1 Tue Aug 16 12:46:35 UTC 2005 > > Intel Xeon 3.00Ghz with 1 gig ram and lots of hd space (200gb). > > > > First, the file size is unbelievably unwieldy. You are most likely > > looking at only certain types of traffic (and if not perhaps you should > > consider breaking it out by traffic type) why not rewrite the files in > > that way. Let us say for example that ICMP is not important and flows > > with less than 3 pkts (not a full tcp handshake). I bet this would cut a > > sizable percentage out of your files. > > The goal is to have an output of Top destination ip (using flow-stat) then > parse it using custom scripts to agreggate all IPs belonging to a particular > country.. Sort of finding out the top destination countries for the month so > that the network guys can do all their routing trick... > > > > > Next, flow-stat needs flow-cat to finish entirely in memory before it > > can build the hashes. This means that you need 20 GBS worth of memory > > used PRIOR to flow-stat building the hashes. > > I see... so I guess I'll have to limit my flow-cat to 1 or less Gb of flows > or make use of the extra 2Gb swap just for the flow-stat to function > smoothly.. > > > This is a difficult trick > > since Debian would need to be specially tuned to use 3GB (2GB is the > > usual max 3 is high end). To accomplish this you would need 20GBs > > minimum of swap space and that would need to be physically on a drive > > other than the drive holding the flow files or you will just be i/o > > bound. Why not cat together single weeks of traffic (with the above > > mentioned edits) and then put them in excel to create the monthly > > reports? > > 1st day flow totals to 750Mb.. And adding the 2nd day equals 1.6 gb. I guess > this would still be tolerable considering I am planning to use the swap space. > > So what now? I mean, is it ok if I just "flow-cat 2_days_of_flows" flow-cat > another 2 days and so on. Then I will flow-cat each output all together then > throw it to flow-stat? hmm... this is getting trickier... I guess it would > really be impossible to do a "flow-cat 20gb_flows |flow-stat -f8", even if I > remove sorting, right? > > I haven't tried flow-cat'ing a week of flows yet.. and given 750mb per 1day > flow, it would roughly be around 5 to 6 gb.. Will it be ok if I flow-cat > |flow-stat -f8 -S2 this size considering my current hardware specs? Or should > I just show them the top destination countries for every two days, > then aggregate them as needed? > > Or you have any other suggestion? > > I see, perhaps this is why Flowviewer is taking too long when showing flow > reports for a long span of time. I wonder if Flowviewer guys have already > considered this. No wonder why other admins here said they have left > flowviewer because it takes forever to complete a month of report. > > > Thanks. > -jay > > > The tip on lsof -p <pid> very cool, just thought I would mention > > that. Thanks. > > Karl Tatgenhorst > > On Wed, 2007-01-10 at 09:28 -0500, Jonathan Glass wrote: > > jay alvarez wrote: > > > Hi, > > > > > > I have a directory of flow-captured flows for a whole month(Dec2006) and > > > I'm trying to do a > > > flow-cat "flows_dir" | flowstat -f8 -S2 > topdestination > > > > > > I left it in background and it's been running for 30 hours now. > > > Doing a "top" shows flow-stat being on top of the list from time to time > > > consuming around 60% of memory on a debian system. Noticeably, flow-cat > > > doesn't appear in "top" (perhaps it's done with its job) > > > > > > however ps shows them both. > > > > > > #ps -aux |grep flow > > > > > > root 22604 0.9 0.0 6448 284 ? S Jan09 16:31 flow-cat > > > /var/netflow/ft/all/dec2006/ > > > root 22605 7.0 52.3 875204 474452 ? D Jan09 123:07 flow-stat > > > -f8 -S2 > > > > > > > > > > > > Also lsof > > > > > > # lsof |grep flow-cat > > > > > > flow-cat 22604 root cwd DIR 8,3 224 36536 > > > flow-cat 22604 root rtd DIR 8,4 584 2 / > > > flow-cat 22604 root txt REG 8,3 88716 25290 > > > /usr/bin/flow-cat > > > flow-cat 22604 root mem REG 8,4 90248 110 > > > /lib/ld-2.3.2.so > > > flow-cat 22604 root mem REG 8,4 73304 5891 > > > /lib/tls/libnsl-2.3.2.so > > > flow-cat 22604 root mem REG 8,4 28880 6019 > > > /lib/libwrap.so.0.7.6 > > > flow-cat 22604 root mem REG 8,3 67468 5598 > > > /usr/lib/libz.so.1.2.2 > > > flow-cat 22604 root mem REG 8,4 1254660 5886 > > > /lib/tls/libc-2.3.2.so > > > flow-cat 22604 root mem REG 8,1 3548008 48872 > > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > > flow-cat 22604 root 0u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > flow-cat 22604 root 1w FIFO 0,7 12005820 > > > pipe > > > flow-cat 22604 root 2u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > flow-cat 22604 root 3r REG 8,1 3548008 48872 > > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > > > > > Above shows flow-cat seems to have stopped processing at Dec 21, don't > > > know why. > > > > > > > > > # lsof |grep flow-stat > > > > > > flow-stat 22605 root cwd DIR 8,3 224 36536 > > > /usr/local/home/jayson/topcountries > > > flow-stat 22605 root rtd DIR 8,4 584 2 / > > > flow-stat 22605 root txt REG 8,3 130208 25291 > > > /usr/bin/flow-stat > > > flow-stat 22605 root mem REG 8,4 90248 110 > > > /lib/ld-2.3.2.so > > > flow-stat 22605 root mem REG 8,4 73304 5891 > > > /lib/tls/libnsl-2.3.2.so > > > flow-stat 22605 root mem REG 8,4 28880 6019 > > > /lib/libwrap.so.0.7.6 > > > flow-stat 22605 root mem REG 8,3 67468 5598 > > > /usr/lib/libz.so.1.2.2 > > > flow-stat 22605 root mem REG 8,4 1254660 5886 > > > /lib/tls/libc-2.3.2.so > > > flow-stat 22605 root 0r FIFO 0,7 12005820 > > > pipe > > > flow-stat 22605 root 1w REG 8,3 0 36353 > > > /usr/local/home/jayson/topcountries/topdestinationip > > > flow-stat 22605 root 2u CHR 136,0 2 > > > /dev/pts/0 (deleted) > > > > > > As you can see above, I have redirected the output to "topdestinatioip" > > > But up to now, the file is still empty. > > > > > > Do you know am I going to find out the progress of what I'm doing? > > > I'm just afraid that the program might have stopped running and I am > > > waiting for nothing now. > > > > > > Thanks > > > - jay > > > > > > > > > > > > ------------------------------------------------------------------------ > > > Want to start your own business? Learn how on Yahoo! Small Business. > > > <http://us.rd.yahoo.com/evt=41244/*http://smallbusiness.yahoo.com/r-index> > > > > > > > > > ------------------------------------------------------------------------ > > > > > > _______________________________________________ > > > Flow-tools mailing list > > > [EMAIL PROTECTED] > > > http://mailman.splintered.net/mailman/listinfo/flow-tools > > > > Just as a personal preference, I like to start my flow-cat sessions in > > the background, find their process id, and watch it. Literally: > > > > flow-cat & > > ps -aef|grep flow-cat > > watch "lsof -p <flow-cat-pid>" > > > > So I can see exactly what files flow-cat is processing, and watch for it > > to die. > > > > > > > > > > ____________________________________________________________________________________ > Cheap talk? > Check out Yahoo! Messenger's low PC-to-Phone call rates. > http://voice.yahoo.com _______________________________________________ Flow-tools mailing list [EMAIL PROTECTED] http://mailman.splintered.net/mailman/listinfo/flow-tools
