----- Original Message ---- From: Karl Tatgenhorst <[EMAIL PROTECTED]> To: Jonathan Glass <[EMAIL PROTECTED]> Cc: jay alvarez <[EMAIL PROTECTED]>; [email protected] Sent: Wednesday, January 10, 2007 10:53:00 PM Subject: Re: [Flow-tools] flow-cat "20gig of flows" |flow-stat -f8 -S2 takes forever to complete...
Hi, > Not sure about your hardwares specs but here are some tips. # cat /proc/version Linux version 2.6.8-2-386 ([EMAIL PROTECTED]) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 Tue Aug 16 12:46:35 UTC 2005 Intel Xeon 3.00Ghz with 1 gig ram and lots of hd space (200gb). > First, the file size is unbelievably unwieldy. You are most likely > looking at only certain types of traffic (and if not perhaps you should > consider breaking it out by traffic type) why not rewrite the files in > that way. Let us say for example that ICMP is not important and flows > with less than 3 pkts (not a full tcp handshake). I bet this would cut a > sizable percentage out of your files. The goal is to have an output of Top destination ip (using flow-stat) then parse it using custom scripts to agreggate all IPs belonging to a particular country.. Sort of finding out the top destination countries for the month so that the network guys can do all their routing trick... > Next, flow-stat needs flow-cat to finish entirely in memory before it > can build the hashes. This means that you need 20 GBS worth of memory > used PRIOR to flow-stat building the hashes. I see... so I guess I'll have to limit my flow-cat to 1 or less Gb of flows or make use of the extra 2Gb swap just for the flow-stat to function smoothly.. > This is a difficult trick > since Debian would need to be specially tuned to use 3GB (2GB is the > usual max 3 is high end). To accomplish this you would need 20GBs > minimum of swap space and that would need to be physically on a drive > other than the drive holding the flow files or you will just be i/o > bound. Why not cat together single weeks of traffic (with the above > mentioned edits) and then put them in excel to create the monthly > reports? 1st day flow totals to 750Mb.. And adding the 2nd day equals 1.6 gb. I guess this would still be tolerable considering I am planning to use the swap space. So what now? I mean, is it ok if I just "flow-cat 2_days_of_flows" flow-cat another 2 days and so on. Then I will flow-cat each output all together then throw it to flow-stat? hmm... this is getting trickier... I guess it would really be impossible to do a "flow-cat 20gb_flows |flow-stat -f8", even if I remove sorting, right? I haven't tried flow-cat'ing a week of flows yet.. and given 750mb per 1day flow, it would roughly be around 5 to 6 gb.. Will it be ok if I flow-cat |flow-stat -f8 -S2 this size considering my current hardware specs? Or should I just show them the top destination countries for every two days, then aggregate them as needed? Or you have any other suggestion? I see, perhaps this is why Flowviewer is taking too long when showing flow reports for a long span of time. I wonder if Flowviewer guys have already considered this. No wonder why other admins here said they have left flowviewer because it takes forever to complete a month of report. Thanks. -jay > The tip on lsof -p <pid> very cool, just thought I would mention > that. Thanks. Karl Tatgenhorst On Wed, 2007-01-10 at 09:28 -0500, Jonathan Glass wrote: > jay alvarez wrote: > > Hi, > > > > I have a directory of flow-captured flows for a whole month(Dec2006) and > > I'm trying to do a > > flow-cat "flows_dir" | flowstat -f8 -S2 > topdestination > > > > I left it in background and it's been running for 30 hours now. > > Doing a "top" shows flow-stat being on top of the list from time to time > > consuming around 60% of memory on a debian system. Noticeably, flow-cat > > doesn't appear in "top" (perhaps it's done with its job) > > > > however ps shows them both. > > > > #ps -aux |grep flow > > > > root 22604 0.9 0.0 6448 284 ? S Jan09 16:31 flow-cat > > /var/netflow/ft/all/dec2006/ > > root 22605 7.0 52.3 875204 474452 ? D Jan09 123:07 flow-stat > > -f8 -S2 > > > > > > > > Also lsof > > > > # lsof |grep flow-cat > > > > flow-cat 22604 root cwd DIR 8,3 224 36536 > > flow-cat 22604 root rtd DIR 8,4 584 2 / > > flow-cat 22604 root txt REG 8,3 88716 25290 > > /usr/bin/flow-cat > > flow-cat 22604 root mem REG 8,4 90248 110 > > /lib/ld-2.3.2.so > > flow-cat 22604 root mem REG 8,4 73304 5891 > > /lib/tls/libnsl-2.3.2.so > > flow-cat 22604 root mem REG 8,4 28880 6019 > > /lib/libwrap.so.0.7.6 > > flow-cat 22604 root mem REG 8,3 67468 5598 > > /usr/lib/libz.so.1.2.2 > > flow-cat 22604 root mem REG 8,4 1254660 5886 > > /lib/tls/libc-2.3.2.so > > flow-cat 22604 root mem REG 8,1 3548008 48872 > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > flow-cat 22604 root 0u CHR 136,0 2 > > /dev/pts/0 (deleted) > > flow-cat 22604 root 1w FIFO 0,7 12005820 pipe > > flow-cat 22604 root 2u CHR 136,0 2 > > /dev/pts/0 (deleted) > > flow-cat 22604 root 3r REG 8,1 3548008 48872 > > /var/netflow/ft/all/dec2006/ft-v05.2006-12-21.133000+0800 > > > > Above shows flow-cat seems to have stopped processing at Dec 21, don't > > know why. > > > > > > # lsof |grep flow-stat > > > > flow-stat 22605 root cwd DIR 8,3 224 36536 > > /usr/local/home/jayson/topcountries > > flow-stat 22605 root rtd DIR 8,4 584 2 / > > flow-stat 22605 root txt REG 8,3 130208 25291 > > /usr/bin/flow-stat > > flow-stat 22605 root mem REG 8,4 90248 110 > > /lib/ld-2.3.2.so > > flow-stat 22605 root mem REG 8,4 73304 5891 > > /lib/tls/libnsl-2.3.2.so > > flow-stat 22605 root mem REG 8,4 28880 6019 > > /lib/libwrap.so.0.7.6 > > flow-stat 22605 root mem REG 8,3 67468 5598 > > /usr/lib/libz.so.1.2.2 > > flow-stat 22605 root mem REG 8,4 1254660 5886 > > /lib/tls/libc-2.3.2.so > > flow-stat 22605 root 0r FIFO 0,7 12005820 pipe > > flow-stat 22605 root 1w REG 8,3 0 36353 > > /usr/local/home/jayson/topcountries/topdestinationip > > flow-stat 22605 root 2u CHR 136,0 2 > > /dev/pts/0 (deleted) > > > > As you can see above, I have redirected the output to "topdestinatioip" > > But up to now, the file is still empty. > > > > Do you know am I going to find out the progress of what I'm doing? > > I'm just afraid that the program might have stopped running and I am > > waiting for nothing now. > > > > Thanks > > - jay > > > > > > > > ------------------------------------------------------------------------ > > Want to start your own business? Learn how on Yahoo! Small Business. > > <http://us.rd.yahoo.com/evt=41244/*http://smallbusiness.yahoo.com/r-index> > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Flow-tools mailing list > > [EMAIL PROTECTED] > > http://mailman.splintered.net/mailman/listinfo/flow-tools > > Just as a personal preference, I like to start my flow-cat sessions in > the background, find their process id, and watch it. Literally: > > flow-cat & > ps -aef|grep flow-cat > watch "lsof -p <flow-cat-pid>" > > So I can see exactly what files flow-cat is processing, and watch for it > to die. > ____________________________________________________________________________________ Cheap talk? Check out Yahoo! Messenger's low PC-to-Phone call rates. http://voice.yahoo.com _______________________________________________ Flow-tools mailing list [EMAIL PROTECTED] http://mailman.splintered.net/mailman/listinfo/flow-tools
