Re: [Bro-Dev] [JIRA] (BIT-1353) BroCtl status/top take excessive amount of time
Hi, On Mon, Mar 23, 2015 at 03:33:13PM -0500, Daniel Thayer wrote: I'm glad to hear that you're testing broctl on FreeBSD (I always test on Linux). Here are my initial ideas: How many hosts are in your cluster? (you mentioned 28 physical nodes, does that mean 28 computers?!) It is 28 computers, each running 3 bro worker processes with 2 more physical machines running the master and proxies. Are you running the git master version of broctl? it is not quite master - it currently is running 5e2defe, so the state as of March 13th. Is every broctl command slow, or just status and top? All the ones that I tried are slow. I can upgrade to master and test again - I just wanted to ask if there is some way to debug what is going on before restarting the cluster, since the problem took a few days to manifest itself. Hence I probably will not be able to directly reproduce it :) The broctl status command usually spends most of its time waiting for broccoli. I've added a new option that you can set in your etc/broctl.cfg file that will skip the broccoli code so that broctl status runs much faster. To enable this feature, make sure this line is in your broctl.cfg file: StatusCmdShowAll = 0 (after you add this, broctl will say that you have to run either install or deploy, but you don't actually need to for this particular broctl option). I added this (without running install / depoloy) and it now is now faster, but still takes a while. I examined spool/debug.log a bit and it actually seems that a significant period of time is spent getting the process status. The timeline currently looks like this: 23 Mar 11:53:05 [broctl] status 23 Mar 11:53:05 [broctl] Getting process status ... 23 Mar 11:53:05 [execute] blade26: /xa/bro/master/share/broctl/scripts/helpers/check-pid 2513 [...] (many lines like this and many exit code lines) 23 Mar 11:54:07 [execute] blade15: exit code 0 23 Mar 11:54:07 [execute] blade26: /xa/bro/master/share/broctl/scripts/helpers/cat-file /xa/bro/master/spool/worker-26-0/.startup [...] 23 Mar 11:54:09 [execute] blade15: exit code 0 23 Mar 11:54:09 [events] broccoli: Control::peer_status_request() to node worker-26-0 [...] 23 Mar 11:54:29 [events] broccoli: Control::peer_status_response(1427136868.812806 [...] - status output Johanna ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev
Re: [Bro-Dev] [JIRA] (BIT-1353) BroCtl status/top take excessive amount of time
On Mon, Mar 23, 2015 at 04:15:12PM -0500, Daniel Thayer wrote: When you do a broctl status, does it show a status line for every Bro node in your cluster? Yes, it does. At least I think so, the number is quite large :) How are you running broctl status: 1) just by typing broctl status, or 2) by running broctl, then type the status command at the BroControl prompt. I run broctl first and then type status. When you run broctl status, it must establish an ssh session to every remote machine, which could take awhile when there are 28 machines. However, when you run just broctl, then type status at the BroControl prompt, it keeps the ssh sessions open, so the 2nd time you type status should be faster than the 1st time (because the 2nd time it doesn't need to do the ssh connections). There does not seem to be a big speed difference between the first time and the second time status is run. Johanna ___ bro-dev mailing list bro-dev@bro.org http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev