Re: [Bro-Dev] [JIRA] (BIT-1353) BroCtl status/top take excessive amount of time

2015-03-23 Thread Johanna Amann
Hi,

On Mon, Mar 23, 2015 at 03:33:13PM -0500, Daniel Thayer wrote:
 I'm glad to hear that you're testing broctl on FreeBSD (I always
 test on Linux).  Here are my initial ideas:

 How many hosts are in your cluster?  (you mentioned 28 physical nodes,
 does that mean 28 computers?!)

It is 28 computers, each running 3 bro worker processes with 2 more
physical machines running the master and proxies.

 Are you running the git master version of broctl?

it is not quite master - it currently is running 5e2defe, so the state as
of March 13th.

 Is every broctl command slow, or just status and top?

All the ones that I tried are slow. I can upgrade to master and test again
- I just wanted to ask if there is some way to debug what is going on
before restarting the cluster, since the problem took a few days to
manifest itself. Hence I probably will not be able to directly reproduce
it :)

 The broctl status command usually spends most of its time
 waiting for broccoli.  I've added a new option that you
 can set in your etc/broctl.cfg file that will skip
 the broccoli code so that broctl status runs much faster.
 To enable this feature, make sure this line is in your
 broctl.cfg file:
 StatusCmdShowAll = 0
 (after you add this, broctl will say that you have to run
 either install or deploy, but you don't actually
 need to for this particular broctl option).

I added this (without running install / depoloy) and it now is now faster,
but still takes a while. I examined spool/debug.log a bit and it actually
seems that a significant period of time is spent getting the process status.
The timeline currently looks like this:

23 Mar 11:53:05 [broctl] status
23 Mar 11:53:05 [broctl] Getting process status ...
23 Mar 11:53:05 [execute] blade26: 
/xa/bro/master/share/broctl/scripts/helpers/check-pid 2513
[...] (many lines like this and many exit code lines)
23 Mar 11:54:07 [execute] blade15: exit code 0
23 Mar 11:54:07 [execute] blade26: 
/xa/bro/master/share/broctl/scripts/helpers/cat-file 
/xa/bro/master/spool/worker-26-0/.startup
[...]
23 Mar 11:54:09 [execute] blade15: exit code 0
23 Mar 11:54:09 [events] broccoli: Control::peer_status_request() to node 
worker-26-0
[...]
23 Mar 11:54:29 [events] broccoli: 
Control::peer_status_response(1427136868.812806 [...]
- status output

Johanna
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev


Re: [Bro-Dev] [JIRA] (BIT-1353) BroCtl status/top take excessive amount of time

2015-03-23 Thread Johanna Amann
On Mon, Mar 23, 2015 at 04:15:12PM -0500, Daniel Thayer wrote:
 When you do a broctl status, does it show a status line for every Bro
 node in your cluster?

Yes, it does. At least I think so, the number is quite large :)

 How are you running broctl status:
 1) just by typing broctl status, or
 2) by running broctl, then type the status command at the BroControl
 prompt.

I run broctl first and then type status.

 When you run broctl status, it must establish an ssh session to
 every remote machine, which could take awhile when there are 28
 machines.  However, when you run just broctl, then type status
 at the BroControl prompt, it keeps the ssh sessions open, so the 2nd
 time you type status should be faster than the 1st time (because
 the 2nd time it doesn't need to do the ssh connections).

There does not seem to be a big speed difference between the first time
and the second time status is run.

Johanna
___
bro-dev mailing list
bro-dev@bro.org
http://mailman.icsi.berkeley.edu/mailman/listinfo/bro-dev