Burton M. Strauss III wrote:
see full report at the end
As a really rough guess, that sounds like some kind of conflict with one of the Mutexes holding things up? I wonder if you enable some of
the debugging messages, what shows up when it finally begins to respond?? You could even try the #define PARM_SHOW_NTOP_HEARTBEAT 1 option - it should then begin to show messages for each of the thread
loops, etc.
we only
but you are STRONGLY advised to `sysctl -w kern.maxproc=1024` as else ntop exhausted proc table with zombies (something like a self Dos; experimented) ...
I've always thought there should be a pthread_kill() call too. But
kill threads at the end of the run, so the OS reap should fix this.(although this
Try running with -K (debug mode) - it skips the fork() calls
means that the http creation is done in the same thread, so response time suffers - don't create the BIG pages showing 1000s of hosts...) See if -K makes ntop stop creating zombies. If so, that gives us a place to look.
bingo, no zombies with -K, it slow down with pcap 0.5, little speed difference with 0.8.3
* somehow, i find strange ntop take 90% CPU of a PII/350 for a home server traffic (even it is high in this category)
Userland threads - so the compile converts many of the interrupt driven calls to polling calls - this is what we saw in FreeBSD. That's why the pcap set nonblocking option exists - it converts the key call to a poll()/nanosleep() cycle.
Give it a try - now that you have libpcap 0.8.x, the set_pcap_nonblocking() call should be available to you.
i discuss it on misc@ and will report there after about why and maybe when.
Tests:
* -K # ntop -i xl0 -t 5 -M -m 192.168.2.0/24 --ipv4 --skip-version-check -u _ntop -w 3000 -W 0 -p /etc/ntop/protocol.list -K slowness seems to last more and stay (but as said, debug plays ...), have telnet answer but no zombies else debug only, as far as i see, some "CMPFCTN_DEBUG: setResolvedName(0x3cd40000)" msg. note also, than cpu stay low 16181 _ntop 4 0 15M 25M sleep bpf 1:00 0.05% ntop
test during 1h30 hours
* pcap_setnonblock (with pcap 0.8.3) # ntop -i xl0 -t 5 -M -m 192.168.2.0/24 --ipv4 --skip-version-check -u _ntop -w 3000 -W 0 -p /etc/ntop/protocol.list --set-pcap-nonblocking fast, for real (respond in first 10 sec; weird, previously there was no change in speed ...)
but zombie in two minutes $ sudo lsof|grep ntop|grep IPv ntop 17612 _ntop 11u IPv4 0xd109ce38 0t0 TCP *:3000 (LISTEN) $ ps ax|grep [n]top 13966 ?? ZW 0:00.00 (ntop) 23426 ?? ZW 0:00.00 (ntop) 14575 ?? ZW 0:00.00 (ntop) 1182 ?? ZW 0:00.00 (ntop) 6982 ?? ZW 0:00.00 (ntop) 27065 ?? ZW 0:00.00 (ntop) 23495 ?? ZW 0:00.00 (ntop) 13042 ?? ZW 0:00.00 (ntop) 1652 ?? ZW 0:00.00 (ntop) 10075 ?? ZW 0:00.00 (ntop) 17612 p4 I+ 0:06.39 ntop -i xl0 -t 5 -M -m 192.168.2.0/24 --ipv4 --skip-version-check -u _ntop -w 3000 -W 0 -p /e
# ntop -i xl0 -t 5 -M -m 192.168.2.0/24 --ipv4 --skip-version-check -u _ntop -w 3000 -W 0 -p /etc/ntop/protocol.list --set-pcap-nonblocking -K speed ok, too; no zombie cpu ok 17345 _ntop 4 0 14M 25M sleep bpf 0:08 3.61% ntop
Mutexes:
Mutex gdbmMutex, is unlocked.
locked: 161 times, last was at Fri Apr 16 21:24:52 2004
util.c:4084(17345)
unlocked: 161 times, last was util.c:4091(17345)
longest: 0 sec from util.c:4091
Mutex packetProcessMutex, is unlocked.
locked: 8518 times, last was at Fri Apr 16 21:24:55 2004
pbuf.c:2089(17345)
unlocked: 8518 times, last was pbuf.c:2109(17345)
longest: 0 sec from pbuf.c:2109
Mutex purgeMutex, is locked. <= this one seems always locked but, it's
http code, so maybe normal ?
locked: 54 times, last was at Fri Apr 16 21:24:55 2004
http.c:3056(17345)
unlocked: 53 times, last was http.c:3092(17345)
longest: 1 sec from http.c:3092
Mutex hostsHashMutex, is unlocked.
locked: 24903 times, last was at Fri Apr 16 21:24:55 2004
pbuf.c:2508(17345)
unlocked: 24903 times, last was pbuf.c:3315(17345)
longest: 0 sec from pbuf.c:3315
Mutex tcpSessionsMutex, is unlocked.
locked: 69593 times, last was at Fri Apr 16 21:24:55 2004
sessions.c:634(17345)
unlocked: 69593 times, last was sessions.c:2033(17345)
longest: 1 sec from sessions.c:551
Mutex purgePortsMutex, is unlocked.
locked: 4144 times, last was at Fri Apr 16 21:24:55 2004
pbuf.c:698(17345)
unlocked: 4144 times, last was pbuf.c:719(17345)
longest: 0 sec from pbuf.c:719
Mutex securityItemsMutex, is unlocked.
locked: 54 times, last was at Fri Apr 16 21:24:55 2004
http.c:2631(17345)
unlocked: 54 times, last was http.c:2644(17345)
longest: 0 sec from http.c:2626Report created on Sat Apr 17 08:39:20 2004 [ntop uptime: 11:16:26] Generated by ntop v.3.0 SourceForge .tgz MT (SSL) [i386-unknown-openbsd3.5]
* PARM_SHOW_NTOP_HEARTBEAT 1 with pcap 0.8.3
$ telnet localhost 3000 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. GET / HTTP/1.0 HTTP/1.0 408 Request Time-out Date: Sat, 17 Apr 2004 06:46:33 GMT Cache-Control: no-cache Expires: 0 Connection: close Server: ntop/3.0 SourceForge .tgz (i386-unknown-openbsd3.5) Content-Type: text/html
<HTML> <HEAD> <TITLE>Error 408</TITLE> <META HTTP-EQUIV=Pragma CONTENT=no-cache> <META HTTP-EQUIV=Cache-Control CONTENT=no-cache> <LINK REL=stylesheet HREF="/style.css" type="text/css"> <SCRIPT SRC="/functions.js" TYPE="text/javascript" LANGUAGE="javascript"></SCRIPT> </HEAD> <BODY BACKGROUND="/white_bg.gif" BGCOLOR="#FFFFFF" LINK=blue VLINK=blue> <H1>Error 408</H1> The request was timed-out. <P>Received request:<BR><BLOCKQUOTE><TT>"GET / HTTP/1.0"</TT></BLOCKQUOTE>Connection closed by foreign host. $
Sat Apr 17 08:49:33 2004 [MSGID9360773] INITWEB: Initializing web server Sat Apr 17 08:49:33 2004 [MSGID8922501] INITWEB: Initializing tcp/ip socket connections for web server Sat Apr 17 08:49:33 2004 [MSGID0349927] Initializing socket, port 3000, address (any) Sat Apr 17 08:49:33 2004 [MSGID0218735] INITWEB: Created a new socket (11) Sat Apr 17 08:49:33 2004 [MSGID0349927] INITWEB: Initialized socket, port 3000, address (any) Sat Apr 17 08:49:33 2004 [MSGID0818081] INITWEB: Waiting for HTTP connections on port 3000 Sat Apr 17 08:49:33 2004 [MSGID0841093] INITWEB: Starting web server Sat Apr 17 08:49:33 2004 [MSGID8791429] THREADMGMT: Started thread (1009242112) for web server Sat Apr 17 08:49:33 2004 [MSGID8437197] INITWEB: Server started... continuing with initialization Sat Apr 17 08:49:33 2004 [MSGID0037760] THREADMGMT: Started thread (1009243136) for network packet sniffing on xl0 Sat Apr 17 08:49:33 2004 [MSGID8940166] HEARTBEAT(000000003)[main.c:1235]: main(), sleep()... Sat Apr 17 08:49:33 2004 [MSGID0548089] THREADMGMT: web connections thread (22223) started... Sat Apr 17 08:49:33 2004 [MSGID0316203] Note: SIGPIPE handler set (ignore) Sat Apr 17 08:49:33 2004 [MSGID0986275] WEB: ntop's web server is now processing requests Sat Apr 17 08:49:33 2004 [MSGID0393976] THREADMGMT: pcap dispatch thread running... Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000005262)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8807438] SECURITY: Loading items table Sat Apr 17 08:50:32 2004 [MSGID8778652] NOTE: -L | --use-syslog=facility not specified, child processes will log to the default (24). Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000005263)[main.c:1243]: main(), sleep()...woke Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000005264)[main.c:1235]: main(), sleep()... Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000008685)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000008686)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000014888)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000014889)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000018789)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000030100)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000032983)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:32 2004 [MSGID8940166] HEARTBEAT(000046906)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:33 2004 [MSGID8940166] HEARTBEAT(000085467)[ntop.c:625]: scanIdleLoop(), sleep(60)...woke Sat Apr 17 08:50:33 2004 [MSGID0825709] IDLE_PURGE: FINISHED selection, 0 [out of 37] hosts selected Sat Apr 17 08:50:36 2004 [MSGID9399277] IDLE_PURGE: Device 0: no hosts deleted Sat Apr 17 08:50:36 2004 [MSGID8940166] HEARTBEAT(000253488)[ntop.c:621]: scanIdleLoop(), sleep(60)... Sat Apr 17 08:50:42 2004 [MSGID8940166] HEARTBEAT(000904085)[main.c:1243]: main(), sleep()...woke Sat Apr 17 08:50:42 2004 [MSGID8940166] HEARTBEAT(000904086)[main.c:1235]: main(), sleep()... Sat Apr 17 08:50:45 2004 [MSGID8940166] HEARTBEAT(001225987)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:46 2004 [MSGID8940166] HEARTBEAT(001335741)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:46 2004 [MSGID8940166] HEARTBEAT(001351714)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:46 2004 [MSGID8940166] HEARTBEAT(001394158)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:46 2004 [MSGID8940166] HEARTBEAT(001401031)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:49 2004 [MSGID8940166] HEARTBEAT(001665417)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:49 2004 [MSGID8940166] HEARTBEAT(001676122)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:49 2004 [MSGID8940166] HEARTBEAT(001690447)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:49 2004 [MSGID8940166] HEARTBEAT(001693646)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:51 2004 [MSGID8940166] HEARTBEAT(001885737)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:51 2004 [MSGID8940166] HEARTBEAT(001921166)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:51 2004 [MSGID8940166] HEARTBEAT(001935612)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:51 2004 [MSGID8940166] HEARTBEAT(001941311)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:51 2004 [MSGID8940166] HEARTBEAT(001941312)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001942702)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001942703)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001942704)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001957113)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001970567)[main.c:1243]: main(), sleep()...woke Sat Apr 17 08:50:52 2004 [MSGID8940166] HEARTBEAT(001970568)[main.c:1235]: main(), sleep()...
# ps ax|grep ntop|grep ZW| wc -l
22
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
22223 _ntop 63 0 12M 22M run - 1:16 90.43% ntopnormal time response speed correct
* PARM_SHOW_NTOP_HEARTBEAT 1 with pcap 0.5 # ntop -i xl0 -t 5 -M -m 192.168.2.0/24 --ipv4 --skip-version-check -u _ntop -w 3000 -W 0 -p /etc/ntop/protocol.list
Sat Apr 17 09:06:31 2004 [MSGID9360773] INITWEB: Initializing web server Sat Apr 17 09:06:31 2004 [MSGID8922501] INITWEB: Initializing tcp/ip socket connections for web server Sat Apr 17 09:06:31 2004 [MSGID0349927] Initializing socket, port 3000, address (any) Sat Apr 17 09:06:31 2004 [MSGID0218735] INITWEB: Created a new socket (11) Sat Apr 17 09:06:31 2004 [MSGID0349927] INITWEB: Initialized socket, port 3000, address (any) Sat Apr 17 09:06:31 2004 [MSGID0818081] INITWEB: Waiting for HTTP connections on port 3000 Sat Apr 17 09:06:31 2004 [MSGID0841093] INITWEB: Starting web server Sat Apr 17 09:06:31 2004 [MSGID8791429] THREADMGMT: Started thread (1009180672) for web server Sat Apr 17 09:06:31 2004 [MSGID8437197] INITWEB: Server started... continuing with initialization Sat Apr 17 09:06:31 2004 [MSGID0037760] THREADMGMT: Started thread (1009181696) for network packet sniffing on xl0 Sat Apr 17 09:06:31 2004 [MSGID8940166] HEARTBEAT(000000003)[main.c:1235]: main(), sleep()... Sat Apr 17 09:06:31 2004 [MSGID0548089] THREADMGMT: web connections thread (27988) started... Sat Apr 17 09:06:31 2004 [MSGID0316203] Note: SIGPIPE handler set (ignore) Sat Apr 17 09:06:31 2004 [MSGID0986275] WEB: ntop's web server is now processing requests Sat Apr 17 09:06:31 2004 [MSGID0393976] THREADMGMT: pcap dispatch thread running... => waiting for response Sat Apr 17 09:07:27 2004 [MSGID8940166] HEARTBEAT(000005086)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:07:27 2004 [MSGID8807438] SECURITY: Loading items table Sat Apr 17 09:07:27 2004 [MSGID8940166] HEARTBEAT(000005087)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:07:27 2004 [MSGID8940166] HEARTBEAT(000005088)[main.c:1235]: main(), sleep()... => always waiting (get title, little after first menu bar), cpu low, no zombie Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010232)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010233)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010234)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010235)[ntop.c:625]: scanIdleLoop(), sleep(60)...woke Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010236)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010237)[main.c:1235]: main(), sleep()... Sat Apr 17 09:08:38 2004 [MSGID8940166] HEARTBEAT(000010269)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:10:16 2004 [MSGID8940166] HEARTBEAT(000017240)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:10:16 2004 [MSGID8940166] HEARTBEAT(000017241)[main.c:1235]: main(), sleep()... Sat Apr 17 09:10:16 2004 [MSGID8940166] HEARTBEAT(000017242)[ntop.c:673]: scanFingerprintLoop(), sleep()...woke Sat Apr 17 09:10:16 2004 [MSGID8757584] OSFP: scanFingerprintLoop() checked 6, resolved 6 Sat Apr 17 09:10:16 2004 [MSGID8940166] HEARTBEAT(000017243)[ntop.c:669]: scanFingerprintLoop(), sleep()... Sat Apr 17 09:10:35 2004 [MSGID8940166] HEARTBEAT(000019061)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:10:35 2004 [MSGID8940166] HEARTBEAT(000019062)[main.c:1235]: main(), sleep()... => same Sat Apr 17 09:10:45 2004 [MSGID8940166] HEARTBEAT(000019834)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:10:45 2004 [MSGID8940166] HEARTBEAT(000019835)[main.c:1235]: main(), sleep()... Sat Apr 17 09:11:09 2004 [MSGID8940166] HEARTBEAT(000022174)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:11:09 2004 [MSGID8940166] HEARTBEAT(000022175)[main.c:1235]: main(), sleep()... Sat Apr 17 09:11:57 2004 [MSGID0825709] IDLE_PURGE: FINISHED selection, 0 [out of 35] hosts selected Sat Apr 17 09:12:38 2004 [MSGID8940166] HEARTBEAT(000029083)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:12:38 2004 [MSGID8778652] NOTE: -L | --use-syslog=facility not specified, child processes will log to the default (24). Sat Apr 17 09:12:38 2004 [MSGID8940166] HEARTBEAT(000029084)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:12:38 2004 [MSGID8940166] HEARTBEAT(000029085)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:12:38 2004 [MSGID8940166] HEARTBEAT(000029086)[main.c:1235]: main(), sleep()... Sat Apr 17 09:12:39 2004 [MSGID8940166] HEARTBEAT(000029174)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:12:39 2004 [MSGID8940166] HEARTBEAT(000032557)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:12:39 2004 [MSGID8940166] HEARTBEAT(000032731)[webInterface.c:8154]: handleWebConnections() Sat Apr 17 09:12:40 2004 [MSGID9399277] IDLE_PURGE: Device 0: no hosts deleted Sat Apr 17 09:12:41 2004 [MSGID8940166] HEARTBEAT(000150940)[ntop.c:621]: scanIdleLoop(), sleep(60)... Sat Apr 17 09:12:46 2004 [MSGID8940166] HEARTBEAT(000772530)[ntop.c:673]: scanFingerprintLoop(), sleep()...woke Sat Apr 17 09:12:46 2004 [MSGID8940166] HEARTBEAT(000775980)[ntop.c:669]: scanFingerprintLoop(), sleep()... => have all Sat Apr 17 09:12:48 2004 [MSGID8940166] HEARTBEAT(001094595)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:12:48 2004 [MSGID8940166] HEARTBEAT(001094596)[main.c:1235]: main(), sleep()... Sat Apr 17 09:12:58 2004 [MSGID8940166] HEARTBEAT(002259902)[main.c:1243]: main(), sleep()...woke Sat Apr 17 09:12:58 2004 [MSGID8940166] HEARTBEAT(002259903)[main.c:1235]: main(), sleep()... => ask for configuration, normal speed, cpu high, first zombies
Report created on Sat Apr 17 09:13:13 2004 [ntop uptime: 6:42] Generated by ntop v.3.0 SourceForge .tgz MT (SSL) [i386-unknown-openbsd3.5]
seems set nonblocking really helps only for the first ~five minutes on speed, but it helps a lot for cpu.
Report created on Sat Apr 17 10:02:25 2004 [ntop uptime: 55:54] Generated by ntop v.3.0 SourceForge .tgz MT (SSL) [i386-unknown-openbsd3.5] speed ok, 155 zombies, cpu high
Regards
Julien
_______________________________________________ Ntop-dev mailing list [EMAIL PROTECTED] http://listgateway.unipi.it/mailman/listinfo/ntop-dev
