Hello all,

I've been working with Mr. Strauss to resolve a problem that's been plaguing me 
for several years: ntop will segfault at random intervals.  Sometimes an hour, 
sometimes a week.  The problem I experience most often is related to looking up 
/ resolving IP's to host names.  This process appears to not be thread safe.

For several days now I've been running a single thread for this process and 
have experienced zero segfaults, whereas typically I would have several / day.  
During testing increasing the number of threads from 3 to 9 resulted in much 
more frequent abends.

If you experience segfaults please try the following tests if you can:

1.) Add "-n | --numeric-ip-addresses"; or "-n 0" where:

    [-n <mode>      | --numeric-ip-addresses <mode>]      Numeric IP addresses 
DNS resolution mode:
                                                            0 - No DNS 
resolution at all
                                                            1 - DNS resolution 
for local hosts only
                                                            2 - DNS resolution 
for remote hosts only


What options are available to you depend on your version of ntop.  I THINK 
4.0.3 and later support the "-n mode"; where earlier versions support just the 
"-n"  You can check by running "./ntop --help" in the directory where your ntop 
binary is stored.  It will print all (most) of the startup options.

Run ntop in this mode for some period of time - whatever you feel is 
satisfactory to prove it's stable.  Ie:  If it usually abends once / day, 
perhaps let it run a week.

2.) If number one is successful, the next step is to recompile ntop with the 
following change to "globals-defines.h"
            - #define MAX_NUM_DEQUEUE_ADDRESS_THREADS     1

The default is 3 threads, and under certain conditions they stomp on each 
other.  Limiting to a single thread renders this specific fault condition 
impossible.

Obviously running only a single thread means the "to be resolved" queue will 
get quite large and possibly overflow; meaning anything above the max will be 
discarded.   This is just a test so some/mostly resolved IP's are better than 
none - right?   If you wish you COULD increase the queue size by editing the 
following; again in globals-defines.h:
            - #define MAX_NUM_QUEUED_ADDRESSES          16384

16K is quite large.  If your DNS servers are fast you should be able to keep 
the queue serviced before it overflows.  However, if you wish to change this I 
ask you do so at some time after reducing the thread count to 1 - to minimize 
the variables and keep the test as controlled as possible.

It seems most people reporting this issue are running netflow, including 
myself.  All interface types use the same process/function to resolve the IP's 
so I'm not sure why it seems more prevalent with netflow users.

Anyway, hope this helps someone.  If there is anyone willing to help test 
patches that would be great.  I'll "alpha" test them before hand, but if we 
could get at least a couple others to test beta patches that would be great.

TIA!

Gary











<font size="1">
<div style='border:none;border-bottom:double windowtext 2.25pt;padding:0in 0in 
1.0pt 0in'>
</div>
"This email is intended to be reviewed by only the intended recipient
 and may contain information that is privileged and/or confidential.
 If you are not the intended recipient, you are hereby notified that
 any review, use, dissemination, disclosure or copying of this email
 and its attachments, if any, is strictly prohibited.  If you have
 received this email in error, please immediately notify the sender by
 return email and delete this email from your system."
</font>

_______________________________________________
Ntop mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to