On Tue, Feb 23, 2010 at 4:21 PM, Robert Story <[email protected]> wrote:

> On Tue, 23 Feb 2010 09:11:40 +0100 Bart wrote:
> BVA> Handling alarms via SIGALRM was already broken by r16831 (
> BVA>
> http://net-snmp.svn.sourceforge.net/viewvc/net-snmp?view=rev&revision=16831
> ).
> BVA> In that revision a helper thread was added in
> BVA> agent/mibgroup/if-mib/data_access/interface_linux.c. Because no
> provisions
> BVA> are taken in snmpd to restrict delivery of SIGALRM to a certain
> thread,
> BVA> SIGALRM can be delivered to either the main agent thread or the helper
> BVA> thread. If SIGALRM is delivered to the prefix listener helper thread,
> the
> BVA> invocation of run_alarms() from the SIGALRM handler will trigger
> several
> BVA> data races and may even cause a crash.
>
> Ok, is this only true in the non-default case where
> NETSNMP_DS_LIB_ALARM_DONT_USE_SIG is set, or can it happen in the default
> case too?
>
> Most likely we should not be creating a thread, but instead be forking off
> a
> child process to do the listening...
>

The above is indeed only the case if NETSNMP_DS_LIB_ALARM_DONT_USE_SIG has
been set to zero (the default value is one), such that Net-SNMP alarms
trigger SIGALRM.

Forking off a child process might introduce more issues than it solves. As
an example, what should happen with the child process when snmpd receives
SIGHUP ?

It is not clear to me why a thread has been created to listen for
RTM_NEWADDR, RTM_DELADDR and RTM_NEWPREFIX messages instead of processing
the netlink socket data via snmp_sess_select() ?

Another remark: the code that handles the netlink messages is not correct.
The queue used by the Linux kernel to pass data from the kernel to user
space has a fixed size (configurable via SO_RCVBUF) and hence can overflow.
When such an overflow happens some of the RTM_* messages are lost. The
process that is listening for RTM_* messages must detect overflows and if an
overflow happened it must poll the current state such that the in-kernel
state and the state in the user process are again in sync. Overflows are
reported by the kernel by sending a message with the type NLMSG_ERROR and
setting an appropriate error code.

See also:
http://www.linuxfoundation.org/collaborate/workgroups/networking/generic_netlink_howto
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=include/linux/netlink.h;hb=HEAD

Bart.
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Net-snmp-coders mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/net-snmp-coders

Reply via email to