|
It’s a nasty one to find. My colleague
helped me to solve this. This problem will occur only when we are running snmp
agent in a switch. (Devices having less memory) Here’s the scoop. The problem happens very early @ init time
before going into the receive loop. The main problem is in real_init_master(),
we free agentx_sockets which is netsnmp_ds_strings[1][1] but we never reset
netsnmp_ds_strings[1][1] to 0. Later on, we calloc an slp which has the same
ptr as netsnmp_ds_strings[1][1] & add the slp to the linked list. Since
netsnmp_ds_strings[1][1] is not 0, we temper with it which will screw up the
linked list. This problem is nasty & difficult to
find because when we catch the corrupted linked list, it’s already way
too late. Here’s the best way for you to see
the problem with gdb main init_snmp read_configs
read_config_with_type
read_config
run_config_handler
agentx_parse_agentx_socket
netsnmp_ds_set_string
netsnmp_ds_strings[1][1] = strdup() = 0x100ab208 "localhost:705" init_master_agent real_init_master
SNMP_FREE(agentx_sockets); <= agentx_sockets =
netsnmp_ds_strings[1][1] Here's the
problem, we free but never reset netsnmp_ds_strings[1][1] to 0 ... (We are still in
init_master_agent) netsnmp_register_agent_nsap snmp_add
snmp_sess_add_ex
snmp_sess_copy
_sess_copy
slp = calloc() = 0x100ab208
<= Now both slp & netsnmp_ds_strings[1][1] have the same ptr Note that from this
point on, netsnmp_ds_get_string(1, 1) will return the same pointer as slp and
we are puting junk into the slp
Now when you do something, the problem
will get worse & it will crash. run_config_handler(token="agentxSocket",
cptr="localhost:705")
agentx_parse_agentx_socket(token="agentxSocket",
cptr="localhost:705")
netsnmp_ds_set_string(storeid=1, which=1, cptr="localhost:705") if
(netsnmp_ds_strings[storeid][which] != NULL) {
free(netsnmp_ds_strings[storeid][which]); <= PROBLEM; freeing the slp
to compound the problem [X]
netsnmp_ds_strings[storeid][which] = NULL; }
netsnmp_ds_strings[storeid][which] = strdup(value); fix for this is as follows, Define
a function in default_store.c file which sets netsnmp_ds_strings[storeid][which]
to NULL and call it from real_init_master() after SNMP_FREE(agentx_sockets).
Freeing twice a pointer was causing core dump. From:
[EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Guddenahalli Naganna, Jayaprakasha Net-snmp version 5.2.1 Platform details. SNMP master agent is running on a
switch which is having “MontaVista 6.0-8.0.7.0300532 2003-12-24”
operating system. Steps to reproduce the crash as follows, 1. Configurations file should contain the following entries,
(order should be same) master agentx agentxSocket localhost:705 agentaddress 161 trap2sink 192.168.33.2 public 2. Start master agent 3. Remove the trap2sink entry from configuration file.
Modified configuration file content looks like this,
master agentx agentxSocket
localhost:705 agentaddress 161 4. Send SIGHUP to master agent
Now snmp daemon will crash. Crash is consistent only with the above procedure.
Here is the gdb trace, [EMAIL PROTECTED]:/nh/bin# gdb -d .. master_agent GNU gdb 6.0 (MontaVista 6.0-8.0.7.0300532 2003-12-24) Copyright 2003 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public
License, and you are welcome to change it and/or distribute copies of it under
certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type
"show warranty" for details. This GDB was configured as
"powerpc-hardhat-linux"...at (gdb) att 351 Attaching to program: /nh/bin/master_agent, process 351 Loaded symbols for /lib/libm.so.6 Loaded symbols for /lib/libresolv.so.2 Loaded symbols for /lib/libcrypt.so.1 Loaded symbols for /usr/lib/libelf.so.0 Loaded symbols for /lib/librt.so.1 Loaded symbols for /lib/libc.so.6 Loaded symbols for /lib/libpthread.so.0 Loaded symbols for /lib/ld.so.1 Loaded symbols for /lib/libnss_files.so.2 0x0fe02fac in select () from /lib/libc.so.6 (gdb) c Continuing. Program received signal SIGHUP, Hangup. 0x0fe02fac in select () from /lib/libc.so.6 (gdb) c Continuing. Program received signal SIGSEGV, Segmentation fault. snmp_sess_select_info (sessp=0x10083490, numfds=0x7ffffc40,
fdset=0x7ffffab8, timeout=0x7ffffc38, block=0x7ffffc44) at
gated/src/snmp/libs/snmplib/snmp_api.c:5714 5714
if (slp->transport->sock == -1) { (gdb)
|
_______________________________________________ Net-snmp-coders mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/net-snmp-coders
