Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
Hi Adnras, Dne 20.7.2011 21:35, Tóth András napsal(a): Hi Jiri, When you mention logs are useless, do you mean you did not find anything in the logs after logging on to the switch which freed up some memory? Yup, there were no signs of anything unusual in the log. logging severity is set to notifications. Any chance to collect the following command from the switch which freed up some memory during the night? sh mem allocating-process totals DC.Cisco.138#sh mem allocating-process totals Total(b)Used(b) Free(b) Lowest(b) Largest(b) Processor 21585348195477682037580 133081 1374036 PC Total Count Name 0x015D73F4 2202188 277 Process Stack 0x0032C018 1213820 1050 *Packet Header* 0x005B1364 743256 74 Flashfs Sector 0x00F81528 712840 8Init 0x00E7B38C 523328 85 Init 0x01546F8C 496176 36 TW Buckets 0x0048A008 439340 1Init 0x01443754 393480 6STP Port Control Block Chunk 0x01011B34 292956 3149 IPC Zone 0x0032F68C 262720 6pak subblock chunk 0x00A6BA2C 262232 2CEF: hash table 0x00489FD8 256300 1Init 0x0079E27C 250672 2PM port_data 0x0158BD78 207900 275 Process 0x00339870 203148 57 *Hardware IDB* 0x01011BDC 196740 3IPC Message Hea 0x0016CDD0 196740 3Mat Addr Tbl Ch 0x004EE5A8 196652 1HRM: destination array 0x015F68A8 191876 3EEM ED ND 0x00E5C79C 184320 2event_trace_tbs 0x0032C06C 164640 4*Packet Data* 0x00809DC8 163884 1Init 0x00949AF4 145484 399 MLDSN L2MCM 0x004F6FA8 135652 29 HULC_MAD_SD_MGR 0x01030A50 133468 383 Virtual Exec 0x013F2930 132728 7VLAN Manager 0xE8BC 132132 11 DTP Protocol 0x00AD52E0 131976 4VRFS: MTRIE n08 0x00336804 131116 1*Init* 0x014271B0 130376 12 SNMP SMALL CHUN 0x007910A8 129948 51 PM port sub-block 0x016F4304 125244 1820 Init 0x009561E4 110676 399 MLDSN L2MCM 0x0048A020 109868 1Init Unfortunately I'm not familiar with usual values these processes should allocate. This might sound stupid but can you confirm by looking at the uptime that the switch did not crash? If it did, please collect the crashinfo files and send them so I can take a look. The switch did not crash, it's uptime is over 6 weeks now. While monitoring the memory usage, if you see regular increase, collect the following commands several times so you can compare them later to see which process allocates most memory. sh proc mem sorted sh mem allocating-process totals Memory graphing is being implemented now. As soon as I have relevant graphs, I will gather info given by these commands. Thank you, Jiri Best regards, Andras On Wed, Jul 20, 2011 at 1:22 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi Andras, All I was able to get from the switch was '%% Low on memory; try again later', so I had no chance to get any usefull info. None of them really crashed, even now (a few days after the issue raised) all are forwarding everything without any interruption. The only (doh) problem is that they are refusing any remote/local management. We have aproximately 40 2960's in our network, all were upgraded to 12.2(58)SE1 at the same night 42 days ago. Till this day four of them have shown this error (first one a week ago, the rest during the last 7 days). I will definitely implement graphing of memory usage and monitor this. Logs are useless, as there is absolutely none info regarding to this behaviour. update: Wow, one of 'crashed' switches surprisingly managed to free some memory over the night and there is no problem with remote login now! DC.Cisco.138#show mem HeadTotal(b) Used(b) Free(b) Lowest(b) Largest(b) Processor27A819C2158534819502124 2083224 1330816 1396804 I/O2C0 4194304 2385892 1808412 1647292 1803000 Driver te1A0 1048576 44 1048532 1048532 1048532 DC.Cisco.138#show proc mem sorted Processor Pool Total: 21585348 Used: 19506548 Free:2078800 I/O Pool Total:4194304 Used:2385788 Free:1808516 Driver te Pool Total:1048576 Used: 40 Free:1048536 PID TTY Allocated FreedHoldingGetbufsRetbufs Process 0 0 209660643684020 13930872 0 0 *Init* 0 0 349880992 30354565617584884520010 421352 *Dead* 0 0 0 0 722384 0 0 *MallocLite* 67 0 531728 17248 463548 0 0 Stack Mgr Notifi 81 0 488448232 332392 0
Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
Hi Andras, All I was able to get from the switch was '%% Low on memory; try again later', so I had no chance to get any usefull info. None of them really crashed, even now (a few days after the issue raised) all are forwarding everything without any interruption. The only (doh) problem is that they are refusing any remote/local management. We have aproximately 40 2960's in our network, all were upgraded to 12.2(58)SE1 at the same night 42 days ago. Till this day four of them have shown this error (first one a week ago, the rest during the last 7 days). I will definitely implement graphing of memory usage and monitor this. Logs are useless, as there is absolutely none info regarding to this behaviour. update: Wow, one of 'crashed' switches surprisingly managed to free some memory over the night and there is no problem with remote login now! DC.Cisco.138#show mem HeadTotal(b) Used(b) Free(b) Lowest(b) Largest(b) Processor27A819C2158534819502124 2083224 1330816 1396804 I/O2C0 4194304 2385892 1808412 1647292 1803000 Driver te1A0 1048576 44 1048532 1048532 1048532 DC.Cisco.138#show proc mem sorted Processor Pool Total: 21585348 Used: 19506548 Free:2078800 I/O Pool Total:4194304 Used:2385788 Free:1808516 Driver te Pool Total:1048576 Used: 40 Free:1048536 PID TTY Allocated FreedHoldingGetbufsRetbufs Process 0 0 209660643684020 13930872 0 0 *Init* 0 0 349880992 30354565617584884520010 421352 *Dead* 0 0 0 0 722384 0 0 *MallocLite* 67 0 531728 17248 463548 0 0 Stack Mgr Notifi 81 0 488448232 332392 0 0 HLFM address lea 104 060022606886956 234548 0 0 HACL Acl Manager 151 01161020 437668 214108 0 0 DTP Protocol 59 0 198956 34501644 208516 0 0 EEM ED ND 163 0 196740 0 203900 0 0 VMATM Callback 219 0 775680 39872788 186548 0 0 MLDSN L2MCM 16 0 312148 762860 145736 0 104780 Entity MIB API Thank you, Jiri Dne 20.7.2011 0:08, Tóth András napsal(a): Hi Jiri, Did you have a chance to collect the output of 'sh log' after logging in via console? If yes, please send it over. Did you observe a crash of the switch or only the error message? How many times did you see this so far? How often is it happening? How many 2960 switches running 12.2(58)SE1 do you have in total and on how many did you see this? If the switch is working fine now, I would recommend monitoring the memory usage and the rate of increase. Check the logs around that time to see if you find anything related, such as dot1x errors, etc. Also, consider collecting the following commands when the error message is seen again and open a Cisco TAC case if possible. sh log sh proc mem sorted sh mem summary sh mem allocating-process totals sh tech Best regards, Andras On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi, a month ago I have upgraded a few dozens of our access layer 2960's to the latest version of IOS (12.2(58)SE1) and during the last few days three of these upgraded switches suddently have stopped responding to SSH telnet access. Traffic coming from/to ports is still regulary forwarded. Connecting over the serial port gives me '%% Low on memory; try again later' into the log. The only solution I came to is to reload the switch. Does anybody else have similar problem with this version of IOS? As far as I know, we don't use any special configuration. One feature is nearly hitting the limit (127 STP instances), but we didn't have any problems with this so far. Thank you for your thoughts. -- --- Kind regards, Jiri Prochazka ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ -- --- Kind regards, Jiri Prochazka ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
It's kinda sad 2960 switches don't have memory reserve console command, which will allow you to run some diagnostic commands. Sometimes when memory leaks happens there tracerbacks actually logged, so I suggest enable syslog logging on switches, so you can see all logging stuff, before it run out of memory and you can't run any command. On 20/07/11 15:22, Jiri Prochazka wrote: Hi Andras, All I was able to get from the switch was '%% Low on memory; try again later', so I had no chance to get any usefull info. None of them really crashed, even now (a few days after the issue raised) all are forwarding everything without any interruption. The only (doh) problem is that they are refusing any remote/local management. We have aproximately 40 2960's in our network, all were upgraded to 12.2(58)SE1 at the same night 42 days ago. Till this day four of them have shown this error (first one a week ago, the rest during the last 7 days). I will definitely implement graphing of memory usage and monitor this. Logs are useless, as there is absolutely none info regarding to this behaviour. update: Wow, one of 'crashed' switches surprisingly managed to free some memory over the night and there is no problem with remote login now! DC.Cisco.138#show mem Head Total(b) Used(b) Free(b) Lowest(b) Largest(b) Processor 27A819C 21585348 19502124 2083224 1330816 1396804 I/O 2C0 4194304 2385892 1808412 1647292 1803000 Driver te 1A0 1048576 44 1048532 1048532 1048532 DC.Cisco.138#show proc mem sorted Processor Pool Total: 21585348 Used: 19506548 Free: 2078800 I/O Pool Total: 4194304 Used: 2385788 Free: 1808516 Driver te Pool Total: 1048576 Used: 40 Free: 1048536 PID TTY Allocated Freed Holding Getbufs Retbufs Process 0 0 20966064 3684020 13930872 0 0 *Init* 0 0 349880992 303545656 1758488 4520010 421352 *Dead* 0 0 0 0 722384 0 0 *MallocLite* 67 0 531728 17248 463548 0 0 Stack Mgr Notifi 81 0 488448 232 332392 0 0 HLFM address lea 104 0 6002260 6886956 234548 0 0 HACL Acl Manager 151 0 1161020 437668 214108 0 0 DTP Protocol 59 0 198956 34501644 208516 0 0 EEM ED ND 163 0 196740 0 203900 0 0 VMATM Callback 219 0 775680 39872788 186548 0 0 MLDSN L2MCM 16 0 312148 762860 145736 0 104780 Entity MIB API Thank you, Jiri Dne 20.7.2011 0:08, Tóth András napsal(a): Hi Jiri, Did you have a chance to collect the output of 'sh log' after logging in via console? If yes, please send it over. Did you observe a crash of the switch or only the error message? How many times did you see this so far? How often is it happening? How many 2960 switches running 12.2(58)SE1 do you have in total and on how many did you see this? If the switch is working fine now, I would recommend monitoring the memory usage and the rate of increase. Check the logs around that time to see if you find anything related, such as dot1x errors, etc. Also, consider collecting the following commands when the error message is seen again and open a Cisco TAC case if possible. sh log sh proc mem sorted sh mem summary sh mem allocating-process totals sh tech Best regards, Andras On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi, a month ago I have upgraded a few dozens of our access layer 2960's to the latest version of IOS (12.2(58)SE1) and during the last few days three of these upgraded switches suddently have stopped responding to SSH telnet access. Traffic coming from/to ports is still regulary forwarded. Connecting over the serial port gives me '%% Low on memory; try again later' into the log. The only solution I came to is to reload the switch. Does anybody else have similar problem with this version of IOS? As far as I know, we don't use any special configuration. One feature is nearly hitting the limit (127 STP instances), but we didn't have any problems with this so far. Thank you for your thoughts. -- --- Kind regards, Jiri Prochazka ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
Hi Jiri, When you mention logs are useless, do you mean you did not find anything in the logs after logging on to the switch which freed up some memory? Any chance to collect the following command from the switch which freed up some memory during the night? sh mem allocating-process totals This might sound stupid but can you confirm by looking at the uptime that the switch did not crash? If it did, please collect the crashinfo files and send them so I can take a look. While monitoring the memory usage, if you see regular increase, collect the following commands several times so you can compare them later to see which process allocates most memory. sh proc mem sorted sh mem allocating-process totals Best regards, Andras On Wed, Jul 20, 2011 at 1:22 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi Andras, All I was able to get from the switch was '%% Low on memory; try again later', so I had no chance to get any usefull info. None of them really crashed, even now (a few days after the issue raised) all are forwarding everything without any interruption. The only (doh) problem is that they are refusing any remote/local management. We have aproximately 40 2960's in our network, all were upgraded to 12.2(58)SE1 at the same night 42 days ago. Till this day four of them have shown this error (first one a week ago, the rest during the last 7 days). I will definitely implement graphing of memory usage and monitor this. Logs are useless, as there is absolutely none info regarding to this behaviour. update: Wow, one of 'crashed' switches surprisingly managed to free some memory over the night and there is no problem with remote login now! DC.Cisco.138#show mem Head Total(b) Used(b) Free(b) Lowest(b) Largest(b) Processor 27A819C 21585348 19502124 2083224 1330816 1396804 I/O 2C0 4194304 2385892 1808412 1647292 1803000 Driver te 1A0 1048576 44 1048532 1048532 1048532 DC.Cisco.138#show proc mem sorted Processor Pool Total: 21585348 Used: 19506548 Free: 2078800 I/O Pool Total: 4194304 Used: 2385788 Free: 1808516 Driver te Pool Total: 1048576 Used: 40 Free: 1048536 PID TTY Allocated Freed Holding Getbufs Retbufs Process 0 0 20966064 3684020 13930872 0 0 *Init* 0 0 349880992 303545656 1758488 4520010 421352 *Dead* 0 0 0 0 722384 0 0 *MallocLite* 67 0 531728 17248 463548 0 0 Stack Mgr Notifi 81 0 488448 232 332392 0 0 HLFM address lea 104 0 6002260 6886956 234548 0 0 HACL Acl Manager 151 0 1161020 437668 214108 0 0 DTP Protocol 59 0 198956 34501644 208516 0 0 EEM ED ND 163 0 196740 0 203900 0 0 VMATM Callback 219 0 775680 39872788 186548 0 0 MLDSN L2MCM 16 0 312148 762860 145736 0 104780 Entity MIB API Thank you, Jiri Dne 20.7.2011 0:08, Tóth András napsal(a): Hi Jiri, Did you have a chance to collect the output of 'sh log' after logging in via console? If yes, please send it over. Did you observe a crash of the switch or only the error message? How many times did you see this so far? How often is it happening? How many 2960 switches running 12.2(58)SE1 do you have in total and on how many did you see this? If the switch is working fine now, I would recommend monitoring the memory usage and the rate of increase. Check the logs around that time to see if you find anything related, such as dot1x errors, etc. Also, consider collecting the following commands when the error message is seen again and open a Cisco TAC case if possible. sh log sh proc mem sorted sh mem summary sh mem allocating-process totals sh tech Best regards, Andras On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi, a month ago I have upgraded a few dozens of our access layer 2960's to the latest version of IOS (12.2(58)SE1) and during the last few days three of these upgraded switches suddently have stopped responding to SSH telnet access. Traffic coming from/to ports is still regulary forwarded. Connecting over the serial port gives me '%% Low on memory; try again later' into the log. The only solution I came to is to reload the switch. Does anybody else have similar problem with this version of IOS? As far as I know, we don't use any special configuration. One feature is nearly hitting the limit (127 STP instances), but we didn't have any problems with this so far. Thank you for your thoughts. -- --- Kind regards, Jiri Prochazka ___
[c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
Hi, a month ago I have upgraded a few dozens of our access layer 2960's to the latest version of IOS (12.2(58)SE1) and during the last few days three of these upgraded switches suddently have stopped responding to SSH telnet access. Traffic coming from/to ports is still regulary forwarded. Connecting over the serial port gives me '%% Low on memory; try again later' into the log. The only solution I came to is to reload the switch. Does anybody else have similar problem with this version of IOS? As far as I know, we don't use any special configuration. One feature is nearly hitting the limit (127 STP instances), but we didn't have any problems with this so far. Thank you for your thoughts. -- --- Kind regards, Jiri Prochazka ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/
Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's
Hi Jiri, Did you have a chance to collect the output of 'sh log' after logging in via console? If yes, please send it over. Did you observe a crash of the switch or only the error message? How many times did you see this so far? How often is it happening? How many 2960 switches running 12.2(58)SE1 do you have in total and on how many did you see this? If the switch is working fine now, I would recommend monitoring the memory usage and the rate of increase. Check the logs around that time to see if you find anything related, such as dot1x errors, etc. Also, consider collecting the following commands when the error message is seen again and open a Cisco TAC case if possible. sh log sh proc mem sorted sh mem summary sh mem allocating-process totals sh tech Best regards, Andras On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka jiri.procha...@superhosting.cz wrote: Hi, a month ago I have upgraded a few dozens of our access layer 2960's to the latest version of IOS (12.2(58)SE1) and during the last few days three of these upgraded switches suddently have stopped responding to SSH telnet access. Traffic coming from/to ports is still regulary forwarded. Connecting over the serial port gives me '%% Low on memory; try again later' into the log. The only solution I came to is to reload the switch. Does anybody else have similar problem with this version of IOS? As far as I know, we don't use any special configuration. One feature is nearly hitting the limit (127 STP instances), but we didn't have any problems with this so far. Thank you for your thoughts. -- --- Kind regards, Jiri Prochazka ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ cisco-nsp mailing list cisco-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/