Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-25 Thread Jiri Prochazka

Hi Adnras,

Dne 20.7.2011 21:35, Tóth András napsal(a):

Hi Jiri,

When you mention logs are useless, do you mean you did not find
anything in the logs after logging on to the switch which freed up
some memory?



Yup, there were no signs of anything unusual in the log. logging 
severity is set to notifications.




Any chance to collect the following command from the switch which
freed up some memory during the night?
sh mem allocating-process totals


DC.Cisco.138#sh mem allocating-process totals   
Total(b)Used(b) Free(b) Lowest(b) Largest(b)
Processor   21585348195477682037580 133081  1374036


PC Total Count   Name
0x015D73F4  2202188 277  Process Stack
0x0032C018  1213820 1050 *Packet Header*
0x005B1364  743256  74   Flashfs Sector
0x00F81528  712840  8Init
0x00E7B38C  523328  85   Init
0x01546F8C  496176  36   TW Buckets
0x0048A008  439340  1Init
0x01443754  393480  6STP Port Control Block Chunk
0x01011B34  292956  3149 IPC Zone
0x0032F68C  262720  6pak subblock chunk
0x00A6BA2C  262232  2CEF: hash table
0x00489FD8  256300  1Init
0x0079E27C  250672  2PM port_data
0x0158BD78  207900  275  Process
0x00339870  203148  57   *Hardware IDB*
0x01011BDC  196740  3IPC Message Hea
0x0016CDD0  196740  3Mat Addr Tbl Ch
0x004EE5A8  196652  1HRM: destination array
0x015F68A8  191876  3EEM ED ND
0x00E5C79C  184320  2event_trace_tbs
0x0032C06C  164640  4*Packet Data*
0x00809DC8  163884  1Init
0x00949AF4  145484  399  MLDSN L2MCM
0x004F6FA8  135652  29   HULC_MAD_SD_MGR
0x01030A50  133468  383  Virtual Exec
0x013F2930  132728  7VLAN Manager
0xE8BC  132132  11   DTP Protocol
0x00AD52E0  131976  4VRFS: MTRIE n08
0x00336804  131116  1*Init*
0x014271B0  130376  12   SNMP SMALL CHUN
0x007910A8  129948  51   PM port sub-block
0x016F4304  125244  1820 Init
0x009561E4  110676  399  MLDSN L2MCM
0x0048A020  109868  1Init


Unfortunately I'm not familiar with usual values these processes should 
allocate.





This might sound stupid but can you confirm by looking at the uptime
that the switch did not crash? If it did, please collect the crashinfo
files and send them so I can take a look.


The switch did not crash, it's uptime is over 6 weeks now.



While monitoring the memory usage, if you see regular increase,
collect the following commands several times so you can compare them
later to see which process allocates most memory.
sh proc mem sorted
sh mem allocating-process totals



Memory graphing is being implemented now. As soon as I have relevant 
graphs, I will gather info given by these commands.





Thank you,


Jiri



Best regards,
Andras


On Wed, Jul 20, 2011 at 1:22 PM, Jiri Prochazka
jiri.procha...@superhosting.cz  wrote:

Hi Andras,

All I was able to get from the switch was '%% Low on memory; try again
later', so I had no chance to get any usefull info.

None of them really crashed, even now (a few days after the issue raised)
all are forwarding everything without any interruption. The only (doh)
problem is that they are refusing any remote/local management.

We have aproximately 40 2960's in our network, all were upgraded to
12.2(58)SE1 at the same night 42 days ago. Till this day four of them have
shown this error (first one a week ago, the rest during the last 7 days).

I will definitely implement graphing of memory usage and monitor this. Logs
are useless, as there is absolutely none info regarding to this behaviour.


update: Wow, one of 'crashed' switches surprisingly managed to free some
memory over the night and there is no problem with remote login now!

DC.Cisco.138#show mem
HeadTotal(b) Used(b) Free(b)   Lowest(b)
Largest(b)
Processor27A819C2158534819502124 2083224 1330816
  1396804
  I/O2C0 4194304 2385892 1808412 1647292
1803000
Driver te1A0 1048576  44 1048532 1048532
  1048532



DC.Cisco.138#show proc mem sorted
Processor Pool Total:   21585348 Used:   19506548 Free:2078800
  I/O Pool Total:4194304 Used:2385788 Free:1808516
Driver te Pool Total:1048576 Used: 40 Free:1048536

  PID TTY  Allocated  FreedHoldingGetbufsRetbufs Process
   0   0   209660643684020   13930872  0  0 *Init*
   0   0  349880992  30354565617584884520010 421352 *Dead*
   0   0  0  0 722384  0  0 *MallocLite*
  67   0 531728  17248 463548  0  0 Stack Mgr
Notifi
  81   0 488448232 332392  0

Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-20 Thread Jiri Prochazka

Hi Andras,

All I was able to get from the switch was '%% Low on memory; try again 
later', so I had no chance to get any usefull info.


None of them really crashed, even now (a few days after the issue 
raised) all are forwarding everything without any interruption. The only 
(doh) problem is that they are refusing any remote/local management.


We have aproximately 40 2960's in our network, all were upgraded to 
12.2(58)SE1 at the same night 42 days ago. Till this day four of them 
have shown this error (first one a week ago, the rest during the last 7 
days).


I will definitely implement graphing of memory usage and monitor this. 
Logs are useless, as there is absolutely none info regarding to this 
behaviour.



update: Wow, one of 'crashed' switches surprisingly managed to free some 
memory over the night and there is no problem with remote login now!


DC.Cisco.138#show mem
HeadTotal(b) Used(b) Free(b)   Lowest(b) 
Largest(b)
Processor27A819C2158534819502124 2083224 1330816 
 1396804
  I/O2C0 4194304 2385892 1808412 1647292 
  1803000
Driver te1A0 1048576  44 1048532 1048532 
 1048532




DC.Cisco.138#show proc mem sorted
Processor Pool Total:   21585348 Used:   19506548 Free:2078800
  I/O Pool Total:4194304 Used:2385788 Free:1808516
Driver te Pool Total:1048576 Used: 40 Free:1048536

 PID TTY  Allocated  FreedHoldingGetbufsRetbufs Process
   0   0   209660643684020   13930872  0  0 *Init*
   0   0  349880992  30354565617584884520010 421352 *Dead*
   0   0  0  0 722384  0  0 
*MallocLite*
  67   0 531728  17248 463548  0  0 Stack 
Mgr Notifi
  81   0 488448232 332392  0  0 HLFM 
address lea
 104   060022606886956 234548  0  0 HACL 
Acl Manager
 151   01161020 437668 214108  0  0 DTP 
Protocol

  59   0 198956   34501644 208516  0  0 EEM ED ND
 163   0 196740  0 203900  0  0 VMATM 
Callback

 219   0 775680   39872788 186548  0  0 MLDSN L2MCM
  16   0 312148 762860 145736  0 104780 Entity 
MIB API




Thank you,


Jiri



Dne 20.7.2011 0:08, Tóth András napsal(a):

Hi Jiri,

Did you have a chance to collect the output of 'sh log' after logging
in via console? If yes, please send it over.
Did you observe a crash of the switch or only the error message?
How many times did you see this so far? How often is it happening?
How many 2960 switches running 12.2(58)SE1 do you have in total and on
how many did you see this?

If the switch is working fine now, I would recommend monitoring the
memory usage and the rate of increase. Check the logs around that time
to see if you find anything related, such as dot1x errors, etc.

Also, consider collecting the following commands when the error
message is seen again and open a Cisco TAC case if possible.
sh log
sh proc mem sorted
sh mem summary
sh mem allocating-process totals
sh tech

Best regards,
Andras


On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka
jiri.procha...@superhosting.cz  wrote:

Hi,

a month ago I have upgraded a few dozens of our access layer 2960's to the
latest version of IOS (12.2(58)SE1) and during the last few days three of
these upgraded switches suddently have stopped responding to SSH  telnet
access. Traffic coming from/to ports is still regulary forwarded.

Connecting over the serial port gives me '%% Low on memory; try again later'
into the log. The only solution I came to is to reload the switch.


Does anybody else have similar problem with this version of IOS?


As far as I know, we don't use any special configuration. One feature is
nearly hitting the limit (127 STP instances), but we didn't have any
problems with this so far.



Thank you for your thoughts.



--
---

Kind regards,


Jiri Prochazka

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/



___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/




--
---

Kind regards,


Jiri Prochazka

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-20 Thread Nikolay Shopik
It's kinda sad 2960 switches don't have memory reserve console 
command, which will allow you to run some diagnostic commands. Sometimes 
when memory leaks happens there tracerbacks actually logged, so I 
suggest enable syslog logging on switches, so you can see all logging 
stuff, before it run out of memory and you can't run any command.


On 20/07/11 15:22, Jiri Prochazka wrote:

Hi Andras,

All I was able to get from the switch was '%% Low on memory; try again
later', so I had no chance to get any usefull info.

None of them really crashed, even now (a few days after the issue
raised) all are forwarding everything without any interruption. The only
(doh) problem is that they are refusing any remote/local management.

We have aproximately 40 2960's in our network, all were upgraded to
12.2(58)SE1 at the same night 42 days ago. Till this day four of them
have shown this error (first one a week ago, the rest during the last 7
days).

I will definitely implement graphing of memory usage and monitor this.
Logs are useless, as there is absolutely none info regarding to this
behaviour.


update: Wow, one of 'crashed' switches surprisingly managed to free some
memory over the night and there is no problem with remote login now!

DC.Cisco.138#show mem
Head Total(b) Used(b) Free(b) Lowest(b) Largest(b)
Processor 27A819C 21585348 19502124 2083224 1330816 1396804
I/O 2C0 4194304 2385892 1808412 1647292 1803000
Driver te 1A0 1048576 44 1048532 1048532 1048532



DC.Cisco.138#show proc mem sorted
Processor Pool Total: 21585348 Used: 19506548 Free: 2078800
I/O Pool Total: 4194304 Used: 2385788 Free: 1808516
Driver te Pool Total: 1048576 Used: 40 Free: 1048536

PID TTY Allocated Freed Holding Getbufs Retbufs Process
0 0 20966064 3684020 13930872 0 0 *Init*
0 0 349880992 303545656 1758488 4520010 421352 *Dead*
0 0 0 0 722384 0 0 *MallocLite*
67 0 531728 17248 463548 0 0 Stack Mgr Notifi
81 0 488448 232 332392 0 0 HLFM address lea
104 0 6002260 6886956 234548 0 0 HACL Acl Manager
151 0 1161020 437668 214108 0 0 DTP Protocol
59 0 198956 34501644 208516 0 0 EEM ED ND
163 0 196740 0 203900 0 0 VMATM Callback
219 0 775680 39872788 186548 0 0 MLDSN L2MCM
16 0 312148 762860 145736 0 104780 Entity MIB API



Thank you,


Jiri



Dne 20.7.2011 0:08, Tóth András napsal(a):

Hi Jiri,

Did you have a chance to collect the output of 'sh log' after logging
in via console? If yes, please send it over.
Did you observe a crash of the switch or only the error message?
How many times did you see this so far? How often is it happening?
How many 2960 switches running 12.2(58)SE1 do you have in total and on
how many did you see this?

If the switch is working fine now, I would recommend monitoring the
memory usage and the rate of increase. Check the logs around that time
to see if you find anything related, such as dot1x errors, etc.

Also, consider collecting the following commands when the error
message is seen again and open a Cisco TAC case if possible.
sh log
sh proc mem sorted
sh mem summary
sh mem allocating-process totals
sh tech

Best regards,
Andras


On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka
jiri.procha...@superhosting.cz wrote:

Hi,

a month ago I have upgraded a few dozens of our access layer 2960's
to the
latest version of IOS (12.2(58)SE1) and during the last few days
three of
these upgraded switches suddently have stopped responding to SSH telnet
access. Traffic coming from/to ports is still regulary forwarded.

Connecting over the serial port gives me '%% Low on memory; try again
later'
into the log. The only solution I came to is to reload the switch.


Does anybody else have similar problem with this version of IOS?


As far as I know, we don't use any special configuration. One feature is
nearly hitting the limit (127 STP instances), but we didn't have any
problems with this so far.



Thank you for your thoughts.



--
---

Kind regards,


Jiri Prochazka

___
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/



___
cisco-nsp mailing list cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/





___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-20 Thread Tóth András
Hi Jiri,

When you mention logs are useless, do you mean you did not find
anything in the logs after logging on to the switch which freed up
some memory?

Any chance to collect the following command from the switch which
freed up some memory during the night?
sh mem allocating-process totals

This might sound stupid but can you confirm by looking at the uptime
that the switch did not crash? If it did, please collect the crashinfo
files and send them so I can take a look.

While monitoring the memory usage, if you see regular increase,
collect the following commands several times so you can compare them
later to see which process allocates most memory.
sh proc mem sorted
sh mem allocating-process totals


Best regards,
Andras


On Wed, Jul 20, 2011 at 1:22 PM, Jiri Prochazka
jiri.procha...@superhosting.cz wrote:
 Hi Andras,

 All I was able to get from the switch was '%% Low on memory; try again
 later', so I had no chance to get any usefull info.

 None of them really crashed, even now (a few days after the issue raised)
 all are forwarding everything without any interruption. The only (doh)
 problem is that they are refusing any remote/local management.

 We have aproximately 40 2960's in our network, all were upgraded to
 12.2(58)SE1 at the same night 42 days ago. Till this day four of them have
 shown this error (first one a week ago, the rest during the last 7 days).

 I will definitely implement graphing of memory usage and monitor this. Logs
 are useless, as there is absolutely none info regarding to this behaviour.


 update: Wow, one of 'crashed' switches surprisingly managed to free some
 memory over the night and there is no problem with remote login now!

 DC.Cisco.138#show mem
                Head    Total(b)     Used(b)     Free(b)   Lowest(b)
 Largest(b)
 Processor    27A819C    21585348    19502124     2083224     1330816
  1396804
      I/O    2C0     4194304     2385892     1808412     1647292
 1803000
 Driver te    1A0     1048576          44     1048532     1048532
  1048532



 DC.Cisco.138#show proc mem sorted
 Processor Pool Total:   21585348 Used:   19506548 Free:    2078800
      I/O Pool Total:    4194304 Used:    2385788 Free:    1808516
 Driver te Pool Total:    1048576 Used:         40 Free:    1048536

  PID TTY  Allocated      Freed    Holding    Getbufs    Retbufs Process
   0   0   20966064    3684020   13930872          0          0 *Init*
   0   0  349880992  303545656    1758488    4520010     421352 *Dead*
   0   0          0          0     722384          0          0 *MallocLite*
  67   0     531728      17248     463548          0          0 Stack Mgr
 Notifi
  81   0     488448        232     332392          0          0 HLFM address
 lea
  104   0    6002260    6886956     234548          0          0 HACL Acl
 Manager
  151   0    1161020     437668     214108          0          0 DTP Protocol
  59   0     198956   34501644     208516          0          0 EEM ED ND
  163   0     196740          0     203900          0          0 VMATM
 Callback
  219   0     775680   39872788     186548          0          0 MLDSN L2MCM
  16   0     312148     762860     145736          0     104780 Entity MIB
 API



 Thank you,


 Jiri



 Dne 20.7.2011 0:08, Tóth András napsal(a):

 Hi Jiri,

 Did you have a chance to collect the output of 'sh log' after logging
 in via console? If yes, please send it over.
 Did you observe a crash of the switch or only the error message?
 How many times did you see this so far? How often is it happening?
 How many 2960 switches running 12.2(58)SE1 do you have in total and on
 how many did you see this?

 If the switch is working fine now, I would recommend monitoring the
 memory usage and the rate of increase. Check the logs around that time
 to see if you find anything related, such as dot1x errors, etc.

 Also, consider collecting the following commands when the error
 message is seen again and open a Cisco TAC case if possible.
 sh log
 sh proc mem sorted
 sh mem summary
 sh mem allocating-process totals
 sh tech

 Best regards,
 Andras


 On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka
 jiri.procha...@superhosting.cz  wrote:

 Hi,

 a month ago I have upgraded a few dozens of our access layer 2960's to
 the
 latest version of IOS (12.2(58)SE1) and during the last few days three of
 these upgraded switches suddently have stopped responding to SSH  telnet
 access. Traffic coming from/to ports is still regulary forwarded.

 Connecting over the serial port gives me '%% Low on memory; try again
 later'
 into the log. The only solution I came to is to reload the switch.


 Does anybody else have similar problem with this version of IOS?


 As far as I know, we don't use any special configuration. One feature is
 nearly hitting the limit (127 STP instances), but we didn't have any
 problems with this so far.



 Thank you for your thoughts.



 --
 ---

 Kind regards,


 Jiri Prochazka

 ___
 

[c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-19 Thread Jiri Prochazka

Hi,

a month ago I have upgraded a few dozens of our access layer 2960's to 
the latest version of IOS (12.2(58)SE1) and during the last few days 
three of these upgraded switches suddently have stopped responding to 
SSH  telnet access. Traffic coming from/to ports is still regulary 
forwarded.


Connecting over the serial port gives me '%% Low on memory; try again 
later' into the log. The only solution I came to is to reload the switch.



Does anybody else have similar problem with this version of IOS?


As far as I know, we don't use any special configuration. One feature is 
nearly hitting the limit (127 STP instances), but we didn't have any 
problems with this so far.




Thank you for your thoughts.



--
---

Kind regards,


Jiri Prochazka

___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/


Re: [c-nsp] memory leaking in IOS 12.2(58)SE1 on 2960's

2011-07-19 Thread Tóth András
Hi Jiri,

Did you have a chance to collect the output of 'sh log' after logging
in via console? If yes, please send it over.
Did you observe a crash of the switch or only the error message?
How many times did you see this so far? How often is it happening?
How many 2960 switches running 12.2(58)SE1 do you have in total and on
how many did you see this?

If the switch is working fine now, I would recommend monitoring the
memory usage and the rate of increase. Check the logs around that time
to see if you find anything related, such as dot1x errors, etc.

Also, consider collecting the following commands when the error
message is seen again and open a Cisco TAC case if possible.
sh log
sh proc mem sorted
sh mem summary
sh mem allocating-process totals
sh tech

Best regards,
Andras


On Tue, Jul 19, 2011 at 4:34 PM, Jiri Prochazka
jiri.procha...@superhosting.cz wrote:
 Hi,

 a month ago I have upgraded a few dozens of our access layer 2960's to the
 latest version of IOS (12.2(58)SE1) and during the last few days three of
 these upgraded switches suddently have stopped responding to SSH  telnet
 access. Traffic coming from/to ports is still regulary forwarded.

 Connecting over the serial port gives me '%% Low on memory; try again later'
 into the log. The only solution I came to is to reload the switch.


 Does anybody else have similar problem with this version of IOS?


 As far as I know, we don't use any special configuration. One feature is
 nearly hitting the limit (127 STP instances), but we didn't have any
 problems with this so far.



 Thank you for your thoughts.



 --
 ---

 Kind regards,


 Jiri Prochazka

 ___
 cisco-nsp mailing list  cisco-nsp@puck.nether.net
 https://puck.nether.net/mailman/listinfo/cisco-nsp
 archive at http://puck.nether.net/pipermail/cisco-nsp/


___
cisco-nsp mailing list  cisco-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/