RE: Cisco IOS Failure due to Virus

2003-09-15 Thread Mark Segal

Got love nanog..

A nice man from cisco called me, it looked like a lot of packets on my
router were being process switched (sh ip cache - displayed A LOT of
entries).  Anyway, it turns our some of my atm sub-ints inherited a "no ip
route-cache cef" from a parent int and well you can see what happens when
the packet volume increase.

Richard I would check that..

So now to life the rate-limit and see what happens..

Regards,
Mark


--
Mark Segal 
Director, Network Planning
FCI Broadband 
Tel: 905-284-4070 
Fax: 416-987-4701 
http://www.fcibroadband.com

Futureway Communications Inc. is now FCI Broadband


-Original Message-
From: Mark Segal [mailto:[EMAIL PROTECTED] 
Sent: September 15, 2003 1:50 PM
To: 'Richard J.Sears'; 'Robert Blayzor'
Cc: 'Nanog'; Mihai Iancu
Subject: RE: Cisco IOS Failure due to Virus




We are seeing the same problem on all of the 6400-nrp aggregation boxes we
have in the network.  Here is the IOS bug ID - CSCec12495.. Actually by rate
limiting icmp on our network the problems have stopped/slowed down a lot.

Sorry for the delay.. Was out of the country for a while..
Mark


--
Mark Segal 
Director, Network Planning
FCI Broadband 
Tel: 905-284-4070 
Fax: 416-987-4701 
http://www.fcibroadband.com

Futureway Communications Inc. is now FCI Broadband


-Original Message-
From: Richard J.Sears [mailto:[EMAIL PROTECTED] 
Sent: September 11, 2003 12:26 AM
To: Robert Blayzor
Cc: Nanog
Subject: Re: Cisco IOS Failure due to Virus



Hi Robert,

Thanks for the info. We are running dCEF...routers show about 4% CPU load
and the following memory:


BR02#sh mem  
   Head   Total(b)Used(b)Free(b)  Lowest(b) Largest(b)
Processor  613AE340   247798976   106515996   141282980   140653360
134546752
 Fast  6138E340 131080  37240  93840  93840  93788


Also, we are not blocking 92 byte ICMP due to the traceroute problems on
customers networks...

Thanks

On Wed, 10 Sep 2003 23:17:01 -0400
Robert Blayzor <[EMAIL PROTECTED]> wrote:

> 
> On 9/10/03 10:58 PM, "Richard J.Sears" <[EMAIL PROTECTED]> wrote:
> 
> > %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
> > 0x60329F00, alignment 0
> > Pool: Processor  Free: 92744  Cause: Memory fragmentation Alternate 
> > Pool: None  Free: 0  Cause: No Alternate pool -Process= "Pool 
> > Manager", ipl= 0, pid= 6 -Traceback= 6038049C 60382200 60329F08 
> > 6038DEDC
> > 
> > %TCP-6-NOBUFF: TTY0, no buffer available
> > -Process= "BGP Router", ipl= 0, pid= 132
> > 
> > %% Low on memory; try again later
> 
> Did you enable CEF?
> Are you dropping 92 byte ICMP packets where needed?
> 
> --
> Robert Blayzor, BOFH
> INOC, LLC
> [EMAIL PROTECTED]
> PGP: http://www.inoc.net/~dev/
> Key fingerprint = A445 7D1E 3D4F A4EF 6875  21BB 1BAA 10FE 5748 CFE9
> 
> "I don't need parents. All I need is a recording that says, 'Go play
> outside!" - Calvin and Hobbes
> 


**
Richard J. Sears
Vice President 
American Digital Network  

[EMAIL PROTECTED]
http://www.adnc.com

858.576.4272 - Phone
858.427.2401 - Fax


I fly because it releases my mind 
from the tyranny of petty things . . 


"Work like you don't need the money, love like you've
never been hurt and dance like you do when nobody's
watching."


RE: Cisco IOS Failure due to Virus

2003-09-15 Thread Mark Segal


We are seeing the same problem on all of the 6400-nrp aggregation boxes we
have in the network.  Here is the IOS bug ID - CSCec12495.. Actually by rate
limiting icmp on our network the problems have stopped/slowed down a lot.

Sorry for the delay.. Was out of the country for a while..
Mark


--
Mark Segal 
Director, Network Planning
FCI Broadband 
Tel: 905-284-4070 
Fax: 416-987-4701 
http://www.fcibroadband.com

Futureway Communications Inc. is now FCI Broadband


-Original Message-
From: Richard J.Sears [mailto:[EMAIL PROTECTED] 
Sent: September 11, 2003 12:26 AM
To: Robert Blayzor
Cc: Nanog
Subject: Re: Cisco IOS Failure due to Virus



Hi Robert,

Thanks for the info. We are running dCEF...routers show about 4% CPU load
and the following memory:


BR02#sh mem  
   Head   Total(b)Used(b)Free(b)  Lowest(b) Largest(b)
Processor  613AE340   247798976   106515996   141282980   140653360
134546752
 Fast  6138E340 131080  37240  93840  93840  93788


Also, we are not blocking 92 byte ICMP due to the traceroute problems on
customers networks...

Thanks

On Wed, 10 Sep 2003 23:17:01 -0400
Robert Blayzor <[EMAIL PROTECTED]> wrote:

> 
> On 9/10/03 10:58 PM, "Richard J.Sears" <[EMAIL PROTECTED]> wrote:
> 
> > %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from 
> > 0x60329F00, alignment 0
> > Pool: Processor  Free: 92744  Cause: Memory fragmentation Alternate 
> > Pool: None  Free: 0  Cause: No Alternate pool -Process= "Pool 
> > Manager", ipl= 0, pid= 6 -Traceback= 6038049C 60382200 60329F08 
> > 6038DEDC
> > 
> > %TCP-6-NOBUFF: TTY0, no buffer available
> > -Process= "BGP Router", ipl= 0, pid= 132
> > 
> > %% Low on memory; try again later
> 
> Did you enable CEF?
> Are you dropping 92 byte ICMP packets where needed?
> 
> --
> Robert Blayzor, BOFH
> INOC, LLC
> [EMAIL PROTECTED]
> PGP: http://www.inoc.net/~dev/
> Key fingerprint = A445 7D1E 3D4F A4EF 6875  21BB 1BAA 10FE 5748 CFE9
> 
> "I don't need parents. All I need is a recording that says, 'Go play 
> outside!" - Calvin and Hobbes
> 


**
Richard J. Sears
Vice President 
American Digital Network  

[EMAIL PROTECTED]
http://www.adnc.com

858.576.4272 - Phone
858.427.2401 - Fax


I fly because it releases my mind 
from the tyranny of petty things . . 


"Work like you don't need the money, love like you've
never been hurt and dance like you do when nobody's
watching."


Re: Cisco IOS Failure due to Virus

2003-09-12 Thread Stephen J. Wilcox


On Fri, 12 Sep 2003, Petri Helenius wrote:

> 
> Stephen J. Wilcox wrote:
> 
> >Hi,
> > we've seen this.. yuo need to make sure you filter the nachi worm 92 byte icmp
> >echo's on your interfaces and it will be fine. The problem seems to be input
> >buffers which use all the memory up for some reason.
> >  
> >
> This sounds vaguely similar to the recent IOS buffers stuck issue.

No, its quite different

1:
On the vuln. the buffer filled up and could not be emptied without a reboot

On nachi the buffer doesnt seem to fill and an acl or shutting the interface 
will solve the problem whilst the router stays up

2:
On the vuln. the outcome was that the particular interface stopped forwarding 
traffic

On nachi the router runs out of main memory and starts dropping processes
because of malloc failure


FYI I have only encountered the nachi problem on a few PE routers which were old 
and had little memory anyway eg Cisco 2500.. presumably the buffer filling isnt 
a memory leak and providnig there is enough spare memory the router wont be 
affected in this way.

Steve



Re: Cisco IOS Failure due to Virus

2003-09-12 Thread Petri Helenius
Stephen J. Wilcox wrote:

Hi,
we've seen this.. yuo need to make sure you filter the nachi worm 92 byte icmp
echo's on your interfaces and it will be fine. The problem seems to be input
buffers which use all the memory up for some reason.
 

This sounds vaguely similar to the recent IOS buffers stuck issue.

Pete




Re: Cisco IOS Failure due to Virus

2003-09-11 Thread Stephen J. Wilcox

Hi,
 we've seen this.. yuo need to make sure you filter the nachi worm 92 byte icmp
echo's on your interfaces and it will be fine. The problem seems to be input
buffers which use all the memory up for some reason.

Steve

On Wed, 10 Sep 2003, Richard J.Sears wrote:

> 
> Hey Everyone - 
> 
> We have two 7507 routers configured with dual RSP4s w/256MB RAM,
> VIP2-50s with 128/8MB RAM, Gig, POSIP OC3 and Fast Ethernet
> interfaces.
> 
> These routers have run flawlessly for over two years now. But about
> two weeks ago, all of a sudden we started having serious crashing
> problems with these two routers. The routers will lose bgp
> connectivity (one at a time) to our upstreams (configured on each
> router). First, we would see a keepalive not sent message, then a bgp
> hold timer expire, then the bgp peering session would go down. OSPF
> would start crashing, then we would see the memory error messages,
> then all interfaces would blink off-line. (Note - we are running the
> max memory we can on both the RSPs and the VIPs).
> 
> Within 1 minute, the exact same thing would happen to the other
> router. Often times we would have to reboot the router to get it to
> come back online. We would see the following errors and have to reboot
> multiple times to get the router back:
> 
> 
> %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
> 0x60329F00, alignment 0
> Pool: Processor  Free: 92744  Cause: Memory fragmentation
> Alternate Pool: None  Free: 0  Cause: No Alternate pool
> -Process= "Pool Manager", ipl= 0, pid= 6
> -Traceback= 6038049C 60382200 60329F08 6038DEDC
> 
> %TCP-6-NOBUFF: TTY0, no buffer available
> -Process= "BGP Router", ipl= 0, pid= 132
> 
> %% Low on memory; try again later
> 
> GigabitEthernet1/1/0: keepalive not sent
> 
> 
> 
> We are running the latest S train IOS patched for the IPV4 issue -
> however downgrading to the code we had run for the previous year did
> not solve the problem, nor did replacing the RSPs, VIPs and interfaces
> with new cards. In addition, we have complied with the Cisco
> recommendations for mitigating the effects of the Nachi Worm.
> 
> http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a00801b143a.shtml
> 
> We also shut down one of the routers totally and the other router
> still experienced the same issue.
> 
> None of these updates or fixes have solved the problem.
> 
> I am thinking it may have something to do with all the virus stuff
> running around (same thing was crashing my Lucent TNT's), but I cannot
> seem to get an answer from Cisco, nor can
> I find anyone seeing the same issues.
> 
> Hopefully someone here can shed some light on this problem.
> 
> Thanks in Advance
> 
> 
> Richard
> 
> I fly because it releases my mind 
> from the tyranny of petty things . . 
> 
> 
> 



RE: Cisco IOS Failure due to Virus

2003-09-10 Thread Niaz, Wajahat

Try using only CEF. I have seen similar problems in the past using dcef 

-Original Message-
From: Richard J.Sears [mailto:[EMAIL PROTECTED] 
Sent: Thursday, September 11, 2003 12:26 AM
To: Robert Blayzor
Cc: Nanog
Subject: Re: Cisco IOS Failure due to Virus


Hi Robert,

Thanks for the info. We are running dCEF...routers show about 4% CPU
load and the following memory:


BR02#sh mem  
   Head   Total(b)Used(b)Free(b)  Lowest(b) Largest(b)
Processor  613AE340   247798976   106515996   141282980   140653360
134546752
 Fast  6138E340 131080  37240  93840  93840  93788


Also, we are not blocking 92 byte ICMP due to the traceroute problems on
customers networks...

Thanks

On Wed, 10 Sep 2003 23:17:01 -0400
Robert Blayzor <[EMAIL PROTECTED]> wrote:

> 
> On 9/10/03 10:58 PM, "Richard J.Sears" <[EMAIL PROTECTED]> wrote:
> 
> > %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
> > 0x60329F00, alignment 0
> > Pool: Processor  Free: 92744  Cause: Memory fragmentation
> > Alternate Pool: None  Free: 0  Cause: No Alternate pool
> > -Process= "Pool Manager", ipl= 0, pid= 6
> > -Traceback= 6038049C 60382200 60329F08 6038DEDC
> > 
> > %TCP-6-NOBUFF: TTY0, no buffer available
> > -Process= "BGP Router", ipl= 0, pid= 132
> > 
> > %% Low on memory; try again later
> 
> Did you enable CEF?
> Are you dropping 92 byte ICMP packets where needed?
> 
> --
> Robert Blayzor, BOFH
> INOC, LLC
> [EMAIL PROTECTED]
> PGP: http://www.inoc.net/~dev/
> Key fingerprint = A445 7D1E 3D4F A4EF 6875  21BB 1BAA 10FE 5748 CFE9
> 
> "I don't need parents. All I need is a recording that says, 'Go play
> outside!" - Calvin and Hobbes
> 


**
Richard J. Sears
Vice President 
American Digital Network  

[EMAIL PROTECTED]
http://www.adnc.com

858.576.4272 - Phone
858.427.2401 - Fax


I fly because it releases my mind 
from the tyranny of petty things . . 


"Work like you don't need the money, love like you've
never been hurt and dance like you do when nobody's
watching."


Re: Cisco IOS Failure due to Virus

2003-09-10 Thread Richard J . Sears

Hi Robert,

Thanks for the info. We are running dCEF...routers show about 4% CPU
load and the following memory:


BR02#sh mem  
   Head   Total(b)Used(b)Free(b)  Lowest(b) Largest(b)
Processor  613AE340   247798976   106515996   141282980   140653360   134546752
 Fast  6138E340 131080  37240  93840  93840  93788


Also, we are not blocking 92 byte ICMP due to the traceroute problems on
customers networks...

Thanks

On Wed, 10 Sep 2003 23:17:01 -0400
Robert Blayzor <[EMAIL PROTECTED]> wrote:

> 
> On 9/10/03 10:58 PM, "Richard J.Sears" <[EMAIL PROTECTED]> wrote:
> 
> > %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
> > 0x60329F00, alignment 0
> > Pool: Processor  Free: 92744  Cause: Memory fragmentation
> > Alternate Pool: None  Free: 0  Cause: No Alternate pool
> > -Process= "Pool Manager", ipl= 0, pid= 6
> > -Traceback= 6038049C 60382200 60329F08 6038DEDC
> > 
> > %TCP-6-NOBUFF: TTY0, no buffer available
> > -Process= "BGP Router", ipl= 0, pid= 132
> > 
> > %% Low on memory; try again later
> 
> Did you enable CEF?
> Are you dropping 92 byte ICMP packets where needed?
> 
> --
> Robert Blayzor, BOFH
> INOC, LLC
> [EMAIL PROTECTED]
> PGP: http://www.inoc.net/~dev/
> Key fingerprint = A445 7D1E 3D4F A4EF 6875  21BB 1BAA 10FE 5748 CFE9
> 
> "I don't need parents. All I need is a recording that says, 'Go play
> outside!" - Calvin and Hobbes
> 


**
Richard J. Sears
Vice President 
American Digital Network  

[EMAIL PROTECTED]
http://www.adnc.com

858.576.4272 - Phone
858.427.2401 - Fax


I fly because it releases my mind 
from the tyranny of petty things . . 


"Work like you don't need the money, love like you've
never been hurt and dance like you do when nobody's
watching."



Re: Cisco IOS Failure due to Virus

2003-09-10 Thread Robert Blayzor

On 9/10/03 10:58 PM, "Richard J.Sears" <[EMAIL PROTECTED]> wrote:

> %SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
> 0x60329F00, alignment 0
> Pool: Processor  Free: 92744  Cause: Memory fragmentation
> Alternate Pool: None  Free: 0  Cause: No Alternate pool
> -Process= "Pool Manager", ipl= 0, pid= 6
> -Traceback= 6038049C 60382200 60329F08 6038DEDC
> 
> %TCP-6-NOBUFF: TTY0, no buffer available
> -Process= "BGP Router", ipl= 0, pid= 132
> 
> %% Low on memory; try again later

Did you enable CEF?
Are you dropping 92 byte ICMP packets where needed?

--
Robert Blayzor, BOFH
INOC, LLC
[EMAIL PROTECTED]
PGP: http://www.inoc.net/~dev/
Key fingerprint = A445 7D1E 3D4F A4EF 6875  21BB 1BAA 10FE 5748 CFE9

"I don't need parents. All I need is a recording that says, 'Go play
outside!" - Calvin and Hobbes




Cisco IOS Failure due to Virus

2003-09-10 Thread Richard J . Sears

Hey Everyone - 

We have two 7507 routers configured with dual RSP4s w/256MB RAM,
VIP2-50s with 128/8MB RAM, Gig, POSIP OC3 and Fast Ethernet
interfaces.

These routers have run flawlessly for over two years now. But about
two weeks ago, all of a sudden we started having serious crashing
problems with these two routers. The routers will lose bgp
connectivity (one at a time) to our upstreams (configured on each
router). First, we would see a keepalive not sent message, then a bgp
hold timer expire, then the bgp peering session would go down. OSPF
would start crashing, then we would see the memory error messages,
then all interfaces would blink off-line. (Note - we are running the
max memory we can on both the RSPs and the VIPs).

Within 1 minute, the exact same thing would happen to the other
router. Often times we would have to reboot the router to get it to
come back online. We would see the following errors and have to reboot
multiple times to get the router back:


%SYS-2-MALLOCFAIL: Memory allocation of 704 bytes failed from
0x60329F00, alignment 0
Pool: Processor  Free: 92744  Cause: Memory fragmentation
Alternate Pool: None  Free: 0  Cause: No Alternate pool
-Process= "Pool Manager", ipl= 0, pid= 6
-Traceback= 6038049C 60382200 60329F08 6038DEDC

%TCP-6-NOBUFF: TTY0, no buffer available
-Process= "BGP Router", ipl= 0, pid= 132

%% Low on memory; try again later

GigabitEthernet1/1/0: keepalive not sent



We are running the latest S train IOS patched for the IPV4 issue -
however downgrading to the code we had run for the previous year did
not solve the problem, nor did replacing the RSPs, VIPs and interfaces
with new cards. In addition, we have complied with the Cisco
recommendations for mitigating the effects of the Nachi Worm.

http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a00801b143a.shtml

We also shut down one of the routers totally and the other router
still experienced the same issue.

None of these updates or fixes have solved the problem.

I am thinking it may have something to do with all the virus stuff
running around (same thing was crashing my Lucent TNT's), but I cannot
seem to get an answer from Cisco, nor can
I find anyone seeing the same issues.

Hopefully someone here can shed some light on this problem.

Thanks in Advance


Richard

I fly because it releases my mind 
from the tyranny of petty things . .