Re: OSPF design [7:40269]

[EMAIL PROTECTED] Thu, 04 Apr 2002 22:10:25 -0800

Comments below...

Thanks,
JMcL
----- Forwarded by Jenny Mcleod/NSO/CSDA on 05/04/2002 03:25 pm -----

"Howard C. Berkowitz" 
Sent by: [EMAIL PROTECTED]
05/04/2002 02:09 pm
Please respond to "Howard C. Berkowitz"

        To:     [EMAIL PROTECTED]
        cc: 
        Subject:        Re: OSPF design [7:40269]

Jenny,

First, I apologize for not giving more of a response earlier, but 
it's been a crazy few days...three people in my office, including 
myself, have had close relatives/friends in surgery and there have 
been a lot of distractions.
JMcL: Err.. yes, I can see how that would be distracting.  Thanks for 
taking the time for this. /JMcL

I'm going to post and elaborate a bit on some observations I sent to 
you earlier, but I'm interested in why and how you have so much core 
trouble.  Could you give us an idea of the number of routes and of 
routers, and the stability of both, in the non-backbone areas?  Are 
the ABRs and any pure backbone routers doing any other 
processor-intensive tasks?
JMcL: The non-backbone areas (about twenty of them) vary quite a bit in 
size as they map (or did once) to geographic/administrative regions.  As 
they consist of multiple geographically-dispersed small offices with two 
routers each (for redundancy), they are pretty router-rich - the smallest 
area has 20 routers and 21 networks, the largest (I think) has 52/49 in 
area x.1.0.0 and 29/27 in x.2.0.0. 
While they aren't too bad for stability, the sheer number of sites means 
that something is usually playing up somewhere :-(
The ABRs mentioned in the problem below aren't doing anything very 
exciting, but some of the core routers have a fair load.  There are 
currently 50 routers in the backbone area - the backbone area is spread 
across two data centres and the ABRs mentioned (which are in sites around 
the country - they have WAN connections to the data centres, not LAN). 
Core routers in the data centres also support CIP cards, may be ABRs for 
other areas (we're not very good at "pure" backbone routers ;-), and until 
recently terminated stacks of DLSw circuits. 
We also have adjusted the OSPF timers throughout the network to make them 
more sensitive - this because we had SNA traffic (first via RSRB, then 
DLSw) and we wanted fast failover.  This worked, but does make OSPF a bit 
more inclined to hysteria when there are links flapping.  This is now 
being phased out as we have moved to TN3270, but the timers haven't all 
been changed back yet.
We possibly could go back to advertising each site separately now, since 
we've reduced the load in the core by various other methods, but I 
wouldn't want to battle the layer 8 issues to do it.
/JMcL

There can be creative solutions if you think outside the traditional 
OSPF box. Hypothetically, if your address plan split geographically, 
it might even be an idea to have an eastern and western OSPF domain 
(i.e., an area 0.0.0.0 and a set of nonzero areas), linked by 
redundant static routes or possibly BGP.  The latter is especially 
useful if you have multiple ISP connections.  Remember also that a 
router can have multiple OSPF processes, so the same router could 
participate in different domains. I assume your user population 
stretches across at least three time zones, so this sort of redesign 
might localize some core thrashing.
JMcL: I don't think we have too much time-based thrashing - even though 
most of our population is in the same time zone (especially in winter). 
Splitting our core is something that has frequently been considered, and 
in fact it was originally split, with very ugly redistribution using IGRP, 
which caused more problems than it solved.  As I mentioned, we may be 
doing a major redesign in the medium term and this is something we can 
consider again.
/JMcL 

>Peter's summarisation of the problem (pardon the pun) is a very good one 
-
>and very useful, as I hadn't really considered the broader case of
>overlapping summarisation in general.

I have a question for Peter a little later -- it can be interesting 
to contrast how different routing software deals with an implementer 
choice, and I don't know how JunOS deals with a particular situation.

>The chance of a major redesign
>simply to fix this problem is approximately the same chance as me winning
>an Olympic medal, but we may well be doing a major redesign/readdressing
>"soon" anyway, so that is something I can add to the list of
>considerations - then it may be quite feasible to put all the sites on
>Rtr1 into area 2.1.0.0 and all the sites on Rtr2 into area 2.2.0.0.  Any
>thoughts on where the local ethernets should go if we did that?  I guess
>whatever area they go in would have to be defined on both routers, and
>that might bring up issues of where we summarise again.  Hmm.  I'll have
to think about that.

>
>One thing I'm not clear on, though, is why the problem (reportedly)
>happened before we upgraded to IOS 12.1 - so before a route to null0 was
>used for the summarised networks (we didn't add one manually).  Any 
ideas?
>  I can understand why it's happening now, so this is more for my 
curiosity
>and understanding.
>
>Peter, when you say that the solution could involve "less specific
>summaries" - do you really mean more specific summaries?  Summarising 
less
>drastically (e.g. summarising each site separately) isn't a good solution
>in this particular case because it creates too much load in the core -
>that's how we used to do it but it created other problems.

One of the interesting things about the OSPF specification is that it 
leaves a lot of room to the implementer on handling summarization 
when some of the more-specific routes become unreachable from one ABR 
in a multiple-ABR area.  I know of at least two ways this has been 
implemented, and I wish both were selectable -- there's a place for 
each.

With IOS, if you have two ABRs in the same area, announcing the same 
summary, and the area becomes partitioned, the summaries continue to 
be announced. The rationale here is that the greater stability is 
worth some loss in connectivity. In other words, it's a static 
process of generating the summary.

In Bay RS, in the same situation, you tell the router what 
more-specifics belong to a summary.  If any of them become 
unreachable, the ABR stops announcing the summary and announces the 
remaining more-specifics into the core. The different rationale here 
is that accuracy is more important than increased route thrashing.

As you should be able to see, each of these can be valid assumptions 
depending on your network objectives.  Peter, how does JunOS deal 
with this situation?

What would be really nice is if Cisco extended BGP conditional 
advertisement to IGPs, and introduced a knob to have the default 
behavior overridden by conditional.

JMcL: I did find an undocumented command (well, not in the command 
reference), which sounds like it turns off the generation of the discard 
route, but I guess that's not nearly so powerful as a conditional 
advertisement (I've learned something else new - I'm not very familiar 
with BGP and had to look up conditional advertisements)
/JMcL

>I think in this case I'll be going for the "protect against partitioning"
>solution and bung in another cable.

Sounds good.  A general rule -- always have at least two paths 
between pairs of ABRs in an area.

>
>Thanks for comments - very useful.
>
>JMcL
>
>----- Forwarded by Jenny Mcleod/NSO/CSDA on 05/04/2002 08:47 am -----
>
>
>"Priscilla Oppenheimer"
>Sent by: [EMAIL PROTECTED]
>05/04/2002 04:39 am
>Please respond to "Priscilla Oppenheimer"
>
>
>         To:     [EMAIL PROTECTED]
>         cc:
>         Subject:        Re: OSPF design [7:40269]
>
>
>At 11:59 AM 4/4/02, Chuck wrote:
>>that was going to be my guess as well. I've done a number of lab
>experiments
>>with similar themes, and have in my own mind at least, confirmed what is
>>stated in the RFC - that the only serious routing issue with partitioned
>>non-backbone areas results from overlapping
>
>She does seem to have overlapping summarization, if that makes sense. She
>said:
>
>The area range statements on Rtr2 are...
>[various area 0 range statements snipped]
>   area 2.1.0.0 range 2.0.0.0 255.128.0.0
>   area 2.2.0.0 range 2.128.0.0 255.224.0.0
>
>On Rtr1 the statements are...
>[same area 0 range statements snipped]
>   area 2.1.0.0 range 2.0.0.0 255.128.0.0
>
>If you look at her ASCII art e-mail, you'll see that the WAN links were
>not
>assigned contiguously unless I'm missing something. Rt1 has 2.101.0.0/16
>and 2.109.0.0/16. Rtr 2 has 2.120.0.0/16, 2.104.0.0/16, and 2.130.0.0/16
>
>It's probably too late now, but perhaps if all the WAN links connected to
>Rtr 1 had been summarizable into a group that was distinct from the WAN
>links connected to Rtr 2, she wouldn't have the problem?? (Of course, she
>has that area 2.2.0.0 to deal with too, but perhaps it could be something
>different entirely....)
>
>But I don't think she's looking for a redesign. She's looking for a quick
>fix for now. What did you guys think of the idea of adding another direct
>connection between the two switches and putting it in area 2.1.0.0?
>
>Priscilla
>
>
>>Chuck
>>
>>""Peter van Oene""  wrote in message
>>[EMAIL PROTECTED]">news:[EMAIL PROTECTED]...
>>  > HI Jenny,
>>  >
>>  > Is it safe to say that your problem is that when your non backbone
>area
>>  > becomes partitioned, you lose reachability to one side of the
>>  > partition?  When you use large summarizes to describe entire areas 
and
>>have
>>  > multiple entry points into those areas themselves, this is a normal
>  > > occurrence.  If this is the problem, the solution likely involves 
the
>use
>>  > of less specific summaries per ABR, and/or greater L2 resiliency to
>>protect
>>  > against partitions.  If that's not the problem, can you indicate 
where
>>I've
>>  > misread the problem description?
>>  >
>>  > Thanks
>>  >
>>  > Pete
>>  >
>>  >
>>  >
>>  > At 09:05 PM 4/2/2002 -0500, [EMAIL PROTECTED] wrote:
>>  > >Hi all,
>>  > >
>>  > >This is actually a real-life scenario, but I think it throws up 
some
>>  > >interesting points about OSPF that some people may not have come
>across.
>>  > >And it has a couple of bits that I don't understand.  Please excuse
>the
>>  > >verbosity.
>>  > >
>>  > >Currently, (part of) this particular network is as described below.
>It
>>  > >normally works fine, but during certain types of failures,
>connectivity
>>  > >breaks although there is still a physical path.  I am contemplating
>what
>>  > >the best way to fix it would be, and would be interested in 
comments.
>>  > >
>>  > >Set-up - I don't think my ascii art is up to this but I'll give it 
a
>go
>>if
>>  > >the description isn't clear enough:
>>  > >
>>  > >Two ABRs (Rtr1 and Rtr2), running IOS 12.1, connected to each other
>by a
>>  > >direct ethernet cable in area 0, and also by several local ethernet
>>  > >networks in area 2.1.0.0.  The details of the local ethernets can
>>probably
>>  > >remain a fluffy cloud, but note that failure of a single component
>can
>>  > >potentially cause all area 2.1.0.0 neighbour connectivity between
>Rtr1
>>and
>>  > >Rtr2 to be lost, although the local ethernets may remain up on one 
or
>>both
>>  > >routers.
>>  > >
>>  > >Both routers have a connection back to the core of the network (on
>Rtr2
>>it
>>  > >is dialup, so not usually active), which is in area 0.  Both 
routers
>have
>>  > >WAN links to several sites (not dual-homed - each site has a link 
to
>only
>>  > >one ABR), in area 2.1.0.0.  Rtr2 may also have WAN links to several
>sites
>>  > >in area 2.2.0.0, but that's probably not too relevant.
>>  > >
>>  > >Both ABRs summarise the networks in area 2.1.0.0 to a single 
summary
>>  > >network (Rtr2 summarises the networks in 2.2.0.0, if any, to 
another
>>  > >summary network).
>>  > >
>>  > >This usually works fine - traffic from the core to sites connected 
to
>>Rtr2
>>  > >(in area 2.1.0.0) travels from Rtr1 to Rtr2 across the local
>ethernets
>>  > >(area 2.1.0.0), and in reverse from Rtr2 to Rtr1 across the Area 0
>>  > >ethernet.  This, while perhaps not ideal, is as expected, and works
>well
>>  > >under normal circumstances.  (If you're not sure why this is
>expected,
>>  > >read up on hot potato routing policy - Howard gave a good 
description
>in
>>  > >the context of stub areas in
>>  > >http://www.groupstudy.com/archives/cisco/200001/msg01579.html)
>>  > >
>>  > >The problem happens if the area 2.1.0.0 neighbour connections 
between
>>Rtr1
>>  > >and Rtr2 are lost.  Even though there is still an area 0 link 
between
>>  > >them, area 2.1.0.0 sites connected to rtr2 lose connectivity to the
>core.
>>  > >Area 2.2.0.0 sites are OK (this is good - I'd be really confused if
>they
>>  > >lost it too).
>>  > >Despite Doyle claiming that partitioned non-backbone areas are not 
a
>>  > >problem (he does, on page 462 of Routing TCP/IP Vol 1), it seems 
they
>can
>>  > >be.  As far as I can see, it's because when summarising the 2.1.0.0
>>  > >networks, Rtr1 also installs a route to null0 for the summary route 
-
>>  > >which overrides the summary route that Rtr2 generates (and which
>would
>>  > >otherwise cover the 'lost' sites).
>>  > >
>>  > >I can see a couple of possibilities for fixing this...
>>  > >1) Install a second direct ethernet cable between Rtr1 and Rtr2, in
>area
>>  > >2.1.0.0.  This may not be particularly elegant, but it should be
>>  > >comparatively easy to do and effective (there are plenty of spare
>>ethernet
>>  > >ports).  It also has the useful side-effect of getting the through
>>traffic
>>  > >off the local ethernets.
>>  > >
>>  > >2) Use the "no discard-route internal" command - this doesn't 
appear
>to
>>be
>>  > >documented but is mentioned at
>>  > >http://www.cisco.com/warp/public/104/3.html#12.0
>>  > >I haven't tested it, but I think it should prevent the null0 route
>from
>>  > >being installed by Rtr1, so my theory is that then the summary
>generated
>>  > >by Rtr2 should come into play.  This, of course, goes against all
>Cisco
>>  > >recommendations, which say that having the null0 route is A Good
>Thing
>to
>>  > >prevent routing loops.
>>  > >
>>  > >3) Muck about with the arrangement of switches within the internal
>>  > >networks.  I think this will cause more trouble than it's worth,
>since
>>any
>>  > >rearrangement has to be duplicated at twenty sites.  In theory at
>least,
>>  > >the whole network may be redesigned from scratch over the next year
>or
>>so,
>>  > >so a quick and dirty fix isn't necessarily a problem.
>>  > >
>>  > >BUT... I am also not positive that my understanding of what is
>happening
>>  > >and why is correct, because the support guys have told me that this
>>  > >problem has been around since we were running IOS 11.2 on the ABRs
>(not
>>  > >that long ago, believe it or not), and I'm pretty sure that no 
route
>to
>>  > >null0 was being generated then (summarisation was the same).
>>  > >So can anyone explain to me why connectivity would fail even if no
>null0
>>  > >route was being generated?  What am I missing?
>>  > >And does anyone feel like commenting on the options for fixing it?
>>  > >
>>  > >JMcL
>________________________
>
>Priscilla Oppenheimer
>http://www.priscilla.com

Message Posted at:
http://www.groupstudy.com/form/read.php?f=7&i=40564&t=40269
--------------------------------------------------
FAQ, list archives, and subscription info: http://www.groupstudy.com/list/cisco.html
Report misconduct and Nondisclosure violations to [EMAIL PROTECTED]

Re: OSPF design [7:40269]

Reply via email to