Re: [j-nsp] BGP output queue priorities between RIBs/NLRIs
> > Can you do the EVPN routes on a separate session (different loopback on > both ends, dedicated to EVPN-afi-only BGP)? Separate sessions would help if TCP socket would be the real issue, but here clear it is not. > Or separate RRs? > Sure that may help. In fact even separate RPD demon on the same box may help :) But what seems wired is last statement: "This has problems with blackholing traffic for long periods in several cases,..." We as the industry have solved this problem many years ago, by clearly decoupling connectivity restoration term from protocol convergence term. IMO protocols can take as much as they like to "converge" after bad or good network event yet connectivity restoration upon any network event within a domain (RRs were brought as example) should be max of 100s of ms. Clearly sub second. How: - RIB tracks next hops and when they go down (known via fast IGP flooding) or their metric changes then paths with such next hop are either removed or best path is run - Data plane has precomputed backup paths and switchover happens in the PIC fashion in parallel to any control plane stress free work I think this would be a recommended direction not so much to mangle BGP code to optimize here and in the same time cause new maybe more severe issues somewhere else. Sure per SAFI refresh should be the norm, but I don't think this is the main issue here. Thx, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
Yes NAT is configured there as I indicated via presence of si- phantom load ... Having NAT there is not my idea though :). But sorry can not share the config. If you could shed some more light on your comment how to properly configure it and what to avoid I think it may be very useful for many folks on this list. Many thx, R. On Tue, Mar 24, 2020 at 5:00 AM Alexander Arseniev wrote: > Hello, > > > > Another interesting observation is that show command indicated services > inline input traffic over 33 Mpps zero output while total coming to the box > was at that time 1 Mpps > > > Do You have inline NAT configured on this box? Is it possible to share the > config please? > It is quite easy to loop traffic with NAT (inline or not) and while looped > inside same box, > TTL does not get decremented so You end up with eternal PFE saturation. > > Thanks > Alex > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
No really as if I ping .209 from Internet side without IP options (this is ISP edge) I get 30 ms RTT across Europe. Besides this is not the only MX104 which is slow in fast path. If we well J finds an answer and makes it public I will share with the list. Thx, R. On Mon, Mar 23, 2020 at 9:32 PM Timur Maryin wrote: > > > On 23-Mar-20 14:03, Robert Raszuk wrote: > > Hi, > > > > Would anyone have any idea why IP packets with options are forwarded via > > MX104 20x faster then regular IP packets ? > > > > "fast" PFE path - 24-35 ms > > "slow" RE path - 1-4 ms > > > 24 ms is ages in terms of PFE. > I hardly can imaginethat is possible. > > > Is it possible that .209 answers faster to packets with options? > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
PFE bugs ... but I will happily let vendor handle it these days :) On Mon, Mar 23, 2020 at 6:30 PM Mark Tinka wrote: > > > On 23/Mar/20 19:25, Robert Raszuk wrote: > > > Hi Mark, > > > > Oh yes ... exact same config and same setup and even same hw (mx104) > > works fine in other location giving consistent 1 ms without IP options > > and mostly 1-3 ms with IP options - but that is all fine. Going to RE > > is always non deterministic :) > > That is quite curious. > > Given the same kit in another location and the results, what immediately > stands out to you? > > Mark. > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
Hi Mark, Oh yes ... exact same config and same setup and even same hw (mx104) works fine in other location giving consistent 1 ms without IP options and mostly 1-3 ms with IP options - but that is all fine. Going to RE is always non deterministic :) Another interesting observation is that show command indicated services inline input traffic over 33 Mpps zero output while total coming to the box was at that time 1 Mpps Thx, R. On Mon, Mar 23, 2020 at 5:47 PM Mark Tinka wrote: > > > On 23/Mar/20 17:37, Saku Ytti wrote: > > I'm not sure what you mean by 'PFE loops', this is single NPU > > fabricless platform, all ports are local. The only way to delay packet > > that long, is to send it to off-chip DRAM (delay buffer). > > > > But please update list once you figure it out. > > Just for giggles, Robert, are you able to test this with another Juniper > platform (even if it's the MX80)? > > Mark. > > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
That is actual topology and during testing no other return path existed. It seems that there are PFE loops which would explain why punted to RE packets are forwarded so fast JTAC is debugging :) Thx, R. On Mon, Mar 23, 2020 at 4:17 PM Saku Ytti wrote: > Hey, > > > This is very simple setup: > > > > linux (.206) LAN mx104(.210) p2p isp (.209) > > > > Pretty much 1.4 is what is expected. The only place the delay occurs is > on ingress from the ISP to MX104. If I ping MX104 outbound (.210) int I get > 0.5 ms. > > > > No worries anyway ... just thought anyone run into this before. > > Is this simplified topology or actual? Is it not possible that return > path is different? Entirely possible for SW and HW forwarded packets > to experience different path selection. So perhaps ISP is returning > packets via other path for HW packets, but other path via SW packets. > > It's very difficult to imagine failure mode where packets would wait > consistent ~23ms on the mx104. > > -- > ++ytti > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Slow RE path 20 x faster then PFE path
Hi Saku, This is very simple setup: linux (.206) LAN mx104(.210) p2p isp (.209) Pretty much 1.4 is what is expected. The only place the delay occurs is on ingress from the ISP to MX104. If I ping MX104 outbound (.210) int I get 0.5 ms. No worries anyway ... just thought anyone run into this before. Cheers, R, On Mon, Mar 23, 2020 at 2:14 PM Saku Ytti wrote: > There isn't enough information to answer your question. > > But one possible reason is that you're choosing a different path in SW > and HW. Or that the answers are not even coming from host you think > (tshark might add information). > > a) is 1.4ms possible in terms of speed-of-light? > b) where are 24ms packets sitting, do you also have longer path > available or are you heavily congested causing massive queueing delay? > > > > On Mon, 23 Mar 2020 at 15:09, Robert Raszuk wrote: > > > > Hi, > > > > Would anyone have any idea why IP packets with options are forwarded via > > MX104 20x faster then regular IP packets ? > > > > "fast" PFE path - 24-35 ms > > "slow" RE path - 1-4 ms > > > > Example (I used record route to force IP Options punt to slow path): > > > > rraszuk@cto-lon2:~$ ping 62.189.71.209 -R -v > > PING 62.189.71.209 (62.189.71.209) 56(124) bytes of data. > > 64 bytes from 62.189.71.209: icmp_seq=1 ttl=63 time=1.44 ms > > RR: 69.191.176.206 > > 10.249.23.7 > > 62.189.71.209 > > 10.249.23.7 > > 69.191.176.206 > > > > 64 bytes from 62.189.71.209: icmp_seq=2 ttl=63 time=1.38 ms > > 64 bytes from 62.189.71.209: icmp_seq=3 ttl=63 time=1.46 ms > > 64 bytes from 62.189.71.209: icmp_seq=4 ttl=63 time=1.41 ms > > 64 bytes from 62.189.71.209: icmp_seq=5 ttl=63 time=1.49 ms > > 64 bytes from 62.189.71.209: icmp_seq=6 ttl=63 time=1.46 ms > > 64 bytes from 62.189.71.209: icmp_seq=8 ttl=63 time=1.52 ms > > 64 bytes from 62.189.71.209: icmp_seq=9 ttl=63 time=2.84 ms > > 64 bytes from 62.189.71.209: icmp_seq=10 ttl=63 time=1.77 ms > > ^C > > --- 62.189.71.209 ping statistics --- > > 10 packets transmitted, 10 received, 0% packet loss, time 9014ms > > rtt min/avg/max/mdev = 1.386/1.892/4.117/0.849 ms > > > > Now I use normal ping running between 62.189.71.209 & 69.191.176.206 in > > "fast" path: > > > > rraszuk@cto-lon2:~$ ping 62.189.71.209 -v > > PING 62.189.71.209 (62.189.71.209) 56(84) bytes of data. > > 64 bytes from 62.189.71.209: icmp_seq=1 ttl=63 time=24.1 ms > > 64 bytes from 62.189.71.209: icmp_seq=2 ttl=63 time=31.5 ms > > 64 bytes from 62.189.71.209: icmp_seq=3 ttl=63 time=24.1 ms > > 64 bytes from 62.189.71.209: icmp_seq=4 ttl=63 time=24.1 ms > > 64 bytes from 62.189.71.209: icmp_seq=5 ttl=63 time=24.0 ms > > 64 bytes from 62.189.71.209: icmp_seq=6 ttl=63 time=24.1 ms > > ^C > > --- 62.189.71.209 ping statistics --- > > 6 packets transmitted, 6 received, 0% packet loss, time 5006ms > > rtt min/avg/max/mdev = 24.097/25.369/31.563/2.774 ms > > rraszuk@cto-lon2:~$ > > > > Best, > > R. > > ___ > > juniper-nsp mailing list juniper-nsp@puck.nether.net > > https://puck.nether.net/mailman/listinfo/juniper-nsp > > > > -- > ++ytti > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
[j-nsp] Slow RE path 20 x faster then PFE path
Hi, Would anyone have any idea why IP packets with options are forwarded via MX104 20x faster then regular IP packets ? "fast" PFE path - 24-35 ms "slow" RE path - 1-4 ms Example (I used record route to force IP Options punt to slow path): rraszuk@cto-lon2:~$ ping 62.189.71.209 -R -v PING 62.189.71.209 (62.189.71.209) 56(124) bytes of data. 64 bytes from 62.189.71.209: icmp_seq=1 ttl=63 time=1.44 ms RR: 69.191.176.206 10.249.23.7 62.189.71.209 10.249.23.7 69.191.176.206 64 bytes from 62.189.71.209: icmp_seq=2 ttl=63 time=1.38 ms 64 bytes from 62.189.71.209: icmp_seq=3 ttl=63 time=1.46 ms 64 bytes from 62.189.71.209: icmp_seq=4 ttl=63 time=1.41 ms 64 bytes from 62.189.71.209: icmp_seq=5 ttl=63 time=1.49 ms 64 bytes from 62.189.71.209: icmp_seq=6 ttl=63 time=1.46 ms 64 bytes from 62.189.71.209: icmp_seq=8 ttl=63 time=1.52 ms 64 bytes from 62.189.71.209: icmp_seq=9 ttl=63 time=2.84 ms 64 bytes from 62.189.71.209: icmp_seq=10 ttl=63 time=1.77 ms ^C --- 62.189.71.209 ping statistics --- 10 packets transmitted, 10 received, 0% packet loss, time 9014ms rtt min/avg/max/mdev = 1.386/1.892/4.117/0.849 ms Now I use normal ping running between 62.189.71.209 & 69.191.176.206 in "fast" path: rraszuk@cto-lon2:~$ ping 62.189.71.209 -v PING 62.189.71.209 (62.189.71.209) 56(84) bytes of data. 64 bytes from 62.189.71.209: icmp_seq=1 ttl=63 time=24.1 ms 64 bytes from 62.189.71.209: icmp_seq=2 ttl=63 time=31.5 ms 64 bytes from 62.189.71.209: icmp_seq=3 ttl=63 time=24.1 ms 64 bytes from 62.189.71.209: icmp_seq=4 ttl=63 time=24.1 ms 64 bytes from 62.189.71.209: icmp_seq=5 ttl=63 time=24.0 ms 64 bytes from 62.189.71.209: icmp_seq=6 ttl=63 time=24.1 ms ^C --- 62.189.71.209 ping statistics --- 6 packets transmitted, 6 received, 0% packet loss, time 5006ms rtt min/avg/max/mdev = 24.097/25.369/31.563/2.774 ms rraszuk@cto-lon2:~$ Best, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Internet monitoring in case of general issues
Hey Mark, Just to clarify My thing is you probably have much better insight and control for > on-net. That leaves off-net as the main issue, as your direct upstream > and peering may be fine, but beyond that is anyone's guess. > Ahh no. See with decent TCP analyzer I am monitoring end to end network behaviour regardless of the point I setup the TAP between src and dst. Clearly easiest is to insert TAP in my ISP peering links. So no guessing - real data only :). > If you are monitoring a specific off-net target for some reason, you can > easily control for that. But if you are looking at a generalized > situation, that's a lot harder. > Ahh that is a fundamental misunderstanding to what I was apparently not well trying to describe. I am not monitoring any targets. I am monitoring and performing real time TCP analytics (using said analyzer) of all my in-out data. And here I have few options. I can focus on optimizing most active sessions per per volume. I can focus on optimizing sessions per src/dst port, I can focus on optimizing sessions experiencing most retransmissions or I could just try to improve RTT, jitter etc ... > Doubly worse for operators who connect to only one or two upstream > providers/peering points. > I understand that if you have no right tools the problem is hard to solve. But that is like everything else :) Go cut piece of wood with even best japanese kitchen knife > I'm not saying we should resign ourselves to, "Ah well, I can't fix what > I can't touch"; but time and resources are limited, so given your > circumstances, spend some time finding out how best to deploy them as > you also seek elegant ways to solve this particular issue itself. > Fair. And the only point of my note is to just share a bit different perspective. I am very well aware that we as networking industry are really in the stone age as far as various aspects of performance routing is concerned. Or for that matter dual disjoint path routing without any static configuration or building two topologies. All networking today Internet, intradomain, DC cares about reachability. Quality of that reachability is pushed to the application layer. Well sure this is ok if you have good app which can build multiple connections to different destinations like say torrent. But not all apps are like that and some would like network to be a little bit more smart :) Best, R. On Sun, Mar 15, 2020 at 12:31 PM Mark Tinka wrote: > > > On 15/Mar/20 12:56, Robert Raszuk wrote: > > All, > > > > It seems that most answers and in fact the question itself assumes that > all > > we can do here is to be reactive. In my books that is an indication that > we > > have already failed. > > > > I do think that any one who has more then one internet upstream ISPs > (full > > table or even defaults out) can do performance routing in real time by > > evaluating quality of TCP sessions across 2 (or more). Based on that data > > it can intelligently shift the exit traffic on a per prefix basis. > > > > Folks like Google or Facebook are using such home grown tools for a long > > time (Espresso, Edge Fabric). Cisco pfr had at least originally single > > sided Internet edge OER. Of course with a bit of automation skills any > > one can build your own tool too - the only real requirement is tapped > > traffic such that you can passively measure the TCP quality to your user > > destinations. > > > > For TCP analysis for few years now I am using https://palermotec.net/ > analyzer. > > TCP analytics it offers is simply fantastic. GUI and user interface still > > needs improvement - if someone is to rely on that. Just fyi ... I am also > > working with that team to build (Smart Edge Routing) SER controller - > they > > already have alpha version, but hopefully in the coming months there will > > be more progress making it beta and eft. The assumption is of course that > > all interaction with routers (any vendor) is over standard protocols (BGP > > or static). > > > > As we all know each decent SD-WAN or Cisco iWAN has ability to monitor > > performance over the mesh of endpoints and choose more optimal paths. But > > that is slightly different as it relies on both ends ownership. Here I > > assume we are talking about just single sided exit routing where we have > > zero control over dst. > > > > All above is about exit. To do analogy inbound is also to some extent > > possible if you are advertising prefixes for your services out. But here > > the issue is much more difficult from the perspective of aggregating > > different services - so at most one could just average which uplinks are > > best for a given prefix for *most* of th
Re: [j-nsp] Internet monitoring in case of general issues
All, It seems that most answers and in fact the question itself assumes that all we can do here is to be reactive. In my books that is an indication that we have already failed. I do think that any one who has more then one internet upstream ISPs (full table or even defaults out) can do performance routing in real time by evaluating quality of TCP sessions across 2 (or more). Based on that data it can intelligently shift the exit traffic on a per prefix basis. Folks like Google or Facebook are using such home grown tools for a long time (Espresso, Edge Fabric). Cisco pfr had at least originally single sided Internet edge OER. Of course with a bit of automation skills any one can build your own tool too - the only real requirement is tapped traffic such that you can passively measure the TCP quality to your user destinations. For TCP analysis for few years now I am using https://palermotec.net/ analyzer. TCP analytics it offers is simply fantastic. GUI and user interface still needs improvement - if someone is to rely on that. Just fyi ... I am also working with that team to build (Smart Edge Routing) SER controller - they already have alpha version, but hopefully in the coming months there will be more progress making it beta and eft. The assumption is of course that all interaction with routers (any vendor) is over standard protocols (BGP or static). As we all know each decent SD-WAN or Cisco iWAN has ability to monitor performance over the mesh of endpoints and choose more optimal paths. But that is slightly different as it relies on both ends ownership. Here I assume we are talking about just single sided exit routing where we have zero control over dst. All above is about exit. To do analogy inbound is also to some extent possible if you are advertising prefixes for your services out. But here the issue is much more difficult from the perspective of aggregating different services - so at most one could just average which uplinks are best for a given prefix for *most* of the users. Kind regards, R. On Sun, Mar 15, 2020 at 9:14 AM Marcel Bößendörfer wrote: > RIPE Atlas is also quite useful: https://atlas.ripe.net - beside NLNOG > RING. > > Am So., 15. März 2020 um 08:58 Uhr schrieb Tore Anderson : > > > * james list > > > > > The question: once you notice issues on internet and your upstreams are > > > fine, what instrument or service or commands or web site do you use to > > try > > > to find out where is the problem and who is experiencing the problem > (ie > > a > > > tier1 carrier)? > > > > We find that being an NLNOG RING (https://ring.nlnog.net/) participant > is > > very useful in diagnosing these kind of issues. We can start pings or > > traceroutes towards towards our own network from 500+ locations all over > > the globe with a single command, for example. Furthermore, there is a > tool > > (ring-sqa) that does pretty much this continuously and alerts us if a > > partial outage is detected. > > > > Tore > > ___ > > juniper-nsp mailing list juniper-nsp@puck.nether.net > > https://puck.nether.net/mailman/listinfo/juniper-nsp > > > > > -- > *Marcel Bößendörfer* > Geschäftsführer / CEO > > *marbis GmbH* > Griesbachstr. 10 > 76185 Karlsruhe, Germany > > Phone: +49 721 754044-11 > Fax: +49 800 100 3860 > E-Mail: m.boessendoer...@nitrado.net > Web: marbis.net / nitrado.net > > *Registered Office | Sitz der Gesellschaft:* Karlsruhe > *Register Court | Registergericht:* AG Mannheim, HRB 713868 > *Managing Directors | Geschäftsführer:* Marco Balle, Marcel Bößendörfer > > Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail > irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und > vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte > Weitergabe dieser Mail ist nicht gestattet. This e-mail may contain > confidential and/or privileged information. If you are not the intended > recipient (or have received this e-mail in error) please notify the sender > immediately and delete this e-mail. Any unauthorized copying, disclosure or > distribution of the material in this e-mail is strictly forbidden. > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Automation - The Skinny (Was: Re: ACX5448 & ACX710)
> And I think almost no one is collecting data in such a manner > that it's actually capitalisable, because we can keep running the > network with how how we did in 90s, IF-MIB and netflow, in separate > systems, with no encrichement at all. Spot on ! Btw Saku - you keep suggesting measuring delta of input/output ... well to do it well I am afraid it is not trivial. So at t0+N I record how many packets entered my system. (We are already at loss here as RE can generate packets unless you add to this RE outbound packets). Then at t0+N+uS (uS) delta of switching via fabric you record number of packets which left the box. What is your N and uS ? Do you subtract BFD packets which enter and leave on the same line card both ingress and egress ? Monitoring drops is much easier if we are dealing with platforms which is honest in recording them. > You don't need ML/AI to find problems in your network, using algorithm > 'this counter which increments at rate X stopped incrementing or > started to increment 100 times slower' Well the way I read Adam's note was that learning this rate X is what he (IMHO correctly) calls ML :) Cheers, R. On Tue, Jan 28, 2020 at 8:45 AM Saku Ytti wrote: > On Mon, 27 Jan 2020 at 22:30, wrote: > > > Then nowadays there's also the possibility to enable tons upon tons of > streaming telemetry -where I could see it all landing in a common data lake > where some form of deep convolutional neural networks could be used for > unsupervised pattern/feature learning, -reason being I'd like the system to > tell me look if this counter is high and that one too and this is low then > this usually happens. But I'd rather wait to see what the industry offers > in this area than developing such solutions internally. For now I'm glad I > have automation projects going, when I asked whether we should have AI in > network strategy for 2020 I got awkward silence in response. > > We should learn to crawl before we take rocket to proxima centauri. > > You don't need ML/AI to find problems in your network, using algorithm > 'this counter which increments at rate X stopped incrementing or > started to increment 100 times slower' and 'this counter which does > not increment, started to increment', and you'll find a lot of > problems in your network. But do you care about every problem in your > network, or only problems that customers care about? > > Juniper once in EBC had some really smart academics explaining us > their ML/AI project which predicts resource needs on a given system. > They quoted how close they got to real numbers then I asked how does > it perform against naive system, after explaining by naive system I > mean system like 'my box has 1M FIB entries so FIB entry uses > RLDRAM/1M' to extrapolate FIB usage in arbitrary config. They hadn't > tried this and couldn't tell how well the ML/AI performs against this. > > Can you really train today ML/AI to determine what actually matters? I > don't think you can, because what actually matters is something that > impacted customer, and you simply cannot put enough learning data in, > you don't have nearly enough customer trouble tickets to be able to > correlate them to network data you're collecting and start predicting > which complex counter combinations are predicting customer ticket > later. > > But are you at least monitoring how many networks are lost inside your > network? Delta of input/output? That is fairly trivial to cover _all > reasons for packet loss_, of course latency/jitter are not covered, > but still, it covers alot of ground fast. Do you have a single system > where you collect all data? Have you enrichened the data stuff like > npu, linecard, city, country, region? Almost no one is doing even very > basic stuff, so I think ML/AI isn't going to be the low hanging fruit > any time soon. If you have a single system with lot of labels for > every counter, you can do a lot with very naive analytics. If you > don't have the data, you can't do anything with the smartest possible > system. And I think almost no one is collecting data in such a manner > that it's actually capitalisable, because we can keep running the > network with how how we did in 90s, IF-MIB and netflow, in separate > systems, with no encrichement at all. > > -- > ++ytti > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Automation - The Skinny (Was: Re: ACX5448 & ACX710)
Hi Adam, I would almost agree entirely with you except that there are two completely different reasons for automation. One as you described is related to service provisioning - here we have full agreement. The other one is actually of keeping your network running. Imagine router maintaining entire control plane perfectly fine, imagine BFD working fine to the box from peers but dropping between line cards via fabric from 20% to 80% traffic. Unfortunately this is not a theory but real world :( Without proper automation in place going way above basic IGP, BGP, LDP, BFD etc ... you need a bit of clever automation to detect it and either alarm noc or if they are really smart take such router out of the SPF network wide. If not you sit and wait till pissed customers call - which is already a failure. Sure not everyone needs to be great coder ... but having network eng with skills sufficient enough to understand code, ability to debug it or at min design functional blocks of the automation routines are really must have today. And I am not even mentioning about all of the new OEM platforms with OS coming from completely different part of the world :) That's when the real fun starts and rubber hits the road when network eng can not run gdb on a daily basis. Cheers, Robert. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] ACX5448 & ACX710
CPE with its datasheet not even mentioning IPSec/DTLS hardware support ... LOL what year do we have ? On Wed, Jan 22, 2020 at 3:01 PM wrote: > > Giuliano C. Medalha > > Sent: Tuesday, January 21, 2020 8:24 PM > > > > Hello > > > > We did some initial lab teste using 5448 for a client and we have > checked with > > JUNIPER. > > > > The major problems we found for our client environment: > > > > - No support for FAT (no roadmap); > > - No support for Entropy Label (no roadmap); > > - No support for Output Policer or HQOS for VPLS / L2Circuit (no > roadmap); > > - ACX does not support load balance parsing the payload on lag interface > (no > > roadmap); > > - Some problems with arp flooding for the main CPU (initial JUNOS > releases > > but I think they have solve it); > > - IRB on VPLS is not supported; > > - Not possible to monitor the real-time traffic on sub-interfaces using > CLI > > (only with SNMP) > > > > It is good to check with them to see if those functions would work ate > some > > new releases (some day ...). > > > And no " TE / TE++ and auto-bandwidth"? > > -seems like ACX5448 is targeted as a CPE box or a L2 switch, > > ...unsubscribe > > adam > > > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] FlowSpec and RTBH
I see there are two questions here Marcin is asking: > I was wondering is there a way to export family flow routes (from > inetflow.0) to non flowspec BGP speaker? Q1 - Can I advertise Flowspec NLRIs to non Flowspec speakers ? The answer is clearly "No" > For example tag Flowspec route with community and advertise this route with > different community to blackhole on upstream network (selective RTBH). Q2 - Can flowspec be tagged with blackhole communities indicating the actions yet still using match criteria to apply those selectively. The answer is "Yes" the original 5575 RFC clearly allows so: A given flow may be associated with a set of attributes, depending on the particular application; such attributes may or may not include reachability information (i.e., NEXT_HOP). *Well-known or AS-specific community attributes can be used to encode a set of predetermined actions.* Thx, R. On Wed, Oct 16, 2019 at 8:44 PM Jeff Haas via juniper-nsp < juniper-nsp@puck.nether.net> wrote: > > > > -- Forwarded message -- > From: Jeff Haas > To: "Marcin Głuc" > Cc: "juniper-nsp@puck.nether.net" > Bcc: > Date: Wed, 16 Oct 2019 18:44:07 + > Subject: Re: [j-nsp] FlowSpec and RTBH > Marcin, > > > > On Oct 9, 2019, at 07:26, Marcin Głuc wrote: > > I was wondering is there a way to export family flow routes (from > > inetflow.0) to non flowspec BGP speaker? > > For example tag Flowspec route with community and advertise this route > with > > different community to blackhole on upstream network (selective RTBH). > > I'm having difficulty following your use case. > > Flowspec is its own address family with its own AFI/SAFI and a rather > nasty format. > > Are you asking that some internal component of a flowspec filter, like > destination, is leaked into another address family? > > -- Jeff > > > > > -- Forwarded message -- > From: Jeff Haas via juniper-nsp > To: "Marcin Głuc" > Cc: "juniper-nsp@puck.nether.net" > Bcc: > Date: Wed, 16 Oct 2019 18:44:07 + > Subject: Re: [j-nsp] FlowSpec and RTBH > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Suggestions for Edge/Peering Router..
> > > Ideally I'd like to see equivalent of Cisco's dynamic update peer-groups > in Junos. > > They are dynamic, but once you make export change which affects subset > of members in peer-group, that member gets reset while placed to new > update-group. And that is how dynamic update groups works by design. They automatically group peers based on their export policy configuration. To reset or not reset the peers really depends on the nature of the policy change. In some cases clear soft out or in will do just fine without bringing down the session. Of course if you enable soft reconfig in and policy is just a new import filter no reset of the peer is required if OS still does it they can fix it easily. R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] LAG/ECMP hash performance
Hi Eldon, You are very correct. I was very highly surprised to read Saku mentioning use of CRC for hashing but then quick google revealed this link: https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/hash-parameters-edit-forwarding-options.html Looks like ECMP and LAG hashing may seriously spread your flows as clearly CRC includes payload and payload is likely to be different with every packet. Good that this is only for QFX though :-) For MX I recall that the hash is not computed with entire packet. The specific packet's fields are taken as input (per configuration) and CRC functions are used to mangle them - which is very different from saying that packet's CRC is used as input. But I admit looking as few prod MXes load across LAG members is far from being well balanced. Thx, R. On Thu, Aug 29, 2019 at 4:56 PM Eldon Koyle < ekoyle+puck.nether@gmail.com> wrote: > On Thu, Aug 29, 2019 at 2:52 AM James Bensley > wrote: > > > Different parameters may or may not change the diffusion density, but > > they may increase the range of results, i.e. perfect diffusion over > > 2^2 outcomes vs. perfect diffusion over 2^6 outcomes. > > > > Also, ASR9Ks use a CRC32 on Typhoon cards but not of the whole frame, > > "Post IOS-XR 4.2.0, Typhoon NPUs use a CRC based calculation of the > > L3/L4 info and compute a 32 bit hash value." So actually, your results > > below should have good diffusion in theory if this was an ASR9K > > (although I'm sure that's not the case in reality). Is the Juniper > > taking (1) the whole frame into the CRC function (2) all the headers > > but no payload, or (3) just the specific headers fields (S/D > > MAC/IP/Port/Intf)? > > > I think 802.3ad and ECMP both require a given connection to hash to > the same link to prevent out-of-order delivery. > > Taking full frames or even full headers into your hashing algorithm > would likely break the expectation of in-order delivery (unless your > have the same vendor on both sides with something proprietary). > Ignoring that requirement, you could ditch hashing altogether and go > for round-robin. Standards-compliant hashing implementations can only > look at header fields that don't change for a flow, namely src/dest > mac, ip, protocol, and port for TCP/UDP (maybe adding in certain MPLS, > VLAN, etc. fields or interface ids or other proprietary information > available to the chip that satisfies that requirement). > > -- > Eldon > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] ARP resolution algorithm? Storage of MX transit packets?
Spot on Gert ! + also including static routes. That's why as some of you for sure remember static to multiaccess interfaces say /8 without giving explicit next hop are very dangerous ;) On Thu, Jan 31, 2019, 09:57 Gert Doering Hi, > > On Thu, Jan 31, 2019 at 10:51:01AM +0200, Saku Ytti wrote: > > On Thu, 31 Jan 2019 at 10:34, Robert Raszuk wrote: > > > > > As mentioned on the other thread decent routers should resolve peer's > IP to mac when creating FIB adj and building rewrite entries. > > > There is no "first packet" notion nor any ARPing driven by packet > reception. This should apply to p2p adj as well as p2mp - classic LANs. > > > > > Are you guys saying that say MXes don't do that ? > > > > I'm not sure what you are saying. I must misunderstand, but are you > > saying once I configure /8 LAN, router ARPs all of them periodically > > until the end of time, retaining unresolved, resolved cache for each > > of /8? Which router does this? > > I think Robert is talking about router-to-router LANs, where you have > "prior knowledge" in your FIB. > > Like, OSPF neighbours, or BGP next-hops pointing to LAN adjacencies - so > the router could go out and start the ARP process the moment it learns > "I have a next-hop in BGP pointing to :". > > (I think it would be a great thing to have, especially including a > feedback mechanism "ARP / ND failed, this next-hop is invalid!" to BGP - > solve a number of blackhole problems with indirect BGP routes) > > getr > -- > "If was one thing all people took for granted, was conviction that if you > feed honest figures into a computer, honest figures come out. Never > doubted > it myself till I met a computer with a sense of humor." > Robert A. Heinlein, The Moon is a Harsh > Mistress > > Gert Doering - Munich, Germany > g...@greenie.muc.de > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] ARP resolution algorithm? Storage of MX transit packets?
We are talking about transit - right ? So regardless of subnet mask you know your next hop IP from control plane. Then you creating adj in FIB/CEF without waiting for any packet to arrive. End hosts on directly connected LANs are different but my impression was that we are discussing case of transit. Thx On Thu, Jan 31, 2019, 09:51 Saku Ytti On Thu, 31 Jan 2019 at 10:34, Robert Raszuk wrote: > > > As mentioned on the other thread decent routers should resolve peer's IP > to mac when creating FIB adj and building rewrite entries. > > There is no "first packet" notion nor any ARPing driven by packet > reception. This should apply to p2p adj as well as p2mp - classic LANs. > > > Are you guys saying that say MXes don't do that ? > > I'm not sure what you are saying. I must misunderstand, but are you > saying once I configure /8 LAN, router ARPs all of them periodically > until the end of time, retaining unresolved, resolved cache for each > of /8? Which router does this? > > At least routers I've worked with punt traffic destined to unresolved > addresses and then build HW adjacency. No traffic to given /32, not > adjacency, no knowledge of DMAC. > > > -- > ++ytti > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] ARP resolution algorithm? Storage of MX transit packets?
As mentioned on the other thread decent routers should resolve peer's IP to mac when creating FIB adj and building rewrite entries. There is no "first packet" notion nor any ARPing driven by packet reception. This should apply to p2p adj as well as p2mp - classic LANs. Are you guys saying that say MXes don't do that ? Thx, R. On Thu, Jan 31, 2019, 09:26 Gert Doering Hi, > > On Thu, Jan 31, 2019 at 10:10:32AM +0200, Saku Ytti wrote: > > I wish some vendor would implement static DIP=>DADDR resolution, there > > Can you do static ARP entries on JunOS? You can do that on Cisco - while > not exactly what you might have had in mind, it would be theoretically > possible to have management system turn off ARP resolution for certain > VLANs and put static ARP entries into the config. > > (I had to use it in the past due to ARP and ND bugs at peering routers, > so I know "it works for a small number of entries" - no idea if it would > scale, or whether Cisco properly programs static ARP into HW right > away, or just uses it for lookups when punting) > > gert > > -- > "If was one thing all people took for granted, was conviction that if you > feed honest figures into a computer, honest figures come out. Never > doubted > it myself till I met a computer with a sense of humor." > Robert A. Heinlein, The Moon is a Harsh > Mistress > > Gert Doering - Munich, Germany > g...@greenie.muc.de > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Junos Arp Expiration Timer Behavior & Active Flows
Hi Alex, > as opposed to normal ARP behaviour where ARP is only > resolved where there is a packet going to 203.0.113.1. In correctly constructed ISP grade routers FIB data plane is build regardless of packets going through the router or not. So it is actually control plane driven to build MAC rewrites at the FIB adj. level regardless if your hardware supports flat or hierarchical FIB. /* Of course this may not be the case for flow based routers like SRX or some x86 cheap hacks, but those should be just a minor exception. */ Thx, R. PS. Did you ever wonder why it is recommended to type in next hop address when adding static route pointing to say a class C LAN ? Well if you would not give the nh and instead say just specify the outbound interface poor router when building the MAC rewrites would have to try to resolve IP to MAC for all 254 possible hosts on it :) On Sat, Jan 12, 2019 at 6:00 PM Alexander Arseniev via juniper-nsp < juniper-nsp@puck.nether.net> wrote: > Hello, > > Few more ARP tidbits for You: > > 1/ JUNOS learns ARP not only from responses but from requests as well - > this is according to RFC 826 "Packet reception" chapter (ARP opcode is > examined AFTER the xlation table is updated). Therefore, You may see > that ARP entry for the remote node is regularly refreshed on local node > without any ARP requests being sent out from that local node. This could > happen if the ARP randomized aging timers or clocks are different - and > they normally are if only by a small amount. > > 2/ changing ARP aging-time does not take effect immediately, You need to > wait until current entry ages out or clear it with CLI command. > > 3/ if You configure a static /32 route to with destination == nexthop - > like set routing-options static route 203.0.113.1/32 next-hop > 203.0.113.1, which is a valid route in JUNOS and 203.0.113.1 must be > directly connected - then the ARP entry for 203.0.113.1 is maintained by > JUNOS in accordance with configured (or default) ARP aging timers > without any traffic going to 203.0.113.1 as opposed to normal ARP > behaviour where ARP is only resolved where there is a packet going to > 203.0.113.1. > > HTH > > Thx > Alex > > On 11/01/2019 16:50, Clarke Morledge wrote: > > According to KB19396, "the Address Resolution Protocol (ARP) > > expiration timer does not refresh even if there is an active traffic > > flow in the router. This is the default behavior of all routers > > running Junos OS." The default timer is 20 minutes. I have confirmed > > this behavior on the MX platform. > > > > This does not seem very intuitive, as it suggests that a Junos device > > at L3 would stop in the middle of an active flow, to send an ARP > > request to try to refresh its ARP cache, potentially causing some > > unnecessary queuing of traffic, while the Junos device waits for ARP > > resolution. For an active flow, the ARP response should come back > > quick, but still it seems unnecessary. > > > > I would have thought that the ARP cache would only start to decrement > > the expiration timer, when the device was not seeing any traffic > > to/from ARP entry host. > > > > KB19396 goes onto say, "When the ARP timer reaches 20 (+/- 25%) > > minutes, the router will initiate an ARP request for that entry to > > check that the host is still alive." I can see that when the ARP timer > > is started initially, that it starts the expiration countdown, at this > > (+/- 25%) value, and not exactly at, say, 20 minutes, which is the > > default timer value. > > > > A couple of questions: > > > > (a) Is this default behavior across all Junos platforms, including MX, > > SRX, and EX? > > > > (b) Is there any other caveat as to when the Junos device will send > > out the ARP request, at the end of expiration period? > > > > Clarke Morledge > > College of William and Mary > > Information Technology - Network Engineering > > Jones Hall (Room 18) > > 200 Ukrop Way > > Williamsburg VA 23187 > > ___ > > juniper-nsp mailing list juniper-nsp@puck.nether.net > > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
[j-nsp] show ospf lsdb - topology drawing
Hi, Would anyone be able to recommend some open or closed src tool which can draw nice topology of the OSPFv2 single area0 based on the show ospf lsdb output capture ? I saw https://blog.webernetz.net/ospf-visualizer/ but looking for more tools like this proven in battle field especially those compatible as is with junos output. Many thx, Robert. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] L3VPN/RR/PE on Same router
> > It's about increasing the odds of it to fall on the right side, > Exactly ! > But comparing say XR and Junos, judging from the rest of the inner workings I could experience empirically, I'd say they are sufficiently different > implementations. > True. In fact even XE & XR BGP code core is quite different in spite of number of failed attempts to at least make new bgp features to share code. The bottom line is that if you get badly malformed update, broken attribute, illegal NLRIs, mistaken blast of BGP-LS etc .. the probability that all of those BGP implementations crash is much less likely then when compared that your chosen one will work even if you run two instances of it. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] L3VPN/RR/PE on Same router
> And I have seen the opposite, ie networks running multiple vendor RRs, > ending up with crashs because of buggy BGP implementations. Hmmm since usually IBGP RRs do not talk to each other (leave alone RR hierarchy aside) what you are essentially endorsing is single vendor networks right ? If I have Cisco PEs and Juniper RRs by your description it may crash ... hence better to avoid it - right ?. Good thing this is thread about iBGP not eBGP :):) Thx R. On Fri, Aug 17, 2018 at 4:43 PM, Youssef Bengelloun-Zahr wrote: > Hi, > > > > Le 17 août 2018 à 16:28, Robert Raszuk a écrit : > > >> and that thing would then crash BGP on RRs, can't afford that happening. > > > > Then best thing is to run two or three RRs in parallel each using > different > > BGP code base - even for the same AFI/SAFI pair > > > > I am seeing number of networks running single vendor RRs and when things > > melt they run around and claim that the problem was was really so rear > and > > unexpected :) Well usually bugs are of unexpected nature > > And I have seen the opposite, ie networks running multiple vendor RRs, > ending up with crashs because of buggy BGP implementations. > > At the end of the day, it is a question of tossing a coin and hopping it > will fall on the right side. > > > > > Thx, > > R. > > > > > > On Fri, Aug 17, 2018 at 4:05 PM, wrote: > > > >>> From: Saku Ytti [mailto:s...@ytti.fi] > >>> Sent: Friday, August 17, 2018 2:38 PM > >>> To: Mark Tinka > >>> Cc: adamv0...@netconsultings.com; tim tiriche; Juniper List > >>> Subject: Re: [j-nsp] L3VPN/RR/PE on Same router > >>> > >>> Hey Mark, > >>> > >>>>> Yes a good practice is to separate internet routes from > >>>>> internal/services l3vpn routes onto separate BGP control planes > >>>>> (different sessions at least) so that malformed bgp msg will affect > >>>>> just one part of your overall BGP infrastructure. > >>>> > >>>> I see you've been giving this advice for quite some time now. > >>> > >>> I'm siding with Adam here. His disaster scenario actually happed to me > in > >>> 3292. We ran for years VXR VPN route-reflectors, after we changed them > to > >>> MX240 we added lot more RR's, with some hard justifications to > >>> management why we need more when we've had no trouble with the count > >>> we had. > >>> After about 3 months of running MX240 reflectors, we got bad BGP UPDATE > >>> and crashed each reflector, which was unprecedented outage in the > history > >>> of the network. And tough to explain to management, considering we just > >>> had made the reflection more redundant with some significant > investment. > >>> I'm sure they believed we just had cocked it up, as people don't really > >>> believe in chance/randomness, evident how people justify that things > >> can't > >>> be broken, by explaining how in previous moment in time it wasn't > broken, > >>> implying that transitioning from non-broken to broken is impossible. > >>> > >>> Note, this is not to trash on Juniper, all vendors have bad BGP > >>> implementations and I'm sure one can fuzz any of them to find crash > bugs. > >>> > >> Oh yeah for sure, the XR RRs too were crashing upon reception of > malformed > >> BGP updates in the past. > >> > >> Currently XR BGP is *somewhat protected by the "BGP Attribute Filter and > >> Enhanced Attribute Error > >> Handling" (now RFC 7606) which already proved itself to me (just got a > log > >> msg informing me the malformed attribute was deleted instead of an > >> important transit session reset). > >> Unfortunately can't enable it on junos as the code we run would instead > of > >> session reset crashed the rpd due to a bug if the RFC 7606 feature > would be > >> enabled. > >> > >> *But still I'd be haunted by what could happen if RFC 7606 would have > >> missed something and that thing would then crash BGP on RRs, can't > afford > >> that happening. > >> > >> > >>> Not only is it CAPEX irrelevant to have separate RR for IPv4 and IPv6, > >> but you > >>> also get faster convergence, as more CPU cycles, fewer BGP neighbours, > >> less > >>> routes. I view it as cheap insurance as well as very simple horizontal > >> scaling. > >>> > >> And going virtual this really is a marginal spend in the grand scheme of > >> things. > >> > >> adam > >> > >> netconsultings.com > >> ::carrier-class solutions for the telecommunications industry:: > >> > >> > >> ___ > >> juniper-nsp mailing list juniper-nsp@puck.nether.net > >> https://puck.nether.net/mailman/listinfo/juniper-nsp > >> > > ___ > > juniper-nsp mailing list juniper-nsp@puck.nether.net > > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] L3VPN/RR/PE on Same router
> and that thing would then crash BGP on RRs, can't afford that happening. Then best thing is to run two or three RRs in parallel each using different BGP code base - even for the same AFI/SAFI pair I am seeing number of networks running single vendor RRs and when things melt they run around and claim that the problem was was really so rear and unexpected :) Well usually bugs are of unexpected nature Thx, R. On Fri, Aug 17, 2018 at 4:05 PM, wrote: > > From: Saku Ytti [mailto:s...@ytti.fi] > > Sent: Friday, August 17, 2018 2:38 PM > > To: Mark Tinka > > Cc: adamv0...@netconsultings.com; tim tiriche; Juniper List > > Subject: Re: [j-nsp] L3VPN/RR/PE on Same router > > > > Hey Mark, > > > > > > Yes a good practice is to separate internet routes from > > > > internal/services l3vpn routes onto separate BGP control planes > > > > (different sessions at least) so that malformed bgp msg will affect > > > > just one part of your overall BGP infrastructure. > > > > > > I see you've been giving this advice for quite some time now. > > > > I'm siding with Adam here. His disaster scenario actually happed to me in > > 3292. We ran for years VXR VPN route-reflectors, after we changed them to > > MX240 we added lot more RR's, with some hard justifications to > > management why we need more when we've had no trouble with the count > > we had. > > After about 3 months of running MX240 reflectors, we got bad BGP UPDATE > > and crashed each reflector, which was unprecedented outage in the history > > of the network. And tough to explain to management, considering we just > > had made the reflection more redundant with some significant investment. > > I'm sure they believed we just had cocked it up, as people don't really > > believe in chance/randomness, evident how people justify that things > can't > > be broken, by explaining how in previous moment in time it wasn't broken, > > implying that transitioning from non-broken to broken is impossible. > > > > Note, this is not to trash on Juniper, all vendors have bad BGP > > implementations and I'm sure one can fuzz any of them to find crash bugs. > > > Oh yeah for sure, the XR RRs too were crashing upon reception of malformed > BGP updates in the past. > > Currently XR BGP is *somewhat protected by the "BGP Attribute Filter and > Enhanced Attribute Error > Handling" (now RFC 7606) which already proved itself to me (just got a log > msg informing me the malformed attribute was deleted instead of an > important transit session reset). > Unfortunately can't enable it on junos as the code we run would instead of > session reset crashed the rpd due to a bug if the RFC 7606 feature would be > enabled. > > *But still I'd be haunted by what could happen if RFC 7606 would have > missed something and that thing would then crash BGP on RRs, can't afford > that happening. > > > > Not only is it CAPEX irrelevant to have separate RR for IPv4 and IPv6, > but you > > also get faster convergence, as more CPU cycles, fewer BGP neighbours, > less > > routes. I view it as cheap insurance as well as very simple horizontal > scaling. > > > And going virtual this really is a marginal spend in the grand scheme of > things. > > adam > > netconsultings.com > ::carrier-class solutions for the telecommunications industry:: > > > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] L3VPN/RR/PE on Same router
Just to clarify ... I was not really worried about how to follow various lists - mail client does a good job to combine them into one folder, filter duplicates etc ... But when writing general reply/question to Mark today about BGP sessions I noticed it only had j-nsp - but oh the question is general so where do I post ? I added c-nsp ... that was the trigger for the above comment. On a similar note I would love to hear comments from all of the members on what linux tools they use to test pps on the routers ... which list should I post it to ? Cheers R. On Fri, Aug 17, 2018 at 12:06 PM, wrote: > > PS. Have not been reading -nsp aliases for a while, but now I see that I > > missed a lot ! Btw do we really need per vendor aliases here ? Wouldn't > it > > be much easier to just have single nsp list ? After all we all most > likely > > have all of the vendors in our networks (including Nokia !) and we are > all > > likely reading all the lists :) Or maybe there is one already ? > > Disagree. I follow several of the -nsp lists, but some of them much > more closely than others. Having them all mixed up would definitely > make this more difficult. > > Steinar Haug, Nethelp consulting, sth...@nethelp.no > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] L3VPN/RR/PE on Same router
Hey Mark, It has been a while > We've been running all address families on the same RR's (different > sessions, obviously, but same hardware) Out of pure curiosity how are you setting up different BGP sessions to the same RR ? I think what Adam is proposing is real TCP session isolation, what you may be doing is just same single TCP session, but different SAFIs which is not the same. Sure you can configure parallel iBGP sessions on the TCP level say between different loopback addresses to the same RR, but what would that really buy you ? You could even be more brave and use BGP multisession code path (if happens to be even supported by your vendor) which in most implementations I have seen is full of holes like swiss cheese but is this what you are doing ? Cheers, R,. PS. Have not been reading -nsp aliases for a while, but now I see that I missed a lot ! Btw do we really need per vendor aliases here ? Wouldn't it be much easier to just have single nsp list ? After all we all most likely have all of the vendors in our networks (including Nokia !) and we are all likely reading all the lists :) Or maybe there is one already ? On Fri, Aug 17, 2018 at 7:06 AM, Mark Tinka wrote: > > > On 16/Aug/18 17:15, adamv0...@netconsultings.com wrote: > > > Yes a good practice is to separate internet routes from internal/services > > l3vpn routes onto separate BGP control planes (different sessions at > least) > > so that malformed bgp msg will affect just one part of your overall BGP > > infrastructure. > > I see you've been giving this advice for quite some time now. > > We've been running all address families on the same RR's (different > sessions, obviously, but same hardware) for almost 5 years. The only > reason sessions have gone down is due to hardware problems. It didn't > disrupt services because there are always 2 RR's, but we haven't seen an > outage due to protocol problems in one address family spilling over into > other address families. > > Of course, I see your concern, but from our own experience over several > years, I've not seen this issue. > > I mention this because introducing this kind of separation is onerous. > > Mark. > ___ > juniper-nsp mailing list juniper-nsp@puck.nether.net > https://puck.nether.net/mailman/listinfo/juniper-nsp > ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Summarize Global Table
Chris, Have you read draft-ietf-grow-simple-va-04 ? There is nothing in the draft nor in the implementation reg route to the left. It is all about take the same door as your big boss when you exit and when you get new smaller boss in the middle who exist differently via different doors your net get's installed to still take a correct set of doors out. As described to Shane semantically this is identical in default behaviour as installing all prefixes into RIB and FIB. However I would argue that if you do it within the POP you can do much better savings that the default behavior. But this is perhaps out of scope of this thread ;-) Cheers, R. On 10/25/2011 10:09 PM, Mark Tinka wrote: On Wednesday, October 26, 2011 05:12:09 AM Richard A Steenbergen wrote: c) Vendors would much rather sell you new cards wih more FIB capacity than find a way to implement a free solution in software (big shocker, I know). :) I've been chatting with a major vendor about their interest in implementing S-VA: http://tools.ietf.org/html/draft-ietf-grow-simple-va-04 'route to the left' ... you can do this today, VA only wraps a 'protocol' and (maybe) 'operational modality' around 'route to the left'. There may be hope yet. sure, 'route to the left' (tm: schil...@uu.net) ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Summarize Global Table
Hi Pavel, Robert, is there any non-production implementation of simple-va, which we could play with? The only implementation I am aware of is cisco ios component code. Yes there are non-production EFT images you could perhaps get from cisco to play around with it. Since I have switched recently and I am not longer a vendor, but an operator you will need to ask your cisco colleagues to get you some test images. The main concern here is, of course, whether a router will be infinitely installing/withdrawing tons of FIB entries because of 'natural' prefix flaps, how much black-hole/loop this will create in practice, and how to deal with it. Did anyone perform some research of this? Any reports exist? The router with simple-va functionality enabled will not install nor withdraw even single additional route on top of what it would do when simple-va feature is not enabled. So it is guaranteed to be no less then today. We simply added logic to table-map filter code which with matches on next hop of less specific route suppresses covered more specifics with identical next hop. The only real additional price (there is no free lunch) is extra CPU when you need to walk back the table in the event of some prefix just with mask between less specific and more specific get's into the RIB/FIB with different next hop. But if I would deploy this functionality it would be to save the edge router's FIB and do it within the POP while POP boxes would still keep full table. That one that extra walks and extra CPU does not need to be used. Btw as a side note I spoke to some Juniper colleagues and there really liked the idea too :) Btw ... it equally well works for Internet as it works for VPN vrfs. Best, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] Summarize Global Table
Hi Shane, First I would like to actually thank all who provided some points reg simple-va during the discussion since yesterday. By all means I agree in some concerns expressed if additional CPU or RIB/FIB installation times for possibly batches of more specifics in relatively short time window caused by the trigger would be something definitely worth of measuring. However it very much depends on how simple-va are deployed, what is the network, what is the device etc ... therefor I do not have any measurement to share. However this has been tested in production network and gain reported in memory saving has been very significant. Now back to the specific questions ... SMALTA if I understand it operates at the FIB level. Er, I'm not sure that's correct. A SMALTA router would still receive a full RIB and calculate an overall Loc-RIB just like it does today; however, it's only during the final step of computing a FIB from either a BGP Loc-RIB or the implementation-specific/dependent view of the RIB that accounts for all source of routing information on the box, (e.g.: BGP, OSPF/IS-IS, etc.). In number of routers RIB is IPC-ed and kept on the linecards. So you are either proposing to redesign those platforms or running SMALTA on a line cards. Simple-VA is a pure control plane intelligent suppression between BGP and RIB. I wonder how many vendors will want to do any code modifications at the FIB level if exactly the same savings can be done at the control plane level …. I don't buy this argument. An implementation of a SMALTA-type router is, most likely, going to calculate a optimal FIB on it's RE/RP and *then* download it to the actual TCAM/SRAM/fwd'ing HW (line cards). Sorry but this is not always the router's architecture especially if you are talking about hardware based forwarding. In software based routers I could perhaps agree that the difference in the point of processing becomes not that different. I don't see that as any more challenging to implement vs. Simple-VA. See above. It is much much easier and simpler to do it in BGP then to do it in any lower component. Especially that all you are after is to compress the BGP routes. If anything, both proposals are nearly identical in this regard -- both are (likely) either implemented at the point after either: a) BGP calculates a Loc-RIB and hands it off to a local Route Table Manager (RTM); Yes this is exactly what smiple-va does. or, b) inside an RTM as it's computing a FIB that gets will get downloaded from the RE/RP to the FIB on the LC's. IMO, it's the size of the FIB in HW that will dictate at which point one chooses to put any change. IOW, since IGP size (in well-design networks) is typically very tiny, putting this code in the RTM is probably not worth it, in the grand scheme of things, in which case doing it at the BGP Loc-RIB - implementation-dependent RIB is more appropriate. So we agree provided SMALTA does not require full RIB as input not size wise but algorithm wise. That's already much more then needed. With simple-va suppression you do not need even to send routes to RIB and FIB at all if they are suppression eligible. So not only FIB size is small, but also RIB (both CPU and memory wise). So, the above raises a question, which I found confusing in the Simple-VA draft. Take the following from the 5th paragraph of Section 1 of the Simple-VA draft: ---snip--- Core routers in the ISP maintain the full DFRT in the FIB and RIB. Edge routers maintain the full DFRT in the BGP protocol RIB, but suppress certain routes from being installed in RIB and FIB tables. Edge routers install a default route to core routers, to ABRs which are installed on the POP to core boundary or to the ASBR routers. ---snip--- There sounds like there's a contradiction in the above. Specifically: 1) Edge routers maintain the full DFRT -- presumably, full DFRT equates to a full DFZ, correct? Yes. Sorry just reused some terms from main VA spec. 2) but [Edge routers] suppress certain routes from being installed in the **RIB** and FIB tables; 3) If these are Edge routers (specifically, FSR's in the context of Simple-VA), how can one suppress routes in the RIB, yet be expected to pass a full DFZ view, in eBGP, to downstream CE's that /demand/ a full DFZ, (because those customers are multi-homed to several SP's simultaneously)? If I were to hazard a guess, you presumably mean that the BGP Loc-RIB still maintains a full-DFZ; however, the router is suppressing routes during the step of pushing routes from the BGP Loc-RIB to an implementation-specific/dependent RIB view (i.e.: the RIB containing routes from not only BGP, but also other routing sources as well, e.g.: OSPF, IS-IS, etc.) That is exactly that. BGP still has full table. Only RIB and FIB have only necessary subset. Hrm, sounds an awful lot like SMALTA … see above :-) Is it a good or bad thing. I think this is good think :) Completely not.
Re: [j-nsp] Force IP traffic not to use the LSP path when enabling ISIS Traffic Engineering with Shortcuts
Peter, One way to accomplish what I think you are asking for is to use different BGP next hops for destinations you want to have native IP switched traffic and those which you want to transport over MPLS LSP (TE or LDP). Then you are only constructing MPLS LSPs to those next hops you would like to see label switched to. R. We are in the process of enabling traffic engineering with shortcuts for ISIS on an IP\MPLS based network. As a result of enabling ISIS traffic engineering with shortcuts, IP traffic will utilize the LSP paths (inet.3) for the forwarding decision. Is there a configuration feature so the IP traffic will continue to used inet.0 as we enable the ISIS traffic engineering feature. We observed a disruption in our IP traffic when we enabled the traffic engineering feature possibly due to the fact in the change of the forwarding path to the LSP. Regards, Peter ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] [c-nsp] general question on VRFs and FIBs...
Hi Derick, I previously blogged that a (totally hypothetical) multi-tenant network built entirely with PBR or FBF would not pass audit because of a lack of separate RIB and separate FIB structures for each tenant in the network. Why wouldn't this pass audit? OpenFlow is similar. Well I would like to observe that there may be an easy way to pass the audit both in the FBF Junos case as well as OpenFlow. - For FBF you may easily configure in the shipping boxes multiple VPLS instances which are as separate as VRFs. Then FBF can be controlled per instance basis (including even identical filters with different actions). - For OpenFlow is the same thing. OpenFlow capable switch can support multiple OpenFlow instances. In fact each such instance can belong to different administrative domain and can be controlled by quite different set of Openflow controllers. IMHO it is again no worse then VRF like separation analogy. Best, R. All: I actually received quite a few responses off-list to this question. We have to deal with many different audit/compliance agencies each with their own guidelines. One of their guidelines is that security zones should reside on physically separate switches. However, in an MPLS based on environment they allow for VRF/VSI separation on the same physical device. The reason is that each instance has its own RIB and its own FIB structures. At least, this is what I've heard now from multiple auditors over the last 6 or 7 years while working for different companies. I'm questioning this in general because we are looking at OpenFlow. In particular, the question came up Are separate structures really necessary? What if the FIB lookup was entirely hash-based (source-port included) and each entry in the hash table had a mask-structure associated with it (for src/dst mac and IPs?). I previously blogged that a (totally hypothetical) multi-tenant network built entirely with PBR or FBF would not pass audit because of a lack of separate RIB and separate FIB structures for each tenant in the network. Why wouldn't this pass audit? OpenFlow is similar. In this potential OpenFlow design there would still be separate VRFs on the controllers, but ultimately the forwarding would be compiled into this single hash table structure. So I'm questioning a basic assumption here: Are separate FIB structures for each VPN required? What I am hearing is mainly ASIC/NPU/FPGA design/performance concerns. Robert expressed some concerns over one VPN potentially impacting other VPNs with something like route instability or table corruption of some kind.. crashing was the word he used :-). I did spray a few lists with this question, but they are lists where the right people generally lurk... Derick Winkworth CCIE #15672 (RS, SP), JNCIE-M #721 http://packetpushers.net/author/dwinkworth From: Robert Raszukrob...@raszuk.net To: Gert Doeringg...@greenie.muc.de Cc: Derick Winkworthdwinkwo...@att.net; juniper-nsp@puck.nether.netjuniper-nsp@puck.nether.net; cisco-...@puck.nether.netcisco-...@puck.nether.net Sent: Tuesday, September 27, 2011 3:58 AM Subject: Re: [c-nsp] general question on VRFs and FIBs... Hi Gert, address first, VRF second. Well no one sane would do that ;) I believe what Derick was asking was why not have incoming_interface/table_id - prefix lookup. And while in software each VRF has separate RIB and FIB data structures for reasons already discussed on L3VPN IETF mailing list in actual hardware on a given line card however this may no longer be the case. Also side note that most vendors still did not implement per interface/per vrf MPLS labels (even in control plane) so all labels are looked up in a global table with just additional essentially control plane driven twicks to protect from malicious attacks in the case of CSC/Inter-AS. Cheers, R. Hi, On Mon, Sep 26, 2011 at 01:18:05PM -0700, Derick Winkworth wrote: I'm trying to find an archived discussion or presentation discussing why exactly the industry generally settled on having a separate FIB table for each VRF vs having one FIB table with a column that identifies the VRF instance? I'm not finding it, but I'm guessing its because of performance issues? Lookup would fail for overlapping address space if you lookup address first, VRF second. How do you find the right entry if you have 10.0.0.0/8 vrf red 10.0.0.0/16 vrf green 10.0.1.0/24 vrf blue and try to look up 10.0.0.1 in vrf red? You'll find the /24 entry, which is tagged vrf blue. Alternatively, you'd need to explode the /8 entry for vrf red if *another* VRF adds a more specific for that /8. gert ___ cisco-nsp mailing list cisco-...@puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp archive at http://puck.nether.net/pipermail/cisco-nsp/ ___ juniper-nsp mailing list juniper-nsp@puck.nether.net
Re: [j-nsp] [c-nsp] general question on VRFs and FIBs...
Hi Keegan, over another. However, if the vrf's all have separate tables in the real world then that should require the table lookup to come before the prefix lookup. If not there would be no way to figure out which fib to search. For packets coming from customer (CE) there is no need for any additional lookup as switching vectors of the interfaces (logical/physical) are already locked to a given VRF. /* One exception of the above is Policy Based VRF selection where you are choosing VRF dynamically based on preconfigured policy or even remote radius lookup. in this configuration interfaces are not bounded to any VRF. */ For packets coming from the core to a PE the VPN label directly points to the right VRF (per vrf label/aggregate label case). For per CE or per prefix labels no IP lookup is necessary in the VRFs at all for packets going to the CE. Thx, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] full table?
Hi Keegan, Is it always necessary to take in a full table? Why or why not? In light of the Saudi Telekom fiasco I'm curious what others thing. This question is understandably subjective. We have datacenters with no more than three upstreams. We would obviously have to have a few copies of the table for customers that want to receive it from us, but I'm curious if it is still necessary to have a full table advertised from every peering. Several ISP's will allow you to filter everything longer than say /20 and then receive a default. Just curious what others think and if anyone is doing this. I am just curious if your question is driven by issue of handling full table in the control plane or if the issue is with RIB and FIB. If the issue is with full table control plane on the edge then one could explore more ebgp multihop advertisements from some box/reflector behind the edge. Just FYI full table of 450K nets and 800K paths (2:1 ratio) should consume around 250 MB of RAM (including BGP idle footprint). If the issue is with boxes actually melting in RIB and FIBs then I may have an easy, automated and operationally friendly solution for that. In my ex-cisco life I have invented simple-va which with single knob only puts from BGP to RIB and FIB what is necessary. 100% correctness to pick what is necessary is assured. Details could be found here: draft-ietf-grow-simple-va-04. So far quite a few operators liked it ! Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] 32-Bit JunOS on the 64-Bit Routing Engines
Hi Keegan, They are saying that the new 16G RE's can handle 250M routes. How is this possible if none of the daemons are 64bit? The only real practical scaling use case I am aware of in the range of 5M routes today is for vpnv4 route reflectors. Another possible scaling point would be perhaps in poorly implemented IX route server functionality where you would need to copy entire table to each RS peer in order to execute per peer policy. So indeed if you have 500K of v4 routes and 500 peers it would indeed result in 250M prefixes to be handled. Other then that just from routing point of view I am not sure what's the practical use of such RE or 250M of routes on a real router. I think control plane can scale and 64bit routing stacks migration is already in progress (or even completed and shipping few years back on some other platforms), but forwarding I am afraid is far from that range. So you are left there using either simple-va like approach, come back to old good caching or worse process switching on the RE ;-) Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] 32-Bit JunOS on the 64-Bit Routing Engines
Hi Thomas, I agree with your reasoning - it is very practical :) However I am not sure what you exactly call by software. AFAIK the BSD kernel has been 64 bit enabled and capable years ago so from that point of view this is mature. For a change various processes for example RPD last time I checked was still 32 bit for most platforms other then JCS. I am not sure what is the status with porting other processes to be 64 bit enabled and honestly I am not sure it is needed across the board. So I would recommend checking with your SE if the 64 Junos you are getting to run on 64bit RE is really fully rewritten across or processes first. Maybe you will find out that it is not. Just looking below it seems that jinstall is failing because it detects cpu mismatch i386 vs {amd64} To fix that I think the best way would be to talk to friends in Juniper to recompile/publish junos for your platform/RE. Cheers, R. Yeah, that is clear - my original point is: I do not trust the 64bit software - I have more faith in the 32bit software. As per now, it has equal cost to order an MX960 with 32b-4G-RE or 64b-16G-RE. So of course I would order the bigger RE but only if I can use the the matured software... Tom Am 24.08.2011 14:19, schrieb Keegan Holley: Interestingly enough my SE told us this is possible at lease on our Mx480 and MX960 boxes. Our lab boxes are otherwise engaged at the moment so we havent tested. One note regarding general computing though. The processor can only address 4G (3.8 or so actually) of ram with a 32 bit word size. So even if you get the re's running the 32 bit code they will only register 4G of the precious 16G. Sent from my iPhone On Aug 24, 2011, at 3:12 AM, Thomas Eichhornt...@te3networks.de wrote: Hi all, I just discussed the following with my SE: I wanted to get new 64Bit REs with some new gear, but run the 32-Bit JunOS on them - he denied that this is possible. I tried to research that, but have not yet found something in the docs - does anybody here have some clue about that? As the REs are 'only' standard PCs, I do not see any reason for them to be not capable of running 'legacy' 32Bit JunOS. I would be really glad if someone has some clue about that and could unearth the truth. Thanks, Tom ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] load balancing in Route reflector scenario
Hi Harry, default, differences in route preference cause a JUNI to prefer an IGP route while ios prefer the bgp routs over IGP. Let's make a clear distinction between preferring eBGP route versus iBGP route. Talking CSCO here eBGP admin distance is as you say 20 while iBGP as even the URL provided by yourself says it is 200. So keeping in mind that usually hot potato routing is a desired behaviour preferring EBGP learned path is highly recommended for a given prefix. If you say that JUNI is to prefer IGP route over BGP one I am sure you must be referring to IBGP and not EBGP, but this is exactly the same in both vendors. W/o this knob replacing a cisco with a juniper can result in previously advertised bgp routes no longer being advertised. I can rest assure you that this was not the main intention of this knob :) Cheers, R. I always thought that advertise-inactive was to make a juniper act like a cisco with regard to BGP route announcements, when, by default, differences in route preference cause a JUNI to prefer an IGP route while ios prefer the bgp routs over IGP. In junos, only the active route is readvertised/subject to export policy. With advertise-inactive you can make a juniper router, whose active route is an IGP route, advertise into BGP the best bgp path, which here is inactive due to the igp route being preferred. W/o this knob replacing a cisco with a juniper can result in previously advertised bgp routes no longer being advertised. From: http://www.cisco.com/en/US/tech/tk365/technologies_tech_note09186a0080094823.shtml eBGP 20 . . . OSPF 110 From: http://www.juniper.net/techpubs/software/junos/junos64/swconfig64-routing/html/protocols-overview4.html OSPF internal route 10 IS-IS Level 1 internal route 15 . . . BGP 170 HTHS. -Original Message- From: juniper-nsp-boun...@puck.nether.net [mailto:juniper-nsp-boun...@puck.nether.net] On Behalf Of Keegan Holley Sent: Wednesday, August 10, 2011 2:48 PM To: rob...@raszuk.net Cc: juniper-nsp@puck.nether.net Subject: Re: [j-nsp] load balancing in Route reflector scenario 2011/8/10 Robert Raszukrob...@raszuk.net Hi Keegan, I think the advertise inactive knob turns that off, but I don't know for sure because I've never tried it. I know it's not supported on cisco routers. The reason for it is the size of the BGP table. So if the table is 400k routes and you have 5 different ISP's and you advertise every route that would be 2M routes in the table. Since BGP doesn't allow multiple version of the same route in the routing table (separate from the BGP table where incoming routes are stored) you would still only use the original 400K the other 1.8M routes would just go unused unless you manipulated them some how. Advertise inactive is not about what get's advertised - it is about if the best path is advertised or not. And if is decided based on the check if the BGP path to be advertised is inserted in the RIB/FIB or not. Oh I see. I have never used that command so thanks. Most of the above example was what would happen if BGP advertised everything it learned instead of just the best path or the path in the routing table btw. By default Junos and IOS-XR advertise only those best path in BGP which actually are installed into forwarding. Advertising inactive knob will overwrite it. Wouldn't this lead to traffic being blackholed? If all the routes for a given destination are inactive would this still cause BGP to advertise a route for them? IOS classic/XE (for historical reasons) advertises all best paths from BGP table and to enforce it not to advertise what has not been installed into RIB/FIB there is knob called suppress inactive. Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] load balancing in Route reflector scenario
Hi Keegan, Nope ... there can be other producers of the same route (OSPF, ISIS, STATIC) which will be in the RIB. If not there is always next step - less specific route to be used. I suppose there's a use for this or the feature wouldn't exist, but why would you have a route in the IGP that's not in BGP no no ... this entire discussion is about the case where the identical prefix is in both producers ... for example in OSPF and in eBGP. If it is only in one non of this what we are talking here applies. but still needs to receive traffic from routers running an IGP and BGP but not learning the route from the IGP. It is as said the other way around. Why not just import the route(s) into BGP. It just seems like this command may cause unexpected behavior to add features that can be configured in a more graceful manner. Very simple example: Some destination is reachable over EBGP ... the same route is advertised into AS via IBGP. All good. Now for some reason an operator is ordered to redirect all traffic going to dst X to go via some screening box. So on this said ASBR which normally would just switch out the packets, NOC guy is inserting a static route into RIB to say all which dst ix X go to this box. Then effectively this would cause BGP to stop advertising it as the RIB active route is now from static and not BGP. That's just one of the use case of traffic redirection by control plane/routing twick - yet not impacting the BGP operation. Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] load balancing in Route reflector scenario
Hi Keegan, I thought advertise inactive just configured the routers to advertise the entire BGP RIB instead of only advertising the routes in the routing-table. Nope. BGP advertises by default single best path. Any subsequent advertisement will be an implicit withdraw. Hi Humair, Per RR different local policy is a valid workaround int he case as reported by biwa. But care must be taken that the network either supports end to end encapsulation (example: mpls) or that all routers on the way will get the same paths. Hi Biwa, 1. The easiest option is to get rid of RR .. just do full IBGP mesh. I know large networks doing it today :) 2. The other option is to put RR in the data path and enable multipath in it. The end effect will be the same as enabling it on PE1. 3. To signal both paths to PE1 from RRs you need either add-paths or diverse-path. Add-paths will require support on PE1 while diverse-path will not. And depending on the choice of RR diverse-path is available today in some implementations :-) 4. Another way is to do ghost loopback (aka anycast next hop self) on PE2 and PE3 and let the IGP load-balance across both PEs. Works well if you have symmetry of IGP and routes of both PE2 and PE3. Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] load balancing in Route reflector scenario
Hi Keegan, I think the advertise inactive knob turns that off, but I don't know for sure because I've never tried it. I know it's not supported on cisco routers. The reason for it is the size of the BGP table. So if the table is 400k routes and you have 5 different ISP's and you advertise every route that would be 2M routes in the table. Since BGP doesn't allow multiple version of the same route in the routing table (separate from the BGP table where incoming routes are stored) you would still only use the original 400K the other 1.8M routes would just go unused unless you manipulated them some how. Advertise inactive is not about what get's advertised - it is about if the best path is advertised or not. And if is decided based on the check if the BGP path to be advertised is inserted in the RIB/FIB or not. By default Junos and IOS-XR advertise only those best path in BGP which actually are installed into forwarding. Advertising inactive knob will overwrite it. IOS classic/XE (for historical reasons) advertises all best paths from BGP table and to enforce it not to advertise what has not been installed into RIB/FIB there is knob called suppress inactive. Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] load balancing in Route reflector scenario
Hi Keegan, By default Junos and IOS-XR advertise only those best path in BGP which actually are installed into forwarding. Advertising inactive knob will overwrite it. Wouldn't this lead to traffic being blackholed? If all the routes for a given destination are inactive would this still cause BGP to advertise a route for them? Nope ... there can be other producers of the same route (OSPF, ISIS, STATIC) which will be in the RIB. If not there is always next step - less specific route to be used. So there are some valid cases where you may want to attract by BGP all traffic, but switch it according by your own policy and not by BGP decision. Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] DOS attack?
Hi Matthias, I wonder now, which is the event, that triggered this behavious? The numer of ssh-logins at that time or this zbexpected EOF? I would with good deal of assurance conclude that the cause were ssh-login attack which apparently starved the poor box to it's memory limits. When even your kernel spins a panic message on the low of memory due to such attack control plane can exhibit quite unexpected behavior. In my opinion end-of-frame BGP message is just a consequence of this. The advice would be to: * open a case with jtac to find out why subsequent ssh-logins cause a memory leak * reduce to very max rate-limiting for the ssh logins Cheers, R. Hi! Last night we had a mysterious behaviour on our router. On a BGP connection with Cogent we received an unexpected EOF. There were also a great number of SSH logins (we do not have FW rules in place, but we have a rate limit, Shortly after the router complained about low memory and a few BGP sessions drop down (oviosly the one, which are memory exhausting), I wonder now, which is the event, that triggered this behavious? The numer of ssh-logins at that time or this zbexpected EOF? The log of that time: ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] MPLS Route Distinguisher Question
Hi William, The Route distinguisher is the VPN instance identifier correct? Incorrect. The only function of RD is to make the routes unique. It has no meaning beyond it. The reason is that between VPNs you may have overlapping address space and if so vpnv4 would get mixed up in the Service Provider core. So it is unique per VPN in the network. It can be unique per VPN (same RD case for given VPN on each PE), but it can also be unique per VPN per VRF resulting in a given VPN carrying as many RDs as VRFs it has it's sites attached to. And the route target/vrf target is a value that you can assign to prefixes when advertised from local PE router to limit which remote PE routers in the VPN will accept the prefixes? It does limit not only which remote PE routers get such prefixes (only in the case where you use rt-constrain), but above all it is crucial to decide which remote virtual routing and forwarding instances will import such prefix. RTs build a VPNs not RDs. (Because PEs also do automatic inbound filtering based on their own RTs regardless of rt-constrain being used or not it may seems just like you have observed that RT limit which routers take which vpnv4 prefixes. But this is just an optimization :). Cheers, R. ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp
Re: [j-nsp] EX4200 9.4R2.9 process crash on previously valid config
Hi Richard, Honestly, I think you should just give up now and accept the fact that Juniper is the new Cisco. Excuse me ? Cisco in vast majority of new products is way much better now. Yes historically there was an issue with IOS, but AFAIK that has been also fixed now. * Look at highly dedicated XR team with an excellent and really technical leadership which knows exactly what they are doing (Hint: Just compare Prefix Independent Convergence capability); * Look at top class in the industry ASR family of routers (Hint: ask your Juniper rep about number of bgp routes they can deal with on the top line of the router and then compare with ASR basic capability) * Look at their procket OS based storage product line ... Or at the end look which company offers you no technology religion into which (if any) encapsulation you as the customer want to choose for your services. One company may offer you wide choice of GRE, L2TPv3, IPinIP or MPLS encapsulations all at line rate where all services you want to offer can run equally well on any of them while perhaps the other one may just lock you to the only one 4 letter encap type they are capable to offer. It isn't 1999 any more, and Juniper isn't the same company. Stop expecting superior products, that isn't the business model any more. :( You couldn't say it any better ! Cheers, R. PS: On the topic of the future of EX-series I will at this point refrain from commenting on the public list :). On Thu, Apr 02, 2009 at 01:48:26PM -0400, Jeff S Wheeler wrote: Dearest Juniper, please pay more attention to validating configs in newer JUNOS vs configs that are allowed on older EX-series software. Honestly, I think you should just give up now and accept the fact that Juniper is the new Cisco. If you want to use EX you should probably just have a lab box that you test your configuration on before deploying, because they certainly aren't making any effort to QA test their code before shipping it. Besides you should consider this progress. In older EX code you could configure MTUs that didn't actually route in hardware (but that routing protocols didn't know about, thus causing you to drop LSAs), or uRPF which wasn't officially supported and caused the box to immediately crash in an endless loop until you disabled it on console. It isn't 1999 any more, and Juniper isn't the same company. Stop expecting superior products, that isn't the business model any more. :( ___ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp