Re: "Tactical" /24 announcements

2021-08-14 Thread Jeff Tantsura
Every major vendor at some point in time has implemented RIB->FIB(really 
BGP->RIB->FIB) filtering, on Redback/Ericsson routers we did around 
2013/2014(@Jakob Heitz;-))
Route compression is a more complex topic, it is not difficult to aggregate, it 
is to effectively disaggregate on changes.
MS research  published a white paper in early 2010s, Volta in late 2010s 
implemented quite effectively route aggregation on top of FRR BGP stack (full 
BGP table into Trident2 class silicon),  to my memory, Spotify did a similar 
implementation with Arista.

Most importantly - these days (at least Cisco and Juniper) through service 
layer APIs allow to run best path off box and reinject the results back into 
RIB, where the routes computed would still be a subject to best route selection 
and hence reasonably safe wrt loops.
So if you feel advantageous - write your own compression code, toolset is there.

Cheers,
Jeff

> On Aug 14, 2021, at 06:21, Masataka Ohta  
> wrote:
> 
> Baldur Norddahl wrote:
> 
>> For all the stub networks out there we should be able to aggressively
>> filter routes without much harm.
> 
> Stub networks, which, by definition, do not have transit traffic
> over them, can not filter routes for transit traffic at all.
> 
> But, if both of two stub networks with address ranges of
> 131.112.32.0/24 and 131.112.33.0/24 advertise 131.112.32.0/23,
> the result will be disastrous for the networks.
> 
> As such, even stub networks should advertise exact address
> ranges of them.
> 
>Masataka Ohta


Re: Facebook post-mortems...

2021-10-04 Thread Jeff Tantsura
129.134.30.0/23, 129.134.30.0/24, 129.134.31.0/24. The specific routes covering 
all 4 nameservers (a-d) were withdrawn from all FB peering at approximately 
15:40 UTC.

Cheers,
Jeff

> On Oct 4, 2021, at 22:45, William Herrin  wrote:
> 
> On Mon, Oct 4, 2021 at 6:15 PM Michael Thomas  wrote:
>> They have a monkey patch subsystem. Lol.
> 
> Yes, actually, they do. They use Chef extensively to configure
> operating systems. Chef is written in Ruby. Ruby has something called
> Monkey Patches. This is where at an arbitrary location in the code you
> re-open an object defined elsewhere and change its methods.
> 
> Chef doesn't always do the right thing. You tell Chef to remove an RPM
> and it does. Even if it has to remove half the operating system to
> satisfy the dependencies. If you want it to do something reasonable,
> say throw an error because you didn't actually tell it to remove half
> the operating system, you have a choice: spin up a fork of chef with a
> couple patches to the chef-rpm interaction or just monkey-patch it in
> one of your chef recipes.
> 
> Regards,
> Bill Herrin
> 
> -- 
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/


Re: Validating multi-path in production?

2021-11-12 Thread Jeff Tantsura
LAG - Micro BFD (RFC7130) provides per constituent livability. MLAG is much 
more complicated (there’s a proposal in IETF but not progressing), so LACP is 
pretty much the only option.
ECMP could use old/good single hop BFD per pair.
Practically - if you introduce enough flows with one of the hash keys 
monotonically changing, eventually you’d exercise every path available;
on itself would not help for end2end testing, usually integrated with a form of 
s/net flow to provide “proof of transit.
Inband telemetry (chose your poison) does provide basic device ID it has 
traversed as well as in some cases POT. 
Finally - there are public Microsoft presentations how we use IPinIP encap to 
traverse a particular path on wide radix ECMP fabrics.

Cheers,
Jeff

> On Nov 12, 2021, at 07:55, Adam Thompson  wrote:
> 
> 
> Hello all.
> Over time, we've run into occurrences of both bugs and human error, both in 
> our own gear and in our partner networks' gear, specifically affecting 
> multi-path forwarding, at pretty much all layers: Multi-chassis LAG, ECMP, 
> and BGP MP.  (Yes, I am a corner-case magnet.  Lucky me.)
> 
> Some of these issues were fairly obvious when they happened, but some were 
> really hard to pin down.
> 
> We've found that typical network monitoring tools (Observium & Smokeping, not 
> to mention plain old ping and traceroute) can't really detect a 
> hashing-related or multi-path-related problem: either the packets get through 
> or they don't.
> 
> Can anyone recommend either tools or techniques to validate that multi-path 
> forwarding either is, or isn't, working correctly in a production network?  
> I'm looking to add something to our test suite for when we make changes to 
> critical network gear.  Almost all the scenarios I want to test only involve 
> two paths, if that helps.
> 
> The best I've come up with so far is to have two test systems (typically VMs) 
> that use adjacent IP addresses and adjacent MAC addresses, and test both 
> inbound and outbound to/from those, blindly trusting/hoping that hashing 
> algorithms will probably exercise both paths.
> 
> Some of the problems we've seen show that merely looking at interface 
> counters is insufficient, so I'm trying to find an explicit proof, not 
> implicit.
> 
> Any suggestions?  Surely other vendors and/or admins have screwed this up in 
> subtle ways enough times that this knowledge exists?  (My Google-fu is 
> usually pretty good, but I'm striking out - maybe I'm using the wrong terms.)
> 
> -Adam
> 
> Adam Thompson
> Consultant, Infrastructure Services
> 
> 100 - 135 Innovation Drive
> Winnipeg, MB, R3T 6A8
> (204) 977-6824 or 1-800-430-6404 (MB only)
> athomp...@merlin.mb.ca
> www.merlin.mb.ca


Re: Incrementally deployable secure Internet routing: operator survey

2021-12-17 Thread Jeff Tantsura
Adrian,

//Speaking as RTGWG co-chair

As commutated to SCION proponents before, a detailed presentation at IETF RTGWG 
would be a good starting point.
Please consider doing so at the upcoming IETF113.
The best way is to subscribe to rtgwg mailing list and respond to chairs email 
request for presentations, perhaps you’d also want to respond to 
comments/questions after tthe presentation, being subscribed would facilitate 
that.
Usually we’d prefer a draft to allow for a presentation, however, for the intro 
(unless you would actually go ahead and write an architecture draft), we’d be 
ok with just a presentation.

Please let me know if you have got any questions.


Cheers,
Jeff

> On Dec 17, 2021, at 12:27, Matt Harris  wrote:
> 
> 
>   
> Matt Harris​  
> |
> Infrastructure Lead
> 816‑256‑5446  
> |
> Direct
> Looking for help?
> Helpdesk  
> |
> Email Support
>   
> We build customized end‑to‑end technology solutions powered by NetFire Cloud.
>> On Fri, Dec 17, 2021 at 12:51 PM Adrian Perrig  wrote:
> 
>> Dear Nanog, 
>> 
>> Knowing how challenging it is to apply new technologies to current networks, 
>> in a collaboration between ETH, Princeton University, and University of 
>> Virginia, we constructed a system that provides security benefits for 
>> current Internet users while requiring minimal changes to networks. Our 
>> design can be built on top of the existing Internet to prevent routing 
>> attacks that can compromise availability and cause detrimental impacts on 
>> critical infrastructure – even given a low adoption rate. This provides 
>> benefits over other proposed approaches such as RPKI that only protects a 
>> route’s origin first AS, or BGPsec that requires widespread adoption and 
>> significant infrastructure upgrades.
>> 
>> Our architecture, called Secure Backbone AS (SBAS), allows clients to 
>> benefit from emerging secure routing deployments like SCION by tunneling 
>> into a secure infrastructure. SBAS provides substantial routing security 
>> improvements when retrofitted to the current Internet. It also provides 
>> benefits even to non-participating networks and endpoints when communicating 
>> with an SBAS-protected entity.
>> 
>> Our ultimate aim is to develop and deploy SBAS beyond an experimental scope. 
>> We have designed a survey to capture the impressions of the network operator 
>> community on the feasibility and viability of our design. The survey is 
>> anonymous and takes about 10 minutes to complete, including watching a brief 
>> 3-minute introductory video. 
>> 
>> https://docs.google.com/forms/d/e/1FAIpQLSc4VCkqd7i88y0CbJ31B7tVXyxBlhEy_zsYZByx6tsKAE7ROg/viewform?usp=pp_url&entry.549791324=NANOG+mailing+list
>> 
>> We thank you for helping inform our further work on this project. We will be 
>> happy to share the results with the community.
>> 
>> With kind regards
>>   Prateek Mittal, Adrian Perrig, Yixin Sun
> 
> Adrian,
> After viewing the video you included, I'm still not sure what SCION is or how 
> it works (as best I can tell, a bunch of folks get together, share an AS 
> border, and just do private AS peering with one another inside, then the 
> shared AS border does the internet advertising of whatever public networks 
> they wish?), but it sounds like your proposed monolithic new exercise 
> wouldn't offer much beyond what could be done by allowing folks to get a 
> default route VPN to a provider that does strict AS border RPKI ROV already. 
> Can you describe how this would be better or stronger protection from any 
> given attack than that, in a meaningful enough way as to make it worth 
> potentially creating massive bureaucracies and new technical systems which 
> seems to rely on massive networks of VPNs overlaid over the existing public 
> internet? 
> 
> - mdh
> 


Re: SRv6 Capable NOS and Devices

2022-01-15 Thread Jeff Tantsura
+1 Mark

There’s no modern silicon that doesn’t support MPLS (and is capable of imposing 
at least 3 labels). There’s 0 additional price for vendors to enable MPLS on 
their devices. The rest is subject to vendors’ licensing and is completely 
artificial. 
SR-MPLS uses MPLS data-plane and requires no changes to silicon, since head-end 
might be required to push more labels (TE, BSIDs, services)one needs to pay 
attention -  (RFC8491/8476) allow signaling of MSD (maximum SID depth) if 
centralized controller/PCE is used for path computation.
LDP after all the years of bug fixing is still a crappy protocol, moving to 
SR-MPLS makes all the sense.


Cheers,
Jeff

> On Jan 15, 2022, at 11:50, Mark Tinka  wrote:
> 
> 
> 
>> On 1/15/22 19:22, Colton Conor wrote:
>> 
>> True, but in general MPLS is more costly. It's available on limited
>> devices, from limited vendors. Infact, many of these vendors, like
>> Extreme, charge you if you want to enable MPLS features on a box.
> 
> Well, I don't entirely agree.
> 
> Pretty much all chips shipping now, either custom or merchant silicon, will 
> support MPLS. Whether the vendor chooses to implement it in code or not is a 
> whole other matter.
> 
> If you need MPLS, chances are you can afford it. If you don't, then you don't 
> have to worry about it.
> 
> For Extreme, are you referring to before or after they picked up Brocade?
> 
> There is MPLS available in a number of cheap software suites. Even Mikrotik 
> provides MPLS support. Whether it works or not, I can't tell you.
> 
> VyOS supports is too. Whether it works or not, I can't tell you.
> 
> But I think we are long past the days of "MPLS is expensive".
> 
> Mark.


Re: SRv6 Capable NOS and Devices

2022-01-16 Thread Jeff Tantsura
Plane IP underlay works real well, I’m yet to see tangible proof of TE in DC 
(outside of niche HPC/IB cases).
SR in DC - with overlay starting on the host SR-MPLSoUDP(RFC8663) is a perfect 
representation of a working technology that works in IP environment as well as 
allows end2end programming for MPLS WAN/DCI.
Here’s an example of end2end architecture that works really well - 
https://datatracker.ietf.org/doc/draft-bookham-rtgwg-nfix-arch/

Geneve (there are some quirks as you get into implementing it) is another 
example of a well designed overlay encap.


Cheers,
Jeff

> On Jan 15, 2022, at 23:54, Saku Ytti  wrote:
> 
> On Sat, 15 Jan 2022 at 19:22, Colton Conor  wrote:
> 
>> True, but in general MPLS is more costly. It's available on limited
>> devices, from limited vendors. Infact, many of these vendors, like
>> Extreme, charge you if you want to enable MPLS features on a box
> 
> Marketing, not fundamentals. DC people are driving demand for VXLAN
> and SRv6, because they assume MPLS is something scary and complex. So
> vendors implement something scary and complex to appease DC people.
> I'm sure in some years to the future, DC people will re-invent MPLS to
> simplify their stack.
> 
> -- 
>  ++ytti


Re: SRv6 Capable NOS and Devices

2022-01-16 Thread Jeff Tantsura
Hey Sabri,

Eventually they have implemented everything ;-)
Arista was a really special case, routing stack they acquired (NextHop) had no 
mpls (quite some time ago), 90% of their revenue was coming from IP only 
networks.

Life is good, MS is treating me well :).
Kids are growing, Marina’ business doing ok.
How’s life on your side?

Would love to meet, lunch or so?

Cheers,
Jeff

> On Jan 16, 2022, at 13:19, Jeff Tantsura  wrote:
> 
> 
> Plane IP underlay works real well, I’m yet to see tangible proof of TE in DC 
> (outside of niche HPC/IB cases).
> SR in DC - with overlay starting on the host SR-MPLSoUDP(RFC8663) is a 
> perfect representation of a working technology that works in IP environment 
> as well as allows end2end programming for MPLS WAN/DCI.
> Here’s an example of end2end architecture that works really well - 
> https://datatracker.ietf.org/doc/draft-bookham-rtgwg-nfix-arch/
> 
> Geneve (there are some quirks as you get into implementing it) is another 
> example of a well designed overlay encap.
> 
> 
> Cheers,
> Jeff
> 
>>> On Jan 15, 2022, at 23:54, Saku Ytti  wrote:
>>> 
>> On Sat, 15 Jan 2022 at 19:22, Colton Conor  wrote:
>> 
>>> True, but in general MPLS is more costly. It's available on limited
>>> devices, from limited vendors. Infact, many of these vendors, like
>>> Extreme, charge you if you want to enable MPLS features on a box
>> 
>> Marketing, not fundamentals. DC people are driving demand for VXLAN
>> and SRv6, because they assume MPLS is something scary and complex. So
>> vendors implement something scary and complex to appease DC people.
>> I'm sure in some years to the future, DC people will re-invent MPLS to
>> simplify their stack.
>> 
>> -- 
>>  ++ytti


Re: Regarding BGP offloading

2022-03-16 Thread Jeff Tantsura
IOS- XR and Junos (don’t know about others) expose service level APIs that 
allow offbox best path selection and consequently injecting these back into RIB.
FRR is in process of implementing customized best path using lua scripts.

Cheers,
Jeff

> On Mar 16, 2022, at 16:15, Anurag Bhatia  wrote:
> 
> 
> Hello NANOG!
> 
> 
> I have seen limited talks about offloading of BGP as a whole into 
> containers/VMs etc. Take e.g this old Google blog post from 2017. Quoting 
> from that: 
> 
>> Second, we separate the logic and control of traffic management from the 
>> confines of individual router “boxes.” Rather than relying on thousands of 
>> individual routers to manage and learn from packet streams, we push the 
>> functionality to a distributed system that extracts the aggregate 
>> information. We leverage our large-scale computing infrastructure and 
>> signals from the application itself to learn how individual flows are 
>> performing, as determined by the end user’s perception of quality.
> 
> 
> If I am reading this correctly, it gives an impression of just BGP signalling 
> offload (to VMs/containers...). Is that understanding correct? Speaking from 
> network topology wise anyone here has an idea or could point to a resource on 
> how it is actually achieved? If the frontend device simply starts passing TCP 
> 179 requests to some backend server running say bird, frr etc, how will that 
> information be passed back to the forwarding plane? Are there more public 
> deployments of this sort of setup where BGP as a whole (that is sessions, 
> route calculation, policies, filtering etc) is offloaded to some x86 device 
> in the backend? 
> 
> Or am I just reading it wrong and it's actually smaller VM/containers will 
> full router functionality and BGP alone is not being offloaded? So the 
> logical L3 endpoint here is VMs? What sort of config the device sitting in 
> frontend would have at the interface level to achieve that? 
> 
> 
> 
> Appreciate your responses! 
> 
> Thanks. 
> 
> -- 
> Anurag Bhatia
> anuragbhatia.com


Re: Opinions on Arista for BGP?

2022-04-01 Thread Jeff Tantsura
Important note - Arista has 2 BGP implementations in the routing stack, old 
(NH/ribd) that has been there since day 1 and newly written  (I believe mostly 
driven by EVPN development), when compared to other vendors - make sure to 
compare with the new (modern code, highly multithreaded, cache optimized) 
implementation.

Cheers,
Jeff

> On Apr 1, 2022, at 11:10, Adam Thompson  wrote:
> 
> 
> TL;DR: Yes, go ahead, they’re good, we like them.
>  
> I won’t say they’re perfect, but we’re using them at the edge (two of them in 
> a hybrid core/edge model right now, even!) and I would happily endorse them 
> for edge routers.  Their BGP stack hasn’t put up any major roadblocks for us 
> so far (at least, that weren’t, ahem, self-inflicted).  We’ve had 1 incident 
> in the last ~2 years where a stuck route on one router needed a full reboot 
> to clear out, following a partial outage - that’s the worst thing I can 
> remember right now.
>  
> Don’t know if you know this already or not, so making it clear:  the one 
> thing to beware of IMO, compared to e.g. a high-end Juniper MX960-style 
> system where you can turn every single feature on without caring, is that the 
> Aristas can do almost anything you can dream of… but not necessarily all at 
> the same time on the same box, no matter which model you’re looking at.
> So if you use it as an edge router?  Fine.  As a VXLAN gateway?  Fine.  As a 
> core router or switch with every kind of accounting turned on?  Fine.  All of 
> those things simultaneously?  Maybe.  It’ll be decision time for which 
> specific, individual sub-features you can live without.  But you’re paying 
> 1/10th (probably less!) of what you would for an MX960, so there you go.
>  
> If this helps, they’re similar to the Cisco Nexus platform in this regard, 
> e.g. if you enable and use every single “Feature” on the fixed-configuration 
> Nexuses you’ll start running out of hardware configuration resources to 
> enable them long before you can finish configuring or using all those 
> features.
>  
> This is something your Arista SE can go through with you in excruciating 
> detail (keyword: “TCAM Profile”), if you think you might be veering into that 
> territory.  After lots of iterations, and a new software release or two, our 
> all-in-one boxes (7280SR2K) do more or less everything we want them to.  
> (Apparently we aren’t typical Arista customers.  Go figure.)  If you want to 
> do BGP and MLAG at the same time on the same box, get your SE involved from 
> the start.
>  
> For anyone not trying to overload the platform or do too much “weird” stuff, 
> it should be a quick and easy deployment producing much happiness.
>  
> -Adam
>  
>  
> Adam Thompson
> Consultant, Infrastructure Services
> 
> 100 - 135 Innovation Drive
> Winnipeg, MB, R3T 6A8
> (204) 977-6824 or 1-800-430-6404 (MB only)
> athomp...@merlin.mb.ca
> www.merlin.mb.ca
>  
> From: NANOG  On Behalf Of 
> David Hubbard
> Sent: Thursday, March 31, 2022 8:10 AM
> To: nanog@nanog.org
> Subject: Opinions on Arista for BGP?
>  
> Hi all, would love to get any current opinions (on or off list) on the 
> stability of Arista’s BGP implementation these days.  Been many years since I 
> last looked into it and wasn’t ready for a change yet.  Past many years have 
> been IOS XR on NCS5500 platform and Arista everywhere but the edge.  I’ve 
> been really happy with them in the other roles, so am thinking about edge 
> now.  I do like and use XR’s RPL, and prefix/as/community/object sets, but we 
> can live without via our own config management if there aren’t easy 
> equivalents.  No fancy needs at all, just small web server networks, so just 
> need reliable eBGP and internal OSPF/OSPFv3.
>  
> Thanks,
>  
> David


Re: sr - spring - what's the deal with 2 names

2020-09-10 Thread Jeff Tantsura
SR could be instantiated with 2 data planes, MPLS and IPv6  - SR-MPLS and SRv6 
respectively.
MPLS data  plane could be instantiated over either IPv4 or IPv6 (similarly to 
LDP6), MPLSoUDP->SRoUDP allows  transport of SR-MPLS over IP/UDP(RFC8663) and 
could be used to build innovative, end2end architectures, e.g.  
draft-bookham-rtgwg-nfix-arch.
There is SFC related work, draft-ietf-spring-nsh-sr.

And there’s whole SRv6 thingy...

Let me know if I can help in any way.

Cheers,
Jeff

> On Sep 10, 2020, at 08:10, aar...@gvtc.com wrote:
> 
> Interesting... I've never heard of SPRINGv4
> 
> https://www.juniper.net/us/en/products-services/routing/ptx-series/datasheet
> s/1000538.page 
> 
> I found it in the bottom section
> 
> I wonder if SPRINGv4 is like SRv6, meaning, SPRING(SR) over IPv4 dataplane?
> Or, am I reading way too much into that SPRINGv4 acronym?
> 
> -Aaron
> 
> 


Re: sr - spring - what's the deal with 2 names

2020-09-10 Thread Jeff Tantsura
I have described what SR is, not what vendors (for variety of reasons) do with 
it, hence “could” ;-)
As a side note - SRoUDP works seamlessly over either v4 or v6.

Regards,
Jeff

> On Sep 10, 2020, at 12:35, Mark Tinka  wrote:
> 
> 
> 
>> On 10/Sep/20 10:40, Jeff Tantsura wrote:
>> 
>> MPLS data  plane could be instantiated over either IPv4 or IPv6
>> (similarly to LDP6),
> 
> Be mindful of sketchy (or non-existent) IPv6 support in SR for IS-IS and
> OSPF across all vendors.
> 
> Mark.


Re: SRv6

2020-09-15 Thread Jeff Tantsura
GRE, VXLAN or any other tunneling encap of the day.
As long as next-hop could be resolved behind remote end

Regards,
Jeff

> On Sep 15, 2020, at 11:14, Randy Bush  wrote:
> 
> 
>> 
>> I'm still learning, but, It does seem interesting that the IP layer
>> (v6) can now support vpn's without mpls.
> 
> as the packet payload is nekkid cleartext, where is the P in vpn?


Re: SRv6

2020-09-15 Thread Jeff Tantsura
Randy,

Was meant as the reply to Aaron’s comment about VPN’s over non MPLS underlay, 
not the encryption of it (which is orthogonal).

Cheers,
Jeff
On Sep 15, 2020, 12:59 PM -0700, Randy Bush , wrote:
> > GRE, VXLAN or any other tunneling encap of the day.
> > As long as next-hop could be resolved behind remote end
>
> i was not aware that GRE, VXLAN (without CN103618596A), and other tunnel
> encaps encrypted the payload. learn something new every day. thanks!
>
> > > > I'm still learning, but, It does seem interesting that the IP layer
> > > > (v6) can now support vpn's without mpls.
> > > as the packet payload is nekkid cleartext, where is the P in vpn?


Re: SRm6 (was:SRv6)

2020-09-17 Thread Jeff Tantsura
MPLSoUDP is not the technology you should be looking at, SRoUDP (RFC8663) is.
draft-bookham-rtgwg-nfix-arch describes an architecture that makes use of it to 
provide an end2end SR path.

Cheers,
Jeff
On Sep 17, 2020, 9:32 AM -0700, James Bensley , 
wrote:
>
>
> On 17 September 2020 11:05:24 CEST, Saku Ytti  wrote:
> > On Thu, 17 Sep 2020 at 11:03, James Bensley 
> > wrote:
> >
> > > MPLSoUDP lacks transport engineering features like explicit paths,
> > FRR LFA and FRR rLFA, assuming only a single IP header is used for the
> > transport abstraction [1]. If you want stuff like TI-LFA (I assume this
> > is supported in SRm6 and SRv6, but I'm not familiar with these, sorry
> > if that is a false assumption) you need additional transport headers or
> > a stack of MPLS labels encapped in the UDP header and then you're back
> > to square one.
> >
> > One of us has confusion about what MPLSoUDP is. I don't run it, so
> > might be me.
> >
> > SPORT == Entropy (so non-cooperating transit can balance)
> > DPORT == 6635 (NOT label)
> > Payload = MPLS label(s)
> >
> > Whatever MPLS can do MPLSoUDP can, by definition, do. It is just
> > another MPLS point-to-point adjacency after the MPLSoUDP
> > abstraction/tunnel.
>
> Nope, we have the same understanding. But the email I was responding to was 
> talking about using MPLSoUDP for service label encapsulation *only*, not 
> transport & services labels:
>
>
> > > If you want an IPv6 underlay for a network offering VPN services
> >
> > And what's wrong again with MPLS over UDP to accomplish the very same with 
> > simplicity ?
> >
> > MPLS - just a demux label to a VRF/CE
> > UDP with IPv6 header plain and simple
> >
> > + minor benefit: you get all of this with zero change to shipping hardware 
> > and software ... Why do we need to go via decks of SRm6 slides and new wave 
> > of protocols extensions ???
>
>
> Cheers,
> James.


Re: A study on community-triggered updates in BGP

2020-10-21 Thread Jeff Tantsura
Hi Thomas,

We had a similar discussion on FRR slack, there are some duplicates indeed.
Are you planing to test FRR at some point in time?

Cheers,
Jeff
On Oct 21, 2020, 3:58 PM -0700, Jakob Heitz (jheitz) via NANOG 
, wrote:
> Thomas,
>
> I confirmed your case and took a look at the code.
> The outbound duplicate suppression function tries to detect
> duplicates without actually storing or recreating the
> previously sent update, so it misses some cases.
>
> Your use case is a good one. We will check to see if we can
> detect it without compromising significantly on resource usage.
> Thank you for raising the issue.
>
> Regards,
> Jakob.
>
> -Original Message-
> Date: Tue, 20 Oct 2020 04:48:37 -0700
> From: Thomas Krenc 
>
> Hi Jakob.
>
> The simple configuration below allows communities to be forwarded
> (send-community-ebgp), but are cleaned at egress (using route-policy and
> community-set).
>
> In the experiment, the router receives announcements with altering
> community attributes only, from the internal peer. After the filter is
> applied, the router sends duplicates to the external peer.
>
> Also, In a slightly different setup, the router sends duplicates due to
> changes in the next-hop only.
>
> best regards
> Thomas
>
> ---
>
> RP/0/0/CPU0:ios(config)#show running-config
> Tue Oct 20 02:56:24.230 UTC
> Building configuration...
> !! IOS XR Configuration 6.0.1
> !! Last configuration change at Tue Oct 20 02:56:02 2020 by cisco
> !
> interface MgmtEth0/0/CPU0/0
> ?shutdown
> !
> interface GigabitEthernet0/0/0/0
> ?ipv4 address 10.12.0.2 255.255.255.252
> !
> interface GigabitEthernet0/0/0/1
> ?ipv4 address 10.20.0.1 255.255.255.252
> !
> community-set all
> ? *:*
> end-set
> !
> route-policy nofilter
> ? pass
> end-policy
> !
> route-policy egressfilter
> ? delete community in all
> ? pass
> end-policy
> !
> router bgp 65002
> ?bgp router-id 10.12.0.2
> ?address-family ipv4 unicast
> !
> ?neighbor 10.12.0.1
> ? remote-as 65001
> ? address-family ipv4 unicast
> ?? send-community-ebgp
> ?? route-policy egressfilter out
> !
> ?neighbor 10.20.0.2
> ? remote-as 65002
> ? address-family ipv4 unicast
> !
> end
>
> On 10/17/20 3:59 PM, Jakob Heitz (jheitz) via NANOG wrote:
> > IOS-XR has duplicate update suppression logic for EBGP sessions,
> > not for IBGP sessions.
> >
> > If you are using EBGP and seeing a fault in the duplicate update
> > suppression logic in IOS-XR, please let me know configs and details
> > of the experiment.
> >
> > Regards,
> > Jakob.
> >
> > -Original Message-
> > Date: Thu, 15 Oct 2020 18:35:58 -0700
> > From: Thomas Krenc 
> >
> > Dear NANOG,
> >
> > As a team of researchers from NPS and TU Berlin, we are investigating
> > the impact of BGP community attributes on the update behavior between ASes.
> >
> > We find that when a route is associated with multiple distinct community
> > attributes it does not only lead to multiple announcement at the tagging
> > AS, but also at neighboring ASes, if communities are not filtered
> > properly. This behavior is wide-spread.
> >
> > In order to better understand our observations, we have performed a
> > series of laboratory experiments using Cisco IOS, Junos OS, as well as
> > the BIRD daemon.
> >
> > We find that - by default - all tested routers generate announcements
> > with changing community attributes, even when other attributes do not
> > change. In addition, when communities are filtered at egress, Cisco und
> > BIRD send duplicate announcements (Juniper does not).
> >
> > Since our findings are limited to observations in public data as well as
> > few router implementations, we would like to share our research and
> > kindly ask you to have a look at:
> >
> > ??? https://www.cmand.org/communityexploration/
> >
> > There, we provide some resources documenting our research, as well as
> > open questions. We greatly appreciate any feedback and insights you can
> > offer. Also, please don't hesitate to contact us directly:
> >
> > ??? communityexploration AT cmand DOT org
> >
> > best regards
> >
> > Thomas Krenc
> > Postdoctoral Researcher
> > Naval Postgraduate School


Re: BGP Peers Data modeling schema

2020-11-05 Thread Jeff Tantsura
YANG is the right direction.
OpenConfig BGP and policy models are supported by every vendor on the earth.
We are finalizing IETF BGP and policy models
draft-ietf-rtgwg-policy-model is about to be last-called
draft-ietf-idr-bgp-model is pretty much ready

Cheers,
Jeff
On Nov 5, 2020, 4:57 AM -0800, Douglas Fischer , 
wrote:
> I'm designing a tool for provisioning configurations for an ITP and his Peers.
> The idea is that based on that, all the configs to all the involved 
> components configurations to be deployed based on that source of data. I'm 
> Talking about Routers, BMP, SNMP tool(Ex.: Zabbix), etc...
>
> But, once again, I'm feeling that I'm reinventing the wheel.
> I'm pretty sure that someone else has already suffered from that.
>
> I search for a bit, and I didn't find anything...
> But with this gray area between developers and network operators, I'm not 
> sure if I'm looking at the right place.
>
> I even tried to look at http://schema.org but didn't find anything related to 
> networks and BGP there yet.
>
>
> So, anyone could point me in the right direction?
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação


Re: Career Opportunities In Network Engineering — Watch it on NANOG TV 👉

2020-11-21 Thread Jeff Tantsura
For DNS fundamentals - I’d definitely recommend DNS deep dive we have recorded 
at IETF108.
https://www.youtube.com/watch?v=DV0q9s94RL8

Cheers,
Jeff
On Nov 21, 2020, 8:49 AM -0800, NANOG News , wrote:
> Career Opportunities In Network Engineering
> “Be curious and passionate; this is a lifelong journey.”
> — Selva Srinivasan, Microsoft
>
> Big thanks to all who attended our first NANOG U Webinar last week! And of 
> course, to each of our guest speakers — it was a great conversation, all 
> around.
>
> Didn't get to attend? No worries! Watch the full webinar this weekend, on 
> NANOG TV.
>
> Watch Now
>
>
> Save the date — 2021 Webinars
> January 22, 2021 — DNS Fundamentals
> 11am - 1pm PST / 2pm - 4pm EST
>
> Speaker:
> Eddy Winstead, Internet Systems Consortium (ISC)
>
> February 26, 2021 — BGP Fundamentals
> 11am - 1pm PST / 2pm - 4pm EST
>
> Speaker:
> Aaron Atac, Akamai Technologies
>
> Learn More
>
>
> Partner with us to empower and inspire
> A NANOG Outreach partnership provides your organization a way to directly 
> support the communities who need access to our tools and resources most, and 
> the opportunity to inspire and connect with the next generation of networking 
> professionals. In turn, we’ll publicly recognize your contributions as a 
> partner in meaningful ways that resonate with the communities we serve.
>
> Interested in collaborating with us? Connect with Shawn Winstead, NANOG's 
> Business Development Specialist, to start a conversation + learn more.


Re: Trident3 vs Jericho2

2021-04-09 Thread Jeff Tantsura
Buffer size has nothing to do with feature richness.
Assuming you are asking about DC  - in a wide radix low oversubscription 
network shallow buffers do just fine, some applications (think map reduce/ML 
model training) have many to one traffic patterns and suffer from incast as the 
result, deep buffers might be helpful here, DCI/DC-GW is another case where 
deep buffers could be justified.

Regards,
Jeff

> On Apr 9, 2021, at 05:59, Dmitry Sherman  wrote:
> 
> Once again, which is better shared buffer featurerich or fat buffer switches?
> When its better to put big buffer switch? When its better to drop and 
> retransmit instead of queueing?
> 
> Thanks.
> Dmitry


Re: 400G forwarding - how does it work?

2022-07-26 Thread Jeff Tantsura
As Lincoln said - all of us directly working with BCM/other silicon vendors 
have signed numerous NDAs.
However if you ask a well crafted question - there’s always a way to talk about 
it ;-)

In general, if we look at the whole spectrum, on one side there’re massively 
parallelized “many core” RTC ASICs, such as Trio, Lightspeed, and similar (as 
the last gasp of Redback/Ericsson venture - we have built 1400 HW threads ASIC 
(Spider).
On another side of the spectrum - fixed pipeline ASICs, from BCM Tomahawk at 
its extreme (max speed/radix - min features) moving with BCM Trident, Innovium, 
Barefoot(quite different animal wrt programmability), etc - usually shallow on 
chip buffer only (100-200M).

In between we have got so called programmable pipeline silicon, BCM DNX and 
Juniper Express are in this category, usually a combo of OCB + off chip memory 
(most often HBM), (2-6G), usually have line-rate/high scale security/overlay 
encap/decap capabilities. Usually have highly optimized RTC blocks within a 
pipeline (RTC within macro). The way and speed to access DBs, memories is 
evolving with each generation, number/speed of non networking cores(usually 
ARM)  keeps growing - OAM, INT, local optimizations are primary users of it.

Cheers,
Jeff

> On Jul 25, 2022, at 15:59, Lincoln Dale  wrote:
> 
> 
>> On Mon, Jul 25, 2022 at 11:58 AM James Bensley  
>> wrote:
> 
>> On Mon, 25 Jul 2022 at 15:34, Lawrence Wobker  wrote:
>> > This is the parallelism part.  I can take multiple instances of these 
>> > memory/logic pipelines, and run them in parallel to increase the 
>> > throughput.
>> ...
>> > I work on/with a chip that can forwarding about 10B packets per second… so 
>> > if we go back to the order-of-magnitude number that I’m doing about “tens” 
>> > of memory lookups for every one of those packets, we’re talking about 
>> > something like a hundred BILLION total memory lookups… and since memory 
>> > does NOT give me answers in 1 picoseconds… we get back to pipelining and 
>> > parallelism.
>> 
>> What level of parallelism is required to forward 10Bpps? Or 2Bpps like
>> my J2 example :)
> 
> I suspect many folks know the exact answer for J2, but it's likely under NDA 
> to talk about said specific answer for a given thing.
> 
> Without being platform or device-specific, the core clock rate of many 
> network devices is often in a "goldilocks" zone of (today) 1 to 1.5GHz with a 
> goal of 1 packet forwarded 'per-clock'. As LJ described the pipeline that 
> doesn't mean a latency of 1 clock ingress-to-egress but rather that every 
> clock there is a forwarding decision from one 'pipeline', and the MPPS/BPPS 
> packet rate is achieved by having enough pipelines in parallel to achieve 
> that.
> The number here is often "1" or "0.5" so you can work the number backwards. 
> (e.g. it emits a packet every clock, or every 2nd clock).
> 
> It's possible to build an ASIC/NPU to run a faster clock rate, but gets back 
> to what I'm hand-waving describing as "goldilocks". Look up power vs 
> frequency and you'll see its non-linear.
> Just as CPUs can scale by adding more cores (vs increasing frequency), ~same 
> holds true on network silicon, and you can go wider, multiple pipelines. But 
> its not 10K parallel slices, there's some parallel parts, but there are 
> multiple 'stages' on each doing different things.
> 
> Using your CPU comparison, there are some analogies here that do work:
>  - you have multiple cpu cores that can do things in parallel -- analogous to 
> pipelines
>  - they often share some common I/O (e.g. CPUs have PCIe, maybe sharing some 
> DRAM or LLC)  -- maybe some lookup engines, or centralized buffer/memory
>  - most modern CPUs are out-of-order execution, where under-the-covers, a 
> cache-miss or DRAM fetch has a disproportionate hit on performance, so its 
> hidden away from you as much as possible by speculative execution out-of-order
> -- no direct analogy to this one - it's unlikely most forwarding 
> pipelines do speculative execution like a general purpose CPU does - but they 
> definitely do 'other work' while waiting for a lookup to happen
> 
> A common-garden x86 is unlikely to achieve such a rate for a few different 
> reasons:
>  - packets-in or packets-out go via DRAM then you need sufficient DRAM (page 
> opens/sec, DRAM bandwidth) to sustain at least one write and one read per 
> packet. Look closer at DRAM and see its speed, Pay attention to page 
> opens/sec, and what that consumes.
>  - one 'trick' is to not DMA packets to DRAM but instead have it go into SRAM 
> of some form - e.g. Intel DDIO, ARM Cache Stashing, which at least 
> potentially saves you that DRAM write+read per packet
>   - ... but then do e.g. a LPM lookup, and best case that is back to a memory 
> access/packet. Maybe it's in L1/L2/L3 cache, but likely at large table sizes 
> it isn't.
>  - ... do more things to the packet (urpf lookups, counters) and it's yet 
> more lookups.
> 
> Software can achiev

Re: 400G forwarding - how does it work?

2022-07-27 Thread Jeff Tantsura
FYI

https://community.juniper.net/blogs/nicolas-fevrier/2022/07/27/voq-and-dnx-pipeline

Cheers,
Jeff

> On Jul 25, 2022, at 15:59, Lincoln Dale  wrote:
> 
> 
>> On Mon, Jul 25, 2022 at 11:58 AM James Bensley  
>> wrote:
> 
>> On Mon, 25 Jul 2022 at 15:34, Lawrence Wobker  wrote:
>> > This is the parallelism part.  I can take multiple instances of these 
>> > memory/logic pipelines, and run them in parallel to increase the 
>> > throughput.
>> ...
>> > I work on/with a chip that can forwarding about 10B packets per second… so 
>> > if we go back to the order-of-magnitude number that I’m doing about “tens” 
>> > of memory lookups for every one of those packets, we’re talking about 
>> > something like a hundred BILLION total memory lookups… and since memory 
>> > does NOT give me answers in 1 picoseconds… we get back to pipelining and 
>> > parallelism.
>> 
>> What level of parallelism is required to forward 10Bpps? Or 2Bpps like
>> my J2 example :)
> 
> I suspect many folks know the exact answer for J2, but it's likely under NDA 
> to talk about said specific answer for a given thing.
> 
> Without being platform or device-specific, the core clock rate of many 
> network devices is often in a "goldilocks" zone of (today) 1 to 1.5GHz with a 
> goal of 1 packet forwarded 'per-clock'. As LJ described the pipeline that 
> doesn't mean a latency of 1 clock ingress-to-egress but rather that every 
> clock there is a forwarding decision from one 'pipeline', and the MPPS/BPPS 
> packet rate is achieved by having enough pipelines in parallel to achieve 
> that.
> The number here is often "1" or "0.5" so you can work the number backwards. 
> (e.g. it emits a packet every clock, or every 2nd clock).
> 
> It's possible to build an ASIC/NPU to run a faster clock rate, but gets back 
> to what I'm hand-waving describing as "goldilocks". Look up power vs 
> frequency and you'll see its non-linear.
> Just as CPUs can scale by adding more cores (vs increasing frequency), ~same 
> holds true on network silicon, and you can go wider, multiple pipelines. But 
> its not 10K parallel slices, there's some parallel parts, but there are 
> multiple 'stages' on each doing different things.
> 
> Using your CPU comparison, there are some analogies here that do work:
>  - you have multiple cpu cores that can do things in parallel -- analogous to 
> pipelines
>  - they often share some common I/O (e.g. CPUs have PCIe, maybe sharing some 
> DRAM or LLC)  -- maybe some lookup engines, or centralized buffer/memory
>  - most modern CPUs are out-of-order execution, where under-the-covers, a 
> cache-miss or DRAM fetch has a disproportionate hit on performance, so its 
> hidden away from you as much as possible by speculative execution out-of-order
> -- no direct analogy to this one - it's unlikely most forwarding 
> pipelines do speculative execution like a general purpose CPU does - but they 
> definitely do 'other work' while waiting for a lookup to happen
> 
> A common-garden x86 is unlikely to achieve such a rate for a few different 
> reasons:
>  - packets-in or packets-out go via DRAM then you need sufficient DRAM (page 
> opens/sec, DRAM bandwidth) to sustain at least one write and one read per 
> packet. Look closer at DRAM and see its speed, Pay attention to page 
> opens/sec, and what that consumes.
>  - one 'trick' is to not DMA packets to DRAM but instead have it go into SRAM 
> of some form - e.g. Intel DDIO, ARM Cache Stashing, which at least 
> potentially saves you that DRAM write+read per packet
>   - ... but then do e.g. a LPM lookup, and best case that is back to a memory 
> access/packet. Maybe it's in L1/L2/L3 cache, but likely at large table sizes 
> it isn't.
>  - ... do more things to the packet (urpf lookups, counters) and it's yet 
> more lookups.
> 
> Software can achieve high rates, but note that a typical ASIC/NPU does on the 
> order of >100 separate lookups per packet, and 100 counter updates per packet.
> Just as forwarding in a ASIC or NPU is a series of tradeoffs, forwarding in 
> software on generic CPUs is also a series of tradeoffs.
> 
> 
> cheers,
> 
> lincoln.
> 


RE: 400G forwarding - how does it work?

2022-08-03 Thread Jeff Tantsura
Hey, This is not an advertisement but an attempt to help folks to better understand networking HW. Some of you might know (and love 😊) “between 0x2 nerds” podcast Jeff Doyle and I have been hosting for a couple of years. Following up the discussion we have decided to dedicate a number of upcoming podcasts to networking HW, the topic where more information and better education is very much needed (no, you won’t have to sign NDA before joining 😊), we have lined up a number of great guests, people who design and build ASICs and can talk firsthand about evolution of networking HW, complexity of the process, differences between fixed and programmable pipelines, memories and databases. This Thursday (08/04) at 11:00PST we are joined by one and only Sharada Yeluri - Sr. Director ASIC at Juniper. Other vendors will be joining in the later episodes, usual rules apply – no marketing, no BS.More to come, stay tuned.Live feed: https://lnkd.in/gk2x2ezZBetween 0x2 nerds playlist, videos will be published to: https://www.youtube.com/playlist?list=PLMYH1xDLIabuZCr1Yeoo39enogPA2yJB7 Cheers,Jeff From: James BensleySent: Wednesday, July 27, 2022 12:53 PMTo: Lawrence Wobker; NANOGSubject: Re: 400G forwarding - how does it work? On Tue, 26 Jul 2022 at 21:39, Lawrence Wobker  wrote:> So if this pipeline can do 1.25 billion PPS and I want to be able to forward 10BPPS, I can build a chip that has 8 of these pipelines and get my performance target that way.  I could also build a "pipeline" that processes multiple packets per clock, if I have one that does 2 packets/clock then I only need 4 of said pipelines... and so on and so forth. Thanks for the response Lawrence. The Broadcom BCM16K KBP has a clock speed of 1.2Ghz, so I expect theJ2 to have something similar (as someone already mentioned, most chipsI've seen are in the 1-1.5Ghz range), so in this case "only" 2pipelines would be needed to maintain the headline 2Bpps rate of theJ2, or even just 1 if they have managed to squeeze out two packets percycle through parallelisation within the pipeline. Cheers,James. 


Re: 400G forwarding - how does it work?

2022-08-04 Thread Jeff Tantsura
Apologies for garbage/HTMLed email, not sure what happened (thanks
Brian F for letting me know).
Anyway, the podcast with Juniper (mostly around Trio/Express) has been
broadcasted today and is available at
https://www.youtube.com/watch?v=1he8GjDBq9g
Next in the pipeline are:
Cisco SiliconOne
Broadcom DNX (Jericho/Qumran/Ramon)
For both - the guests are main architects of the silicon

Enjoy


On Wed, Aug 3, 2022 at 5:06 PM Jeff Tantsura  wrote:
>
> Hey,
>
>
>
> This is not an advertisement but an attempt to help folks to better 
> understand networking HW.
>
>
>
> Some of you might know (and love 😊) “between 0x2 nerds” podcast Jeff Doyle 
> and I have been hosting for a couple of years.
>
>
>
> Following up the discussion we have decided to dedicate a number of upcoming 
> podcasts to networking HW, the topic where more information and better 
> education is very much needed (no, you won’t have to sign NDA before joining 
> 😊), we have lined up a number of great guests, people who design and build 
> ASICs and can talk firsthand about evolution of networking HW, complexity of 
> the process, differences between fixed and programmable pipelines, memories 
> and databases. This Thursday (08/04) at 11:00PST we are joined by one and 
> only Sharada Yeluri - Sr. Director ASIC at Juniper. Other vendors will be 
> joining in the later episodes, usual rules apply – no marketing, no BS.
>
> More to come, stay tuned.
>
> Live feed: https://lnkd.in/gk2x2ezZ
>
> Between 0x2 nerds playlist, videos will be published to: 
> https://www.youtube.com/playlist?list=PLMYH1xDLIabuZCr1Yeoo39enogPA2yJB7
>
>
>
> Cheers,
>
> Jeff
>
>
>
> From: James Bensley
> Sent: Wednesday, July 27, 2022 12:53 PM
> To: Lawrence Wobker; NANOG
> Subject: Re: 400G forwarding - how does it work?
>
>
>
> On Tue, 26 Jul 2022 at 21:39, Lawrence Wobker  wrote:
>
> > So if this pipeline can do 1.25 billion PPS and I want to be able to 
> > forward 10BPPS, I can build a chip that has 8 of these pipelines and get my 
> > performance target that way.  I could also build a "pipeline" that 
> > processes multiple packets per clock, if I have one that does 2 
> > packets/clock then I only need 4 of said pipelines... and so on and so 
> > forth.
>
>
>
> Thanks for the response Lawrence.
>
>
>
> The Broadcom BCM16K KBP has a clock speed of 1.2Ghz, so I expect the
>
> J2 to have something similar (as someone already mentioned, most chips
>
> I've seen are in the 1-1.5Ghz range), so in this case "only" 2
>
> pipelines would be needed to maintain the headline 2Bpps rate of the
>
> J2, or even just 1 if they have managed to squeeze out two packets per
>
> cycle through parallelisation within the pipeline.
>
>
>
> Cheers,
>
> James.
>
>


RE: 400G forwarding - how does it work?

2022-08-09 Thread Jeff Tantsura
Saku, I have forwarded your questions to Sharada. All, For this week – at 11:00am PST, Thursday 08/11, we will be joined by Guy Caspary (co-founder of Leaba Semiconductor (acquired by Cisco -> SiliconOne)https://m.youtube.com/watch?v=GDthnCj31_Y For the next week, I’m planning to get one of main architects of Broadcom DNX  (Jericho/Qumran/Ramon). Cheers,Jeff From: Saku YttiSent: Friday, August 5, 2022 12:15 AMTo: Jeff TantsuraCc: NANOG; Jeff DoyleSubject: Re: 400G forwarding - how does it work? Thank you for this. I wish there would have been a deeper dive to the lookup side. My open questions a) Trio model of packet stays in single PPE until done vs. FP model ofline-of-PPE (identical cores). I don't understand the advantages ofthe FP model, the Trio model advantages are clear to me. Obviously theFP model has to have some advantages, what are they? b) What exactly are the gains of putting two trios on-package inTrio6, there is no local-switching between WANs of trios in-package,they are, as far as I can tell, ships in the night, packets betweentrios go via fabric, as they would with separate Trios. I canunderstand the benefit of putting trio and HBM2 on the same package,to reduce distance so wattage goes down or frequency goes up. c) What evolution they are thinking for the shallow ingress buffersfor Trio6. The collateral damage potential is significant, because WANwhich asks most, gets most, instead each having their fair share, thuspotentially arbitrarily low rate WAN ingress might not get access toingress buffer causing drop. Would it be practical in terms ofwattage/area to add some sort of preQoS towards the shallow ingressbuffer, so each WAN ingress has a fair guaranteed-rate to shallowbuffers? On Fri, 5 Aug 2022 at 02:18, Jeff Tantsura  wrote:> > Apologies for garbage/HTMLed email, not sure what happened (thanks> Brian F for letting me know).> Anyway, the podcast with Juniper (mostly around Trio/Express) has been> broadcasted today and is available at> https://www.youtube.com/watch?v=1he8GjDBq9g> Next in the pipeline are:> Cisco SiliconOne> Broadcom DNX (Jericho/Qumran/Ramon)> For both - the guests are main architects of the silicon> > Enjoy> > > On Wed, Aug 3, 2022 at 5:06 PM Jeff Tantsura  wrote:> >> > Hey,> >> >> >> > This is not an advertisement but an attempt to help folks to better understand networking HW.> >> >> >> > Some of you might know (and love 😊) “between 0x2 nerds” podcast Jeff Doyle and I have been hosting for a couple of years.> >> >> >> > Following up the discussion we have decided to dedicate a number of upcoming podcasts to networking HW, the topic where more information and better education is very much needed (no, you won’t have to sign NDA before joining 😊), we have lined up a number of great guests, people who design and build ASICs and can talk firsthand about evolution of networking HW, complexity of the process, differences between fixed and programmable pipelines, memories and databases. This Thursday (08/04) at 11:00PST we are joined by one and only Sharada Yeluri - Sr. Director ASIC at Juniper. Other vendors will be joining in the later episodes, usual rules apply – no marketing, no BS.> >> > More to come, stay tuned.> >> > Live feed: https://lnkd.in/gk2x2ezZ> >> > Between 0x2 nerds playlist, videos will be published to: https://www.youtube.com/playlist?list=PLMYH1xDLIabuZCr1Yeoo39enogPA2yJB7> >> >> >> > Cheers,> >> > Jeff> >> >> >> > From: James Bensley> > Sent: Wednesday, July 27, 2022 12:53 PM> > To: Lawrence Wobker; NANOG> > Subject: Re: 400G forwarding - how does it work?> >> >> >> > On Tue, 26 Jul 2022 at 21:39, Lawrence Wobker  wrote:> >> > > So if this pipeline can do 1.25 billion PPS and I want to be able to forward 10BPPS, I can build a chip that has 8 of these pipelines and get my performance target that way.  I could also build a "pipeline" that processes multiple packets per clock, if I have one that does 2 packets/clock then I only need 4 of said pipelines.. and so on and so forth.> >> >> >> > Thanks for the response Lawrence.> >> >> >> > The Broadcom BCM16K KBP has a clock speed of 1.2Ghz, so I expect the> >> > J2 to have something similar (as someone already mentioned, most chips> >> > I've seen are in the 1-1.5Ghz range), so in this case "only" 2> >> > pipelines would be needed to maintain the headline 2Bpps rate of the> >> > J2, or even just 1 if they have managed to squeeze out two packets per> >> > cycle through parallelisation within the pipeline.> >> >> >> > Cheers,> >> > James.> >> >   --   ++ytti 


Re: 400G forwarding - how does it work?

2022-08-10 Thread Jeff Tantsura
Sharada’s answers:

a) Yes, the run-to-completion model of Trio is superior to FP5/Nokia model when 
it comes to flexible processing engines. In Trio, the same engines can do 
either ingress or egress processing. Traditionally, there is more processing on 
ingress than on egress. When that happens, by design, less number of processing 
engines get used for egress, and more engines are available for ingress 
processing. Trio gives full flexibility. Unless Nokia is optimizing the engines 
(not all engines are identical, and some are area optimized for specific 
processing) to save overall area, I do not see any other advantage.  

b) Trio provides on-chip shallow buffering on ingress for fabric queues. We 
share this buffer between the slices on the same die. This gives us the 
flexibility to go easy on the size of SRAM we want to support for buffering. 

c) I didn't completely follow the question. Shallow ingress buffers are for 
fabric-facing queues, and we do not expect sustained fabric congestion. This, 
combined with the fact that we have some speed up over fabric, ensures that all 
WAN packets do reach the egress PFE buffer. On ingress, if packet processing is 
oversubscribed, we have line rate pre-classifiers do proper drops based on WAN 
queue priority.

Cheers,
Jeff

> On Aug 9, 2022, at 16:34, Jeff Tantsura  wrote:
> 
> 
> Saku,
>  
> I have forwarded your questions to Sharada.
>  
> All,
>  
> For this week – at 11:00am PST, Thursday 08/11, we will be joined by Guy 
> Caspary (co-founder of Leaba Semiconductor (acquired by Cisco -> SiliconOne)
> https://m.youtube.com/watch?v=GDthnCj31_Y
>  
> For the next week, I’m planning to get one of main architects of Broadcom DNX 
>  (Jericho/Qumran/Ramon).
>  
> Cheers,
> Jeff
>  
> From: Saku Ytti
> Sent: Friday, August 5, 2022 12:15 AM
> To: Jeff Tantsura
> Cc: NANOG; Jeff Doyle
> Subject: Re: 400G forwarding - how does it work?
>  
> Thank you for this.
>  
> I wish there would have been a deeper dive to the lookup side. My open 
> questions
>  
> a) Trio model of packet stays in single PPE until done vs. FP model of
> line-of-PPE (identical cores). I don't understand the advantages of
> the FP model, the Trio model advantages are clear to me. Obviously the
> FP model has to have some advantages, what are they?
>  
> b) What exactly are the gains of putting two trios on-package in
> Trio6, there is no local-switching between WANs of trios in-package,
> they are, as far as I can tell, ships in the night, packets between
> trios go via fabric, as they would with separate Trios. I can
> understand the benefit of putting trio and HBM2 on the same package,
> to reduce distance so wattage goes down or frequency goes up.
>  
> c) What evolution they are thinking for the shallow ingress buffers
> for Trio6. The collateral damage potential is significant, because WAN
> which asks most, gets most, instead each having their fair share, thus
> potentially arbitrarily low rate WAN ingress might not get access to
> ingress buffer causing drop. Would it be practical in terms of
> wattage/area to add some sort of preQoS towards the shallow ingress
> buffer, so each WAN ingress has a fair guaranteed-rate to shallow
> buffers?
>  
> On Fri, 5 Aug 2022 at 02:18, Jeff Tantsura  wrote:
> > 
> > Apologies for garbage/HTMLed email, not sure what happened (thanks
> > Brian F for letting me know).
> > Anyway, the podcast with Juniper (mostly around Trio/Express) has been
> > broadcasted today and is available at
> > https://www.youtube.com/watch?v=1he8GjDBq9g
> > Next in the pipeline are:
> > Cisco SiliconOne
> > Broadcom DNX (Jericho/Qumran/Ramon)
> > For both - the guests are main architects of the silicon
> > 
> > Enjoy
> > 
> > 
> > On Wed, Aug 3, 2022 at 5:06 PM Jeff Tantsura  
> > wrote:
> > >
> > > Hey,
> > >
> > >
> > >
> > > This is not an advertisement but an attempt to help folks to better 
> > > understand networking HW.
> > >
> > >
> > >
> > > Some of you might know (and love 😊) “between 0x2 nerds” podcast Jeff 
> > > Doyle and I have been hosting for a couple of years.
> > >
> > >
> > >
> > > Following up the discussion we have decided to dedicate a number of 
> > > upcoming podcasts to networking HW, the topic where more information and 
> > > better education is very much needed (no, you won’t have to sign NDA 
> > > before joining 😊), we have lined up a number of great guests, people who 
> > > design and build ASICs and can talk firsthand about evolution of 
> > > networking HW, complexity of the proces

Re: Longest prepend( 255 times) as path found

2022-08-25 Thread Jeff Tantsura
https://datatracker.ietf.org/doc/html/draft-ietf-grow-as-path-prepending

Cheers,
Jeff

> On Aug 25, 2022, at 11:00, Tom Beecher  wrote:
> 
> 
> 
>> Usually What shoud we do ? Should we filter it ?
> 
> 
> As with many things, the answer depends on your situation. 
> 
> If I was running an edge device with a limited FIB, perhaps I might drop it 
> to save memory. If I had beefier devices, perhaps I would just depref it.  
> Maybe it's a prefix/source ASN I have to care about in some way so I had to 
> take some other action. 
> 
> There is no one size fits all answer. 
> 
> Is prepending an announcement to oblivion generally useless? Yes. But people 
> will do it. So you just have to decide if you need to care, and then what to 
> do , or not do. 
> 
>> On Thu, Aug 25, 2022 at 10:25 AM anonymous  wrote:
>> Hey everyone,
>> 
>> Too many hops found as below. 
>> Usually What shoud we do ? Should we filter it ? 
>> 
>> 91.246.12.0/24 
>> 
>> 
>>   AS path: 4788 9002 41313 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 I
>> 
>>   AS path: 9930 9002 41313 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 
>> 51196 51196 51196 51196 51196 51196 51196 51196 51196 51196 I
>> 
>> /noname


Re: Router ID on IPv6-Only

2022-09-12 Thread Jeff Tantsura
Indeed, someone was recently complaining that FRR is unhappy with a peer with 
router-id from class E range…

Cheers,
Jeff

> On Sep 9, 2022, at 09:30, Saku Ytti  wrote:
> 
> On Fri, 9 Sept 2022 at 09:31, Crist Clark  wrote:
> 
>> As I said in the original email, I realize router IDs just need to be
>> unique in
>> an AS. We could have done random ones with IPv4, but using a well chosen
> 
> In some far future this will be true. We meet eBGP speakers across the
> world, and not everyone supports route refresh, _TODAY_, I suspect
> mostly because internally developed eBGP implementations and
> developers were not very familiar with how real life BGP works.
> RFC6286 is not supported by all common implementations, much less
> uncommon. And even for common implementations it requires a very new
> image (20.4 for Junos, many are even in 17.4 still).
> 
> So while we can consider BGP router-id to be only locally significant
> when RFC6286 is implemented, in practice you want to be defensive in
> your router-id strategy, i.e. avoid at least scheme of 1,2,3,4,5,6...
> on thesis that will be common scheme and liable to increase support
> costs down the line due to collision probability being higher. While
> it might also add commercial advantage for transit providers, to have
> low router-id to win billable traffic.
> 
>> And to get even a little more specific about our particular use case and
>> the
>> suggestion here to build the device location into the ID, we're
>> generally not
> 
> I would strongly advise against any information-to-ID mapping schemes.
> This adds complexity and reduces flexibility and requires you to know
> the complete problem ahead of time, which is difficult, only have
> rules you absolutely must have. I am sure most people here have
> experience having too cutesy addressing schemes some time in their
> past, where forming an IP address had unnecessary rules in them, which
> just created complexity and cost in future.
> If you can add an arbitrary 32b ID to your database, this problem
> becomes very easy. If not, it's tricky.
> 
> -- 
>  ++ytti


Re: Router ID on IPv6-Only

2022-09-13 Thread Jeff Tantsura
Looking at the fix, Donald has only removed IPV4_CLASS_DE(a) uint32_t)(a)) 
& 0xe000) == 0xe000)
validation but kept INADDR_ANY. 
I’ll bring up RFC6286 to him

Cheers,
Jeff

> On Sep 12, 2022, at 13:41, Bjørn Mork  wrote:
> Jeff Tantsura  writes:
> 
>> Indeed, someone was recently complaining that FRR is unhappy with a
>> peer with router-id from class E range…
> 
> This made me curious enough to dig up the fix.  If anyone else is interested:
> https://github.com/FRRouting/frr/commit/b5c2113e47f846d0c48fb4ef63e29bf96bd2fbe2
> 
> 
> Bjørn


Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-10 Thread Jeff Tantsura
There has been a number of efforts to implement FIB (actually BGP RIB) 
compression. There’s a white paper from MS research; I recall Spotify talking 
of running off-box BGP compression SW and re-injecting summarized BGP RIB; 
Volta Networks had an implementation of full BGP table compression to about 
370K routes with no connectivity loss and  reasonably fast reaction of topology 
changes/disaggregation needed(it even won some Intel price for innovation), not 
sure what happened to it (Volta had been acquired by IBM some time ago). To my 
memory -  IOS-XR allows off box custom best path logic and  re-injection of 
routes into BGP RIB

Cheers,
Jeff

> On Oct 10, 2022, at 09:26, William Herrin  wrote:
> 
> On Mon, Oct 10, 2022 at 8:37 AM Mike Hammett  wrote:
>> Feasibility of adding some middleware that culls unneeded routes (existing 
>> more specific and aggregate routes pointing to the same next hop), when that 
>> table starts to fill?
> 
> This is called "FIB aggregation." It exists and works but is not widely 
> adopted.
> 
> Regards,
> Bill Herrin
> 
> -- 
> For hire. https://bill.herrin.us/resume/


Re: any dangers of filtering every /24 on full internet table to preserve FIB space ?

2022-10-10 Thread Jeff Tantsura
Link to Arista article about their Spotify deployment (2016), has all the relevant links, can be implemented on variety of vendors https://aristanetworks.force.com/AristaCommunity/s/article/spotifys-sdn-internet-routerCheers,JeffOn Oct 10, 2022, at 15:57, Ryan Rawdon  wrote:On Oct 10, 2022, at 6:37 PM, Matthew Petach  wrote:On Mon, Oct 10, 2022 at 8:44 AM Mark Tinka  wrote:
On 10/10/22 16:58, Edvinas Kairys wrote:

> Hello,
>
> We're considering to buy some Cisco boxes - NCS-55A1-24H. That box has 
> 24x100G, but only 2.2mln route (FIB) memory entries. In a near future 
> it will be not enough - so we're thinking to deny all /24s to save the 
> memory. What do you think about that approach - I know it could 
> provide some misbehavior. But theoretically every filtered /24 could 
> be routed via smaller prefix /23 /22 /21 or etc. But of course it 
> could be a situation when denied /24 will not be covered by any 
> smaller prefix.

I wouldn't bank on that.

I am confident I have seen /24's with no covering route, more so for PI 
space from RIR's that may only be able to allocate a /24 and nothing 
shorter.

It would be one heck of an experiment, though :-).

Mark.I may or may not have done something like this at $PREVIOUS_DAY_JOB.We (might have) discovered some interesting brokenness on the Internet in doing so; in one case, a peer was sending a /20 across exchange peering sessions with us, along with some more specific /24s.  After filtering out the /24s, traffic rightly flowed to the covering /20.  Peer reached out in an outraged huff; the /24s were being advertised from non-backbone-connected remote sites in their network, that suddenly couldn't fetch content from us anymore.  Traceroutes from our side followed the /20 back to their "core", and then died.  They explained the /24s were being advertised from remote sites without backbone connections to the site advertising the /20, and we needed to stop sending traffic to the /20, and send it directly to the /24 instead.We demurred, and let them know we were correctly following the information in the routing table. We encountered similar behavior, but not from a network desegregating their own address space like this.  Rather, it was a network (actually a network services vendor) who had a PA /24 from a colo provider that they were no longer interconnected with.  We had to filter /24s on transit (our network does not resell transit) due to issues with spanslogic inefficiency on Nexus 7k.  When trying to turn up a demo with this vendor, connections were not establishing.  We found that they had an older PA /24 in the FIB but we were following a /20 or some such route to their old upstream/colo.  We ended up doing a bunch of work to find other such “possibly disconnected /24s” based mainly on origin ASN, and put in exceptions to our filtering until we could complete some hardware upgrades.In situations like this, we of course did have functioning default routes from our upstream — but that doesn’t help since the /20 from a peer was attracting and blackholing the traffic.  As IPv4 continues to desegregate and get resold and otherwise optimized, I imagine this will become more common.  Not a problem for a multi-homed stub network with multiple default routes coming from upstream, unless they have peering and don’t micromanage it with this in mind.RyanThey became even more huffy, insisting that we were breaking the internet by not following the correct routing for the more-specific /24s which were no longer present in our tables.  No amount of trying to explain to them that they should not advertise an aggregate route if no connectivity to the more specific constituents existed seemed to get the point across.  In their eyes, advertising the /24s meant that everyone should follow the more specific route to the final destination directly.So, even seeing a 'covering route' in the table is no guarantee that you won't create subtle and not-so-subtle breakage when filtering out more specifics to save table space.   ^_^;+1 Having (possibly) done this once in the past, I'd strongly recommend looking for a different solution--or at least be willing to arm your front-end response team with suitable "No, *you* broke the Internet" asbestos suits before running a git commit to push your changes out to all the affected devices in your network.   ;)Matt 


Re: Experiences with commercial NOS vendors in white box space

2022-12-02 Thread Jeff Tantsura
Hey,

BCM DNX ASICs don’t make a device a white-box, many commercial vendors 
programming it either completely or at least partially avoiding using BCM SDK, 
using DB’s in different ways, etc.
Looking at a choice of modern NOSes, Arrcus is a high performance,YANG 
programmable vertically integrated NOS built specifically (own HAL) on DNX.
OcNOS - haven’t touched it for a couple of years, never liked it (for 
aforementioned reasons and more).
RtBrick - another modern, highly scalable NOS using DNX, that also provides 
fully fledges BRAS.
I’ll CC CTOs of both companies so you could contact them p2p.

Hope this is helpful.

Cheers,
Jeff

> On Nov 30, 2022, at 07:51, Graham Johnston  wrote:
> 
> Good day.
> 
> I'm curious to hear from those with direct, hopefully in-production,
> experience in using a commercial network operating system vendor along
> with white box switches. I'm specifically looking for operators in the
> service provider space, rather than data center or enterprise. I'm
> largely focused on Jericho2/Qumran2 based devices, with what would
> likely be modest feature requirements. We currently use MPLS with RSVP
> to build automatic paths, but don't do anything specific for traffic
> engineering. Segment routing isn't a requirement for today. Currently
> use more traditional forms of MPLS services for customer L2 and L3
> VPNs, but are investigating a transition to EVPN. Automation is very
> much important to us, as are routing security features. Based on
> research, and use of vertically integrated Jericho-based switches, we
> aren't concerned about QoS as our needs aren't super complex. I guess
> I'm largely saying that we don't expect the ASIC to be the weak point
> in our use case, but rather the NOS or the nature of support from the
> vendor.
> 
> I'm aware of IPInfusion and their OcNOS product, but the CLI and
> config syntax feels dated. I feel like I've been ruined by Cisco RPL,
> Juniper policy-statements, and Arista RCF and expect I would find
> wanting more than what route-map syntax has to offer. Can I accomplish
> the same complex routing policies via route-maps that I can with more
> modern solutions, and I'm just assuming it's limiting? Is it fair to
> say that even if I can achieve the same functionality, that route-maps
> are the poorer choice when it comes to the human interaction aspect?
> 
> I know of Arrcus, but don't know much more than I can see on their
> website. Edge-core has an interesting reference in its Open Networking
> Solution Guide on their website in which they position Arrcus for core
> applications and IPInfusion for access and aggregation. All of which
> could be meaningless based on the varied definitions and expectations
> of what a core network is and does. Is it feature rich or just a set
> of fast LSR P-routers? The Edge-Cor guide also identifies Exaware and
> Capgemini, both of whom I know little about. Are there viable SP
> focused NOS vendors that I haven't touched on?
> 
> Thanks in advance for any reply, be it on-list or off-list.
> 
> Regards,
> Graham


Re: SDN Internet Router (sir)

2023-01-06 Thread Jeff Tantsura
Freertr folks have just published (I didn’t look into the details of their implementation though):“rare/freertr just got fib compression.. in our nren, the v4 table can be compressed from 900k to 260k, the v6 table from 160k to 52k... the tofino2 asic with our dataplane code ( https://lnkd.in/dJrHVZqE ) can accomodate 520k v4 and 130k v6”On a none related note - they are also implementing RIFT, we plan testing against Junos and Python OpenSource implementations during IETF 116 hackathon.Cheers,JeffOn Jan 6, 2023, at 17:13, Matthew Walster via NANOG  wrote:On Fri, 6 Jan 2023, 18:38 Mike Hammett,  wrote:I suspect it always will have value, whether it's peering routers, POP routers, multi-homed customer routers, etc.Indeed. It's not "clean" but it is an acceptable tradeoff if you know what you're doing, and how traffic sloshes around etc.I wrote a tool once that took a number of BGP feeds and aggregated the prefixes based on the next-hop values, which was *amazingly* good at reducing FIB sizes, but consumed so much CPU and memory, not to mention the latency of updates during any sizeable churn event, that it proved less useful than just precomputing based on historical traffic flows and updating the lists semi-frequently.The idea of Juniper's EPE etc is very attractive, and largely matches what I had done back then, but does it with a lot more finesse. Ultimately, it's a tradeoff between CapEx of the high FIB router and the OpEx of the engineers who have to maintain the often hacky solution ;)M


Re: SDN Internet Router (sir)

2023-01-06 Thread Jeff Tantsura
You might want to search for “policy based add-path”, same idea (BGP listener + flow collector), different issue (60M+ entries BGP RIB), all clouds use some version of that, not sure about open sourcing it though Cheers,JeffOn Jan 6, 2023, at 17:00, Mike Hammett  wrote:Right.Only I'm not the guy to build that solution.What I originally linked to (and another link or two contributed since then) seem to be people that already built that solution.-Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISPFrom: "Tom Beecher" To: "Mike Hammett" Cc: "Mel Beckman" , "NANOG" Sent: Friday, January 6, 2023 9:51:43 AMSubject: Re: SDN Internet Router (sir)Gotcha. Setup a Quagga/Bird box. Do your top talker analysis , use that box to inject the routes you deem important with communities.  On your routers , create policy structure to only take a default plus those communities. Obviously lots of devils in the details of the implementation , but something like that is all you need to do.On Fri, Jan 6, 2023 at 10:29 Mike Hammett  wrote:Maybe?I don't need any additional performance tests, though. Just watching which prefixes are the top talkers and leaving the rest to default.I'm not looking at this to do what a BGP optimizier would do and find the best tested path to the top talkers and then massage BGP to get it routed that way. Determine the top talkers, then let BGP do its thing for those top talkers.I don't want to manually say X traffic from Y POP manually goes here, but I don't want to just leave it to default routing either. Something in the middle.-Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISPFrom: "Tom Beecher" To: "Mike Hammett" Cc: "Mel Beckman" , "NANOG" Sent: Friday, January 6, 2023 9:16:19 AMSubject: Re: SDN Internet Router (sir)Thanks for this example. It sounds like you are describing egress peer engineering, but kinda in reverse. In 'traditional' EPE, the routers have all the routes, and you are using the external controller to perform the performance tests that matter to you, and signal the network where to take the traffic based on those tests. It seems like you want to do the same thing , but instead of having the controller signal the network where to carry bits, you want the controller to signal the networks what routes are present, and direct the bits that way. Do I have this right? On Thu, Jan 5, 2023 at 4:12 PM Mike Hammett  wrote:I hesitated to get too specific in examples because someone is going to drag the conversation into the weeds.Let's take the the Dallas - New Orleans - Atlanta example where I have a connection from New Orleans to Dallas and a connection from New Orleans to Atlanta.Let's say I peer with Netflix in both markets. Netflix chooses to serve me out of Atlanta, for whatever reason. Say my default route sends my traffic to Dallas. That's not where Netflix wanted it, so now I have to go from Dallas to Atlanta, whether that's my circuit or across the public Internet. Potentially, it's on MPLS and it rides back through the New Orleans router to get back to Atlanta. That's a long trip when I already had a better path, the less-than-full-fib router just didn't know about it. Given that Netflix is a sizable amount of traffic in an eyeball ISP, that's a lot of traffic to be going the wrong way. If the website for Viktor's Arctic Plunge in Siberia was hosted in Atlanta, I wouldn't give two craps that the traffic went the wrong way because A), I'll probably never go there and B) when someone does, it won't be meaningfully enough traffic to accommodate.Someone's going to tell me to put a full-table router in New Orleans. Maybe I should. Okay, so maybe I have a POP in Ashford, Alabama. It has transport to New Orleans and Atlanta. There aren't enough grains of sugar in Ashford, Alabama to justify a current-generation, full table router. Now I'm even closer to Atlanta, but default may point to New Orleans.-Mike HammettIntelligent Computing SolutionsMidwest Internet ExchangeThe Brothers WISPFrom: "Mel Beckman" To: "Mike Hammett" Cc: "Joe Maimon" , "NANOG" Sent: Thursday, January 5, 2023 2:54:27 PMSubject: Re: SDN Internet Router (sir)




Mike,


I’m not sure I understand what you mean by “suboptimal“ routing. Even though the Internet uses AS path length for routing,  many of those path lengths are bogus, and don’t really represent any kind of path performance value. For example, a single AS might
 hide many hops in an MPLS network as a single hop, obscuring asymmetric routing and other uglies. Prepending also occurs when destinations are trying to enforce their own engineering  policies, which often conflict with yours or mine.


So what do you mean by “suboptimal“? Are you thinking that the “best” path in BGP tables actually meant you

Re: BGP Engines with support to "RTFilter address-family"

2023-02-27 Thread Jeff Tantsura
FRR hasn’t implemented RFC4364 (nor planning to my knowledge (unless someone 
comes and codes it ;-))
I believe - Arccus has implemented it (Keyur to confirm).

Cheers,
Jeff

> On Feb 26, 2023, at 22:58, Paul Rolland  wrote:
> 
> Hello,
> 
>> On Sun, 26 Feb 2023 17:46:42 -0300
>> Douglas Fischer  wrote:
>> 
>> But I'm looking for an open-source engine that supports it.
>> 
>> The official FRR documentation does not mention anything about RFC 4364,
>> or RTFilter address family.
>> So, I think FRR does not support RTFilter Constrained Route Distribution.
>> 
>> Do any of the colleagues have any suggestions on this?
> 
> ExaBGP ?
> 
> https://github.com/Exa-Networks/exabgp/wiki/RFC-Information
> 
> Best,
> Paul
> 
> -- 
> Paul RollandE-Mail : rol(at)witbe.net
> CTO - Witbe.net SA  Tel. +33 (0)1 47 67 77 77
> 18 Rue d'Arras, Bat. A11Fax. +33 (0)1 47 67 77 99
> F-92000 NanterreRIPE : PR12-RIPE
> 
> Please no HTML, I'm not a browser - Pas d'HTML, je ne suis pas un
> navigateur "Some people dream of success... while others wake up and work
> hard at it" 
> 
> "I worry about my child and the Internet all the time, even though she's
> too young to have logged on yet. Here's what I worry about. I worry that 10
> or 15 years from now, she will come to me and say 'Daddy, where were you
> when they took freedom of the press away from the Internet?'"
> --Mike Godwin, Electronic Frontier Foundation 


Re: BGP Books

2023-04-28 Thread Jeff Tantsura
If you are looking for BGP in DC (either unicast and/or VPN) we (Jeff Doyle and I) have published a significant number of podcasts on “between 0x2 nerds”(from basic BGP to EVPN to BGP security to HW) - https://youtube.com/playlist?list=PLMYH1xDLIabuZCr1Yeoo39enogPA2yJB7Cheers,JeffOn Apr 27, 2023, at 15:37, Warren Kumari  wrote:On Tue, Apr 25, 2023 at 7:20 PM, Steven G. Huter  wrote:On 4/25/23 3:55 PM, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: It has been a couple of decades since I've done any BGP in anger,
but it looks like I will be jumping into the deep end again, soon,
and I desperately need to get up to speed again. There seem to be a lot of good guides out there from Cisco, Juniper,
and the like, but naturally they are very product oriented.  What
I'm looking for is more like the Stevens networking bibles (i.e.  "BGP Illustrated Vol I and II"). Something that covers more than
just the raw protocols, and includes things like RPKI.  (The world
sure has changed since the last time I was doing this!)Any/all suggestions welcome. https://learn.nsrc.org/bgpYes, this. Much of it (all of it?) is presented by Philip Smith, and he's a sufficiently entertaining speaker that it's worth watching even if you are already a bgp "expert". As for books — I used to buy a copy of "BGP4: Inter-Domain Routing in the Internet" by John W Stewart for all of my new hires — https://amzn.to/3VdqdfK . It's really short and sweet, and covers just the stuff that you need to know. It is old at this point (1998!), but still well worth the read.WSteve


Re: Best Linux (or BSD) hosted BGP?

2023-05-08 Thread Jeff Tantsura
All fixed (thanks Donald)

CVE-2022-40302 and CVE-2022-40318: https://github.com/FRRouting/frr/pull/12043
CVE-2022-43681: https://github.com/FRRouting/frr/pull/12247

Cheers,
Jeff

> On May 3, 2023, at 2:52 AM, Hank Nussbacher  wrote:
> 
> On 02/05/2023 17:56, Warren Kumari wrote:
> 
> For those that like FRR:
> https://thehackernews.com/2023/05/researchers-uncover-new-bgp-flaws-in.html
> 
> Regards,
> Hank
> 
>> +lots.
>> 
>> I've used a number of Linux routing thingies (BIRD, Quagga, VyOS/Ubiquiti, 
>> OpenBGPd, ExBGP), and FRR is (for me at least) by far the friendliest. It's 
>> trivial to spin this up on a cloud VM and start announcing a prefix.
>> 
>> For doing something like Anycast though (where you are mostly just 
>> announcing a route on demand), ExaBGP is great.
>> 
>> W
>> 
>> 
>> On Mon, May 01, 2023 at 2:03 PM, Jean Franco  wrote:
>> 
>>https://frrouting.org/ 
>> 
> 



Re: Best Linux (or BSD) hosted BGP?

2023-05-08 Thread Jeff Tantsura


Saying that IS-IS in FRR is broken is incorrect, that it is in many ways weird 
- no offense to folks who coded it :)  (especially if you have worked with 
commercial code bases), that it doesn’t scale/naive, missing features - for 
sure.
FRR runs today some of the biggest DCs in the world and is reasonably stable 
within the feature set used (RFC7938), there’s interest in further IS-IS 
development - you could see some minor bug fixes/features coming from a 
particular company, if you are interested - join them and work together.

FRR is an Open Source project, whining about missing features is not helpful, 
coding (or at least testing) and contributing is.

Cheers,
Jeff

> On May 8, 2023, at 9:51 AM, Mark Tinka  wrote:
> 
> 
> 
> On 5/8/23 18:45, Mark Tinka wrote:
> 
>> Broken when talking to Cisco IOS XE. Catalogued here:
>> 
>> https://lists.frrouting.org/pipermail/frog/2023-March/001265.html
>> 
>> I have no doubt FRR can talk IS-IS to other instances of FRR, but that is 
>> not a realistic scenario in a large scale network with multiple vendors.
> 
> And do add, IS-IS in Quagga has been broken from the get. There is no work 
> happening there to develop it. All hopes for a working IS-IS in 
> non-commercial vendor code is happening in FRR.
> 
> I have not considered IS-IS in VyOS, as the my use-case is for Anycast 
> running on FreeBSD.
> 
> Mark.



Re: JunOS/FRR/Nokia et al BGP critical issue

2023-08-31 Thread Jeff Tantsura
FRR fix went into 9.0 and has been back ported to 8.5 and 8.4 , Cumulus 5.6 
will include the fix.

Cheers,
Jeff

> On Aug 30, 2023, at 5:32 AM, Mark Prosser  wrote:
> 
> Thanks for sharing this, Mike. I saw it on lobste.rs yesterday and figured 
> everyone would be ahead.
> 
> I'm running VyOS in a volunteer WISP but not with BGP peering... I'm thinking 
> to test it now as we'll likely swap in VyOS for it soon.
> 
> I saw this PR as a reply on Mastodon:
> 
> https://github.com/FRRouting/frr/pull/14290
> 
> Warm regards,
> 
> Mark
> 



RE: Multiple ISP Load Balancing

2011-12-14 Thread Jeff Tantsura
Hi David,

You might want to take a look at work happening in ALTO 
(http://tools.ietf.org/wg/alto/)

Regards,
Jeff
-Original Message-
From: Holmes,David A [mailto:dhol...@mwdh2o.com] 
Sent: Wednesday, December 14, 2011 11:07 AM
To: nanog@nanog.org
Subject: Multiple ISP Load Balancing

>From time to time some have posted questions asking if BGP load balancers such 
>as the old Routescience Pathcontrol device are still around, and if not what 
>have others found to replace that function. I have used the Routescience 
>device with much success 10 years ago when it first came on the market, but 
>since then a full BGP feed has become much larger, Routescience has been 
>bought by Avaya, then discontinued, and other competitors such as Sockeye, 
>Netvmg have been acquired by other companies.

Doing some research on how load balancing can be accomplished in 2011, I have 
come across Cisco's performance routing feature, and features from load 
balancing companies such as F5's Link Controller. I have always found BGP to be 
easy to work with, and an elegant, simple solution to load balancing using a 
route-reflector configuration in which one BGP client (Routescience Pathcontrol 
in my background) learns the best route to destination networks, and then 
announces that best route to BGP border routers using common and widely 
understood BGP concepts such as communities and local pref, and found this to 
lead to a deterministic Internet routing architecture. This required a 
knowledge only of IETF standards (common BGP concepts and configurations), 
required no specialized scripting, or any other knowledge lying outside IETF 
boundaries, and it seemed reasonable to expect that network engineers should 
eagerly and enthusiastically want to master this technology, just as any other 
technology must be mastered to run high availability networks.

So I am wondering if anyone has experience with implementing load balancing 
across multiple ISP links in 2011, and if there have been any comparisons 
between IETF standards-based methods using BGP, and other proprietary methods 
which may use a particular vendor's approach to solving the same problem, but 
involves some complexity with more variables to be plugged in to the 
architecture.

David



  
This communication, together with any attachments or embedded links, is for the 
sole use of the intended recipient(s) and may contain information that is 
confidential or legally protected. If you are not the intended recipient, you 
are hereby notified that any review, disclosure, copying, dissemination, 
distribution or use of this communication is strictly prohibited. If you have 
received this communication in error, please notify the sender immediately by 
return e-mail message and delete the original and all copies of the 
communication, along with any attachments or embedded links, from your system.



RE: bgp update destroying transit on redback routers ?

2011-12-22 Thread Jeff Tantsura
Olivier,

Thanks!
We've done our best to provide the fix ASAP.

Regards,
Jeff
-Original Message-
From: Olivier Benghozi [mailto:olivier.bengh...@wifirst.fr] 
Sent: Thursday, December 22, 2011 5:20 AM
To: nanog@nanog.org
Cc: Alexandre Snarskii; Jeff Tantsura
Subject: Re: bgp update destroying transit on redback routers ?

Aha, it looks that our Quebecer friends from Hostlogistic (AS46609) have again 
been advertising their now famous funny aggregate with their mad Brocade 
router, since yesterday 10pm UTC (that is 5pm in Quebec)...
Same route to 206.125.164.0/22, same AGGREGATOR attribute full of 0.

At least I can say that the patched Ericsson's bgpd stopped reseting the 
sessions.


regards,
Olivier


Le 2 déc. 2011 à 23:14, Jeff Tantsura a écrit :

> Hi Alexandre,
> 
> You are right, the behavior is exactly as per RFC4271 section 6:
> "When any of the conditions described here are detected, a 
> NOTIFICATION message, with the indicated Error Code, Error Subcode, and Data 
> fields, is sent, and the BGP connection is closed.
> So because ASN 0 in AGGREGATOR is seen as a malformed UPDATE we send 3/9 and 
> close the connection.
> 
> Ideally it should be treated as "treat-as-withdraw" as per 
> draft-chen-ebgp-error-handling, however please note - this is still a draft, 
> not a normative document and with all my support it takes time to implement.
> 
> Once again, we understand the implications for our customers and hence going 
> to disable ASN 0 check.
> 
> P.S. We have strong evidence that the update in question was caused by 
> a bug on a freshly updated router (I'm not going to disclose the 
> vendor)
> 
> Regards,
> Jeff
> 
> 
> -Original Message-----
> From: Alexandre Snarskii [mailto:s...@snar.spb.ru]
> Sent: Friday, December 02, 2011 6:36 AM
> To: Jeff Tantsura
> Cc: nanog@nanog.org
> Subject: Re: bgp update destroying transit on redback routers ?
> 
> On Thu, Dec 01, 2011 at 04:56:43PM -0500, Jeff Tantsura wrote:
>> Hi,
>> 
>> Let me take it over from now on, I'm the IP Routing/MPLS Product 
>> Manager at Ericsson responsible for all routing protocols.
>> There's nothing wrong in checking ASN in AGGREGATOR, we don't really 
>> want see ASN 0 anywhere, that's how draft-wkumari-idr-as0
>> (draft-ietf-idr-as0-00) came into the worlds.
> 
> This draft says that
> 
> If a BGP speaker receives a route which has an AS number of zero in the 
> AS_PATH (or AS4_PATH) attribute, it SHOULD be logged and treated as a 
> WITHDRAW. This same behavior applies to routes containing zero as the 
> Aggregator or AS4 Aggregator.
> 
> but observed behaviour was more like following: 
> 
> If a BGP speaker receives [bad route] it MUST close session immediately with 
> NOTIFICATION Error Code 'Update Message Error' and subcode 'Error with 
> optional attribute'.




Re: IPTV and ASM

2011-12-28 Thread Jeff Tantsura
Mike,

To my knowledge in most today's networks even if legacy equipment don't support 
IGMPv3 most likely 1st hop router does static translation and SSM upstream.
The reason not to migrate to SSM is usually - ASM is already there and works 
just fine :)
Cost to support RP infrastructure is usually the main non-technical factor to 
not to use ASM.
Would be interested to hear from the SPs on the list.

Regards,
Jeff

On Dec 28, 2011, at 2:19 PM, "Mike McBride"  wrote:

> Marshall,
> 
> On Wed, Dec 28, 2011 at 1:50 PM, Marshall Eubanks
>  wrote:
>> Dear Mike;
>> 
>> On Wed, Dec 28, 2011 at 4:48 PM, Mike McBride  wrote:
>>> Anyone using ASM (versus SSM) for IPTV? If so why?
>>> 
>> 
>> From what I understand, the answer is likely to be "yes" and the
>> reason is likely to be "deployed equipment only
>> supports IGMP v2."
> 
> Agreed. I'm seeking confirmation, from IPTV implementers, that non
> igmpv3 support is the reason for using ASM with IPTV. Versus other
> reasons such as reducing state. Or is this a non issue and everyone is
> using SSM with IPTV?
> 
> thanks,
> mike
> 
>> Regards
>> Marshall
>> 
>>> thanks,
>>> mike
>>> 
> 



Re: is sbcglobal throttling Cuban traffic?

2012-03-24 Thread Jeff Tantsura
81.169.144 belongs to a German company based in Berlin :) 

Regards,
Jeff

On Mar 24, 2012, at 13:39, "Randy Bush"  wrote:

> 81.169.145.156



Re: mulcast assignments

2012-05-03 Thread Jeff Tantsura
Hi,

All modern routers support mapping from IGMPv2 to PIM SSM, all static, some 
others thru DNS, etc

Regards,
Jeff

On May 3, 2012, at 12:34 PM, "Nick Hilliard"  wrote:

> On 03/05/2012 21:00, Greg Shepherd wrote:
>> Sure, but GLOP predated SSM, and was really only an interim fix for
>> the presumed need of mcast address assignments. GLOP only gives you a
>> /24 for each ASN where SSM gives you a /8 for every unique unicast
>> address you have along with vastly superior security and network
>> simplicity.
> 
> SSM is indeed a lot simpler and better than GLOP in every conceivable way -
> except vendor support.  It needs igmpv3 on all intermediate devices and SSM
> support on the client device.  All major desktop operating systems now have
> SSM support (OS/X since 10.7/Lion), but there is still lots of older
> hardware which either doesn't support igmpv3 or else only supports it in a
> very primitive fashion.  This can lead to Unexpected Behaviour in naive
> roll-outs.
> 
> Nick
> 



Re: mulcast assignments

2012-05-04 Thread Jeff Tantsura
Marshall,

That's exactly what the feature does, when it receives a IGMPv1/2 join it adds 
a preconfigured S and sends S,G (INCLUDE)upstream.
Google for IGMP mapping


Regards,
Jeff

On May 4, 2012, at 1:45 PM, "Marshall Eubanks"  
wrote:

> On Fri, May 4, 2012 at 2:53 AM, Jeff Tantsura
>  wrote:
>> Hi,
>> 
>> All modern routers support mapping from IGMPv2 to PIM SSM, all static, some 
>> others thru DNS, etc
> 
> I am not sure what you mean here. To support SSM, you need IGMPv3. Most
> routers do support IGMPv3, but there is still a fair amount of legacy
> gear at various
> edges which doesn't.
> 
> Regards
> Marshall
> 
>> 
>> Regards,
>> Jeff
>> 
>> On May 3, 2012, at 12:34 PM, "Nick Hilliard"  wrote:
>> 
>>> On 03/05/2012 21:00, Greg Shepherd wrote:
>>>> Sure, but GLOP predated SSM, and was really only an interim fix for
>>>> the presumed need of mcast address assignments. GLOP only gives you a
>>>> /24 for each ASN where SSM gives you a /8 for every unique unicast
>>>> address you have along with vastly superior security and network
>>>> simplicity.
>>> 
>>> SSM is indeed a lot simpler and better than GLOP in every conceivable way -
>>> except vendor support.  It needs igmpv3 on all intermediate devices and SSM
>>> support on the client device.  All major desktop operating systems now have
>>> SSM support (OS/X since 10.7/Lion), but there is still lots of older
>>> hardware which either doesn't support igmpv3 or else only supports it in a
>>> very primitive fashion.  This can lead to Unexpected Behaviour in naive
>>> roll-outs.
>>> 
>>> Nick
>>> 
>> 



Re: need help about bgd and ospf

2012-05-18 Thread Jeff Tantsura
Nope, run iBGP, have only next-hops in OSPF.

Regards,
Jeff

On May 18, 2012, at 19:14, "Deric Kwok"  wrote:

> Hi all
> 
> Can I have questions about bgp and ospf
> 
> 1/ Do I have to redistrt bgd in ospf to make ospf to know which
> upstrem bgp routers to go out
> 
> 2/ If yes, how many routes can ospf database handle as one full bgp
> table is about 400,000 routes
> 
> 3/ When we have 8 ospf routers to run "redistrubt bgp", ls it 8 x
> 400,000 routes in ospf database?
> 
> 4/ If not redistribted bgp, how ospf to know which upstream to go out
> 
> Thank you for your help
> 



IETF RTGWG interim meeting - Existing problems for routing in the large Data Centers and potential solutions

2017-01-10 Thread Jeff Tantsura
/RTGWG chair hat on

 

Dear NANOG,

 

For those, who might be interested, on January 25 we (IETF Routing Area RTGWG) 
will be having an interim online meeting, dedicated to Existing problems for 
routing in the large Data Centers and potential solutions. Presenters will be 
describing (15 minutes per presentation) problems in the space, followed by 
potential solutions

 

Please join us.

 

Presenters:

Problem statements:

Peter Lapukhov - FB

Dmitry Afanasiev - YANDEX

Russ White/ Shawn Zandi - LinkedIn

 

Solutions proposed:

Keyur Patel, Arrcus  - Shortest Path Routing Extensions for BGP Protocol, 
draft-keyupate-idr-bgp-spf

Naiming Shen , Cisco  - IS-IS Routing for Spine-Leaf Topology, 
draft-shen-isis-spine-leaf-ext

Tony Przygienda, Juniper - RIFT: Routing in Fat Trees, draft-przygienda-rift  
(draft to be published later this week)

Bhumip Khasnabish, ZTE - Generic Fault-avoidance Routing Protocol for Data 
Center Networks, draft-sl-rtgwg-far-dcn

 

Cheers,

Jeff

 

 

From: Routing Area Working Group 
Reply-To: rtgwg-chairs 
Date: Tuesday, December 20, 2016 at 11:47
To: 
Subject: WebEx meeting invitation: Existing problems for routing in the large 
Data Centers and potentail solutions
Resent-From: 
Resent-To: , Chris Bowers 
Resent-Date: Tue, 20 Dec 2016 11:47:19 -0800 (PST)

 

Hello, 
Routing Area Working Group invites you to join this WebEx meeting. 

 

 

 

Existing problems for routing in the large Data Centers and potential solutions 
Wednesday, January 25, 2017 
9:00 am  |  Pacific Standard Time (San Francisco, GMT-08:00)  |  2 hrs 30 mins 

 

Meeting number (access code): 649 161 321 

 

Meeting password: px3Gk22M

 

 

 

Add to Calendar 
When it's time, join the meeting.

 

 

 

Join by phone
1-877-668-4493 Call-in toll free number (US/Canada)
1-650-479-3208 Call-in toll number (US/Canada)
Toll-free calling restrictions

 

 

 

Can't join the meeting? 

 

 

 

IMPORTANT NOTICE: Please note that this WebEx service allows audio and other 
information sent during the session to be recorded, which may be discoverable 
in a legal matter. By joining this session, you automatically consent to such 
recordings. If you do not consent to being recorded, discuss your concerns with 
the host or do not join the session.

 



WebEx_Meeting.ics
Description: Binary data


Re: Best practice for BGP session/ full routes for customer

2014-07-14 Thread Jeff Tantsura
Mark,

BGP to RIB filtering (in any vendor implementation) is targeting RR which
is not in the forwarding path, so there¹s no forwarding towards any
destination filtered out from RIB.
Using it selectively on a forwarding node is error prone and in case of
incorrect configuration would result in blackholing.

Cheers,
Jeff




-Original Message-
From: Mark Tinka 
Organization: SEACOM
Reply-To: 
Date: Tuesday, July 8, 2014 at 1:56 PM
To: "nanog@nanog.org" 
Subject: Re: Best practice for BGP session/ full routes for customer

>On Monday, July 07, 2014 08:33:12 PM Anurag Bhatia wrote:
> 
>> In this scenario what is best practice for giving full
>> table to downstream?
>
>In our case, we have three types of edge routers; Juniper
>MX480 + Cisco ASR1006, and the Cisco ME3600X.
>
>For the MX480 and ASR1006 have no problems supporting a full
>table. So customers peer natively.
>
>The ME3600X is a small switch, that supports only up to
>24,000 IPv4 and 5,000 IPv6 FIB entries. However, Cisco have
>a feature called BGP Selective Download:
>
>   http://tinyurl.com/nodnmct
>
>Using BGP-SD, we can send a full BGP table from our route
>reflectors to our ME3600X switches, without worrying about
>them entering the FIB, i.e., they are held only in memory.
>The beauty - you can advertise these routes to customers
>natively, without clunky eBGP Multi-Hop sessions running
>rampant.
>
>Of course, with BGP-SD, you still need a 0/0 + ::/0 route in
>the FIB for traffic to flow from your customers upstream,
>but that is fine as it's only two entries :-).
>
>If your system supports a BGP-SD-type implementation, I'd
>recommend it, provided you have sufficient control plane
>memory.
>
>Cheers,
>
>Mark.



Re: Multicast Internet Route table.

2014-09-02 Thread Jeff Tantsura
It is not the network devices per se, it is additional configuration,
security, MSDP peering, etc, i.e. OPEX

Business justification for such effort is not obvious, (most of multicast
deployments I have done in my previous life were because I loved the
technology, not because of business needs :))

Cheers,
Jeff




-Original Message-
From: Octavio Alvarez 
Date: Tuesday, September 2, 2014 at 8:43 AM
To: "nanog@nanog.org" 
Subject: Re: Multicast Internet Route table.

>On 09/02/2014 05:46 AM, John Kristoff wrote:
>> On Tue, 2 Sep 2014 04:47:37 +
>> "S, Somasundaram (Somasundaram)" 
>> wrote:
>> 
>>> 1: Does all the ISP's provide Multicast Routing by
>>> default?
>> 
>> No not all and even those that do often do not do so on the same gear,
>> links and peers as their unicast forwarding.
>
>Why would that be, are network devices not able to support multicast?
>
>I have never used interdomain multicast but I imagine the global
>m-routing table would quickly become large.



Re: MPLS VPN design - RR in forwarding path?

2014-12-31 Thread Jeff Tantsura
Hi,

Right, one is when besides forwarding packets a router also functioning as a 
RR, another - when RR sets NH to itself and hence forces all the traffic to 
pass thru the router in fast path.
Keep in mind - some architectures, such as seamless MPLS would require a RR to 
be in the fast path.
There are some other cases where it could be a requirement.
I'd advice to look into vRR space - price/performance looks quite good.

Wrt open source implementations - if you are looking into relatively basic 
feature set (v4/v6 unicast/vpn) reliability is not of main concern and of 
course- there are hands and brains to support it - could be a viable approach.
Might you be looking into more complex feature set  - EVPN, BGP-LS, FS,
enhanced route refresh, etc,  highly optimized code wrt update rate/ number of 
peers supported - most probably you'd end up with a commercial implementation.

Hope this helps

Regards,
Jeff

> On Dec 31, 2014, at 9:08 AM, Chuck Anderson  wrote:
> 
>> On Wed, Dec 31, 2014 at 01:08:15PM +0100, Marcin Kurek wrote:
>> Hi everyone,
>> 
>> I'm reading Randy's Zhang BGP Design and Implementation and I found
>> following guidelines about designing RR-based MPLS VPN architecture:
>> - Partition RRs
>> - Move RRs out of the forwarding path
>> - Use a high-end processor with maximum memory
>> - Use peer groups
>> - Tune RR routers for improved performance.
>> 
>> Since the book is a bit outdated (2004) I'm curious if these rules
>> still apply to modern SP networks.
>> What would be the reasoning behind keeping RRs out of the forwarding
>> path? Is it only a matter of performance and stability?
> 
> When they say "move RRs out of the forwarding path", they could mean
> "don't force all traffic through the RRs".  These are two different
> things.  Naive configurations could end up causing all VPN traffic to
> go through the RRs (e.g. setting next-hop-self on all reflected
> routes) whereas more correct configurations don't do that--but there
> may be some traffic that natrually flows through the same routers that
> are the RRs, via an MPLS LSP for example.  That latter is fine in many
> cases, the former is not.  E.g. I would argue that a P-router can be
> an RR if desired.


Re: MPLS VPN design - RR in forwarding path?

2015-01-01 Thread Jeff Tantsura
You don't need LDP on RR as long as clients support "not on lsp" flag 
(different implementation have different names for it)
There are more and more reasons to run RR on a non router HW, there are many 
reasons to still run commercial code base, mostly feature set and resilience.

Regards,
Jeff

> On Jan 1, 2015, at 2:11 PM, Nick Hilliard  wrote:
> 
>> On 01/01/2015 21:37, Baldur Norddahl wrote:
>> Are anyone using Bird, Quagga etc. for this?
> 
> there are patches for both code-bases and some preliminary support for
> vpnv4 in quagga, but other than that neither currently supports either ldp
> or the vpnv4/vpnv6 address families in the main-line code.
> 
> Nick
> 
> 


Re: MPLS VPN design - RR in forwarding path?

2015-01-02 Thread Jeff Tantsura
+100

Regards,
Jeff

> On Jan 2, 2015, at 5:29 AM, Rob Shakir  wrote:
> 
> 
>> On 2 Jan 2015, at 01:54, Jeff Tantsura  wrote:
>> 
>> You don't need LDP on RR as long as clients support "not on lsp" flag 
>> (different implementation have different names for it)
>> There are more and more reasons to run RR on a non router HW, there are many 
>> reasons to still run commercial code base, mostly feature set and resilience.
> 
> And test coverage. As Saku alluded to earlier in the thread, rr<->rr-client 
> outages are painful. I’ve certainly seen a number of them caused by inter-op 
> issues between implementations. Running at least one RR which matches the 
> code-base of the client means that at least you’re likely to have fallen 
> within the test-cases of that vendor’s implementation.
> 
> r.


Re: Level3 worldwide emergency upgrade?

2013-02-07 Thread Jeff Tantsura
Good times indeed...

Regards,
Jeff

On Feb 7, 2013, at 2:09, "Brett Watson"  wrote:

> Hell, we used to not have to bother notifying customers of anything, we just 
> fixed the problem. Reminds me a of a story I've probably shared on the past. 
> 
> 1995, IETF in Dallas. The "big ISP" I worked for at the time got tripped up 
> on a 24-day IS-IS timer bug (maybe all of them at the time did, I don't 
> recall)  where all adjacencies reset at once. That's like, entire network 
> down. Working with our engineering team in the *terminal* lab mind you, and 
> Ravi Chandra (then at Cisco) we reloaded the entire network of routers with 
> new code from Cisco once they'd fixed the bug. I seem to remember this being 
> my first exposure to Tony Li's infamous line, "... Confidence Level: boots in 
> the lab."
> 
> Good times.
> 
> -b
> 
> 
> On Feb 6, 2013, at 5:41 PM, Brandt, Ralph wrote:
> 
>> David. I am on an evening shift and am just now reading this thread.   
>> 
>> I was almost tempted to write an explanation that would have had
>> identical content with yours based simply on Level3 doing something and
>> keeping the information close.  
>> 
>> Responsible Vendors do not try to hide what is being done unless it is
>> an Op Sec issue and I have never seen Level3 act with less than
>> responsibility so it had to be Op Sec. 
>> 
>> When it is that, it is best if the remainder of us sit quietly on the
>> sidelines.
>> 
>> Ralph Brandt
>> 
>> 
>> -Original Message-
>> From: Siegel, David [mailto:david.sie...@level3.com] 
>> Sent: Wednesday, February 06, 2013 12:01 PM
>> To: 'Ray Wong'; nanog@nanog.org
>> Subject: RE: Level3 worldwide emergency upgrade?
>> 
>> Hi Ray,
>> 
>> This topic reminds me of yesterday's discussion in the conference around
>> getting some BCOP's drafted.  it would be useful to confirm my own view
>> of the BCOP around communicating security issues.  My understanding for
>> the best practice is to limit knowledge distribution of security related
>> problems both before and after the patches are deployed.  You limit
>> knowledge before the patch is deployed to prevent yourself from being
>> exploited, but you also limit knowledge afterwards in order to limit
>> potential damage to others (customers, competitors...the Internet at
>> large).  You also do not want to announce that you will be deploying a
>> security patch until you have a fix in hand and know when you will
>> deploy it (typically, next available maintenance window unless the cat
>> is out of the bag and danger is real and imminent).
>> 
>> As a service provider, you should stay on top of security alerts from
>> your vendors so that you can make your own decision about what action is
>> required.  I would not recommend relying on service provider maintenance
>> bulletins or public operations mailing lists for obtaining this type of
>> information.  There is some information that can cause more harm than
>> good if it is distributed in the wrong way and information relating to
>> security vulnerabilities definitely falls into that category.
>> 
>> Dave
>> 
>> -Original Message-
>> From: Ray Wong [mailto:r...@rayw.net] 
>> Sent: Wednesday, February 06, 2013 9:16 AM
>> To: nanog@nanog.org
>> Subject: Re: Level3 worldwide emergency upgrade?
>> 
>> 
>> OK, having had that first cup of coffee, I can say perhaps the main
>> reason I was wondering is I've gotten used to Level3 always being on top
>> of things (and admittedly, rarely communicating). They've reached the
>> top by often being a black box of reliability, so it's (perhaps
>> unrealistically) surprising to see them caught by surprise. Anything
>> that pushes them into scramble mode causes me to lose a little sleep
>> anyway. The alternative to what they did seems likely for at least a few
>> providers who'll NOT manage to fix things in time, so I may well be
>> looking at longer outages from other providers, and need to issue
>> guidance to others on what to do if/when other links go down for periods
>> long enough that all the cost-bounding monitoring alarms start to scream
>> even louder.
>> 
>> I was also grumpy at myself for having not noticed advance
>> communication, which I still don't seem to have, though since I
>> outsourced my email to bigG, I've noticed I'm more likely to miss
>> things. Perhaps giving up maintaining that massive set of procmail rules
>> has cost me a bit more edge.
>> 
>> Related, of course, just because you design/run your network to tolerate
>> some issues doesn't mean you can also budget to be in support contract
>> as well. :) Knowing more about the exploit/fix might mean trying to find
>> a way to get free upgrades to some kit to prevent more localized attacks
>> to other types of gear, as well, though in this case it's all about
>> Juniper PR839412 then, so vendor specific, it seems?
>> 
>> There are probably more reasons to wish for more info, too. There's
>> still more of them (exploiters/attackers) than there are those of 

Re: OSPF Vulnerability - Owning the Routing Table

2013-08-03 Thread Jeff Tantsura
Hi,

As for Ericsson (Redback) products.
We found the issue quite some time ago and fixed it immediately.
Smart Edge code base (SEOS) has been fixed  back to the release 6.3
SSR code base (IPOS) - not affected.

Please let me know if you have got any questions.

Regards,
Jeff

On Aug 3, 2013, at 10:25, "excel...@gmx.com"  wrote:

> So, only Cisco and Juniper are hit by this one? What about "the rest"?
> Michael
> 
> 
> Am 02.08.2013 21:34, schrieb John Stuppi (jstuppi):
>> Yes, these advisories (from both Cisco and Juniper), covering CVE-2013-0149, 
>> are both related to the announcement yesterday (1-Aug) at BlackHat regarding 
>> the OSPF LSA Manipulation vulnerability. 
>> 
>> Thanks,
>> John
>> 
>> “Optimism is the faith that leads to achievement. Nothing can be done 
>> without hope and confidence”.
>> 
>> 
>> 
>> 
>> 
>> John Stuppi, CISSP
>> Technical Leader
>> Strategic Security Research
>> jstu...@cisco.com
>> Phone: +1 732 516 5994
>> Mobile: 732 319 3886
>> 
>> CCIE, Security - 11154
>> Cisco Systems
>> Mail Stop INJ01/2/ 
>> 111 Wood Avenue South 
>> Iselin, New Jersey 08830
>> United States
>> Cisco.com
> 
> 



Re: OSPF Vulnerability - Owning the Routing Table

2013-08-04 Thread Jeff Tantsura
Agree, that't why using p2p has been mentioned as BCP in networking "howto's" 
for at least last 10 years.

Regards,
Jeff

On Aug 4, 2013, at 3:14 AM, "Saku Ytti"  wrote:

> On (2013-08-04 05:01 -0500), Jimmy Hess wrote:
> 
>> I would say the risk score of the advisory is overstated.   And if you
>> think "ospf is secure" against LAN activity after any patch,  that
>> would be wishful thinking. Someone just rediscovered one of the
>> countless innumerable holes in the back of the cardboard box and tried
>> covering it with duck tape...
> 
> I tend to agree. OTOH I'm not 100% sure if it's unexploitable outside LAN
> via unicast OSPF packets.
> But like you say MD5 offers some level of protection. I wish there would be
> some KDF for IGP KARP so that each LSA would actually have unique
> not-to-be-repeated password, so even if someone gets copy of one LSA and
> calculates out the MD5 it won't be relevant anymore.
> 
> L2 is very dangerous in any platform I've tried, access to L2 and you can
> usually DoS the neighbouring router, even when optimally configured
> CoPP/Lo0 filter.
> 
> -- 
>  ++ytti
> 



Re: Cisco announces it will no longer publishing The Internet Protocol Journal

2013-11-17 Thread Jeff Tantsura
Ole is not with Cisco anymore.

Regards,
Jeff

> On Nov 17, 2013, at 10:11, "Courtney Smith"  wrote:
> 
> Another one bites the dust.   Received the below this AM.  Hopefully, 
> something similar finds it's way into an electronic format.
> 
> 
> TO OUR READERS
> 
> At this time, Cisco Systems, Inc. has decided not to continue publishing The 
> Internet Protocol Journal (IPJ) effective immediately.
> 
> Cisco wishes to thank Ole Jacobsen, the Editor and Publisher of IPJ for his 
> tireless and professional efforts to inform the community of the Internet, 
> its varied protocols, and its impact upon the world through this publication. 
> Cisco also wishes to thank the authors of the published articles, and all 
> those who submitted articles. A special note of thanks goes to the IPJ 
> Editorial Advisory Board and the article reviewers who have helped to 
> maintain the very high standards of journalistic and technical quality of IPJ.
> 
> The online version of the Internet Protocol Journal will remain available for 
> reference, including all back issues in PDF and HTML format as well as the 
> index files. The IPJ website remains at:http://www.cisco.com/ipj
> 
> 
> 
> Courtney Smith
> courtneysm...@comcast.net
> 
> ()  ascii ribbon campaign - against html e-mail 
> /\  www.asciiribbon.org   - against proprietary attachments
> 
> 
> 
> 



Re: L2TPv3 - and layer 2 PDU's

2014-01-24 Thread Jeff Tantsura
Yes, 10 years ago on 10720, CDP and LACP worked like a charm

Regards,
Jeff

> On Jan 24, 2014, at 7:55 AM, "Philip Lavine"  wrote:
> 
> To all,
> 
> Has anyone successfully tunneled L2 PDU's (STP, CDP, LLDP) over a L2TPv3 
> pseudowire tunnel, i.e. should I be able to see CDP neighbors across the 
> tunnel? For some reason if I encapsulated dot1q on a router sub-interface and 
> try and pass traffic across the trunk the downstream switch port will go into 
> err-disable.
> 
> Thx
> 
> Philip
> 



Re: OSPF Costs Formula that include delay.

2014-01-24 Thread Jeff Tantsura
Eric,

Issues:

1.OSPF (SPF) can only produce a SPT based on cost (metric).
Anything else would require CSPF rather than SPF.


2. Delay is not distributed as part of an IGP update
Typical constrains distributed are: bandwidth, color, some others

In IETF we are working to also be able to distribute those kinds of
metrics (draft-ietf-ospf/isis-te-metric-extensions)
draft-ietf-mpls-te-express-path defines how to use these metrics for
RSVP-TE (computation result is an ERO) however theoretically nothing
precludes one (implementation) to use those  for more comprehensive
computation, i.e. delay could be taken into consideration as long as the
path is loop free. So it would look like - compute all loop free paths to
a destination and then choose one with the smallest cumulative delay.

BTW - segment routing will give you this functionality day one :)


Cheers,
Jeff


-Original Message-
From: Erik Sundberg 
Date: Friday, January 24, 2014 12:26 PM
To: Randy , "nanog@nanog.org" 
Subject: RE: OSPF Costs Formula that include delay.

>I understand OSPF default calculation for cost doesn't include delay. I
>am looking for a formula that I can use to manually set the OSPF costs
>that factors in delay.
>
>When using OSPF's default costs, the shortest path is not always the
>optimal path.
>
>
>Example
>
>New York to Los Angeles. Assuming all links are the same bandwidth and
>have a ospf cost of 1.
>
>Path 1 (75ms) - OSPF Cost 2 - New York > Dallas > Los Angeles
>
>Path 2 (65ms) - OSPF Cost 3 - New York > Chicago > Denver > Los Angeles
>
>If I left the default cost's alone then path 1 would win because it has a
>lower ospf cost, however it take traffic 10ms longer to get there.
>
>However I would like traffic to take Path 2 by adjusting the OSPF cost.
>
>
>I am looking for a formula that other people are using .p
>
>Thanks
>
>Erik
>
>
>-Original Message-
>From: Randy [mailto:randy_94...@yahoo.com]
>Sent: Thursday, January 23, 2014 9:03 PM
>To: Erik Sundberg; nanog@nanog.org
>Subject: Re: OSPF Costs Formula that include delay.
>
>
>
>- Original Message -
>> From: Erik Sundberg 
>> To: "nanog@nanog.org" 
>> Cc:
>> Sent: Thursday, January 23, 2014 4:47 PM
>> Subject: OSPF Costs Formula that include delay.
>>
>> What is everyone using for an OSPF cost formula that factors in a
>> circuits delay and bandwidth (10M-100G)???
>>
>> Thanks in advance
>
>
>
>umm..are you sure your question is not about EIGRP?
>OSPF has no concept of interface-delays.
>
>The default reference bandwidth for OSPF is 100M
>
>In your case if you set your reference bandwidth to 10 your 100G
>links would have a link cost of 1, 10G - 10, 1G-100, 100M-1000 and
>10M-1
>
>A vendor specific list would be a better place to ask.
>
>
>./Randy
>
>
>
>CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents,
>files or previous e-mail messages attached to it may contain confidential
>information that is legally privileged. If you are not the intended
>recipient, or a person responsible for delivering it to the intended
>recipient, you are hereby notified that any disclosure, copying,
>distribution or use of any of the information contained in or attached to
>this transmission is STRICTLY PROHIBITED. If you have received this
>transmission in error please notify the sender immediately by replying to
>this e-mail. You must destroy the original transmission and its
>attachments without reading or saving in any manner. Thank you.
>




Re: OSPF Costs Formula that include delay.

2014-01-25 Thread Jeff Tantsura
A path to a destination must be loop free, irrespectively.
So it is not a combination of multiple but rather a list of loop free paths to 
a destination where any other metrics are used as tie-breakers.
Another story - how do you get all that state distributed, inter-area cases, 
how do you make it actually useful ( LSDB vs TED ) and not to forget - FEC 
definition. 


Regards,
Jeff

> On Jan 24, 2014, at 10:13 PM, "Graham Beneke"  wrote:
> 
> The auto-cost capability in some vendors devices seems to have left many
> people ignoring the link metrics within their IGP. From what I recall in
> the standards - bandwidth is one possible link metric but certainly not
> the only one. Network designers are free (and I would encourage to) pick
> whatever metric is relevant to them.
> 
>> On 24/01/2014 22:26, Erik Sundberg wrote:
>> I am looking for a formula that other people are using .p
> 
> I've started to use a combination of 3 metrics to determine my costing:
> 
> * The traditional auto-cost calculation based on a 100Gbps reference
> which gives far more useful values than the old 100Mbps reference.
> 
> * An average or nominal link latency multiplied by a factor of 200.
> Sometimes adjusted if I want two geographically diverse paths between
> the same endpoints to have equivalent costs.
> 
> * Path length in km multiplied by 2. This accounts for situations when
> the nominal latency is too small to accurately determine and assumes 1
> ms per 100 km.
> 
> I then pick the largest of the above 3 metrics as my OSPF cost.
> 
> -- 
> Graham Beneke
> 



Re: Recommended L2 switches for a new IXP

2015-01-13 Thread Jeff Tantsura
What does it mean -  to be SDN ready?

Cheers,
Jeff




-Original Message-
From: Eduardo Schoedler 
Date: Tuesday, January 13, 2015 at 3:25 AM
To: "nanog@nanog.org" 
Subject: Re: Recommended L2 switches for a new IXP

>QFX5100 is SDN ready.
>
>--
>Eduardo Schoedler
>
>
>2015-01-13 6:29 GMT-02:00 Stepan Kucherenko :
>
>> Is there any particular reason you prefer EX4600 over QFX5100 ? Not
>> counting obvious differences like ports and upgrade options.
>>
>> It's the same chipset after all, and with all upgrades they have the
>> same 10G density (with breakouts). Is that because you can have more 40G
>> ports with EX4600 ?
>>
>> I'm still trying to find out if there are any noticeable software or
>> feature differences.
>>
>> On 13.01.2015 09:01, Mark Tinka wrote:
>> > On Monday, January 12, 2015 11:41:20 PM Tony Wicks wrote:
>> >
>> >> People seem to be avoiding recommending actual devices,
>> >> well I would recommend the Juniper EX4600 -
>> >>
>> >> http://www.juniper.net/us/en/products-services/switching/
>> >> ex-series/ex4600/
>> >>
>> >> They are affordable, highly scalable, stackable and run
>> >> JunOS.
>> >
>> > We've been quite happy with the EX4550, but the EX4600 is
>> > good too, particularly if you're coming from its younger
>> > brother.
>> >
>> > Mark.
>> >
>>
>
>
>
>-- 
>Eduardo Schoedler



Re: Recommended L2 switches for a new IXP

2015-01-13 Thread Jeff Tantsura
AhhhŠ vertically integrated horizontal API¹s

Cheers,
Jeff




-Original Message-
From: Nick Hilliard 
Date: Tuesday, January 13, 2015 at 2:23 PM
To: Jeff Tantsura , Eduardo Schoedler
, "nanog@nanog.org" 
Subject: Re: Recommended L2 switches for a new IXP

>On 13/01/2015 22:10, Jeff Tantsura wrote:
>> What does it mean -  to be SDN ready?
>
>it means "fully buzzword compliant".
>
>Nick
>
>



Re: Recommended L2 switches for a new IXP

2015-01-13 Thread Jeff Tantsura
Got you - artificially disabling 90% of the features otherwise supported
by the OS and using half baked HAL makes product SDN ready!
Sorry for the sarcasm, couldn¹t resist :)





Cheers,
Jeff



-Original Message-
From: Eduardo Schoedler 
Date: Tuesday, January 13, 2015 at 2:28 PM
To: "nanog@nanog.org" 
Subject: Re: Recommended L2 switches for a new IXP

>My mistake, it's the OCX1100.
>http://www.networkworld.com/article/2855056/sdn/juniper-unbundles-switch-h
>ardware-software.html
>
>2015-01-13 20:10 GMT-02:00 Jeff Tantsura :
>
>> What does it mean -  to be SDN ready?
>>
>> Cheers,
>> Jeff
>>
>>
>>
>>
>> -Original Message-
>> From: Eduardo Schoedler 
>> Date: Tuesday, January 13, 2015 at 3:25 AM
>> To: "nanog@nanog.org" 
>> Subject: Re: Recommended L2 switches for a new IXP
>>
>> >QFX5100 is SDN ready.
>> >
>> >--
>> >Eduardo Schoedler
>> >
>> >
>> >2015-01-13 6:29 GMT-02:00 Stepan Kucherenko :
>> >
>> >> Is there any particular reason you prefer EX4600 over QFX5100 ? Not
>> >> counting obvious differences like ports and upgrade options.
>> >>
>> >> It's the same chipset after all, and with all upgrades they have the
>> >> same 10G density (with breakouts). Is that because you can have more
>>40G
>> >> ports with EX4600 ?
>> >>
>> >> I'm still trying to find out if there are any noticeable software or
>> >> feature differences.
>> >>
>> >> On 13.01.2015 09:01, Mark Tinka wrote:
>> >> > On Monday, January 12, 2015 11:41:20 PM Tony Wicks wrote:
>> >> >
>> >> >> People seem to be avoiding recommending actual devices,
>> >> >> well I would recommend the Juniper EX4600 -
>> >> >>
>> >> >> http://www.juniper.net/us/en/products-services/switching/
>> >> >> ex-series/ex4600/
>> >> >>
>> >> >> They are affordable, highly scalable, stackable and run
>> >> >> JunOS.
>> >> >
>> >> > We've been quite happy with the EX4550, but the EX4600 is
>> >> > good too, particularly if you're coming from its younger
>> >> > brother.
>> >> >
>> >> > Mark.
>> >> >
>> >>
>> >
>> >
>> >
>> >--
>> >Eduardo Schoedler
>>
>>
>
>
>-- 
>Eduardo Schoedler



Re: draft-ietf-mpls-ldp-ipv6-16

2015-02-20 Thread Jeff Tantsura
>From market prospective v6 SR is definitely lower priority. Comcast and few 
>more are looking into native rather than v6 with MPLS encap.
Wrt v4 - 2 weeks ago at EANTC in Berlin we have tested 3 implementations of 
ISIS SR v4 MPLS with L3VPN and 6VPE over SR tunnels. Worked very well, very few 
issues.
So there's production quality code and interoperability - given the timeframe 
we have done a really good job in IETF :)


Regards,
Jeff

> On Feb 20, 2015, at 2:09 PM, Mark Tinka  wrote:
> 
> 
> 
>> On 20/Feb/15 13:39, Saku Ytti wrote:
>> 
>> Is there 4PE implementation to drive IPv4 edges, shouldn't be hard to accept
>> IPv6 next-hop in BGP LU, but probably does not work out-of-the-box?
>> Isn't Segment Routing implementation day1 IPV4+IPV6 in XR?
> 
> The last time I checked, MPLS support in SR for IPv6 is not a high
> priority, compared to TE for IPv4 MPLS.
> 
> My thoughts that SR would automatically mean native label signaling in
> IS-IS and OSPFv3 were otherwise ambitious.
> 
> Mark.


Re: draft-ietf-mpls-ldp-ipv6-16

2015-02-20 Thread Jeff Tantsura
For L2VPN if you could make it work - go with EVPN day 1, it solves most of the 
issues present in both LDP and BGP VPLS implementations.
Be aware of differences in PBB vs plain EVPN and platform support. 
EVPN, specifically multhoming/split horizon/some other stuff as well as 
presence of L3 (type 2/5) and complications of above brings lots of complexity 
into fast path processing and not every platform/NPU can do full spec.



Regards,
Jeff

> On Feb 20, 2015, at 1:19 PM, Saku Ytti  wrote:
> 
> On (2015-02-20 09:00 -0500), Tim Durack wrote:
> 
> Hey Tim,
> 
>> I also need some flavor of L2VPN (eVPN) and L3VPN (VPNv4/VPNv6) working
>> over IPv6.
> 
> L3VPN uses BGP exclusively for VPN label signalling, no need for LDP.
> 
> For L2VPN only Martini uses LDP, but if you have choice, why wouldn't you
> choose Kompella, the scaling factor is superior, as you only have 2 signalling
> connection, instead of n*(n-1)/2 signalling sessions. And you get to
> capitalize on instantly available backuo path.
> Of course I know that in CSCO land Kompella isn't available on every platform
> where Martini is, so you indeed may need LDP for some time until old platforms
> are phased out. 'Luckily' these older platforms have dubious IPv6 anyhow, so
> you might opt them out from IPv6 deployment anyhow.
> 
>> IPv6 control plane this decade may yet be optimistic.
> 
> For greenfield it's doable today (only Kompella pseudowires), but IPv6-only
> would require 4PE, I don't know if that works/exists.
> 
> -- 
>  ++ytti


Re: Low Cost 10G Router

2015-05-19 Thread Jeff Tantsura
ASR1K (XE) has great BGP implementation, go for it if you are OK with 
density/throughput.

Regards,
Jeff

> On May 19, 2015, at 11:35 PM, Mark Tees  wrote:
> 
> For the lists benefit, there is a 6 X 10GBE option for the ASR1000
> series it seems. No idea on pricing though.
> 
> http://www.cisco.com/c/en/us/products/collateral/application-networking-services/wide-area-application-services-waas-software/data-sheet-c78-729778.pdf
> 
> Cheers,
> 
> Mark
> 
> 
>> On Wed, May 20, 2015 at 3:59 PM, Mark Tinka  wrote:
>> 
>> 
>>> On 19/May/15 20:46, Ray Soucy wrote:
>>> 
>>> An ASR1K might do the trick, but more likely than not you're looking at an
>>> ASR9K if you want full tables; I don't have any experience with the 1K
>>> personally so I can't speak to that.  The ASR 9K is a really great platform
>>> and is what we use for BGP here, but it's pretty much the opposite of cheap.
>> 
>> The ASR1000 is a very good box, but I tend to prefer them for low-speed
>> services, which are generally non-Ethernet in nature, e.g., downstream
>> customers coming in via SDH.
>> 
>> They do support 10Gbps ports, but that is a 1-port SPA; and the most you
>> can have in today's SIP's (carrier cards) would be 4x 1-port SPA's. So
>> not very dense.
>> 
>> Their forwarding planes start at 2.5Gbps (fixed) all the way to 200Gbps
>> (13-slot chassis). But you're more likely to run out of high-speed ports
>> before you stress a 200Gbps forwarding plane on that chassis.
>> 
>> So if the applications are purely Ethernet, I'd not consider the
>> ASR1000. But if there is a mix-and-match for Ethernet and non-Ethernet
>> ports, it's the perfect box. That and the MX104.
>> 
>> Mark.
> 
> 
> 
> -- 
> Regards,
> 
> Mark L. Tees


Re: Layer2 over Layer3

2012-09-12 Thread Jeff Tantsura
l2tpv3

Regards,
Jeff

On Sep 12, 2012, at 19:23, "Philip Lavine"  wrote:

> To all,
>  
> I am trying to extend a layer2 connection over Layer 3 so I can have 
> redundant Layer connectivity between my HQ and colo site. The reason I need 
> this is so I can give the "appeareance" that there is one gateway and that 
> both data centers can share the same Layer3 subnet (which I am announcing via 
> BGP to 2 different vendors).
>  
> I have 2 ASR's. Will EoMPLS work or is there another option?
>  
> Philip



RE: MP-BGP next hop tracking delay 0

2012-10-23 Thread Jeff Tantsura
Hi Adam,

Works just fine on any relatively modern router.

Cheers,
Jeff

-Original Message-
From: Adam Vitkovsky [mailto:adam.vitkov...@swan.sk] 
Sent: Tuesday, October 23, 2012 12:31 AM
To: nanog@nanog.org
Subject: MP-BGP next hop tracking delay 0

I was wondering whether you have some experience with setting of the next hop 
tracking delay value for BGP to 0 for critical changes please There's gonna be 
only a few prefixes registered with BGP so far, around 150+

adam





Re: Yet Another BGP (Border Gateway Protocol) Python Implementation

2015-08-06 Thread Jeff Tantsura
Hi Peng,

Good stuff!

Any plans for multicast, RTC and EVPN AF's?

Regards,
Jeff

> On Aug 6, 2015, at 7:43 PM, Peng Xiao (penxiao)  wrote:
> 
> Hi guys,
> 
> Ipv6 and other address families are under development. We have already 
> designed the data structures for them as you can see from the documentation, 
> but it's just some testing code and not stable and we are not doing them 
> after careful consideration.
> But at last, they will come out.
> 
> 
> -Original Message-
> From: valdis.kletni...@vt.edu [mailto:valdis.kletni...@vt.edu] 
> Sent: 2015年8月7日 0:16
> To: Peng Xiao (penxiao)
> Cc: Jahangir Hossain; xx...@ledeuns.net; nanog@nanog.org
> Subject: Re: Yet Another BGP (Border Gateway Protocol) Python Implementation
> 
> On Thu, 06 Aug 2015 14:25:55 -, "Peng Xiao (penxiao)" said:
>> Currently, yabgp does not support IPv6 address family. We only support IPv4 
>> now.
> 
> http://tnx.nl/legacy-ip-only.svg
> 
> Seriously guys.  It's 2015.  We really don't care what you hack for your own 
> use - but publicly announcing stuff that doesn't have IPv6 support is getting 
> kind of embarassing...
> 
> *especially* when it's a major vendor open-sourcing their code. (I'm still 
> willing to cut a *little* slack for "two guys and a stack of empty pizza 
> boxes" software)
> 
> 


Re: BGP advertise-best-external on RR

2015-08-25 Thread Jeff Tantsura
Hi,

In your case I¹d recommend to use diverse path, due to its simplicity and
non disruptive deployment characteristics.
As you know - diverse path requires additional BGP session per additional
(second, next, etc) path, in most cases not a problem, however mileage
might vary.

To my memory, in Cisco land - it has only been implemented in IOS, not XR,
please check.   

Cheers,
Jeff




-Original Message-
From: Diptanshu Singh 
Date: Monday, August 24, 2015 at 10:53 PM
To: Mohamed Kamal 
Cc: "nanog@nanog.org" 
Subject: Re: BGP advertise-best-external on RR

>Yes . In the case of diverse path , shadow route reflector will be the
>one wherever  you enable commands to trigger diverse path computation.
>
>Good thing with diverse path is that the RR-Clients don't have to have
>any support but bad thing is that it can only reflect One additional
>best-path( second best path ) .
>
>Sent from my iPhone
>
>> On Aug 24, 2015, at 2:31 PM, Mohamed Kamal  wrote:
>> 
>> It's only supported on the 15.2(4)S and later not the SRE train. I
>>might consider an upgrade.
>> 
>> One more question regarding this, can you configure the RR to be the
>>main and shadow RR?
>> 
>> Mohamed Kamal
>> Core Network Sr. Engineer
>> 
>>> On 8/24/2015 9:16 PM, Diptanshu Singh wrote:
>>> BGP Add-Path might be your friend . You can look at diverse-path as
>>>well .
>> 



Re: Segment Routing for L2VPN?

2015-09-21 Thread Jeff Tantsura
Hi,

In most well designed IP routing stacks the way to get to a labeled
(tunneled) next hop is decoupled from a service, so if a service requires
such next hop it is upto (usually RIB) to return one (best, multiple might
exist) which would be used for forwarding. If it is a Segment Routed one
so it will then be used.

Cheers,
Jeff

-Original Message-
From: Mohan Nanduri 
Date: Sunday, September 20, 2015 at 12:59 PM
To: Jason Lixfeld 
Cc: "nanog@nanog.org" 
Subject: Re: Segment Routing for L2VPN?

>No, it works with L2VPNs also. Outer label is going to be SR label and
>inner label is your L2VPN label.
>
>Cheers,
>-Mohan
>
>
>On Sun, Sep 20, 2015 at 3:23 PM, Jason Lixfeld  wrote:
>> Hello!
>>
>> I've been doing some reading recently on Segment Routing.  By all
>>accounts, it seems that the (only?) implementation for SR supports
>>L3VPN.  Am I dumb and just missing the L2VPN bits, or is L3VPN simply
>>the extent of the first generation?
>>
>> Sent from my iPhone



Re: EoMPLS vlan rewrite between brands; possibly new bug in Cisco IOS 15

2015-11-14 Thread Jeff Tantsura
Been forever since i looked at cisco, however sounds like vc type mismatch. 
They used to have it as a platform capability, perhaps SW upgrade changed the 
default.

to my memory "show mpls l2 transport" should provide enough details.

Hope this helps

Regards,
Jeff

> On Nov 14, 2015, at 4:50 AM, Jonas Bjork  wrote:
> 
> Hi, I am using a couple of AToM/EoMPLS tunnels in order to carry customer 
> voice and data traffic across our IP/MPLS core, and it is currently working 
> just fine. The first side consists of a Cisco 7600 router (rsp) and the other 
> one is an HP A5500-HI routing switch with full LER/E-LSR capability. At the 
> HP site, the tunnels are facing the access ports towards our premium 
> end-customers; and on the Cisco PE I terminate the tunnels on one of the 
> 2x10GE portchannel backbone links. There is vlan X on the HP side and vlan Y 
> on the Cisco side - vlan rewrite is working perfectly - as long as I use IOS 
> 12.
> 
> After upgrading the Cisco router software to IOS 15 the tunnels won't come 
> up. sh mpls l2 vc Y d says:
> ...
> Last error: Imposition VLAN rewrite capability mismatch with peer
> ...
> 
> I use almost exactly the same Cisco configuration before and after the 
> upgrade (only minor changes and nothing related to this) and I havn't touched 
> the HP. Apparently they don't talk the same L2PW language. I wonder though, 
> why now? We use service instances on the HP switchport as endpoint, we 
> initiate the targetted LDP session in addition to the pseudowire handshake 
> from a sub interface MPLS crossconnect. There is no MTU mismatch; not here - 
> not anywhere.
> 
> Anyone heard of this issue or experienced it?
> 
> Best regards,
> 
> Jonas Björk
> SNE, Europe/Sweden (hope you guys will help me anyway:)


Re: EoMPLS vlan rewrite between brands; possibly new bug in Cisco IOS 15

2015-11-14 Thread Jeff Tantsura
Jonas,

As expected - the problem is related to vc type negotiation.

You have hit CSCuq28998 :)
talk to your cisco rep  

Workaround:
- configure VC type 5 between the routers (configured on HP side)
- configuring no-control-word 


The bug has been reported in 15.2(4)S4a, perhaps there’s an image with the 
problem fixed.

Cheers,
Jeff




On 11/14/15, 17:31, "Jonas Bjork"  wrote:

>Dear Mr. Jeff,
>
>Thank you for your reply. Below is the complete output in question (l2 is 
>short for l2transport).
>You are mentioning platform capabilities and that the default might have 
>changed. How do I alter this?
>
>pe#sh mpls l2 vc 42 d
>Local interface: Po190.42 up, line protocol up, Eth VLAN 42 up
>  Destination address: X.X.1.89, VC ID: 42, VC status: down
>Last error: Imposition VLAN rewrite capability mismatch with peer
>Output interface: none, imposed label stack {}
>Preferred path: not configured
>Default path: no route
>No adjacency
>  Create time: 00:00:59, last status change time: 00:31:40
>Last label FSM state change time: 00:00:18
>Last peer autosense occurred at: 00:00:18
>  Signaling protocol: LDP, peer X.X.1.89:0 up
>Targeted Hello: X.X.0.2(LDP Id) -> X.X.1.89, LDP is UP
>Graceful restart: not configured and not enabled
>Non stop routing: not configured and not enabled
>Status TLV support (local/remote)   : enabled/not supported
>  LDP route watch   : enabled
>  Label/status state machine: remote invalid, LruRnd
>  Last local dataplane   status rcvd: No fault
>  Last BFD dataplane status rcvd: Not sent
>  Last BFD peer monitor  status rcvd: No fault
>  Last local AC  circuit status rcvd: No fault
>  Last local AC  circuit status sent: DOWN PW(rx/tx faults)
>  Last local PW i/f circ status rcvd: No fault
>  Last local LDP TLV status sent: No fault
>  Last remote LDP TLVstatus rcvd: Not sent
>  Last remote LDP ADJstatus rcvd: No fault
>MPLS VC labels: local 242, remote 1199
>Group ID: local 0, remote 0
>MTU: local 9216, remote 9216
>Remote interface description:
>Remote VLAN id: 42
>  Sequencing: receive disabled, send disabled
>  Control Word: Off (configured: autosense)
>  SSO Descriptor: X.X.1.89/42, local label: 242
>  Dataplane:
>SSM segment/switch IDs: 0/0 (used), PWID: 142
>  VC statistics:
>transit packet totals: receive 0, send 0
>transit byte totals:   receive 0, send 0
>transit packet drops:  receive 0, seq error 0, send 0
>pe#
>
>Anyone else: feel free to join in. Maybe we have any L2VC/PW ninjas watching.
>
>Best regards,
>Jonas Bjork
>
>
>> On 15 Nov 2015, at 1:26, Jeff Tantsura  wrote:
>> 
>> Been forever since i looked at cisco, however sounds like vc type mismatch. 
>> They used to have it as a platform capability, perhaps SW upgrade changed 
>> the default.
>> 
>> to my memory "show mpls l2 transport" should provide enough details.
>> 
>> Hope this helps
>> 
>> Regards,
>> Jeff
>> 
>>> On Nov 14, 2015, at 4:50 AM, Jonas Bjork  wrote:
>>> 
>>> Hi, I am using a couple of AToM/EoMPLS tunnels in order to carry customer 
>>> voice and data traffic across our IP/MPLS core, and it is currently working 
>>> just fine. The first side consists of a Cisco 7600 router (rsp) and the 
>>> other one is an HP A5500-HI routing switch with full LER/E-LSR capability. 
>>> At the HP site, the tunnels are facing the access ports towards our premium 
>>> end-customers; and on the Cisco PE I terminate the tunnels on one of the 
>>> 2x10GE portchannel backbone links. There is vlan X on the HP side and vlan 
>>> Y on the Cisco side - vlan rewrite is working perfectly - as long as I use 
>>> IOS 12.
>>> 
>>> After upgrading the Cisco router software to IOS 15 the tunnels won't come 
>>> up. sh mpls l2 vc Y d says:
>>> ...
>>> Last error: Imposition VLAN rewrite capability mismatch with peer
>>> ...
>>> 
>>> I use almost exactly the same Cisco configuration before and after the 
>>> upgrade (only minor changes and nothing related to this) and I havn't 
>>> touched the HP. Apparently they don't talk the same L2PW language. I wonder 
>>> though, why now? We use service instances on the HP switchport as endpoint, 
>>> we initiate the targetted LDP session in addition to the pseudowire 
>>> handshake from a sub interface MPLS crossconnect. There is no MTU mismatch; 
>>> not here - not anywhere.
>>> 
>>> Anyone heard of this issue or experienced it?
>>> 
>>> Best regards,
>>> 
>>> Jonas Björk
>>> SNE, Europe/Sweden (hope you guys will help me anyway:)
>


Re: route converge time

2015-11-28 Thread Jeff Tantsura
In that case multihop BFD (if supported on both sides) would really help.

Regards,
Jeff

> On Nov 28, 2015, at 11:37 AM, Matthew Petach  wrote:
> 
> One thing I notice you don't mention is whether your
> BGP sessions to your upstream providers are direct
> or multi-hop eBGP.  I know for a while some of the
> more bargain-basement providers were doing eBGP
> multi-hop feeds for full tables, which will definitely
> slow down convergence if the routers have to wait
> for hold timers to expire to flush routes, rather than
> being able to direct detect link state transitions.
> 
> Matt
> 
> 
> On Sat, Nov 21, 2015 at 5:44 AM, Baldur Norddahl
>  wrote:
>> Hi
>> 
>> I got a network with two routers and two IP transit providers, each with
>> the full BGP table. Router A is connected to provider A and router B to
>> provider B. We use MPLS with a L3VPN with a VRF called "internet".
>> Everything happens inside that VRF.
>> 
>> Now if I interrupt one of the IP transit circuits, the routers will take
>> several minutes to remove the now bad routes and move everything to the
>> remaining transit provider. This is very noticeable to the customers. I am
>> looking into ways to improve that.
>> 
>> I added a default static route 0.0.0.0 to provider A on router A and did
>> the same to provider B on router B. This is supposed to be a trick that
>> allows the network to move packets before everything is fully converged.
>> Traffic might not leave the most optimal link, but it will be delivered.
>> 
>> Say I take down the provider A link on router A. As I understand it, the
>> hardware will notice this right away and stop using the routes to provider
>> A. Router A might know about the default route on router B and send the
>> traffic to router B. However this is not much help, because on router B
>> there is no link that is down, so the hardware is unaware until the BGP
>> process is done updating the hardware tables. Which apparently can take
>> several minutes.
>> 
>> My routers also have multipath support, but I am unsure if that is going to
>> be of any help.
>> 
>> Anyone got any tricks or pointers to what can be done to optimize the
>> downtime in case of a IP transit link failure? Or the related case of one
>> my routers going down or the link between them going down (the traffic
>> would go a non-direct way instead if the direct link is down).
>> 
>> Thanks,
>> 
>> Baldur
>> 


Re: VPLS Providers

2016-01-01 Thread Jeff Tantsura
In 2016 we will start seeing first massive EVPN deployments.
If you really need L2 with multihoming and BGP FRR speeds in service recovery - 
look for EVPN, otherwise, as mentioned below - L3 is your friend.

Regards,
Jeff

> On Jan 1, 2016, at 7:21 AM, Nick Hilliard  wrote:
> 
> Chris Burwell wrote:
>> I've had enough trouble with broadcast storms and other issues in N.A.
> 
> And you still want vpls?  Wow.
> 
> If you're talking a requirement for connecting geographically separated
> locations, there are sound technical reasons for avoiding vpls like the
> plague.  Unless there are overriding technical reasons why it wouldn't
> work, l3vpn will almost always provide a far better quality service.
> 
> Nick
> 


Re: New Switches with Broadcom StrataDNX

2016-01-19 Thread Jeff Tantsura
Hi,

Some points:
1.DNX SDK is significantly different from SGX, adopted by Cumulus and such, yet 
to be done, and this is not negligible amount of work
2.if you are not interested in capacity but in scale, there’re other BCM chips, 
perhaps more suitable
3.you don’t have to have all the forwarding entries populated in silicon, as an 
example - take a look at http://sdn-internet-router-sir.readthedocs.org, code 
at https://github.com/dbarrosop/sir, one could also leverage approach we have 
taken in EVPN - decoupling RIB from FIB completely
4.NG silicon will do 1M+ LPM's

Cheers,
Jeff







On 1/19/16, 06:29, "NANOG on behalf of Colton Conor"  wrote:

>I was hoping this new Broadcom chip would be able to support enough routes
>to hold a full BGP table, and be used for something like cumulus linux. I
>have no need for 100G, but 10G and 40G on a platform with deeper buffers
>sounds nice.
>
>On Tue, Jan 19, 2016 at 1:01 AM, Phil Bedard  wrote:
>
>> The BCM88670 (Jericho) is what powers the new Cisco NCS55XX devices. The
>> processor is linerate above around 100 bytes per packet without external
>> TCAM, supports 256K IPv4/64K IPv6 FIB entries (or mixed amounts).  These
>> chips are being used for high scale 100G, the initial NCS5508 linecard is a
>> 36x100G QSFP28 one.
>>
>> Juniper has chosen to use their own silicon for most of their dense 100G
>> platforms, but you’ll see these chips used by pretty much everyone else I
>> imagine at some point in the next year.
>>
>>
>>
>> Phil
>>
>> -Original Message-
>> From: NANOG  on behalf of Colton Conor <
>> colton.co...@gmail.com>
>> Date: Sunday, January 17, 2016 at 18:15
>> To: NANOG 
>> Subject: New Switches with Broadcom StrataDNX
>>
>> >Does anyone know when the switching and router vendors will release their
>> >new models with the Broadcom BCM88370 and BCM88670 chips? It looks like
>> >these chips could be used as a carrier grade router and/or metro E device.
>> >
>> >More information here:
>> http://www.broadcom.com/press/release.php?id=s902223
>> >
>> >and here:
>> >
>> http://www.nextplatform.com/2015/03/19/new-dune-chips-enable-heftier-switches/
>>


Re: New Switches with Broadcom StrataDNX

2016-01-20 Thread Jeff Tantsura
That's right, logic is in programming chips, not their property. You just need 
to know what to program ;-)

Regards,
Jeff

> On Jan 19, 2016, at 10:10 PM, Mark Tinka  wrote:
> 
> 
> 
>> On 20/Jan/16 00:17, Phil Bedard wrote:
>> 
>> Good point, there are many people looking at what I call FIB optimization 
>> right now.  The key is having the programmability on the device to make it 
>> happen.  Juniper/Cisco support it using policies to filter RIB->FIB and I 
>> believe both also do per-NPU/PFE localized FIBs now. I am not sure if that’s 
>> something supported on this new Broadcom chipset.  Depends on your network 
>> of course and where you are looking to position the router.
> 
> I don't think the FIB needs to have specific support for selective
> programming.
> 
> I think that comes in the code to instruct the control plane what it
> should download to the FIB.
> 
> Cisco's and Juniper's support of this is on FIB that has been in
> production long before the feature became available. It was just added
> to code.
> 
> Mark.


Re: New Switches with Broadcom StrataDNX

2016-04-18 Thread Jeff Tantsura
Lincoln,

Why wouldn’t they?
What is it Arista did others didn’t?

Cheers,
Jeff

From: lincoln dale mailto:l...@interlink.com.au>>
Date: Monday, April 18, 2016 at 11:42 AM
To: Colton Conor mailto:colton.co...@gmail.com>>
Cc: Jeff Tantsura 
mailto:jeff.tants...@ericsson.com>>, 
"nanog@nanog.org<mailto:nanog@nanog.org>" 
mailto:nanog@nanog.org>>
Subject: Re: New Switches with Broadcom StrataDNX

Yes. We also have 1M+ FIB support day one too - hence the letter 'R' denoting 
the evolution with 3rd generation of its evolution to internet edge/router use 
cases.

Not sure what other vendors are doing but I doubt others are yet shipping large 
table support.
(there's more to it than just the underlying native silicon)


cheers,

lincoln. (l...@arista.com<mailto:l...@arista.com>)


On Mon, Apr 18, 2016 at 11:01 AM, Colton Conor 
mailto:colton.co...@gmail.com>> wrote:
As a follow up to this post, it look like the Arista 7500R series has this
new chip inside of it.

On Wed, Jan 20, 2016 at 9:34 AM, Jeff Tantsura 
mailto:jeff.tants...@ericsson.com>>
wrote:

> That's right, logic is in programming chips, not their property. You just
> need to know what to program ;-)
>
> Regards,
> Jeff
>
> > On Jan 19, 2016, at 10:10 PM, Mark Tinka 
> > mailto:mark.ti...@seacom.mu>> wrote:
> >
> >
> >
> >> On 20/Jan/16 00:17, Phil Bedard wrote:
> >>
> >> Good point, there are many people looking at what I call FIB
> optimization right now.  The key is having the programmability on the
> device to make it happen.  Juniper/Cisco support it using policies to
> filter RIB->FIB and I believe both also do per-NPU/PFE localized FIBs now.
> I am not sure if that’s something supported on this new Broadcom chipset.
> Depends on your network of course and where you are looking to position the
> router.
> >
> > I don't think the FIB needs to have specific support for selective
> > programming.
> >
> > I think that comes in the code to instruct the control plane what it
> > should download to the FIB.
> >
> > Cisco's and Juniper's support of this is on FIB that has been in
> > production long before the feature became available. It was just added
> > to code.
> >
> > Mark.
>



Re: New Switches with Broadcom StrataDNX

2016-04-18 Thread Jeff Tantsura
It depends…

there’s a phenomenon called “next-hop flattening” which has to do with lookup 
recursiveness within the silicon.
Unless this is done (and this is big piece of work) not everything supported on 
Trio or Ezchip can be supported.

In general – Jericho (and its followers) is a great piece of silicon made by 
clueful folks… watch this space closely

Jeff
From:  Colton Conor 
Date:  Monday, April 18, 2016 at 11:44 AM
To:  lincoln dale 
Cc:  Jeff Tantsura , "nanog@nanog.org" 

Subject:  Re: New Switches with Broadcom StrataDNX

So can this compete routing wise against something like a Juniper MX104 or 
Cisco ASR 9001? 

On Mon, Apr 18, 2016 at 1:42 PM, lincoln dale  wrote:
Yes. We also have 1M+ FIB support day one too - hence the letter 'R' denoting 
the evolution with 3rd generation of its evolution to internet edge/router use 
cases.

Not sure what other vendors are doing but I doubt others are yet shipping large 
table support.
(there's more to it than just the underlying native silicon)


cheers,

lincoln. (l...@arista.com)


On Mon, Apr 18, 2016 at 11:01 AM, Colton Conor  wrote:
As a follow up to this post, it look like the Arista 7500R series has this
new chip inside of it.

On Wed, Jan 20, 2016 at 9:34 AM, Jeff Tantsura 
wrote:

> That's right, logic is in programming chips, not their property. You just
> need to know what to program ;-)
>
> Regards,
> Jeff
>
> > On Jan 19, 2016, at 10:10 PM, Mark Tinka  wrote:
> >
> >
> >
> >> On 20/Jan/16 00:17, Phil Bedard wrote:
> >>
> >> Good point, there are many people looking at what I call FIB
> optimization right now.  The key is having the programmability on the
> device to make it happen.  Juniper/Cisco support it using policies to
> filter RIB->FIB and I believe both also do per-NPU/PFE localized FIBs now.
> I am not sure if that’s something supported on this new Broadcom chipset.
> Depends on your network of course and where you are looking to position the
> router.
> >
> > I don't think the FIB needs to have specific support for selective
> > programming.
> >
> > I think that comes in the code to instruct the control plane what it
> > should download to the FIB.
> >
> > Cisco's and Juniper's support of this is on FIB that has been in
> > production long before the feature became available. It was just added
> > to code.
> >
> > Mark.
>





Re: Arista Routing Solutions

2016-04-23 Thread Jeff Tantsura
Saku,

Jericho is in no sense a low end chip, while there are some scale limitations 
(what can be done with SuperFEC, some bridging related stuff), from 
functionality prospective it is a very capable silicon.

One has to:
Understand how to program it properly (recursiveness, ECMP’s, etc) 
Know how to enhance SDK
Have a rather rich control plane, which can be translated into rich forwarding 
functionality :-)

I’m not familiar with Arista’s feature set
NCS with XR would be a good proof 

Watch for Jericho updates from DNX

Cheers,
Jeff 



On 4/23/16, 11:20 AM, "NANOG on behalf of Saku Ytti"  wrote:

>On 23 April 2016 at 10:52, Tom Hill  wrote:
>> In broad strokes: for your money you're either getting port density, or
>> more features per port. The only difference here is that there's
>> suddenly more TCAM on the device, and I still don't see the above
>> changing too drastically.
>
>Yeah OP is comparing high touch chip (MX104) to low touch chip
>(Jericho) that is not fair comparison. And cost is what customer is
>willing to pay, regardless of sticker on the box. No one will pay
>significant mark-up for another sticker, I've never seen in RFP
>significant differences in comparable products.
>
>Fairer comparison would be QFX10k, instead of MX104. QFX10k is AFAIK
>only product in this segment which is not using Jericho. If this is
>competitive advantage or risk, jury is still out, I lean towards
>competitive advantage, mainly due to its memory design.
>
>-- 
>  ++ytti



RE: bgp update destroying transit on redback routers ?

2011-12-01 Thread Jeff Tantsura
Hi,

Let me take it over from now on, I'm the IP Routing/MPLS Product Manager at 
Ericsson responsible for all routing protocols.
There's nothing wrong in checking ASN in AGGREGATOR, we don't really want see 
ASN 0 anywhere, that's how draft-wkumari-idr-as0 (draft-ietf-idr-as0-00) came 
into the worlds.

To my knowledge - the only vendor which allows changing ASN in AGGREGATOR is 
Juniper, see "no-aggregator-id", in the past I've tried to talk to Yakov about 
it, without any results though. 
So for those who have it configured - please rethink whether you really need it.

As for SEOS - understanding that this badly affects our customers and not 
having draft-ietf-idr-error-handling fully implemented yet, we will temporarily 
disable this check in our code.
Patch will be made available.

Please contact me for any further clarifications.

Regards,
Jeff

P.S. Warren has recently  included AGGREGATOR in the draft, please see

 2. Behavior
   This document specifies that a BGP speaker MUST NOT originate or
   propagate a route with an AS number of zero.  If a BGP speaker
   receives a route which has an AS number of zero in the AS_PATH (or
   AS4_PATH) attribute, it SHOULD be logged and treated as a WITHDRAW.
   This same behavior applies to routes containing zero as the
   Aggregator or AS4 Aggregator.




RE: bgp update destroying transit on redback routers ?

2011-12-01 Thread Jeff Tantsura
Thanks Warren!
I have already brought this to the list.

Regards,
Jeff


-Original Message-
From: Warren Kumari [mailto:war...@kumari.net] 
Sent: Thursday, December 01, 2011 3:05 PM
To: Christopher Morrow
Cc: nanog@nanog.org
Subject: Re: bgp update destroying transit on redback routers ?


On Dec 1, 2011, at 3:36 PM, Christopher Morrow wrote:

> On Thu, Dec 1, 2011 at 3:23 PM, Igor Ybema  wrote:
>>> 
>>> 
>>> one of the reasons the above was written...
>> 
>> That does not include when ASN=0 is used in the aggregator attribute.
>> Could you add that?
> 
> that's a warren question...

http://tools.ietf.org/html/draft-wkumari-idr-as0-01 has been replaced with 
http://tools.ietf.org/html/draft-ietf-idr-as0-00 -- which does include it.

Thanks all,
W





RE: bgp update destroying transit on redback routers ?

2011-12-02 Thread Jeff Tantsura
Hi Alexandre,

You are right, the behavior is exactly as per RFC4271 section 6:
"When any of the conditions described here are detected, a
NOTIFICATION message, with the indicated Error Code, Error Subcode, and Data 
fields, is sent, and the BGP connection is closed.
So because ASN 0 in AGGREGATOR is seen as a malformed UPDATE we send 3/9 and 
close the connection.

Ideally it should be treated as "treat-as-withdraw" as per 
draft-chen-ebgp-error-handling, however please note - this is still a draft, 
not a normative document and with all my support it takes time to implement.

Once again, we understand the implications for our customers and hence going to 
disable ASN 0 check.

P.S. We have strong evidence that the update in question was caused by a bug on 
a freshly updated router (I'm not going to disclose the vendor) 

Regards,
Jeff


-Original Message-
From: Alexandre Snarskii [mailto:s...@snar.spb.ru] 
Sent: Friday, December 02, 2011 6:36 AM
To: Jeff Tantsura
Cc: nanog@nanog.org
Subject: Re: bgp update destroying transit on redback routers ?

On Thu, Dec 01, 2011 at 04:56:43PM -0500, Jeff Tantsura wrote:
> Hi,
> 
> Let me take it over from now on, I'm the IP Routing/MPLS Product 
> Manager at Ericsson responsible for all routing protocols.
> There's nothing wrong in checking ASN in AGGREGATOR, we don't really 
> want see ASN 0 anywhere, that's how draft-wkumari-idr-as0 
> (draft-ietf-idr-as0-00) came into the worlds.

This draft says that

If a BGP speaker receives a route which has an AS number of zero in the AS_PATH 
(or AS4_PATH) attribute, it SHOULD be logged and treated as a WITHDRAW. This 
same behavior applies to routes containing zero as the Aggregator or AS4 
Aggregator.

but observed behaviour was more like following: 

If a BGP speaker receives [bad route] it MUST close session immediately with 
NOTIFICATION Error Code 'Update Message Error' and subcode 'Error with optional 
attribute'.

--
In theory, there is no difference between theory and practice. 
But, in practice, there is. 




RE: draft-ietf-idr-as0-00 (bgp update destroying transit on redback routers ?)

2011-12-03 Thread Jeff Tantsura
Hi Daniel,

I do understand the use of it however have my doubts about usability as such, 
I'd really like to see anyone using it for the reason below.
All of updates with ASN 0 I have seen in the past few years were there due to 
software bugs, not explicit configuration - same as this one.

Warren/ idr -  I do support addition of AGGREGATOR in the draft

Regards,
Jeff

P.S. Jeffrey/John -  this draft makes use of "no-aggregator-id"  de facto 
illigal, are you (your customers) OK with it? 
Thanks!

-Original Message-
From: Daniel Ginsburg [mailto:d...@net-geek.org] 
Sent: Friday, December 02, 2011 5:13 AM
To: Jeff Tantsura; Warren Kumari
Cc: nanog@nanog.org; i...@ietf.org
Subject: draft-ietf-idr-as0-00 (bgp update destroying transit on redback 
routers ?)

Hi,

This is true that "no-aggregator-id" knob zeroes out the AGGREGATOR attribute.

The knob, as far as I was able to find out, dates back to gated and there's a 
reason why it was introduced - it helps to avoid unnecessary updates. Assume 
that an aggregate route is generated by two (or more) speakers in the network. 
These two aggregates differ only in AGGREGATOR attribute. One of the aggregates 
is preferred within the network (due to IGP metric, for instance, or any other 
reasons) and is announced out. Now if something changes within the network and 
the other instance of the aggregate becomes preferred, the network has to issue 
an outward update different from the previous only in AGGREGATOR attribute, 
which is completely superfluous.

If the network employs the "no-aggregator-id" knob to zero out the AGGREGATOR 
attribute, both instances of the aggregate route are completely equivalent, and 
no redundant outward updates have to be send if one instance becomes better 
than another due to some internal event, which nobody in the Internet cares 
about.

In other words, the "no-aggregator-id" knob has valid operational reasons to be 
used. And, IMHO, the draft-ietf-idr-as0-00 should not prohibit AS0 in 
AGGREGATOR attribute.

On 02.12.2011, at 1:56, Jeff Tantsura wrote:

> Hi,
> 
> Let me take it over from now on, I'm the IP Routing/MPLS Product Manager at 
> Ericsson responsible for all routing protocols.
> There's nothing wrong in checking ASN in AGGREGATOR, we don't really want see 
> ASN 0 anywhere, that's how draft-wkumari-idr-as0 (draft-ietf-idr-as0-00) came 
> into the worlds.
> 
> To my knowledge - the only vendor which allows changing ASN in AGGREGATOR is 
> Juniper, see "no-aggregator-id", in the past I've tried to talk to Yakov 
> about it, without any results though. 
> So for those who have it configured - please rethink whether you really need 
> it.
> 
> As for SEOS - understanding that this badly affects our customers and not 
> having draft-ietf-idr-error-handling fully implemented yet, we will 
> temporarily disable this check in our code.
> Patch will be made available.
> 
> Please contact me for any further clarifications.
> 
> Regards,
> Jeff
> 
> P.S. Warren has recently  included AGGREGATOR in the draft, please see
> 
> 2. Behavior
>   This document specifies that a BGP speaker MUST NOT originate or
>   propagate a route with an AS number of zero.  If a BGP speaker
>   receives a route which has an AS number of zero in the AS_PATH (or
>   AS4_PATH) attribute, it SHOULD be logged and treated as a WITHDRAW.
>   This same behavior applies to routes containing zero as the
>   Aggregator or AS4 Aggregator.
> 




Re: sr - spring - what's the deal with 2 names

2020-09-06 Thread Jeff Tantsura via NANOG
Aaron,

Out of curiosity - if you are interested in SR, where are you getting your 
information from if not IETF (SPRING)?
As for history - we (at Redback) have published 1st draft describing SR-MPLS 
data plane in 2003 (LDP control plane).

Regards,
Jeff

> On Sep 6, 2020, at 09:53, Saku Ytti via NANOG  wrote:
> 
> Hey,
> 
> 
>> On Sun, 6 Sep 2020 at 04:26, Aaron Gould via NANOG  wrote:
>> 
>> Does anyone know the scope on why we have 2 names for this ?  Seriously, was 
>> it one of those things where a vendor started doing it first (pre-standard) 
>> as sr, and then ietf started standardizing it as spring ? …or was it always 
>> being standardized pre-vendor implementation and there was a disagreement 
>> within ietf or elsewhere ?  or… was there a conscious decision amongst the 
>> inventors to actually call it both sr and spring ?  or is their actually 
>> something different about each one and I’m wrong in thinking they are 2 
>> names for the same technology.
> 
> If you don't like the names, I have others.
> 
> SPRING is the IETF working group name - Source Packet Routing in Networking
> Segment Routing is under SPRING
> 
> -- 
>  ++ytti


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Jeff Tantsura via NANOG
De-facto standards are as good as people implementing them, however in order to 
enforce non ambiguous implementations, it has to be de-jure (e.g. a standard 
track RFC).
While I’m sympathetic to the idea, I’m quite skeptical about its viability.
A well written BCP would be much more valuable, and perhaps when we get to a 
critical mass, codification would be a natural process, rather than 
artificially enforcing people doing stuff they don’t see value (ROI) in, 
discussion here perfectly reflects the state of art.

Cheers,
Jeff

> On Sep 8, 2020, at 17:57, Douglas Fischer via NANOG  wrote:
> 
> 
> Most of us have already used some BGP community policy to no-export some 
> routes to some where.
> 
> On the majority of IXPs, and most of the Transit Providers, the very common 
> community tell to route-servers and routers "Please do no-export these routes 
> to that ASN" is:
> 
>  -> 0:
> 
> So we could say that this is a de-facto standard.
> 
> 
> But the Policy equivalent to "Please, export these routes only to that ASN" 
> is very varied on all the IXPs or Transit Providers.
> 
> 
> With that said, now comes some questions:
> 
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or 
> something like that, that would define 0: as "no-export-to" 
> standard?
> 
> 2 - What about reserving some 16-bits ASN to use : as 
> "export-only-to" standard?
> 2.1 - Is important to be 16 bits, because with (RT) extended communities, any 
> ASN on the planet could be the target of that policy.
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
> 
> -- 
> Douglas Fernando Fischer
> Engº de Controle e Automação


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Jeff Tantsura via NANOG
I don’t think, anyone has proposed to use ‘’reserved ASNs” as a BCP, example of 
“ab”use of ASN0 is a de-facto artifact (unfortunate one).
My goal would be to provide a viable source of information to someone who is 
setting up a new ISP and has a very little clue as where to start. Do’s and 
don’t’s wrt inter-domain communities use. 

I really enjoyed the difference RFC7938 (Use of BGP for Routing in Large-Scale 
Data Centers) made, literally 100s of companies have used it to educate 
themselves/ implemented their DC networking.

Cheers,
Jeff

>> On Sep 9, 2020, at 10:04, adam via NANOG  wrote:
> 
> I don’t agree with the use of reserved ASNs, let alone making it BCP, cause 
> it defeats the whole purpose of the community structure.
> Community is basically sending a message to an AS. If I want your specific AS 
> to interpret the message I set it in format YOUR_ASN:, your 
> AS in the first part of the community means that your rules of how to 
> interpret the community value apply.
> Turning AS#0 or any other reserved AS# into a “broadcast-AS#” in terms of 
> communities (or any other attribute for that matter) just doesn’t sit right 
> with me (what’s next? multicast-ASNs that we can subscribe to?).
> All the examples in Robert’s draft or wide community RFC, all of them use an 
> example AS# the community is addressed to (not some special reserved AS#).
>  
> Also should something like this become standard it needs to be properly 
> standardized and implemented as a well-known community by most vendors (like 
> RFCs defining the wide communities or addition to standard communities like 
> no_export/no_advertise/…). This would also eliminate the adoption friction 
> from operators rightly claiming “my AS my rules”.   
>  
> adam
>  
>  
> From: NANOG  On Behalf 
> Of Douglas Fischer via NANOG
> Sent: Tuesday, September 8, 2020 4:56 PM
> To: NANOG 
> Subject: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
> reserved to "export-only-to"?'
>  
> Most of us have already used some BGP community policy to no-export some 
> routes to some where.
> 
> On the majority of IXPs, and most of the Transit Providers, the very common 
> community tell to route-servers and routers "Please do no-export these routes 
> to that ASN" is:
> 
>  -> 0:
>  
> So we could say that this is a de-facto standard.
>  
>  
> But the Policy equivalent to "Please, export these routes only to that ASN" 
> is very varied on all the IXPs or Transit Providers.
>  
>  
> With that said, now comes some questions:
> 
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or 
> something like that, that would define 0: as "no-export-to" 
> standard?
>  
> 2 - What about reserving some 16-bits ASN to use : as 
> "export-only-to" standard?
> 2.1 - Is important to be 16 bits, because with (RT) extended communities, any 
> ASN on the planet could be the target of that policy.
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>  
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Jeff Tantsura via NANOG
Robert,

This is not whether you should do it, but, should you have decided to, how to 
do it in the best possible way, without making mistakes someone else has made 
and learnt from.

Regards,
Jeff

> On Sep 9, 2020, at 11:40, Robert Raszuk  wrote:
> 
> 
> And use of BGP without IGP left and right when even today bunch of DCs can do 
> just fine with current IGPs scaling wise is IMO not a good thing.
> 
> Thx
> R.
> 
>> On Wed, Sep 9, 2020, 10:55 Jeff Tantsura via NANOG  wrote:
>> I don’t think, anyone has proposed to use ‘’reserved ASNs” as a BCP, example 
>> of “ab”use of ASN0 is a de-facto artifact (unfortunate one).
>> My goal would be to provide a viable source of information to someone who is 
>> setting up a new ISP and has a very little clue as where to start. Do’s and 
>> don’t’s wrt inter-domain communities use. 
>> 
>> I really enjoyed the difference RFC7938 (Use of BGP for Routing in 
>> Large-Scale Data Centers) made, literally 100s of companies have used it to 
>> educate themselves/ implemented their DC networking.
>> 
>> Cheers,
>> Jeff
>> 
>>>> On Sep 9, 2020, at 10:04, adam via NANOG  wrote:
>>>> 
>>> 
>>> I don’t agree with the use of reserved ASNs, let alone making it BCP, cause 
>>> it defeats the whole purpose of the community structure.
>>> 
>>> Community is basically sending a message to an AS. If I want your specific 
>>> AS to interpret the message I set it in format YOUR_ASN:, 
>>> your AS in the first part of the community means that your rules of how to 
>>> interpret the community value apply.
>>> 
>>> Turning AS#0 or any other reserved AS# into a “broadcast-AS#” in terms of 
>>> communities (or any other attribute for that matter) just doesn’t sit right 
>>> with me (what’s next? multicast-ASNs that we can subscribe to?).
>>> 
>>> All the examples in Robert’s draft or wide community RFC, all of them use 
>>> an example AS# the community is addressed to (not some special reserved 
>>> AS#).
>>> 
>>>  
>>> 
>>> Also should something like this become standard it needs to be properly 
>>> standardized and implemented as a well-known community by most vendors 
>>> (like RFCs defining the wide communities or addition to standard 
>>> communities like no_export/no_advertise/…). This would also eliminate the 
>>> adoption friction from operators rightly claiming “my AS my rules”.   
>>> 
>>>  
>>> 
>>> adam
>>> 
>>>  
>>> 
>>>  
>>> 
>>> From: NANOG  On 
>>> Behalf Of Douglas Fischer via NANOG
>>> Sent: Tuesday, September 8, 2020 4:56 PM
>>> To: NANOG 
>>> Subject: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
>>> reserved to "export-only-to"?'
>>> 
>>>  
>>> 
>>> Most of us have already used some BGP community policy to no-export some 
>>> routes to some where.
>>> 
>>> On the majority of IXPs, and most of the Transit Providers, the very common 
>>> community tell to route-servers and routers "Please do no-export these 
>>> routes to that ASN" is:
>>> 
>>>  -> 0:
>>> 
>>>  
>>> 
>>> So we could say that this is a de-facto standard.
>>> 
>>>  
>>> 
>>>  
>>> 
>>> But the Policy equivalent to "Please, export these routes only to that ASN" 
>>> is very varied on all the IXPs or Transit Providers.
>>> 
>>>  
>>> 
>>>  
>>> 
>>> With that said, now comes some questions:
>>> 
>>> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or 
>>> something like that, that would define 0: as "no-export-to" 
>>> standard?
>>> 
>>>  
>>> 
>>> 2 - What about reserving some 16-bits ASN to use : as 
>>> "export-only-to" standard?
>>> 
>>> 2.1 - Is important to be 16 bits, because with (RT) extended communities, 
>>> any ASN on the planet could be the target of that policy.
>>> 
>>> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>>> 
>>>  
>>> 
>>> --
>>> 
>>> Douglas Fernando Fischer
>>> Engº de Controle e Automação


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Jeff Tantsura via NANOG
Great excuse ;-)

Regards,
Jeff

> On Sep 9, 2020, at 15:16, Mike Hammett via NANOG  wrote:
> 
> 
> If history has taught us anything, everything we do will be ignored by those 
> that most need it.  :-)
> 
> 
> 
> -
> Mike Hammett
> Intelligent Computing Solutions
> http://www.ics-il.com
> 
> Midwest-IX
> http://www.midwest-ix.com
> 
> From: "Mark Tinka" 
> To: "Mike Hammett" 
> Cc: nanog@nanog.org
> Sent: Wednesday, September 9, 2020 7:59:55 AM
> Subject: Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
> reserved to "export-only-to"?'
> 
> Well, the proposed de facto standard is only useful for what we need to 
> signal outside of the AS.
> 
> Since an operator will still need to design for communities used internal to 
> the AS (which will have nothing to do with the outside world, and be of a 
> higher number), they can accomplish both tasks in one sitting; in lieu of 
> first designing for internal use, and then trying to design again for the 
> external standard.
> 
> At any rate, as Nick said yesterday, if it's taken us over 2 decades to agree 
> on the well-known communities we have today, perhaps the industry should go 
> ahead and standardize this proposal anyway, and then see what happens. If 
> history has taught us anything, folk will do what they want for 23 or so 
> years, and even then, it might not turn out the way we hoped.
> 
> If it were me, I'd spend my time on other things. I can design internal 
> operator-specific communities that also do the right thing externally, if 
> needed. Heck, it's what I've done already. My customers are happy and I have 
> little incentive to fix that. 
> 
> But that's just me :-).
> 
> Mark.
> 
> On 9/Sep/20 14:47, Mike Hammett wrote:
> Exactly. There are far more pressing things when launching a new network than 
> coming up with a BGP community scheme from scratch, learning everyone else's 
> BGP community scheme, etc. If networks used a standard, then there is a very 
> minimal ramp-up.
> 
> 
> 
> -
> Mike Hammett
> Intelligent Computing Solutions
> http://www.ics-il.com
> 
> Midwest-IX
> http://www.midwest-ix.com
> 
> From: "Mark Tinka" 
> To: "Mike Hammett" 
> Cc: nanog@nanog.org
> Sent: Wednesday, September 9, 2020 6:47:13 AM
> Subject: Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
> reserved to "export-only-to"?'
> 
> 
> 
> On 9/Sep/20 13:41, Mike Hammett wrote:
> How is that any different than any other network with minimal connectivity 
> (say a non-ISP such as a school, medium business, local government, etc.)?
> 
> Because the existing flexibility of dis-aggregated BGP community design can 
> be done without any need to be in concert with the rest of the world, and 
> your network won't blow up. There are far more pressing things to consider 
> when launching a new network.
> 
> 
> 
> Also, it would likely help that new ISP in Myanmar learn their limited 
> upstream's communities if there were a standard.
> 
> There used to be a very large global transit network that did not support BGP 
> communities for their customers or peers. I'm not sure if that is still their 
> position in 2020, but back then, it did not stop them from growing quite well.
> 
> Mark.
> 
> 
> 


Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'

2020-09-09 Thread Jeff Tantsura via NANOG
BCP38 is an RFC, 2827.
It is a grand advise if you can:
-find someone who is actually well versed
-afford that someone.

Personally - when in early 2000s I had to write complete community tagging 
design for a multi country network, I wish I had  a “how to” 

Regards,
Jeff

> On Sep 9, 2020, at 15:35, adamv0...@netconsultings.com wrote:
> 
> 
> My advice to “someone who is setting up a new ISP and has a very little clue 
> as where to start” would be just don’t and instead hire someone who’s well 
> versed in this topic.
> But I see what you mean, RFC7938 was a good food for thought. But at the same 
> time I’m sceptical, for instance would it help if BCP38 was an RFC?
> Would be nice for instance if the community could put together a checklist of 
> things to consider for ISPs (could be in no particular order) (and actually 
> there are such lists albeit concentrated around security)   
>  
> adam
>  
> From: Jeff Tantsura  
> Sent: Wednesday, September 9, 2020 9:52 AM
> To: adamv0...@netconsultings.com
> Cc: nanog@nanog.org
> Subject: Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
> reserved to "export-only-to"?'
>  
> I don’t think, anyone has proposed to use ‘’reserved ASNs” as a BCP, example 
> of “ab”use of ASN0 is a de-facto artifact (unfortunate one).
> My goal would be to provide a viable source of information to someone who is 
> setting up a new ISP and has a very little clue as where to start. Do’s and 
> don’t’s wrt inter-domain communities use. 
>  
> I really enjoyed the difference RFC7938 (Use of BGP for Routing in 
> Large-Scale Data Centers) made, literally 100s of companies have used it to 
> educate themselves/ implemented their DC networking.
>  
> Cheers,
> Jeff
> 
> 
> On Sep 9, 2020, at 10:04, adam via NANOG  wrote:
> 
> 
> I don’t agree with the use of reserved ASNs, let alone making it BCP, cause 
> it defeats the whole purpose of the community structure.
> Community is basically sending a message to an AS. If I want your specific AS 
> to interpret the message I set it in format YOUR_ASN:, your 
> AS in the first part of the community means that your rules of how to 
> interpret the community value apply.
> Turning AS#0 or any other reserved AS# into a “broadcast-AS#” in terms of 
> communities (or any other attribute for that matter) just doesn’t sit right 
> with me (what’s next? multicast-ASNs that we can subscribe to?).
> All the examples in Robert’s draft or wide community RFC, all of them use an 
> example AS# the community is addressed to (not some special reserved AS#).
>  
> Also should something like this become standard it needs to be properly 
> standardized and implemented as a well-known community by most vendors (like 
> RFCs defining the wide communities or addition to standard communities like 
> no_export/no_advertise/…). This would also eliminate the adoption friction 
> from operators rightly claiming “my AS my rules”.   
>  
> adam
>  
>  
> From: NANOG  On Behalf 
> Of Douglas Fischer via NANOG
> Sent: Tuesday, September 8, 2020 4:56 PM
> To: NANOG 
> Subject: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN 
> reserved to "export-only-to"?'
>  
> Most of us have already used some BGP community policy to no-export some 
> routes to some where.
> 
> On the majority of IXPs, and most of the Transit Providers, the very common 
> community tell to route-servers and routers "Please do no-export these routes 
> to that ASN" is:
> 
>  -> 0:
>  
> So we could say that this is a de-facto standard.
>  
>  
> But the Policy equivalent to "Please, export these routes only to that ASN" 
> is very varied on all the IXPs or Transit Providers.
>  
>  
> With that said, now comes some questions:
> 
> 1 - Beyond being a de-facto standard, there is any RFC, Public Policy, or 
> something like that, that would define 0: as "no-export-to" 
> standard?
>  
> 2 - What about reserving some 16-bits ASN to use : as 
> "export-only-to" standard?
> 2.1 - Is important to be 16 bits, because with (RT) extended communities, any 
> ASN on the planet could be the target of that policy.
> 2.2 - Would be interesting some mnemonic number like 1000 / 1 or so.
>  
> --
> Douglas Fernando Fischer
> Engº de Controle e Automação