Re: Networks ignoring prepends?

2024-01-24 Thread Chris Adams
The basic disconnect here is that you seem to think that BGP is to be
used to dictate policy to other networks on how to reach your network.
That is not and has never been the case.

When I learned BGP back in the 1990s, it was explicitly said that you
control your outbound traffic with your BGP policy, but that all you can
do is try to influence the decisions of other networks for your inbound
traffic (using a combination of prepends, communities, and somtimes
other tricks), but sometimes they'll take a path that isn't what you'd
prefer (and you just have to accept that).  Just like your outbound
policy is 100% in your control, so it is with every other network.

We always took that kind of thing into account when choosing where to
buy transit.  When not buying from a "big guy" with a well-connected
nationwide network, we'd check BGP announcements and traceroutes to see
where things went.

-- 
Chris Adams 


Re: Networks ignoring prepends?

2024-01-24 Thread James Jun
On Wed, Jan 24, 2024 at 09:22:06AM -0800, William Herrin wrote:
> On Wed, Jan 24, 2024 at 8:39???AM James Jun  wrote:
> > On Wed, Jan 24, 2024 at 08:16:56AM -0800, William Herrin wrote:
> > > Sophistry. I buy IP transit from 3 providers, one of which has a 3 AS
> > > path to 3356.
> >
> > Again you omit context.
> 
> What you're calling context, I call deceptive.
> 
> For one thing, Centurylink's process is, like a spammer, opt-out
> rather than opt-in. 

Nope. Your allegation that Lumen (Centurylink)'s "process" is out-out like a 
spammer is factually and historically incorrect.  However, Lumen's practice is 
complaint with best common practices and experiences as documented on RFC 4277 
and provided by RFC 4271.

Lumen/Centurylink's alleged "opt-out spamming" practice predates their very 
existence and was established during the NSFNET, with an operational need at 
the time to differenciate commercial networks from R&E networks. Just as R&E 
networks needed to treat commercial network traffic differently during the 
needs of the NSFNET, commercial operators of the Internet are also expected and 
demanded to prioritize traffic by their paying customers, over non-paying 
customers.

> 3356 enables the local pref unless told through a
> BGP community not to. There's no evidence that 47787 even knows that
> Centurylink is preferring them despite shorter AS paths elsewhere, let
> alone desires that behavior. Indeed, given the prepends that 47787
> added, it's quite possible they desire the opposite.

The evidence is widely documented and is in best common practices of every 
major ASN exercising routing policy and subsequent RFCs and BCPs published 
concerning discussions herein.  Internet standards and documented widely 
accepted current practices exist for a good reason.  Your, or alleged 47787's 
possibility of failure, ignorance or act of ommission in being informed of how 
the current practices work does not make you any less responsible in 
identifying the problem at hand.  Your allegation and arguments that currently 
adopted and documented inter-AS traffic engineering practices are deceptive and 
"opt-out" in a bad-faith nature are simply too tenuous a connection and amount 
to reductio ad absurdum.  You are however welcome to participate in IETF 
process to propose to alter the way BGP practices work for the better, as you 
wish.  That's what's so great about community input-based policy development 
processes.

> 
> For another, a key implication in your "context" is that if one
> customer intentionally pays 3356 to intentionally send another
> customer's packets on a longer, slower trip than 3356 otherwise would,
> that's a legitimate above-board business transaction. Not obviously
> corrupt.

False.  None of the parties described herein, neither 47784, nor 3356 are 
liable in "intentionally" sending traffic of another customer on a longer, less 
efficient path.  What they are however likely liable for, are contractual 
obligations and commercial expectations of bilateral parties engaged in an 
ongoing transaction.  You fit into the chain of buying from 53356 without 
understanding the underlying infrastructure and connectivity relationships that 
53356 has toward 3356.  And you're now litigating that it's corrupt and is 
possibly some kind of a coordinated scheme or a racket without your consent.  
You gave your consent by agreeing to run BGP with 53356 as your vendor, which 
you awarded that business to, and began advertising your prefix.  It's not 
working the way you want, so engage with your vendor to fix it, or fire them.  
This is not hard.

James


Re: Networks ignoring prepends?

2024-01-24 Thread William Herrin
On Wed, Jan 24, 2024 at 8:39 AM James Jun  wrote:
> On Wed, Jan 24, 2024 at 08:16:56AM -0800, William Herrin wrote:
> > Sophistry. I buy IP transit from 3 providers, one of which has a 3 AS
> > path to 3356.
>
> Again you omit context.

What you're calling context, I call deceptive.

For one thing, Centurylink's process is, like a spammer, opt-out
rather than opt-in. 3356 enables the local pref unless told through a
BGP community not to. There's no evidence that 47787 even knows that
Centurylink is preferring them despite shorter AS paths elsewhere, let
alone desires that behavior. Indeed, given the prepends that 47787
added, it's quite possible they desire the opposite.

For another, a key implication in your "context" is that if one
customer intentionally pays 3356 to intentionally send another
customer's packets on a longer, slower trip than 3356 otherwise would,
that's a legitimate above-board business transaction. Not obviously
corrupt.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-24 Thread James Jun
On Wed, Jan 24, 2024 at 08:16:56AM -0800, William Herrin wrote:
> On Wed, Jan 24, 2024 at 8:11???AM James Jun  wrote:
> > You (AS11875) have an operational need for good connectivity
> > into 3356 but, you made a poor purchasing decision by buying
> > IP transit for 11875 from a provider who has 10-AS path into
> > 3356 instead of <=3 AS path. You've done a _bad_ job here
> > in selecting an inferior pathway into 3356, and what you
> > SHOULD have done is to select an IP transit provider who
> > has an optimal AS-path into 3356 to meet your operational
> > need of having good connectivity into 3356.
> 
> Sophistry. I buy IP transit from 3 providers, one of which has a 3 AS
> path to 3356.

Again you omit context.

We've already established as per the RFC, that calculation of degree of 
preference takes precedence over and overrides AS_PATH (Phase 1 decision).  

Therefore, let's rephrase what you've just said above:

You're buying IP transit from 3 providers, two of which are configured with the 
following known constraints:

- 20473 who buys from 1299, who has lower degree of preference into 3356, as 
1299 and 3356 are interconnection (could be settlement-free or paid-peer) 
peering partners.
- 53356 who buys from 47787 as a prioritized downstream customer, and then 
47787 too subsequently connects into 3356 as a prioritized downstream customer.

It's obviously clear that 53356 path you've bought has a priority ticket into 
3356 no matter how inferior or long its AS_PATH may be, and the solution is 
right in front of you.  Next.

James


Re: Networks ignoring prepends?

2024-01-24 Thread William Herrin
On Wed, Jan 24, 2024 at 8:11 AM James Jun  wrote:
> You (AS11875) have an operational need for good connectivity
> into 3356 but, you made a poor purchasing decision by buying
> IP transit for 11875 from a provider who has 10-AS path into
> 3356 instead of <=3 AS path. You've done a _bad_ job here
> in selecting an inferior pathway into 3356, and what you
> SHOULD have done is to select an IP transit provider who
> has an optimal AS-path into 3356 to meet your operational
> need of having good connectivity into 3356.

Sophistry. I buy IP transit from 3 providers, one of which has a 3 AS
path to 3356.

-Bill


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-24 Thread James Jun
On Wed, Jan 24, 2024 at 07:25:42AM -0800, William Herrin wrote:

[ snip ]

> or I chose my words poorly. What I did say, and stand behind, was that
> applying local prefs moves BGP's route selection off the _defaults_,
> and if Centurylink was routing to me based instead on the defaults
> they'd have made a _good_ route selection instead of a _bad_ one.

This cuts both ways Bill.  First, 3356 is making an intended route selection, 
their customer who interconnects directly into 3356 demands this.  That 
customer who connects into 3356 probably had no idea that you (AS11875) would 
someday decide to take IP transit from a downstream AS of them, and your 
situation was likely never in their minds of consideration in their network 
planning.

_You_ want better connectivity from 3356 to 11875 for the explicit benefit of 
11875, which _you_ operate and control.  That's good, so let's continue.


> 
> I do care whether you're routing packets in a reasonable way. When you
> pick the 10-AS path over the 3-AS path because the 10-AS path arrives
> from a customer, the odds that you're routing those packets in a
> _good_ way are very low. I get that a lot of you do that. I'm telling
> you that when you do, you're doing a _bad_ job. If you think you're
> justified, well, it's your business. But don't doubt for a second that
> you've served your customers poorly.

Conversely at the same time, the below is also equally true:

You (AS11875) have an operational need for good connectivity into 3356 but, you 
made a poor purchasing decision by buying IP transit for 11875 from a provider 
who has 10-AS path into 3356 instead of <=3 AS path.  You've done a _bad_ job 
here in selecting an inferior pathway into 3356, and what you SHOULD have done 
is to select an IP transit provider who has an optimal AS-path into 3356 to 
meet your operational need of having good connectivity into 3356.


> And before you suggest that I'm not your customer, let me point out
> what should be obvious: if none of your paying customers were trying
> to reach my network, I wouldn't notice which direction you routed my
> packets, let alone care. It's not about serving me, it's about serving
> your paying customers. My packets are their packets, and when you send
> _their_ packets along the scenic route, you have done a bad job.

We can do this all day long.  You (AS11875) also have the responsibility to 
yourself and your end-users to select and award business to an IP transit 
provider and make every reasonable efforts to ensuer that 11875 has good 
connectivity into 3356 as your operational needs require.  You've abrogated 
that responsibility in your own AS and decided to spew non-sense over the most 
critical and important knob that is more important than AS_PATH (LOCAL_PREF) in 
BGP-4 that was developed since NSFNET days and are telling us that we're doing 
a poor job.  Your argument fails.

The internet works upon the principle of "best-effort."  What you're describing 
is the net effect of that "best-effort", and you, as the operator and 
controller of AS11875 which is involved in the path are just as culpable and 
responsible.  Moreover, you, by being the operator of an AS in the problematic 
path, have the wherewithal and commercial ability to fix it, without involving 
the rest of us.  The answer right is in front of you.

James


Re: Networks ignoring prepends?

2024-01-24 Thread William Herrin
On Wed, Jan 24, 2024 at 5:23 AM Chris Adams  wrote:
> Once upon a time, William Herrin  said:
> > On Tue, Jan 23, 2024 at 4:00 PM Chris Adams  wrote:
> > > Once upon a time, William Herrin  said:
> > > > Nevertheless, in the protocol's design, the one expressed in the
> > > > RFC's, AS path length = distance.
> > >
> > > The RFC doesn't make any equivalence between AS path length and
> > > distance.  You are the one trying to make that equivalence,
> >
> > Respectfully Chris, you are mistaken.
> >
> > https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2
> >
> > "a) Remove from consideration all routes that are not tied for having
> > the smallest number of AS numbers present in their AS_PATH
> > attributes."
> >
> > So literally, the first thing BGP does when picking the best next hop
> > is to discard all but the routes with the shortest AS path.
>
> That's literally not the first thing - you skipped section 9.1.1.

Phase 1 is local pref. That's what 9.1.1 says. As implied by the word
"local," it's set locally by the local operator, not by the origin,
though many providers offer haphazard mechanisms that sometimes have
some impact if the origin doesn't mind playing whack-a-mole with BGP
communities.

Unless locally configured to selectively change the local pref off the
default, all routes have the same local pref. So it moves to phase 2
(section 9.1.2). This matches what I've been saying for the entire
thread: unless the operator intentionally makes the route worse, it
follows the shortest AS path. Per the RFC.


> It also literally says nothing about distance.

BGP is a distance-vector protocol. BGP's authors preferred different
terminology so they used different terminology. Nevertheless, BGP is a
distance-vector protocol and when you ask what it uses to determine
distance, the answer is the AS path length because all the other
criteria are policy functions not distance functions.

Want to go another few rounds with pedantry over word choice, or can
we leave it there?

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-24 Thread Owen DeLong via NANOG
> 
> When you twist a policy knob to move BGP off its defaults, you take
> responsibility for making a better routing choice. And for correcting
> that choice if it should prove faulty. What I've seen here in this
> thread is a bunch of folks abdicating that responsibility. That's not
> unexpected, but it is disappointing.

Better is in the eye of the beholder. From your perspective, better is the 
lowest latency. From almost any ISPs perspective, better is the revenue 
positive path, followed by the revenue neutral path, with last choice being the 
revenue negative path. 

From 3356 perspective, they ARE choosing the best route… the route that pays 
them. 

Owen




Re: Networks ignoring prepends?

2024-01-24 Thread William Herrin
On Wed, Jan 24, 2024 at 7:02 AM Jon Lewis  wrote:
> In one of his messages, William complained that the big bad networks are
> breaking the BGP rules by ignoring as-path length.

To be clear, I don't really care whether you're "breaking the rules."
Moreover, if my words suggested that I thought using BGP's local pref
capability was "breaking the rules," then either you misunderstood me
or I chose my words poorly. What I did say, and stand behind, was that
applying local prefs moves BGP's route selection off the _defaults_,
and if Centurylink was routing to me based instead on the defaults
they'd have made a _good_ route selection instead of a _bad_ one.

I do care whether you're routing packets in a reasonable way. When you
pick the 10-AS path over the 3-AS path because the 10-AS path arrives
from a customer, the odds that you're routing those packets in a
_good_ way are very low. I get that a lot of you do that. I'm telling
you that when you do, you're doing a _bad_ job. If you think you're
justified, well, it's your business. But don't doubt for a second that
you've served your customers poorly.

And before you suggest that I'm not your customer, let me point out
what should be obvious: if none of your paying customers were trying
to reach my network, I wouldn't notice which direction you routed my
packets, let alone care. It's not about serving me, it's about serving
your paying customers. My packets are their packets, and when you send
_their_ packets along the scenic route, you have done a bad job.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-24 Thread Jon Lewis

On Wed, 24 Jan 2024, Jay R. Ashworth wrote:


- Original Message -

From: "Jon Lewis" 



On Mon, 22 Jan 2024, William Herrin wrote:

It gives me, your paying customer, less control over my routing
through your network than if I wasn't your paying customer. That
seems... backwards.


Not at all.  Think like a service provider.

"I've got packets to deliver.  I've got 3 different classes of paths I can
use.  One of them, I get paid to use.  One is cost neutral.  The last one,
I pay to use."

Which path would you pick (assuming you're trying to maximize revenue
from your network)?


And here, you nail it, Jon:

The Internet stopped being an engineering construct many years ago, to its--and
our--detriment; things work much more poorly, and harder to understand and
diagnose and fix, because of this.

His example, of packets going from Miami to Ft Lauderdale via One Wilshire,
is a classic example.


It can be a whole lot worse.  At a previous job, running an anycast CDN, 
we had POPs originating the same prefixes all over the world.  Cogent was 
one of our transit providers in most POPs (i.e. all the POPs in North 
America and Europe).


Toward the end of my time there, Cogent started making some progress 
breaking into the transit market in Asia.  So, we saw some eyeball 
networks in Asia hitting our anycast IPs via Cogent.  Trouble was, the 
established "tier 1's" in Asia wouldn't peer with Cogent in Asia (for 
business reasons - i.e. they didn't want Cogent coming into their market 
and upsetting their apple carts).  Our Asian POPs had lots of peering (IX 
and private) and transit from established Asian tier 1's.  So this traffic 
from Cogent's Asian customers would land in our LA and San Jose POPs.  As 
you can imagine, the RTT from an eyeball in Tokyo is "a bit higher" when 
talking to our LA POP vs our Tokyo POP.  Cogent has some BGP community 
controls available, but nothing that says "keep this route in-region". 
IIRC, the closest to it they had was lower localpref when sharing with 
region X.  Lowering localpref doesn't matter if region X has no path other 
than the one received from an out-of-region customer session.  Our options 
were "stop advertising anycast to Cogent globally" or "connect to Cogent 
in Asia so we can serve that traffic locally from our Asian POPs."


In one of his messages, William complained that the big bad networks are 
breaking the BGP rules by ignoring as-path length.  That's nonsense.  If 
you look at the BGP best path decision algorithm, there are several 
attributes considered before as-path length.  Localpref is one of 
them...and since most networks exist to make money, it's standard practice 
to use localpref to make sure you route traffic economically rather than 
efficiently (via the shortest as-path, which may still not be the shortest 
actual path).  For traffic you care about, obviously there's a balance 
between cost and performance.  If you've made poor/cheap choices in your 
transit providers, nobody cares that your traffic takes the scenic route. 
At least not the networks carrying your traffic that you're not directly 
paying...and you're likely to find, as above, even when you are directly 
paying, their interests are likely to outweigh yours.


--
 Jon Lewis, MCP :)  |  I route
 Blue Stream Fiber, Sr. Neteng  |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: Networks ignoring prepends?

2024-01-24 Thread Chris Adams
Once upon a time, William Herrin  said:
> On Tue, Jan 23, 2024 at 4:00 PM Chris Adams  wrote:
> > Once upon a time, William Herrin  said:
> > > Nevertheless, in the protocol's design, the one expressed in the
> > > RFC's, AS path length = distance.
> >
> > The RFC doesn't make any equivalence between AS path length and
> > distance.  You are the one trying to make that equivalence,
> 
> Respectfully Chris, you are mistaken.
> 
> https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2
> 
> "a) Remove from consideration all routes that are not tied for having
> the smallest number of AS numbers present in their AS_PATH
> attributes."
> 
> So literally, the first thing BGP does when picking the best next hop
> is to discard all but the routes with the shortest AS path.

That's literally not the first thing - you skipped section 9.1.1.

It also literally says nothing about distance.

-- 
Chris Adams 


Re: Networks ignoring prepends?

2024-01-24 Thread James Jun
On Tue, Jan 23, 2024 at 10:12:33PM -0800, William Herrin wrote:
> Respectfully Chris, you are mistaken.
> 
> https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2
> 
> "a) Remove from consideration all routes that are not tied for having
> the smallest number of AS numbers present in their AS_PATH
> attributes."
> 
> So literally, the first thing BGP does when picking the best next hop
> is to discard all but the routes with the shortest AS path.

Not true.  Read the whole RFC--you've ommitted Sections 9.1 and 9.1.1, which 
are very critical.

Discarding all but the routes with shortest AS path is _not_ literally the 
first thing BGP does as you stated above.

The first thing BGP does is to calculate the degree of preference whenever BGP 
receives a new route, withdrawn route or replacement route (See Section 9.1.1). 
 The determination of the degree of preference is considered to be a local 
matter for each Autonomous System exercising route policy, typically expressed 
using LOCAL_PREF, to execute upon the configured administrative policy to class 
the incoming routes.

After completion of 9.1.1, section 9.1.2 and 9.1.2.2 which you cited begins 
(Phase 2: Route Selection).  Route selection under 9.1.2 is only invoked after 
degree of preference is determined (called 'Phase 1' decision) as clearly 
described in Section 9.1.

In fact, even in 9.1.2.2 that you cited above, it clearly states:

   In its Adj-RIBs-In, a BGP speaker may have several routes to the same
   destination that have the same degree of preference. 

   [ snip ]

   The following tie-breaking procedure assumes that, for each candidate
   route, all the BGP speakers within an autonomous system can ascertain
   the cost of a path (interior distance) to the address depicted by the
   NEXT_HOP attribute of the route, and follow the same route selection
   algorithm.

   The tie-breaking algorithm begins by considering all equally
   preferable routes to the same destination, and then selects routes to
   be removed from consideration.  The algorithm terminates as soon as
   only one route remains in consideration.  The criteria MUST be
   applied in the order specified.

   [ snip ]

  a) Remove from consideration all routes that are not tied for
 having the smallest number of AS numbers present in their
 AS_PATH attributes.  Note that when counting this number, an
 AS_SET counts as 1, no matter how many ASes are in the set.



So you see, the comparison of AS_PATH and therefore the route selection process 
could only begin after routes are first resolved by their degree of preference, 
often typically exercised by LOCAL_PREF across the AS (or other similar import, 
such as Cisco's "weight" parameter which is applied before LOCAL_PREF locally 
significant to the router itself where its been configured).  The route 
selection process, including the elimination of routes with inferior AS paths, 
is a tie-breaker algorithm after degree of preference is first calculated, 
which is what we've been trying to tell you.  So no, AS_PATH comparison is not 
literally the first thing BGP does.

You're ignoring Section 9.1.1 in its entirety, which chronologically begins 
before Section 9.1.2.2 (the section you cited), which also clearly specifies 
that route selection process described in it (including AS_PATH comparison) is 
a tie-breaking procedure. 


> 
> It also says that BGP implementations are -allowed- to use other
> selection criteria.


Further followed by the following clause immediately afterwards: 
  "BGP implementations MAY use any algorithm that produces the __same results__ 
as those described here."

And restricted by the following clause in the preceding paragraph:
  "The criteria MUST be applied in the order specified."

And clarified by Section 9.1:
  "as long as the implementations support the described functionality and they 
exhibit the same externally visible behavior."


> And there are many situations where doing so is
> well advised and improves the result. But AS path length is
> unambiguously the default, off which a user has to move it.


So, when a BGP implementation is written in a router software, how does the 
manufacturer know whether your network is going to need to be applying lot of 
degrees of preference, or none?  The vendors have no idea, and RFC also 
clarifies that degree of preference is a local policy matter.  Therefore, the 
default behavior is to assume a universally same LOCAL_PREF until a policy is 
configured, which typically has been '100' across many vendor implementations.  
In this instance, since all routes have the same degree of preference of 100, 
Section 9.1.2.2 you cited then begins to tie-break the routes of same 
preference, starting with the AS_PATH comparison, but it is absolutely by no 
means, the first thing BGP does, at all.  The first thing BGP does as clearly 
specified in the RFC is to determine the degree of preference to meet local 
routing policy

Re: Networks ignoring prepends?

2024-01-24 Thread Robert Raszuk
Bill,


> https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2
>
> "a) Remove from consideration all routes that are not tied for having
> the smallest number of AS numbers present in their AS_PATH
> attributes."
>
> So literally, the first thing BGP does when picking the best next hop
> is to discard all but the routes with the shortest AS path.


Not really. I have never seen a BGP implementation which would do that.
That section 9 you are referring to is just informational - no specific
order in there is mandated.

Shortest AS-PATH is used as step 4 or 5 in best path selection - not to
mention Cost Communities which below links do not even consider:

https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html

https://www.juniper.net/documentation/us/en/software/junos/vpn-l2/bgp/topics/concept/routing-protocols-address-representation.html

Thx,
R.


Re: Networks ignoring prepends?

2024-01-24 Thread Robert Raszuk
All,

> But that ship seems to have sailed.

The problem is well known and it consists of two orthogonal aspects:

#1  - Ability to signal the preference of which return path to choose by
arbitrary remote ASN

#2 - Actually applying this preference by remote ASN.

For #1 I have proposed some time back a new set of well known wide
communities defined in section 2.2.4 of this draft:
https://datatracker.ietf.org/doc/html/draft-ietf-idr-registered-wide-bgp-communities-02#section-2.2.4

Perhaps one day this will surface such that operators will be able to
signal their preference without extending AS-PATH or trashing the table
with more specifics.

For #2 it is quite likely that the economical aspect plays a role here. So
it could be that accepting such a preference may not be for free. But
before that happens BGP for obvious reasons should be secured and updates
should be signed. And we all know how fast that is going to happen.

Kind regards,
Robert



On Wed, Jan 24, 2024 at 5:38 AM Darrel Lewis  wrote:

>
>
> > On Jan 22, 2024, at 6:53 PM, Jeff Behrns via NANOG 
> wrote:
> >
> >>> William Herrin  wrote:
> > Until they tamper with it using localpref, BGP's default behavior with
> prepends does exactly the right thing, at least in my situation.
> >
> > I feel your pain Bill, but from a slightly different angle.  For years
> the large CDNs have been disregarding prepends.  When a source AS
> disregards BGP best path selection rules, it sets off a chain reaction of
> silliness not attributable to the transit AS's.  At the terminus of that
> chain are destination / eyeball AS's now compelled to do undesirable things
> out of necessity such as:
> >  1) Advertise specifics towards select peers - i.e. inconsistent edge
> routing policy & littering global table
> >  2) Continuing to prepending a ridiculous amount anyway
> > Gotta wonder how things would be if everyone just abided by the rules.
> >
>
> One might argue that the global routing system should allow for sites to
> signal their ingress traffic engineering preferences to remote sites in
> ways other than bloating the global routing table.  But that ship seems to
> have sailed.
>
> Regards,
>
> -Darrel
>
>
>


Re: Networks ignoring prepends?

2024-01-24 Thread William Herrin
On Wed, Jan 24, 2024 at 12:55 AM Owen DeLong  wrote:
> BGP is more of a PDVP (Policy Distance Vector Protocol).

Hi Owen,

That's a distinction without a difference. All but the most
rudimentary implementation of a distance-vector protocol supports
policy definition and enforcement. BGP has more policy knobs than
most, but at its heart it's still a distance-vector protocol and until
pushed off its default settings its first differentiator for distance
is the length of the AS path.

Only link-state protocols tend to lack policy knobs since all nodes
must agree about the correct full path, not just the next closest hop.

When you twist a policy knob to move BGP off its defaults, you take
responsibility for making a better routing choice. And for correcting
that choice if it should prove faulty. What I've seen here in this
thread is a bunch of folks abdicating that responsibility. That's not
unexpected, but it is disappointing.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-24 Thread Owen DeLong via NANOG
BGP is more of a PDVP (Policy Distance Vector Protocol). Policy will always 
override Distance in BGP and is pretty much the key difference between an EGP 
and an IGP. 

Once you recognize that, the rest makes much more sense. 

Owen


> On Jan 23, 2024, at 14:29, William Herrin  wrote:
> 
> On Tue, Jan 23, 2024 at 12:34 PM Niels Bakker  wrote:
>> BGP, while a distance vector protocol, famously does not take
>> latency into account when making routing decisions.
> 
> Unless overridden, BGP takes -distance- into account where distance =
> AS path length.
> 
> Centurylink has overridden that with a localpref so that it DOES NOT
> take distance into account. Which rather defeats the function of a
> distance vector protocol.
> 
> Regards,
> Bill Herrin
> 
> 
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/



Re: Networks ignoring prepends?

2024-01-23 Thread William Herrin
On Tue, Jan 23, 2024 at 4:00 PM Chris Adams  wrote:
> Once upon a time, William Herrin  said:
> > Nevertheless, in the protocol's design, the one expressed in the
> > RFC's, AS path length = distance.
>
> The RFC doesn't make any equivalence between AS path length and
> distance.  You are the one trying to make that equivalence,

Respectfully Chris, you are mistaken.

https://datatracker.ietf.org/doc/html/rfc4271#section-9.1.2.2

"a) Remove from consideration all routes that are not tied for having
the smallest number of AS numbers present in their AS_PATH
attributes."

So literally, the first thing BGP does when picking the best next hop
is to discard all but the routes with the shortest AS path.

It also says that BGP implementations are -allowed- to use other
selection criteria. And there are many situations where doing so is
well advised and improves the result. But AS path length is
unambiguously the default, off which a user has to move it.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread Darrel Lewis



> On Jan 22, 2024, at 6:53 PM, Jeff Behrns via NANOG  wrote:
> 
>>> William Herrin  wrote:
> Until they tamper with it using localpref, BGP's default behavior with 
> prepends does exactly the right thing, at least in my situation.
> 
> I feel your pain Bill, but from a slightly different angle.  For years the 
> large CDNs have been disregarding prepends.  When a source AS disregards BGP 
> best path selection rules, it sets off a chain reaction of silliness not 
> attributable to the transit AS's.  At the terminus of that chain are 
> destination / eyeball AS's now compelled to do undesirable things out of 
> necessity such as:
>  1) Advertise specifics towards select peers - i.e. inconsistent edge routing 
> policy & littering global table
>  2) Continuing to prepending a ridiculous amount anyway
> Gotta wonder how things would be if everyone just abided by the rules.
> 

One might argue that the global routing system should allow for sites to signal 
their ingress traffic engineering preferences to remote sites in ways other 
than bloating the global routing table.  But that ship seems to have sailed.

Regards,

-Darrel




Re: Networks ignoring prepends?

2024-01-23 Thread Jay R. Ashworth
- Original Message -
> From: "Jon Lewis" 

> On Mon, 22 Jan 2024, William Herrin wrote:
>> It gives me, your paying customer, less control over my routing
>> through your network than if I wasn't your paying customer. That
>> seems... backwards.
> 
> Not at all.  Think like a service provider.
> 
> "I've got packets to deliver.  I've got 3 different classes of paths I can
> use.  One of them, I get paid to use.  One is cost neutral.  The last one,
> I pay to use."
> 
> Which path would you pick (assuming you're trying to maximize revenue
> from your network)?

And here, you nail it, Jon:

The Internet stopped being an engineering construct many years ago, to its--and
our--detriment; things work much more poorly, and harder to understand and 
diagnose and fix, because of this.

His example, of packets going from Miami to Ft Lauderdale via One Wilshire, 
is a classic example.

Cheers,
-- jra

-- 
Jay R. Ashworth  Baylink   j...@baylink.com
Designer The Things I Think   RFC 2100
Ashworth & Associates   http://www.bcp38.info  2000 Land Rover DII
St Petersburg FL USA  BCP38: Ask For It By Name!   +1 727 647 1274


Re: Networks ignoring prepends?

2024-01-23 Thread James Jun
William Herrin wrote:
> Nevertheless, in the protocol's design, the one expressed in the RFC's, AS 
> path length = distance.

Since we're opening RFCs now, and somehow it is being opined that LOCAL_PREF is 
a profit-driven conspiracy and a coordinated scheme concocted by commercial 
networks to tamper with, or "override" AS_PATH desires of the majority, let us 
review factually about what LOCAL_PREF actually does and why it was implemented 
into BGP in the first place:

RFC 4277 entitled "Experience with the BGP-4 Protocol", Section 20:

   The NSFNET program used EGP, and then BGP, to provide external
   routing information.  It was the NSF policy of offering different
   prices and providing different levels of support to the Research and
   Education (RE) and the Commercial (CO) networks that led to BGP's
   initial policy requirements.  In addition to being charged more, CO
   networks were not able to use the NSFNET backbone to reach other CO
   networks.  The rationale for higher prices was that commercial users
   of the NSFNET within the business and research entities should
   subsidize the RE community.  Recognition that the Internet was
   evolving away from a hierarchical network to a mesh of peers led to
   changes away from EGP and BGP-1 that eliminated any assumptions of
   hierarchy.

   Enforcement of NSF policy was accomplished through maintenance of the
   NSF Policy Routing Database (PRDB).  The PRDB not only contained each
   networks designation as CO or RE, but also contained a list of the
   preferred exit points to the NSFNET to reach each network.  This was
   the basis for setting what would later be called BGP LOCAL_PREF on
   the NSFNET.  Tools provided with the PRDB generated complete router
   configurations for the NSFNET.


RFC 4271 entitled "A Border Gateway Protocol 4 (BGP-4)" (supersedes RFC 1771), 
Section 5.1.5:

   A BGP speaker SHALL calculate the degree of preference for
   each external route based on the locally-configured policy, and
   include the degree of preference when advertising a route to its
   internal peers.  The higher degree of preference MUST be preferred.
   A BGP speaker uses the degree of preference learned via LOCAL_PREF in
   its Decision Process (see Section 9.1.1).



It is clear by the experiences of NSFnet and early days of the Internet, that 
AS_PATH alone is insufficient to meet interconnection policy objectives.  In 
fact, this LOCAL_PREF "conspiracy" was actually concocted by Research and 
Education (R&E) networks to make evil commercial networks pay--but in reality, 
NSFnet and early R&E networks had actual operational and demonstrated reasons 
for this, and a path vector routing protocol where cross-border interconnection 
policies must be applied cannot simply rely on AS_PATH for decision mechanism.  
Otherwise, it'd have been easier to just scale up RIP into a global routing 
protocol instead of using BGP.  

This is where your argument and basis of your claim fails-- a parameter to 
express administrative policy preference was required even in early days of 
NSFnet, and that is why LOCAL_PREF was put in there in the first place, despite 
your assertions claiming it is broken and being used to "override" AS_PATH to 
small guys for bad faith reasons.  This was not some later "add-on" for 
conspiracy by commercial networks; LOCAL_PREF in fact, was one of the principal 
features and reasons for developing BGP-4.  You're 29 years late to this 
conversation buddy.



> > 4. Get yourself connected to 3356 directly.
> 
> I am, just not as a BGP customer. And I won't be as a BGP customer.
> Opening a ticket with them has not yielded results. Or any response
> from network engineering at all. Just the frontline support who wants
> me to reboot my modem. :(

I get that you are not in the position to buy from 3356, and to that extent, 
that is a completely respectable and reasonable position (commercial reasons, 
personal experience/preference or otherwise, you are the customer here).  But 
you have a voice as a customer on which BGP transit provider you're purchasing 
on the other end (the far-end location or data center where your ASN is 
operating and taking transit from) -- take it as a lesson learned going 
forward:  when choosing a smaller/nimble or blended bandwidth IP provider, make 
sure you to ask, what can the provider do to help you achieve better 
connectivity into 3356 or any other network you're trying to get to?  It's your 
transit provider's business to make sure your ASN's connectivity works to your 
expectations.  Otherwise why would you, the customer, choose to do business 
with a middle-man when you could just buy direct from 3356 at the data center 
for your ASN instead?  It is incumbent upon your IP transit provider to help 
you better meet your connectivity requirements (especially for retail and small 
traffic customers in data centers like yourself who are not subject to capacity 
or comercial interconnection dispute

Re: Networks ignoring prepends?

2024-01-23 Thread Majdi S. Abbas
On Tue, Jan 23, 2024 at 03:37:25PM -0800, William Herrin wrote:
> Nevertheless, in the protocol's design, the one expressed in the
> RFC's, AS path length = distance.

Bill,

The protocol was also developed at a time when everyone
utilized the same transit provider, and all other ASes were 
regional or local in scope.

Still, I'm not sure your assertion is true.  There are
senior network engineers on this list who weren't even alive 
when 1105 was published, and express contemplation of AS path
as a tiebreaker doesn't come into it until 1164:

"1. An AS can minimize the number of transit ASs.  
(Shorter AS paths can be preferred over longer ones.)"

Note the can...hardly a MUST, or a SHOULD.  AS hop
count was never intended as a large hammer, and it has never
been one in practice, since most people are making their
decisions based on local preference, which for the last couple
of decades is typically set based on internal community tagging.

--msa


Re: Networks ignoring prepends?

2024-01-23 Thread Chris Adams
Once upon a time, William Herrin  said:
> Nevertheless, in the protocol's design, the one expressed in the
> RFC's, AS path length = distance.

The RFC doesn't make any equivalence between AS path length and
distance.  You are the one trying to make that equivalence, but that's
not how BGP is used on the Internet.  You're about 30 years too late to
have any influence on that.

-- 
Chris Adams 


Re: Networks ignoring prepends?

2024-01-23 Thread William Herrin
On Tue, Jan 23, 2024 at 3:27 PM Tom Beecher  wrote:
>> Unless overridden, BGP takes -distance- into account where distance =
>> AS path length.
>
> An AS_PATH length of 10 could be a physical distance of 1 mile.
>
> An AS_PATH length of 1 could be a physical distance of 1000 miles.

Nevertheless, in the protocol's design, the one expressed in the
RFC's, AS path length = distance.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread Tom Beecher
>
> Unless overridden, BGP takes -distance- into account where distance =
> AS path length.
>

An AS_PATH length of 10 could be a physical distance of 1 mile.

An AS_PATH length of 1 could be a physical distance of 1000 miles.

BGP TE communities exist to provide signalling in the event that the
standards implemented by a provider don't align with the desires of an ASN.
They are certainly imperfect, but they are a very useful tool in the
toolbox that can solve problems exactly as you are experiencing.

If you chose not to even attempt to use them, for whatever your reasons may
be, I guess that's all there is to say at this point.


On Tue, Jan 23, 2024 at 5:29 PM William Herrin  wrote:

> On Tue, Jan 23, 2024 at 12:34 PM Niels Bakker 
> wrote:
> > BGP, while a distance vector protocol, famously does not take
> > latency into account when making routing decisions.
>
> Unless overridden, BGP takes -distance- into account where distance =
> AS path length.
>
> Centurylink has overridden that with a localpref so that it DOES NOT
> take distance into account. Which rather defeats the function of a
> distance vector protocol.
>
> Regards,
> Bill Herrin
>
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: Networks ignoring prepends?

2024-01-23 Thread William Herrin
On Tue, Jan 23, 2024 at 12:34 PM Niels Bakker  wrote:
> BGP, while a distance vector protocol, famously does not take
> latency into account when making routing decisions.

Unless overridden, BGP takes -distance- into account where distance =
AS path length.

Centurylink has overridden that with a localpref so that it DOES NOT
take distance into account. Which rather defeats the function of a
distance vector protocol.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread William Herrin
On Tue, Jan 23, 2024 at 12:38 PM Tom Beecher  wrote:
> 1. Experiment with 53356's TE communities to prevent them from announcing to 
> upstreams that give you poor performance to 3356.

Respectfully, I rejected that approach because it doesn't address the
other few hundred instances of this problem, nor even resolves the
current issue since Centurylink is demonstrated to then switch to yet
another customer via a different one of my upstreams that would
require yet another community, if there is one.

> 2. See if 47787 will talk to you about their path to 3356.

Haha. You're funny.

> 3. Pick an upstream that has better / more direct connectivity to 3356, use 
> them instead of /in parallel with 53356.

Haha. You're funny.

> 4. Get yourself connected to 3356 directly.

I am, just not as a BGP customer. And I won't be as a BGP customer.
Opening a ticket with them has not yielded results. Or any response
from network engineering at all. Just the frontline support who wants
me to reboot my modem. :(

> 5. Keep yelling at the clouds about 3356 , even though they are doing the 
> same thing that (to the best of my knowledge) every large transit provider 
> does.

6. Pollute the DFZ because in light of what "every large transit
provider does," that's the solution that actually works.

Regards,
Bill Herrin



-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread Tom Beecher
>
> Because big operators think it reasonable to localpref distance routes
> ahead of nearby ones so long as the distant routes arrive from
> customers. I'll remember that the next time folks complain about the
> size of the routing table. This one you did to yourselves.
>

That has absolutely nothing to do with it, at all.

3356 is following common practice : Use customer routes before peer routes.
This is not some Illuminati based conspiracy , it's pretty standard stuff.
Nobody at 3356 is doing some magic latency based twerking to mess with you.
You kinda have a lousy upstream IMO.

It just so happens that their customer ( 47787 ) happens to takes a
*physical* pathway that is less performant than you'd prefer.

Two people ( myself and Andrew Hoyos ) went and looked , and found that the
upstream you use ( 53356 ) provides TE communities that you can use to
prevent your advertisement from being sent to 47787, thus avoiding that
poorly performing pathway, and hopefully using someone else better. Again,
for reference ( https://docs.freerangecloud.com/en/bgp/communities ).

You can:
1. Experiment with 53356's TE communities to prevent them from announcing
to upstreams that give you poor performance to 3356.
2. See if 47787 will talk to you about their path to 3356.  ( Doubtful,
since you aren't a direct customer of theirs.)
3. Pick an upstream that has better / more direct connectivity to 3356, use
them instead of /in parallel with 53356.
4. Get yourself connected to 3356 directly.
5. Keep yelling at the clouds about 3356 , even though they are doing the
same thing that (to the best of my knowledge) every large transit provider
does.


On Tue, Jan 23, 2024 at 3:02 PM William Herrin  wrote:

> On Tue, Jan 23, 2024 at 11:45 AM Owen DeLong via NANOG 
> wrote:
> > The catch to all of that, however, is that he’s not directly peered with
> 3356 and many AS operators strip communities.
>
> And even if I didn't, the problem isn't just one ISP localprefing to
> prefer distant routes. Centurylink most directly impacts me, but as
> others have pointed out: many ISPs do the same darn thing. The only
> workable solution available to me appears to be tripling my presence
> in the DFZ tables.
>
> Because big operators think it reasonable to localpref distance routes
> ahead of nearby ones so long as the distant routes arrive from
> customers. I'll remember that the next time folks complain about the
> size of the routing table. This one you did to yourselves.
>
> Regards,
> Bill
>
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: Networks ignoring prepends?

2024-01-23 Thread Niels Bakker

* b...@herrin.us (William Herrin) [Tue 23 Jan 2024, 21:02 CET]:

On Tue, Jan 23, 2024 at 11:45 AM Owen DeLong via NANOG  wrote:
The catch to all of that, however, is that he’s not directly 
peered with 3356 and many AS operators strip communities.


And even if I didn't, the problem isn't just one ISP localprefing to 
prefer distant routes. Centurylink most directly impacts me, but as 
others have pointed out: many ISPs do the same darn thing. The only 
workable solution available to me appears to be tripling my presence 
in the DFZ tables.


Why do you buy from ISPs when you don't want to receive traffic via 
them?


Have you tried asking that upstream to interconnect more locally with 
certain other networks?


Why do you buy from ISPs that strip TE communities from your 
announcements that don't affect them in the first place?



Because big operators think it reasonable to localpref distance 
routes ahead of nearby ones so long as the distant routes arrive 
from customers. I'll remember that the next time folks complain 
about the size of the routing table. This one you did to yourselves.


BGP, while a distance vector protocol, famously does not take 
latency into account when making routing decisions.



-- Niels.


Re: Networks ignoring prepends?

2024-01-23 Thread Chris Adams
Once upon a time, William Herrin  said:
> Because big operators think it reasonable to localpref distance routes
> ahead of nearby ones so long as the distant routes arrive from
> customers. I'll remember that the next time folks complain about the
> size of the routing table. This one you did to yourselves.

This isn't some "big operators" conspiracy... it's how lots of networks
with BGP customers work (even small networks).  BGP has no knowledge of
the distance you keep emphasizing, and path prepends have always been
known to be down the decision tree.

When you receive a route over a paid link, it's not unreasonable to
assume it's because your paying customer wants that traffic from you.
It's been pretty standard practice to localpref up routes from your
customers for a long time, and then (often but not always) provide
communities for said customers to override the localpref.  Being a
customer of a customer makes that harder, but then it's basically on you
to choose your connections with that in mind.
-- 
Chris Adams 


Re: Networks ignoring prepends?

2024-01-23 Thread Niels Bakker

* nanog@nanog.org (Owen DeLong via NANOG) [Tue 23 Jan 2024, 20:47 CET]:
The catch to all of that, however, is that he’s not directly peered 
with 3356 and many AS operators strip communities.


Are there recent statistics on that last assertion?


-- Niels.


Re: Networks ignoring prepends?

2024-01-23 Thread William Herrin
On Tue, Jan 23, 2024 at 11:45 AM Owen DeLong via NANOG  wrote:
> The catch to all of that, however, is that he’s not directly peered with 3356 
> and many AS operators strip communities.

And even if I didn't, the problem isn't just one ISP localprefing to
prefer distant routes. Centurylink most directly impacts me, but as
others have pointed out: many ISPs do the same darn thing. The only
workable solution available to me appears to be tripling my presence
in the DFZ tables.

Because big operators think it reasonable to localpref distance routes
ahead of nearby ones so long as the distant routes arrive from
customers. I'll remember that the next time folks complain about the
size of the routing table. This one you did to yourselves.

Regards,
Bill


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread Owen DeLong via NANOG



> On Jan 23, 2024, at 10:47, Jay Borkenhagen  wrote:
> 
> William Herrin writes:
>> 
>> The best path to me from Centurylink is: 3356 1299 20473 11875
>> 
>> The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
>> 11875 11875 11875
>> 
>> Do you want to tell me again how that's a reasonable path selection,
>> or how I'm supposed to pass communities to either 20473 or 53356 which
>> tell 3356 to behave itself?
>> 
> 
> What you want to do is pass communities to 3356 so they apply the same
> local-pref to routes from both paths, enabling as-path-length-based
> path selection to work.  That means lowering their local-pref on the
> currently-chosen customer path via 47787 to match the local-pref on
> the their 1299 peer path.
> 
> as3356's TE communities are listed in their IRR aut-num: AS3356
> object: 
> 
> remarks: 
> remarks: customer traffic engineering communities - LocalPref
> remarks: 
> remarks:3356:70 - set local preference to 70
> remarks:3356:80 - set local preference to 80
> remarks:3356:90 - set local preference to 90
> remarks: 
> 
> Those communities look like RFC1998.  Thus presumably 3356's peer
> local-pref is 80, and you'll want to signal using 3356:80.  As you
> make signaling changes you should use as3356's looking glass to
> confirm.
> 
> as47787 and as53356 should pass your 3356:80 community along to
> as3356.  If they don't do so, complain to them or vote with your
> feet.

The catch to all of that, however, is that he’s not directly peered with 3356 
and many AS operators strip communities.

Owen



Re: Networks ignoring prepends?

2024-01-23 Thread Jay Borkenhagen
William Herrin writes:
 > 
 > The best path to me from Centurylink is: 3356 1299 20473 11875
 > 
 > The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
 > 11875 11875 11875
 > 
 > Do you want to tell me again how that's a reasonable path selection,
 > or how I'm supposed to pass communities to either 20473 or 53356 which
 > tell 3356 to behave itself?
 > 

What you want to do is pass communities to 3356 so they apply the same
local-pref to routes from both paths, enabling as-path-length-based
path selection to work.  That means lowering their local-pref on the
currently-chosen customer path via 47787 to match the local-pref on
the their 1299 peer path.

as3356's TE communities are listed in their IRR aut-num: AS3356
object: 

remarks: 
remarks: customer traffic engineering communities - LocalPref
remarks: 
remarks:3356:70 - set local preference to 70
remarks:3356:80 - set local preference to 80
remarks:3356:90 - set local preference to 90
remarks: 

Those communities look like RFC1998.  Thus presumably 3356's peer
local-pref is 80, and you'll want to signal using 3356:80.  As you
make signaling changes you should use as3356's looking glass to
confirm.

as47787 and as53356 should pass your 3356:80 community along to
as3356.  If they don't do so, complain to them or vote with your
feet. 

Jay B.



Re: Networks ignoring prepends?

2024-01-23 Thread Tom Beecher
>
> Apparently there is a conflict between what you want and what 47787 wants.
> As you both seem to be paying customers, you should probably ask 3356 to
> resolve that instead of us random internet folks.
>

Calling 3356 and saying "I know your global routing policy is to prefer a
customer learned route over a peer route, but can you change that for me
please?" probably won't see much success.



On Tue, Jan 23, 2024 at 8:39 AM Alex Le Heux  wrote:

>
> >> Packets don't have customers, ISPs do. And in this case you're not a
> customer of the ISP making the routing decision
> >
> > Incorrect. I am a customer of 3356. A residential customer, not a BGP
> > customer. I'm paying them to route my packets too, and they're routing
> > them poorly.
>
> Oh, you should have said that right away, or perhaps I missed it.
>
> In that case it’s simple: Stop giving them money for bad service. By
> continuing to give them money you’re incentivizing them to continue
> breaking your internet, making you the architect of your own misery ;)
>
> > Also incorrect: every packet in your network is linked to either one
> > or two customers. Never more. Never less. Routing my packet via 47787
> > in this case serves neither of us: my Internet access is severely
> > degraded and 47787 is charged money for a packet they need not have
> > handled.
>
> Nonsense. 47787 is clearly telling 3356 they *want* to handle that traffic
> and even paying for the privilege. Apparently there is a conflict between
> what you want and what 47787 wants. As you both seem to be paying
> customers, you should probably ask 3356 to resolve that instead of us
> random internet folks.
>
> >> Fact is that all prepending does it provide a vague hint to other
> >> networks about what you would like them to do.
> >
> > Until they tamper with it using localpref, BGP's default behavior with
> > prepends does exactly the right thing, at least in my situation.
>
> Try giving your money to someone who runs BGP with just its default
> settings and no policies, see how well that works out.
>
> Cheers,
>
> Alex
>
>
> > Regards,
> > Bill Herrin
> >
> > --
> > William Herrin
> > b...@herrin.us
> > https://bill.herrin.us/
>


Re: Networks ignoring prepends?

2024-01-23 Thread Tom Beecher
>
> I feel your pain Bill, but from a slightly different angle.  For years the
> large CDNs have been disregarding prepends.  When a source AS disregards
> BGP best path selection rules, it sets off a chain reaction of silliness
> not attributable to the transit AS's.  At the terminus of that chain are
> destination / eyeball AS's now compelled to do undesirable things out of
> necessity such as:
>   1) Advertise specifics towards select peers - i.e. inconsistent edge
> routing policy & littering global table
>   2) Continuing to prepending a ridiculous amount anyway
> Gotta wonder how things would be if everyone just abided by the rules.
>

What 'rule' are you asserting is being broken here?



On Mon, Jan 22, 2024 at 9:56 PM Jeff Behrns via NANOG 
wrote:

> > > William Herrin  wrote:
> Until they tamper with it using localpref, BGP's default behavior with
> prepends does exactly the right thing, at least in my situation.
>
> I feel your pain Bill, but from a slightly different angle.  For years the
> large CDNs have been disregarding prepends.  When a source AS disregards
> BGP best path selection rules, it sets off a chain reaction of silliness
> not attributable to the transit AS's.  At the terminus of that chain are
> destination / eyeball AS's now compelled to do undesirable things out of
> necessity such as:
>   1) Advertise specifics towards select peers - i.e. inconsistent edge
> routing policy & littering global table
>   2) Continuing to prepending a ridiculous amount anyway
> Gotta wonder how things would be if everyone just abided by the rules.
>
>


Re: Networks ignoring prepends?

2024-01-23 Thread Alex Le Heux


>> Packets don't have customers, ISPs do. And in this case you're not a 
>> customer of the ISP making the routing decision
> 
> Incorrect. I am a customer of 3356. A residential customer, not a BGP
> customer. I'm paying them to route my packets too, and they're routing
> them poorly.

Oh, you should have said that right away, or perhaps I missed it.

In that case it’s simple: Stop giving them money for bad service. By continuing 
to give them money you’re incentivizing them to continue breaking your 
internet, making you the architect of your own misery ;)

> Also incorrect: every packet in your network is linked to either one
> or two customers. Never more. Never less. Routing my packet via 47787
> in this case serves neither of us: my Internet access is severely
> degraded and 47787 is charged money for a packet they need not have
> handled.

Nonsense. 47787 is clearly telling 3356 they *want* to handle that traffic and 
even paying for the privilege. Apparently there is a conflict between what you 
want and what 47787 wants. As you both seem to be paying customers, you should 
probably ask 3356 to resolve that instead of us random internet folks.

>> Fact is that all prepending does it provide a vague hint to other
>> networks about what you would like them to do.
> 
> Until they tamper with it using localpref, BGP's default behavior with
> prepends does exactly the right thing, at least in my situation.

Try giving your money to someone who runs BGP with just its default settings 
and no policies, see how well that works out.

Cheers,

Alex


> Regards,
> Bill Herrin
> 
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-23 Thread Alex Le Heux



> On Jan 22, 2024, at 21:34, Forrest Christian (List Account) 
>  wrote:
> 
> I really really wish there were a couple of well-known and globally respected 
> communities which you could set to say either "this is a route of last 
> resort" or "this is my preferred route".

You're not the first to wish for this:

https://datatracker.ietf.org/doc/html/draft-dickson-idr-last-resort-05

Alex



Re: Networks ignoring prepends?

2024-01-23 Thread Alex Le Heux



> On Jan 23, 2024, at 00:43, William Herrin  wrote:
> 
> On Mon, Jan 22, 2024 at 3:34 PM Alex Le Heux  wrote:
>> This is perfectly reasonable routing _if you're 3356_
>> 
>> In this profit-driven world, expecting 3356 to do something that's 
>> unprofitable for them just because it happens to be convenient for you is, 
>> well, unreasonable.
> 
> Every packet has two customers: the one sending it and the one
> receiving it. 3356 is providing a service to its customers. ALL of its
> customers. Not just 47787. Sending the packet an extra 5,000 miles
> harms every one of 3356's customers -except for- 47787.
> 
> In this case, I am the customer on both ends. 3356's choice to route
> my packet via 47787 serves me poorly.

Packets don't have customers, ISPs do. And in this case you're not a customer 
of the ISP making the routing decision and 3356 is doing precisely what its 
customer tells it to do by adding (or not adding) specific communities to what 
is announced. In other words, 3356 is doing precisely what its customer pays it 
to do.

You can build a shorter backup path, deaggregate, get 53356 and 47787 to 
propagate your routes differently or change your transit mix. There aren't many 
other options. 

Fact is that all prepending does it provide a vague hint to other networks 
about what you would like them to do. And this is only one of the many things 
those networks take into account when formulating their routing policies. This 
is why many ASes build extensive community lists to set things like localpref 
and limit route propagation in other ways. Perhaps you can try adding 
53356:47787 to your announcement although it's anyone's guess how that'll 
affect things.

Alex
 

Re: Networks ignoring prepends?

2024-01-23 Thread Alex Le Heux


> At which point Centurylink chooses 40676 7489 11875 11875 11875 11875
> 11875 11875 11875.
> 
>> This certainly seems like a reasonable path selection, in the context that 
>> 47787 is likely a 3356 customer.
> 
> That's -why- 3356 chooses the paths. 40676 and 47787 are customers,
> 1299 is a peer. You're telling me with a straight face that you think
> that's *reasonable* routing?

The reasons why have been pointed out by others:

This is perfectly reasonable routing _if you're 3356_

In this profit-driven world, expecting 3356 to do something that's unprofitable 
for them just because it happens to be convenient for you is, well, 
unreasonable.

Deaggregation offers one loophole out of this Layer 8 problem though, making 
TCAM slots just the price we pay for "my network, my rules". Convincing 53356 
and 47787 to add 3356:70 to your route is another. Have you asked them? I know 
I would look into it if a customer comes to me with a similar request.

Alex

Re: Networks ignoring prepends?

2024-01-23 Thread Andrew Hoyos

On Jan 22, 2024, at 14:35, William Herrin  wrote:
> 
> The best path to me from Centurylink is: 3356 1299 20473 11875
> 
> The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
> 11875 11875 11875
> 
> Do you want to tell me again how that's a reasonable path selection,
> or how I'm supposed to pass communities to either 20473 or 53356 which
> tell 3356 to behave itself?

This certainly seems like a reasonable path selection, in the context that 
47787 is likely a 3356 customer.

AS53356 (Free Range Cloud Hosting) appears to have some limited BGP communities 
that may help.

https://docs.freerangecloud.com/en/bgp/communities

implies that you sending 53356:19014 would block announcements to 47787.
That may turn into a game of whack a mole, but the knobs appear to be there to 
try something other than prepending to influence 3356’s selection.

—
Andrew Hoyos
hoy...@gmail.com 



RE: Networks ignoring prepends?

2024-01-22 Thread Jeff Behrns via NANOG
> > William Herrin  wrote:
Until they tamper with it using localpref, BGP's default behavior with prepends 
does exactly the right thing, at least in my situation.

I feel your pain Bill, but from a slightly different angle.  For years the 
large CDNs have been disregarding prepends.  When a source AS disregards BGP 
best path selection rules, it sets off a chain reaction of silliness not 
attributable to the transit AS's.  At the terminus of that chain are 
destination / eyeball AS's now compelled to do undesirable things out of 
necessity such as:
  1) Advertise specifics towards select peers - i.e. inconsistent edge routing 
policy & littering global table
  2) Continuing to prepending a ridiculous amount anyway
Gotta wonder how things would be if everyone just abided by the rules.



Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 6:43 PM William Herrin  wrote:
> On Mon, Jan 22, 2024 at 5:59 PM James Jun  wrote:
> > CL is choosing 3356 47787[x3] 53356 11875[x3] over better path via 1299:
> >This is not a Lumen/CenturyLink/Level 3 problem.
> > What you need to be doing is
>
> Hi James,
>
> My solution has been to add two more-specific routes to -your- routing
> table so that my one prefix now consumes three routes. If you and the
> others defending Centurylink's behavior are satisfied with that
> solution, then we're done here.

Of course, I'll probably have to do the same thing with my v6 prefix
too. But hey, if that works for you I'll conquer my irritation at the
inefficiency.

-Bill

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 5:59 PM James Jun  wrote:
> CL is choosing 3356 47787[x3] 53356 11875[x3] over better path via 1299:
>This is not a Lumen/CenturyLink/Level 3 problem.
> What you need to be doing is

Hi James,

My solution has been to add two more-specific routes to -your- routing
table so that my one prefix now consumes three routes. If you and the
others defending Centurylink's behavior are satisfied with that
solution, then we're done here.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Rubens Kuhl
You can use the ultimate BOFH BGP tool, which is to include the
network you don't want those announcements to go in the AS Path.
Let's say your ASN is 65000, and the target you want to not route
through that path is 65001.

For the path you want that network to route to, announce this AS Path:
65000 65000 65000 65000 65000

For the path you don't want that network to route to, announce this AS Path:
65000 65001 65000

So your announcements still have your AS as first AS and peer AS. But
65001 loop detection will kill that announcement, regardless of local
preference or AS Path size.


Rubens



On Mon, Jan 22, 2024 at 9:50 AM William Herrin  wrote:
>
> Howdy,
>
> Does anyone have suggestions for dealing with networks who ignore my
> BGP route prepends?
>
> I have a primary ingress with no prepends and then several distant
> backups with multiple prepends of my own AS number. My intention, of
> course, is that folks take the short path to me whenever it's
> reachable.
>
> A few years ago, Comcast decided it would prefer the 5000 mile,
> five-prepend loop to the short 10 mile path. I was able to cure that
> with a community telling my ISP along that path to not advertise my
> route to Comcast. Today it's Centurylink. Same story; they'd rather
> send the packets 5000 miles to the other coast and back than 10 miles
> across town. I know they have the correct route because when I
> withdraw the distant ones entirely, they see and use it. But this time
> it's not just one path; they prefer any other path except the one I
> want them to use. And Centurylink is not a peer of those ISPs, so
> there doesn't appear to be any community I can use to tell them not to
> use the route.
>
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm at a loss as to what
> else to do.
>
> Advice would be most welcome.
>
> Regards,
> Bill Herrin
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread James Jun
On Mon, Jan 22, 2024 at 02:03:48PM -0800, William Herrin wrote:
> 
> It offends my pride to handle it this way, but -you- shoulder the cost.
>

You're misdiagnosing the issue at hand.

CL is choosing 3356 47787[x3] 53356 11875[x3] over better path via 1299:

What you need to be doing is reaching out to AS53356 (your upstream provider 
supposedly) to assist with traffic engineering.  Given the # of prepends that 
53356 added themselves, it looks like you're using their communities to prepend 
on top of your own prepends (wasted effort), or they've attempted to help you 
by prepending manually, but to no avail (see our prior discussion).  The next 
level of escalation is for 53356 to now work with 47787 to implement the 
correct traffic engineering policy facing 3356.

This is really something your IP transit providers should be assisting you 
with.  You're misdiagnosing and complaining about something which BGP is 
supposed to be doing, instead of escalating with the right parties who are in 
the best position to be assisting you.

Believe it or not, there are small-medium IP transit providers who are _very 
good_ at assisting their BGP customers in traffic engineering efforts, 
especially with extensive BGP community options, competent network engineers, 
automation and the likes.  Your upstream providers need to step up their game 
to help you out here.  This is not a Lumen/CenturyLink/Level 3 problem.

HTH,
James


Re: Networks ignoring prepends?

2024-01-22 Thread Tom Beecher
>
> As I already explained, neither the primary nor any of the backup
> providers directly peer with Centurylink, thus have no communities for
> controlling announcements to Centurylink.


No, but they do have an option to not announce to 47787.

https://docs.freerangecloud.com/en/bgp/communities

53356:19014 would deny to 47787 , which would seem to be the 'problematic'
intermediate ASN in your case, You could try that and see what other
upstream paths are taken , and see if that gets you over an upstream that
lines up more with your performance expectations.

Otherwise, you either have to deal with more specifics, or try to get
better connected to 3356 some other way.

3356 isn't doing anything wrong here, as much as you seem to want to
believe that to be true. This is all pretty standard customer / peer
preference handling.

On Mon, Jan 22, 2024 at 7:26 PM William Herrin  wrote:

> On Mon, Jan 22, 2024 at 4:16 PM Alex Le Heux  wrote:
> > > On Jan 23, 2024, at 00:43, William Herrin  wrote:
> > > Every packet has two customers: the one sending it and the one
> > > receiving it. 3356 is providing a service to its customers. ALL of its
> > > customers. Not just 47787. Sending the packet an extra 5,000 miles
> > > harms every one of 3356's customers -except for- 47787.
> > >
> > > In this case, I am the customer on both ends. 3356's choice to route
> > > my packet via 47787 serves me poorly.
> >
> > Packets don't have customers, ISPs do. And in this case you're not a
> customer of the ISP making the routing decision
>
> Incorrect. I am a customer of 3356. A residential customer, not a BGP
> customer. I'm paying them to route my packets too, and they're routing
> them poorly.
>
> Also incorrect: every packet in your network is linked to either one
> or two customers. Never more. Never less. Routing my packet via 47787
> in this case serves neither of us: my Internet access is severely
> degraded and 47787 is charged money for a packet they need not have
> handled.
>
> Charging your customers to make their service worse doesn't seem like
> a good business model to me, but maybe that's why I'm not a CEO.
>
>
> > Fact is that all prepending does it provide a vague hint to other
> > networks about what you would like them to do.
>
> Until they tamper with it using localpref, BGP's default behavior with
> prepends does exactly the right thing, at least in my situation.
>
> Regards,
> Bill Herrin
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 4:16 PM Alex Le Heux  wrote:
> > On Jan 23, 2024, at 00:43, William Herrin  wrote:
> > Every packet has two customers: the one sending it and the one
> > receiving it. 3356 is providing a service to its customers. ALL of its
> > customers. Not just 47787. Sending the packet an extra 5,000 miles
> > harms every one of 3356's customers -except for- 47787.
> >
> > In this case, I am the customer on both ends. 3356's choice to route
> > my packet via 47787 serves me poorly.
>
> Packets don't have customers, ISPs do. And in this case you're not a customer 
> of the ISP making the routing decision

Incorrect. I am a customer of 3356. A residential customer, not a BGP
customer. I'm paying them to route my packets too, and they're routing
them poorly.

Also incorrect: every packet in your network is linked to either one
or two customers. Never more. Never less. Routing my packet via 47787
in this case serves neither of us: my Internet access is severely
degraded and 47787 is charged money for a packet they need not have
handled.

Charging your customers to make their service worse doesn't seem like
a good business model to me, but maybe that's why I'm not a CEO.


> Fact is that all prepending does it provide a vague hint to other
> networks about what you would like them to do.

Until they tamper with it using localpref, BGP's default behavior with
prepends does exactly the right thing, at least in my situation.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Tom Beecher
>
> I’d bet that 47787 is a paying century link customer. As such, despite the
> ugliness of the path, CL probably local prefs everything advertised by them
> higher than any non-paying link. I’m willing to bet 1299 is peered and not
> paying CL.
>

It's almost as if you've done this before. :)

Community : 3356:3 3356:22 3356:100 ==> 3356:123 <++ 3356:575
3356:903 3356:2011 3356:11918 47787:1020
47787:3090 47787:3690 47787:3
Cluster : 0.0.7.15 0.0.7.19
Originator Id : 4.69.181.14 Peer Router Id : 4.69.130.10
Fwd Class : None Priority : None
Flags : Used Valid Best IGP Group-Best
Route Source : Internal
AS-Path : 47787 47787 47787 47787 53356 11875 11875 11875

3356:123 = Customer


On Mon, Jan 22, 2024 at 5:45 PM Owen DeLong via NANOG 
wrote:

> I’d bet that 47787 is a paying century link customer. As such, despite the
> ugliness of the path, CL probably local prefs everything advertised by them
> higher than any non-paying link. I’m willing to bet 1299 is peered and not
> paying CL.
>
> Sending bits for revenue is almost always preferable to sending bits for
> free, so…
>
> Owen
>
>
> > On Jan 22, 2024, at 12:37, William Herrin  wrote:
> >
> > On Mon, Jan 22, 2024 at 10:19 AM James Jun 
> wrote:
> >> So, as a customer, you actually SHOULD be demanding your ISPs
> >> to positively identify and categorize their routes using local-pref
> >> and communities.
> >
> > Hi James,
> >
> > The best path to me from Centurylink is: 3356 1299 20473 11875
> >
> > The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
> > 11875 11875 11875
> >
> > Do you want to tell me again how that's a reasonable path selection,
> > or how I'm supposed to pass communities to either 20473 or 53356 which
> > tell 3356 to behave itself?
> >
> > Regards,
> > Bill Herrin
> >
> >
> > --
> > William Herrin
> > b...@herrin.us
> > https://bill.herrin.us/
>
>


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 3:34 PM Alex Le Heux  wrote:
> This is perfectly reasonable routing _if you're 3356_
>
> In this profit-driven world, expecting 3356 to do something that's 
> unprofitable for them just because it happens to be convenient for you is, 
> well, unreasonable.

Hi Alex,

Every packet has two customers: the one sending it and the one
receiving it. 3356 is providing a service to its customers. ALL of its
customers. Not just 47787. Sending the packet an extra 5,000 miles
harms every one of 3356's customers -except for- 47787.

In this case, I am the customer on both ends. 3356's choice to route
my packet via 47787 serves me poorly.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Owen DeLong via NANOG
I’d bet that 47787 is a paying century link customer. As such, despite the 
ugliness of the path, CL probably local prefs everything advertised by them 
higher than any non-paying link. I’m willing to bet 1299 is peered and not 
paying CL. 

Sending bits for revenue is almost always preferable to sending bits for free, 
so…

Owen


> On Jan 22, 2024, at 12:37, William Herrin  wrote:
> 
> On Mon, Jan 22, 2024 at 10:19 AM James Jun  wrote:
>> So, as a customer, you actually SHOULD be demanding your ISPs
>> to positively identify and categorize their routes using local-pref
>> and communities.
> 
> Hi James,
> 
> The best path to me from Centurylink is: 3356 1299 20473 11875
> 
> The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
> 11875 11875 11875
> 
> Do you want to tell me again how that's a reasonable path selection,
> or how I'm supposed to pass communities to either 20473 or 53356 which
> tell 3356 to behave itself?
> 
> Regards,
> Bill Herrin
> 
> 
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/



Re: Networks ignoring prepends?

2024-01-22 Thread Owen DeLong via NANOG
And now you are faced with an object lesson as to why TE routes are so 
prevalent. 

Less specifics are your only functional alternative here. In most cases, you 
shouldn’t need more than 2 per prefix. 

Owen


> On Jan 22, 2024, at 12:16, William Herrin  wrote:
> 
> On Mon, Jan 22, 2024 at 5:23 AM Jon Lewis  wrote:
>> You may be limited to seeing if your backup providers have community
>> controls that would let you tell them "don't share with Centurylink"
> 
> As I already explained, neither the primary nor any of the backup
> providers directly peer with Centurylink, thus have no communities for
> controlling announcements to Centurylink.
> 
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm not hearing any
> practical alternatives. Treating my distant links as equivalent even
> though I told you with prepends that they are not leaves me with few
> knobs I can turn.
> 
> Regards,
> Bill Herrin
> 
> 
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/



Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 1:55 PM Nick Hilliard  wrote:
> You have your own ASN, you have control over your own routing policy.
> Centurylink probably aren't going to be interested in engaging with you
> if you're not a customer. It's a pickle.

It's not a pickle for me. I'll announce three prefixes instead of one,
and you get to pay for the extra two TCAM slots.

It offends my pride to handle it this way, but -you- shoulder the cost.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Nick Hilliard

William Herrin wrote on 22/01/2024 21:26:

At which point Centurylink chooses 40676 7489 11875 11875 11875
11875 11875 11875 11875.

[...]
You're telling me with a straight face that you think 
that's*reasonable*  routing?


yep, looks pretty reasonable, if you're Centurylink and 40676 is a 
Centurylink customer.



Besides, I don't want to drop the path to53356 via 47787. If the path
through 20473  fails, the path through 53356 is the next best and I 
want Centurylink to use it.
You have your own ASN, you have control over your own routing policy. 
Centurylink probably aren't going to be interested in engaging with you 
if you're not a customer. It's a pickle.


Nick


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 1:11 PM Andrew Hoyos  wrote:
> On Jan 22, 2024, at 14:35, William Herrin  wrote:
>> The best path to me from Centurylink is: 3356 1299 20473 11875
>
>> The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
>> 11875 11875 11875
>
>> Do you want to tell me again how that's a reasonable path selection,
>> or how I'm supposed to pass communities to either 20473 or 53356 which
>> tell 3356 to behave itself?
>
> AS53356 (Free Range Cloud Hosting) appears to have some limited BGP 
> communities that may help.
> https://docs.freerangecloud.com/en/bgp/communities
>
> implies that you sending 53356:19014 would block announcements to 47787.

At which point Centurylink chooses 40676 7489 11875 11875 11875 11875
11875 11875 11875.

> This certainly seems like a reasonable path selection, in the context that 
> 47787 is likely a 3356 customer.

That's -why- 3356 chooses the paths. 40676 and 47787 are customers,
1299 is a peer. You're telling me with a straight face that you think
that's *reasonable* routing?


> That may turn into a game of whack a mole, but the knobs appear to be there 
> to try something other than prepending to influence 3356’s selection.

Whack-a-mole is not a reasonable solution to anything.

Besides, I don't want to drop the path to 53356 via 47787. If the path
through 20473 fails, the path through 53356 is the next best and I
want Centurylink to use it.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 10:19 AM James Jun  wrote:
> So, as a customer, you actually SHOULD be demanding your ISPs
> to positively identify and categorize their routes using local-pref
> and communities.

Hi James,

The best path to me from Centurylink is: 3356 1299 20473 11875

The path Centurylink chose is: 3356 47787 47787 47787 47787 53356
11875 11875 11875

Do you want to tell me again how that's a reasonable path selection,
or how I'm supposed to pass communities to either 20473 or 53356 which
tell 3356 to behave itself?

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Forrest Christian (List Account)
I really really wish there were a couple of well-known and globally
respected communities which you could set to say either "this is a route of
last resort" or "this is my preferred route".

I feel like it would avoid many of us doing exactly what you're about to do
which is pollute the routing tables with extra, more specific routes to do
basic traffic engineering.  (Resulting in 3 routes where one would do).

I'm not talking fine level control here,  just being able to say "hey this
route is better than nothing,  but not much" or "treat this as backup".

I understand the resistance to honoring various route engineering tactics,
but being able to effectively do the exact same thing that announcing more
specifics does without having to resort to announcing more specifics would
be a good thing as far as the global bgp table size goes.

On Mon, Jan 22, 2024, 1:16 PM William Herrin  wrote:

> On Mon, Jan 22, 2024 at 5:23 AM Jon Lewis  wrote:
> > You may be limited to seeing if your backup providers have community
> > controls that would let you tell them "don't share with Centurylink"
>
> As I already explained, neither the primary nor any of the backup
> providers directly peer with Centurylink, thus have no communities for
> controlling announcements to Centurylink.
>
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm not hearing any
> practical alternatives. Treating my distant links as equivalent even
> though I told you with prepends that they are not leaves me with few
> knobs I can turn.
>
> Regards,
> Bill Herrin
>
>
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/
>


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 5:23 AM Jon Lewis  wrote:
> You may be limited to seeing if your backup providers have community
> controls that would let you tell them "don't share with Centurylink"

As I already explained, neither the primary nor any of the backup
providers directly peer with Centurylink, thus have no communities for
controlling announcements to Centurylink.

I hate to litter the table with a batch of more-specifics that only
originate from the short, preferred link but I'm not hearing any
practical alternatives. Treating my distant links as equivalent even
though I told you with prepends that they are not leaves me with few
knobs I can turn.

Regards,
Bill Herrin


-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Steve Gibbard
To expand on what others have said here, I find it helpful to think of BGP as a 
policy enforcement protocol, rather than as a distance vector routing protocol. 
 

To that end, there’s a generally expected hierarchy of routes, and then a lot 
of individuality between networks.  Having done traffic engineering for some 
global CDNs, there’s a bunch of inbound traffic control that you can do by 
letting an understanding of how most other providers think about this guide 
your transit and peering policies, and a remaining portion that generally needs 
to be solved through either discussions, negotiations, or commercial 
arrangements with the sending party or their upstreams.

For the general rules, local-preference trumps everything else.  The number of 
AS path hops comes after local-preference.  Other things being equal networks 
usually like to hand off traffic to a short AS path, and at the closest point 
to its origination (there are valid performance reasons for this) but 
local-preference policies will override both of those.

Local-preferences usually have three default tiers — customer, peering, and 
transit.  In other words, get paid, hand off for free, and pay.  There are 
often some additional peers that can be selected for traffic engineering 
reasons, either internally or by customers using BGP communities.  BUT, those 
BGP communities don’t transit to other ASes, so even if you manage to signal 
one hop up stream, you may still find your upstream provider announcing your 
routes to those who have different ideas.

One example of this from the early days of anycasted DNS root servers involved 
k.root-servers.net  installing a node in Delhi, 
which pulled 60% of its traffic from North America.  This was clearly 
non-optimal.  They had attempted to get routing diversity by getting transit 
from different providers in different parts of the world, but their Delhi node 
was, if I recall correctly, a customer of a customer of a customer of Level3.  
Oops.

So, what do you do about this?

If you’re a global network operator, you probably attempt to maintain 
consistent peering/transit relationships across sites.  That way, AS paths and 
local-preferences should be fairly even, and you can let nearest exit routing 
do its thing.

If you have a smaller network, but have multiple interconnection locations that 
are far enough apart to make a performance difference, make the same transit 
and peering relationships at each one.  Make exceptions only for peers (not 
transit providers) whose customers or services only exist in one of the areas, 
and make sure they don’t announce your routes to their upstreams.  That way you 
won’t trombone traffic.

If you’ve done all that, and traffic is still coming in the wrong place, then 
you start talking to people.  “Hey, I’m buying transit from you in both Asia 
and the Western US, and all my traffic from asian-country-x is coming into San 
Jose.  Why?”  “Well, they only have a 100 Mb/s interconnection to us in Asia.  
We have to traffic engineer around it.”  And then you have to figure out how to 
convince some national telco to want to talk to you more than they want to talk 
to your transit provider.

I think in your case, I would be asking why you have a 5,000 mile, five-prepend 
loop to get to a provide ten miles away.  It suggests that your network is 
doing things 5,000 miles away that are inconsistent with what you're doing 
locally, or that you have upstreams who aren’t interconnecting locally or 
aren’t maintaining sufficient capacity or sufficient political relationships on 
those paths.  All of those would predictably have this result.  The solution is 
likely to take a look at your transit relationships, ask your transit providers 
about their transit relationships, and either supplement or switch to a set of 
transit providers who can provide the routing you want.

-Steve



> On Jan 22, 2024, at 4:49 AM, William Herrin  wrote:
> 
> Howdy,
> 
> Does anyone have suggestions for dealing with networks who ignore my
> BGP route prepends?
> 
> I have a primary ingress with no prepends and then several distant
> backups with multiple prepends of my own AS number. My intention, of
> course, is that folks take the short path to me whenever it's
> reachable.
> 
> A few years ago, Comcast decided it would prefer the 5000 mile,
> five-prepend loop to the short 10 mile path. I was able to cure that
> with a community telling my ISP along that path to not advertise my
> route to Comcast. Today it's Centurylink. Same story; they'd rather
> send the packets 5000 miles to the other coast and back than 10 miles
> across town. I know they have the correct route because when I
> withdraw the distant ones entirely, they see and use it. But this time
> it's not just one path; they prefer any other path except the one I
> want them to use. And Centurylink is not a peer of those ISPs, so
> there doesn't appear to be any community I can use to tell

Re: Networks ignoring prepends?

2024-01-22 Thread James Jun
On Mon, Jan 22, 2024 at 06:02:53AM -0800, William Herrin wrote:
> On Mon, Jan 22, 2024 at 5:24???AM Patrick W. Gilmore  
> wrote:
> > Standard practice is to localpref your customers up, which makes prepends 
> > irrelevant. Why would anyone expect different behavior?
> 
> It gives me, your paying customer, less control over my routing
> through your network than if I wasn't your paying customer. That
> seems... backwards.
>

Nope, that is not at all backwards.

Have you actually wondered what would happen, if every major ISP stopped 
classifying routes with localpref, and treated every route received by them 
(including customers and external peers) on same local-pref, so your AS 
prepending can work easily?

Some 21 years ago, there was this little known story during early stages of the 
IPv6 development, called 6bone.  Aside from the lack of native IPv6 (where 
everything had to be tunneled), the #1 issue that guaranteed IPv6 sucked many 
times worse than IPv4 back in the day was the lack of BGP clue by most of IPv6 
DFZ participants at that time, where nobody classified any of their routes 
accordingly with localpref and communities.

Not classifying your routes with local-pref leads to complete operational 
chaos, including world-tour hair-pin sightseeing becoming very common with IPv6 
during 6bone days (which resulted in rise of as30071/occaid to dominate the 
IPv6 DFZ for several years for many to transition out of 6bone).  Not 
classifying routes with local-pref means you do not care whether a particular 
peer is a settlement-free peer or a customer-- this lack of relationship 
classifiction leads to operational harm:  A customer may be paying you $/bits 
expecting you to deliver your on-net traffic onto them over their paid peering 
(or transit) link they bought from you, except, only to find you preferring an 
IX peer (e.g. Hurricane Electric, etc. over IX) as best-path, even without any 
AS Path prepending involved. 

Further, not classifying routes with local-pref and ident communities means you 
are entirely at the mercy of prefix-lists applied on your export policy.  A 
very common occurrence is often a rookie ISP appeared to be giving "transit" to 
a major Tier-1 backbone on a route that was supposed to be customer-originated 
route, but this network selected AS-Path via its uptream provider as best-path, 
instead of direct connection into the said customer.  This happens a lot on a 
route that is "downstream of a downstream" customer, who is also multi-homed 
with the said rookie ISP's upstream Tier-1 provider, thereby resulting in 
equidisant AS-Paths to what is supposed to be a customer-originated route.  
Scale this up to many routes and you have complete chaos and breakdown of your 
BGP routing table.



So, as a customer, you actually SHOULD be demanding your ISPs to positively 
identify and categorize their routes using local-pref and communities.  In 
fact, I will never purchase IP transit with BGP from a provider who doesn't 
categorize routes with local-pref.  As a customer, if you want more control 
over your network's incoming traffic, you need to instead ask your upstream 
providers about their BGP routing policy and how well they support BGP 
communities to let you steer traffic, and use those communities to make 
absolute traffic decisions.



Always remember this #1 rule of BGP decision process:  AS Path is a 
**tie-breaker** to local-pref classification.  When you prepend AS Path, your 
goal is to try to steer traffic from routes that are in the same category (i.e. 
customer or peer) as you.  When your goal is absolute steering (i.e. absolute 
as in, do not advertise to a particular peer, or make your connection standby 
backup where no traffic ever comes until there is complete outage on the other 
path, etc), you absolutely SHOULD be using BGP communities provided by your 
upstream IP provider.  If your IP transit provider does not provide extensive 
BGP communities to meet your requirements, cancel their service and give your 
business to someone else.

A rookie BGP mistake that is commonly made made by those without real-world 
experience, is the assumption that AS Path prepending should deliver absolute 
traffic steering -- it does not, and should NOT, by design.  The BGP Best-Path 
Selection Algorithm is taught very well in the CCIE curriculum, but last I 
looked, they don't teach you on the _why_, only on on the how.  So it's common 
to see enterprise CCIE's working for VARs often falling into the false 
assumption of AS Path.  See 
https://www.cisco.com/c/en/us/support/docs/ip/border-gateway-protocol-bgp/13753-25.html#toc-hId-1778347102

Hope this clarifies.

James


Re: Networks ignoring prepends?

2024-01-22 Thread Jon Lewis

On Mon, 22 Jan 2024, William Herrin wrote:


On Mon, Jan 22, 2024 at 5:24 AM Patrick W. Gilmore  wrote:

Standard practice is to localpref your customers up, which makes prepends 
irrelevant. Why would anyone expect different behavior?


It gives me, your paying customer, less control over my routing
through your network than if I wasn't your paying customer. That
seems... backwards.


Not at all.  Think like a service provider.

"I've got packets to deliver.  I've got 3 different classes of paths I can 
use.  One of them, I get paid to use.  One is cost neutral.  The last one, 
I pay to use."


Which path would you pick (assuming you're trying to maximize revenue 
from your network)?


--
 Jon Lewis, MCP :)  |  I route
 Blue Stream Fiber, Sr. Neteng  |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: Networks ignoring prepends?

2024-01-22 Thread Niels Bakker

* b...@herrin.us (William Herrin) [Mon 22 Jan 2024, 15:05 CET]:

On Mon, Jan 22, 2024 at 5:24 AM Patrick W. Gilmore  wrote:
Standard practice is to localpref your customers up, which makes 
prepends irrelevant. Why would anyone expect different behavior?


It gives me, your paying customer, less control over my routing 
through your network than if I wasn't your paying customer. That 
seems... backwards.


Most sellers of IP transit offer a "treat as peer" BGP community which 
will flatten your localpref to that of peers rather than a customer.



-- Niels.


Re: Networks ignoring prepends?

2024-01-22 Thread William Herrin
On Mon, Jan 22, 2024 at 5:24 AM Patrick W. Gilmore  wrote:
> Standard practice is to localpref your customers up, which makes prepends 
> irrelevant. Why would anyone expect different behavior?

It gives me, your paying customer, less control over my routing
through your network than if I wasn't your paying customer. That
seems... backwards.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/


Re: Networks ignoring prepends?

2024-01-22 Thread Patrick W. Gilmore
> The Internet is lying to itself, and that’s not a situation that can persist 
> forever.

I am not sure I agree.

First, prepends are a suggestion. Perhaps a request. It has never (or at least 
not for the 3 decades I’ve been doing this) been a guarantee. In the situation 
below, perhaps the 5K mile backup path is through a provider who pays 
Centurylink (Lumen?). Standard practice is to localpref your customers up, 
which makes prepends irrelevant. Why would anyone expect different behavior?

As for hiding hops, that is not lying. What happens inside my network is my 
business. If I give the world some info, say with in-addrs on hops, that’s 
fine. If I do not, I am not “lying”. This is perfectly sustainable, nothing 
will break (IMHO). In fact, I would argue without tools like MPLS, the Internet 
would have broken a long time ago.

-- 
TTFN,
patrick

> On Jan 22, 2024, at 08:13, Mel Beckman  wrote:
> 
> Prepend contraction is becoming more common. You can’t really stop providers 
> from doing it, and it reduces BGP table size, which I’ve heard as a secondary 
> rationale. I’d love to see the statistics on that though.
> 
> BGP Communities seem to be the only alternative, and that limits your 
> engineering reach to mostly immediate peers.
> 
> Another problem is providers that hide multiple router hops inside MPLS, 
> which appears as a single ip hop in traceroutes, making it impossible to know 
> the truth path geographically. 
> 
> The Internet is lying to itself, and that’s not a situation that can persist 
> forever.
> 
> -mel via cell
> 
>> On Jan 22, 2024, at 4:52 AM, William Herrin  wrote:
>> 
>> Howdy,
>> 
>> Does anyone have suggestions for dealing with networks who ignore my
>> BGP route prepends?
>> 
>> I have a primary ingress with no prepends and then several distant
>> backups with multiple prepends of my own AS number. My intention, of
>> course, is that folks take the short path to me whenever it's
>> reachable.
>> 
>> A few years ago, Comcast decided it would prefer the 5000 mile,
>> five-prepend loop to the short 10 mile path. I was able to cure that
>> with a community telling my ISP along that path to not advertise my
>> route to Comcast. Today it's Centurylink. Same story; they'd rather
>> send the packets 5000 miles to the other coast and back than 10 miles
>> across town. I know they have the correct route because when I
>> withdraw the distant ones entirely, they see and use it. But this time
>> it's not just one path; they prefer any other path except the one I
>> want them to use. And Centurylink is not a peer of those ISPs, so
>> there doesn't appear to be any community I can use to tell them not to
>> use the route.
>> 
>> I hate to litter the table with a batch of more-specifics that only
>> originate from the short, preferred link but I'm at a loss as to what
>> else to do.
>> 
>> Advice would be most welcome.
>> 
>> Regards,
>> Bill Herrin
>> 
>> --
>> William Herrin
>> b...@herrin.us
>> https://bill.herrin.us/



Re: Networks ignoring prepends?

2024-01-22 Thread Jon Lewis

On Mon, 22 Jan 2024, William Herrin wrote:


Howdy,

Does anyone have suggestions for dealing with networks who ignore my
BGP route prepends?

I have a primary ingress with no prepends and then several distant
backups with multiple prepends of my own AS number. My intention, of
course, is that folks take the short path to me whenever it's
reachable.

A few years ago, Comcast decided it would prefer the 5000 mile,
five-prepend loop to the short 10 mile path. I was able to cure that
with a community telling my ISP along that path to not advertise my
route to Comcast. Today it's Centurylink. Same story; they'd rather
send the packets 5000 miles to the other coast and back than 10 miles
across town. I know they have the correct route because when I
withdraw the distant ones entirely, they see and use it. But this time
it's not just one path; they prefer any other path except the one I
want them to use. And Centurylink is not a peer of those ISPs, so
there doesn't appear to be any community I can use to tell them not to
use the route.

I hate to litter the table with a batch of more-specifics that only
originate from the short, preferred link but I'm at a loss as to what
else to do.


In my experience, it's pretty common for service providers to use 
localpref to differentiate paid/free/customer routes (with LP increasing 
in this order).  Since LP trumps as-path length, no amount of prepending 
will get around this.


You may be limited to seeing if your backup providers have community 
controls that would let you tell them "don't share with Centurylink" or 
seeing if your primary has similar controls that would let you advertise 
both the aggregate and more specifics, but have them not propagate the 
more specifics except to those networks (i.e. Centurylink) that you need 
to see them to get them off your backup paths.


--
 Jon Lewis, MCP :)  |  I route
 Blue Stream Fiber, Sr. Neteng  |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: Networks ignoring prepends?

2024-01-22 Thread Mel Beckman
Prepend contraction is becoming more common. You can’t really stop providers 
from doing it, and it reduces BGP table size, which I’ve heard as a secondary 
rationale. I’d love to see the statistics on that though.

BGP Communities seem to be the only alternative, and that limits your 
engineering reach to mostly immediate peers.

Another problem is providers that hide multiple router hops inside MPLS, which 
appears as a single ip hop in traceroutes, making it impossible to know the 
truth path geographically. 

The Internet is lying to itself, and that’s not a situation that can persist 
forever.

-mel via cell

> On Jan 22, 2024, at 4:52 AM, William Herrin  wrote:
> 
> Howdy,
> 
> Does anyone have suggestions for dealing with networks who ignore my
> BGP route prepends?
> 
> I have a primary ingress with no prepends and then several distant
> backups with multiple prepends of my own AS number. My intention, of
> course, is that folks take the short path to me whenever it's
> reachable.
> 
> A few years ago, Comcast decided it would prefer the 5000 mile,
> five-prepend loop to the short 10 mile path. I was able to cure that
> with a community telling my ISP along that path to not advertise my
> route to Comcast. Today it's Centurylink. Same story; they'd rather
> send the packets 5000 miles to the other coast and back than 10 miles
> across town. I know they have the correct route because when I
> withdraw the distant ones entirely, they see and use it. But this time
> it's not just one path; they prefer any other path except the one I
> want them to use. And Centurylink is not a peer of those ISPs, so
> there doesn't appear to be any community I can use to tell them not to
> use the route.
> 
> I hate to litter the table with a batch of more-specifics that only
> originate from the short, preferred link but I'm at a loss as to what
> else to do.
> 
> Advice would be most welcome.
> 
> Regards,
> Bill Herrin
> 
> --
> William Herrin
> b...@herrin.us
> https://bill.herrin.us/


Networks ignoring prepends?

2024-01-22 Thread William Herrin
Howdy,

Does anyone have suggestions for dealing with networks who ignore my
BGP route prepends?

I have a primary ingress with no prepends and then several distant
backups with multiple prepends of my own AS number. My intention, of
course, is that folks take the short path to me whenever it's
reachable.

A few years ago, Comcast decided it would prefer the 5000 mile,
five-prepend loop to the short 10 mile path. I was able to cure that
with a community telling my ISP along that path to not advertise my
route to Comcast. Today it's Centurylink. Same story; they'd rather
send the packets 5000 miles to the other coast and back than 10 miles
across town. I know they have the correct route because when I
withdraw the distant ones entirely, they see and use it. But this time
it's not just one path; they prefer any other path except the one I
want them to use. And Centurylink is not a peer of those ISPs, so
there doesn't appear to be any community I can use to tell them not to
use the route.

I hate to litter the table with a batch of more-specifics that only
originate from the short, preferred link but I'm at a loss as to what
else to do.

Advice would be most welcome.

Regards,
Bill Herrin

-- 
William Herrin
b...@herrin.us
https://bill.herrin.us/