Re: [j-nsp] JunOS RPKI/ROA database in non-default routing instance, but require an eBGP import policy in inet.0 (default:default LI:RI) to reference it.

2023-06-06 Thread Mark Tinka via juniper-nsp



On 6/6/23 09:27, Saku Ytti wrote:


I am not implying it is pragmatic or possible, just correct from a
design point of view.

Commercial software deals with competing requirements, and these
requirements are not constructive towards producing maintainable clean
code. Over time commercial software becomes illiquid with its
technical debt.

There is no real personal reward for paying technical debt, because
almost invariably it takes a lot of time, brings no new revenue and
non-coder observing your work only sees the outages the debt repayment
caused. While another person who creates this debt creating new
invoiceable features and bug fixes in ra[pb]id manner is a star to the
non-coder observers.

Not to say our open source networking is always great either, Linux
developers are notorious about not asking SMEs 'how has this problem
been solved in other software'. There are plenty of anecdotes to
choose from, but I'll give one.

- In 3.6 kernel, FIB was introduced to replace flow-cache, of course
anyone dealing with networking could have told kernel developers day1
why flow-cache was a poor idea, and what FIB is, how it is done, and
why it is a better idea.
- In 3.6 FIB implementation, ECMP was solved by essentially randomly
choosing 1 option of many, per-packet. Again they could have asked
even junior network engineers 'how does ECMP work, how should it be
done, I'm thinking of doing like this, why do you think they've not
done this in other software?' But they didn't.
- in 4.4 Random ECMP was changed to do hashed ECMP

I still continue to catch discussions about poor TCP performance on
Linux ECMP environment, then I first ask what kernel do you have, then
I explain to them why per-packet + cubic will never ever perform. So
for 4 years ECMP was completely broke, and reading ECMP release notes
in 4.4 not even developers had completely understood just how bad the
problem one, so we can safely assume people were not running ECMP.

Another example was when I tried to explain to the OpenSSH mailing
list, that ''TOS' isn't a thing, and got a confident reply that TOS
absolutely is a thing, prec/DSCP are not. Luckily a few years later
Job fixed OpenSSH packet classification.

But these examples are everywhere, so it seems you either choose
software written by people who understand the problem but are forced
to write unmaintainable code, or you choose software by people who are
just now learning about the problem and then solve it without
discovering prior art, usually wrong.


I think being able to write code is one thing. Being able to write code 
to build and run an IP/MPLS network is - not a-whole-other - but another 
thing. I say this because people that know how to write code do not 
always understand how IP/MPLS networks work. And for better or worse, we 
need code to run the routers and switches that deliver IP/MPLS 
capability to network operators.


The reason traditional networking OEM's build usable code that allows us 
to run IP/MPLS networks is that their raison d'ĂȘtre is, well, shifting 
packets around the world as quickly as possible. General-purpose OS 
developers optimize for service/app performance, leaving the problem of 
network performance to the networking folk, for the most part. So it 
does not surprise me that developers who code for a general-purpose OS 
would think RIP is better than IS-IS, for example, just because it has 
the word "Routing" in it and they can write code for it. It's not 
because they don't know how to write code for IS-IS... they just don't 
have the organizational structure setup to care about why IS-IS is a 
better idea than RIP. Their organization setup is app, app, app.


Unfortunately, not everybody can be a Cisco, Juniper, Google or AWS, who 
have the benefit of plenty of people that can more easily integrate 
writing code for its down sake with writing code for networking.


It is the reason most large scale network operators will still continue 
to find value in IOS XR, Junos, EOS, ArcOS, e.t.c., than, say, a NOS 
that was put together by someone that knows how to interpret an RFC and 
spit out an implementation on Linux, with zero understanding of the 
overall TCP/UDP/IP/MPLS/Ethernet stack and how it all ties in together 
at scale.


I like what folk like pfSense (Netgate) are doing with FRR, and also 
what folk like Mikrotik can pack in 13MB of software... but at a certain 
scale, you simply can't ignore traditional networking OEM, try as we might.


Mark.
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp


Re: [j-nsp] JunOS RPKI/ROA database in non-default routing instance, but require an eBGP import policy in inet.0 (default:default LI:RI) to reference it.

2023-06-06 Thread Saku Ytti via juniper-nsp
On Tue, 6 Jun 2023 at 06:54, Mark Tinka via juniper-nsp
 wrote:

> > While I have a lot of sympathy for Saku's pragmatism, I prefer to file off 
> > the ugly edges of old justifications when I can... but it's done one commit 
> > at a time.
>>
> Going back to re-do the implementation from scratch would be a
> non-starter. There is simply too much water under this bridge.

I am not implying it is pragmatic or possible, just correct from a
design point of view.

Commercial software deals with competing requirements, and these
requirements are not constructive towards producing maintainable clean
code. Over time commercial software becomes illiquid with its
technical debt.

There is no real personal reward for paying technical debt, because
almost invariably it takes a lot of time, brings no new revenue and
non-coder observing your work only sees the outages the debt repayment
caused. While another person who creates this debt creating new
invoiceable features and bug fixes in ra[pb]id manner is a star to the
non-coder observers.

Not to say our open source networking is always great either, Linux
developers are notorious about not asking SMEs 'how has this problem
been solved in other software'. There are plenty of anecdotes to
choose from, but I'll give one.

- In 3.6 kernel, FIB was introduced to replace flow-cache, of course
anyone dealing with networking could have told kernel developers day1
why flow-cache was a poor idea, and what FIB is, how it is done, and
why it is a better idea.
- In 3.6 FIB implementation, ECMP was solved by essentially randomly
choosing 1 option of many, per-packet. Again they could have asked
even junior network engineers 'how does ECMP work, how should it be
done, I'm thinking of doing like this, why do you think they've not
done this in other software?' But they didn't.
- in 4.4 Random ECMP was changed to do hashed ECMP

I still continue to catch discussions about poor TCP performance on
Linux ECMP environment, then I first ask what kernel do you have, then
I explain to them why per-packet + cubic will never ever perform. So
for 4 years ECMP was completely broke, and reading ECMP release notes
in 4.4 not even developers had completely understood just how bad the
problem one, so we can safely assume people were not running ECMP.

Another example was when I tried to explain to the OpenSSH mailing
list, that ''TOS' isn't a thing, and got a confident reply that TOS
absolutely is a thing, prec/DSCP are not. Luckily a few years later
Job fixed OpenSSH packet classification.

But these examples are everywhere, so it seems you either choose
software written by people who understand the problem but are forced
to write unmaintainable code, or you choose software by people who are
just now learning about the problem and then solve it without
discovering prior art, usually wrong.


-- 
  ++ytti
___
juniper-nsp mailing list juniper-nsp@puck.nether.net
https://puck.nether.net/mailman/listinfo/juniper-nsp