On Tue, 6 Jun 2023 at 06:54, Mark Tinka via juniper-nsp <juniper-nsp@puck.nether.net> wrote:
> > While I have a lot of sympathy for Saku's pragmatism, I prefer to file off > > the ugly edges of old justifications when I can... but it's done one commit > > at a time. >> > Going back to re-do the implementation from scratch would be a > non-starter. There is simply too much water under this bridge. I am not implying it is pragmatic or possible, just correct from a design point of view. Commercial software deals with competing requirements, and these requirements are not constructive towards producing maintainable clean code. Over time commercial software becomes illiquid with its technical debt. There is no real personal reward for paying technical debt, because almost invariably it takes a lot of time, brings no new revenue and non-coder observing your work only sees the outages the debt repayment caused. While another person who creates this debt creating new invoiceable features and bug fixes in ra[pb]id manner is a star to the non-coder observers. Not to say our open source networking is always great either, Linux developers are notorious about not asking SMEs 'how has this problem been solved in other software'. There are plenty of anecdotes to choose from, but I'll give one. - In 3.6 kernel, FIB was introduced to replace flow-cache, of course anyone dealing with networking could have told kernel developers day1 why flow-cache was a poor idea, and what FIB is, how it is done, and why it is a better idea. - In 3.6 FIB implementation, ECMP was solved by essentially randomly choosing 1 option of many, per-packet. Again they could have asked even junior network engineers 'how does ECMP work, how should it be done, I'm thinking of doing like this, why do you think they've not done this in other software?' But they didn't. - in 4.4 Random ECMP was changed to do hashed ECMP I still continue to catch discussions about poor TCP performance on Linux ECMP environment, then I first ask what kernel do you have, then I explain to them why per-packet + cubic will never ever perform. So for 4 years ECMP was completely broke, and reading ECMP release notes in 4.4 not even developers had completely understood just how bad the problem one, so we can safely assume people were not running ECMP. Another example was when I tried to explain to the OpenSSH mailing list, that ''TOS' isn't a thing, and got a confident reply that TOS absolutely is a thing, prec/DSCP are not. Luckily a few years later Job fixed OpenSSH packet classification. But these examples are everywhere, so it seems you either choose software written by people who understand the problem but are forced to write unmaintainable code, or you choose software by people who are just now learning about the problem and then solve it without discovering prior art, usually wrong. -- ++ytti _______________________________________________ juniper-nsp mailing list juniper-nsp@puck.nether.net https://puck.nether.net/mailman/listinfo/juniper-nsp