Re: Frustration with increasing information demands from Network Vendors
Rich Kulawiec wrote on 11/10/2024 15:07: Every answer to every question at every site should be different and every one of them should be wrong. This approach does lead to interesting conversations with tech support people when you explain to them that your birthday is some time in the 1700s or 1800s. But a token is a token, right? Nick (fwiw, my cat's name is "ofo0tL1!Rgz8WPQ+")
Re: Compiling RTG on EL9
Drew Weaver wrote on 12/07/2024 14:37: I am just curious with the demise of EL7 if anyone else is working on trying to compile RTG for EL9. If you don’t know what RTG is it’s just an old SNMP poller/graph plotter that some networks have found useful in the past. Drew, Whoa, that's some blast from the past. At the time of the latest release in 2003, rtg was still duking it out with mrtg and cricket, which was used by the cool kids. Still some good memories there. Out of curiosity I had a look. It barfs at my_thread_init(). Probably this is related to mysql 8.0.2 which removed my_init() entirely as it's now called implicitly from regular mysql api calls, i.e. it can be deleted from the code. Here's the reference in the release notes: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-2.html#mysqld-8-0-2-compiling After 20 years, most of the code compiles even without warnings, which is pretty good. I'm sure it would be pretty straightforward for a C dev to get it to compile again. Whether you'd want this or not is a different issue :) Nick
Re: Open source Netflow analysis for monitoring AS-to-AS traffic
Tom Beecher wrote on 28/03/2024 18:35: Fundamentally I've always disagreed with how sFlow aggregates flow data with network state data. "can aggregate" rather than "aggregates" - this is implementation dependent and most implementations don't bother with it. Overall, sflow has one major advantage over netflow/ipfix, namely that it's a stateless sampling mechanism. Once you have hardware that can reliably pick out one in N frames, the rest of the protocol is straightforward enough, which means that it's cheap to implement in hardware. If you're ok with 1. sampling and 2. the set of data that sflow provides, then sflow is great. Netflow / ipfix, on the other hand, assumes that it's learning about flow state. For this, you need both a flow lookup mechanism and flow storage memory. Usually the flow lookup mechanism is implemented using the same technology as the packet forwarding lookup mechanism due to performance requirements, i.e. expensive. Similarly, the storage mechanism needs to be fast, which often precludes being large. Often both the lookup and storage mechanism are linked, e.g. tcam. Obviously, not all netflow/ipfix implementations implement flow state, but most do; some implement stateless sampling ala sflow. Also many netflow implementations don't export mac address information, which limits usefulness in certain situations. But this is an implementation gap rather than a protocol weakness. Tools should be chosen to fit the job. There are plenty of situations where sflow is ideal. There are others where netflow is preferable. Nick
Re: IPv6 uptake
Michael Thomas wrote on 18/02/2024 21:18: So it has its own wireless? I seem to recall that there were some economic reasons to use their CPE as little as possible to avoid rent. Has that changed? Or can I run down and just buy a Cablelabs certified router/modem these days? There's no short answer to this question. A third party cable modem will work with a basic CM config file if you can convince the cable operator to provision the device, but cable operators don't like running third party kit on their network for a lot of reasons. One of these reasons is bandwidth channel utilisation. Another is support. Another would be software upgrades, which can lead to issues with security. Also, if you use a vanilla cable modem config, you miss out on many of the more interesting configurable bits on cable modems. The root issue here is that cable networks are shared resources, and the cable modem polices the customers' bandwidth utilisation on instruction from the CMTS (head-end cable router) and the provisioning system. The system works well from a technical perspective when the operator has full control of all modems and they're all relatively recent, properly supported units, fully managed by the cable operator. If you start adding poor quality cheap units into the mix, it can cause service problems. For example, some cable modems provide basic spectrum analysers on the provider interface (yes, cable operators can remotely log in to cable modems) and good quality reporting about RF noise. If you get some hobbyist demanding to use their own modem, and then you run into an RF problem at their premises which could normally be diagnosed remotely using the internal cable modem diagnostics, but you can't do this because the customer has used their own kit which doesn't support this, then you've instantly driven up your cost of service because now you need to schedule a call-out for something which could previously have been diagnosed remotely. So you can see why this might be frustrating for the cable modem operator. Cable modem rent is a political issue. Nick
Re: IPv6 uptake
Michael Thomas wrote on 18/02/2024 20:56: That's really great to hear. Of course there is still the problem with CPE that doesn't speak v6, but that's not their fault and gives some reason to use their CPE. Already solved: cable modem ipv6 support is usually also excellent, both in terms of subscriber services handoff and management. The requirements for ipv6 support are very clearly defined in the cablelabs docsis 3.0 specification. Nick
Re: IPv6 uptake
Michael Thomas wrote on 18/02/2024 20:28: I do know that Cablelabs pretty early on -- around the time I mentioned above -- has been pushing for v6. Maybe Jason Livingood can clue us in. Getting cable operators onboard too would certainly be a good thing, availability of provider-side ipv6 support is generally excellent on docsis networks. This includes end-user device support, management, client and server side provisioning, the works. This is one of the real ipv6 success stories in the service provider arena. Nick
Re: Networks ignoring prepends?
William Herrin wrote on 22/01/2024 21:26: At which point Centurylink chooses 40676 7489 11875 11875 11875 11875 11875 11875 11875. [...] You're telling me with a straight face that you think that's*reasonable* routing? yep, looks pretty reasonable, if you're Centurylink and 40676 is a Centurylink customer. Besides, I don't want to drop the path to53356 via 47787. If the path through 20473 fails, the path through 53356 is the next best and I want Centurylink to use it. You have your own ASN, you have control over your own routing policy. Centurylink probably aren't going to be interested in engaging with you if you're not a customer. It's a pickle. Nick
Re: Shared cache servers on an island's IXP
Jérôme Nicolle wrote on 18/01/2024 14:38: Those I'm nearly sure I could get, if I can pool caches amongst ISPs. The current constraints are issues to any content provider, not just for local ISPs. two issues here: the smaller issue is that CDNs sometimes want their own routable IP address blocks, especially if they're connecting directly to the IXP, which usually means /24 in practice. It doesn't always happen, and sometimes the CDN is happy to use provider address space (i.e. IXP), or smaller address blocks. But it's something to note. The bigger issue is: who pays the transit costs for the CDN's cache-fill requirements? CDNs typically won't pay for cache-fill for installations like this, and if one local ISP is pulling disproportionate quantities of data compared to other ISPs at the IXP, then this can cause problems unless there's an shared billing mechanism built in. Nick
Re: IPv4 address block
Matthew Petach wrote on 13/01/2024 00:27: In light of that, I strongly suspect that a second go-around at developing more beneficial post-exhaustion policies might turn out very differently than it did when many of us were naively thinking we understood how people would behave in a post-exhaustion world. Naah, any future relitigation would end up the same if new ipv4 addresses fell out of the sky and became available. The ipv4 address market turned out exactly like most people suspected: it was a market; people bought and sold addresses; the addresses cost money; there were/are some sharks; life moved on. If you limit each requesting organization to a /22 per year, we can keep the internet mostly functional for decades to come, at least in the ripe ncc service region, all this proved was that if the cost of registering a company (or LIR) and applying for an allocation was lower than the market rate of ipv4 addresses, then people would do that. The root problem is unavoidable: ipv4 is a scarce resource with an inherent demand. Every policy designed to mitigate against this will create workarounds, and the more valuable the resource, the more inventive the workaround. In terms of hard landings vs soft landings, what will make ipv6 succeed is how compelling ipv6 is, rather than whether someone created a policy to make ipv4 less palatable. In particular, any effect from a hard landing compared would have been ephemeral. Nick
Re: 202401100645.AYC Re: IPv4 address block
Matthew Petach wrote on 11/01/2024 21:05: I think that's a bit of an unfair categorization--we can't look at pre-exhaustion demand numbers and extrapolate to post-exhaustion allocations, given the difference in allocation policies pre-exhaustion versus post-exhaustion. Matt, the demand for publicly-routable ipv4 addresses would be comparable to before, with the additional pressure of several years of pent-up demand. You're right to say that allocation policies could be different, but we had discussions about run-out policies in each RIR area in the late 2000s and each RIR community settled on particular sets of policies. I don't see that if an additional set of ipv4 address blocks were to fall out of the sky, that any future run-out policies would be much different to what we had before. So 240/4 might last a month, or a year, or two, or be different in each RIR service area, but it's not going to change anything fundamental here, or permanently move the dial: ipv4 will still be a scarce resource afterwards. Nick
Re: IPv4 address block
Christopher Hawker wrote on 11/01/2024 10:54: Reclassifying this space, would add 10+ years onto the free pool for each RIR on this point: prior to RIR depletion, the annual global run-rate on /8s measured by IANA was ~13 per annum. So that suggests that 240/4 would provide a little more than 1Y of consumption, assuming no demand back-pressure, which seems an unlikely assumption. Nick
Re: 202401100645.AYC Re: IPv4 address block
Dave Taht wrote on 11/01/2024 09:40: 240/4 is intensely routable and actually used in routers along hops inside multiple networkstoday, but less so as a destination. 240/4 is fine for private use, but the OP needed publicly routable IP addresses, which 240/4 are definitely not. Nick
Re: 202401100645.AYC Re: IPv4 address block
Tom Beecher wrote on 10/01/2024 15:12: ( Unless people are transferring RFC1918 space these days, in which case who wants to make me an offer for 10/8? ) I'm taking bids on 256.0.0.0/8, which is every bit as publicly routable as 240/4. Nick
Re: maximum ipv4 bgp prefix length of /24 ?
William Herrin wrote on 02/10/2023 08:56: All it means is that you have to keep an eye on your FIB size as well, since it's no longer the same as your RIB size. the point Jacob is making is is that when using FIB compression, the FIB size depends on both RIB size and RIB complexity. I.e. what was previously a deterministic 1:1 ratio between RIB and FIB - which is straightforward to handle from an operational point of view - becomes non-deterministic. The difficulty with this is that if you end up with a FIB overflow, your router will no longer route. That said, there are cases where FIB compression makes a lot of sense, e.g. leaf sites, etc. Conversely, it's not a generally appropriate technology for a dense dfz core device. It's a tool in the toolbox, one of many. Nick
Re: Lossy cogent p2p experiences?
Masataka Ohta wrote on 04/09/2023 12:04: Are you saying you thought a 100G Ethernet link actually consisting of 4 parallel 25G links, which is an example of "equal speed multi parallel point to point links", were relying on hashing? this is an excellent example of what we're not talking about in this thread. A 100G serdes is an unbuffered mechanism which includes a PLL, and this allows the style of clock/signal synchronisation required for the deserialised 4x25G lanes to be reserialised at the far end. This is one of the mechanisms used for packet / cell / bit spray, and it works really well. This thread is talking about buffered transmission links on routers / switches on systems which provide no clocking synchronisation and not even a guarantee that the bearer circuits have comparable latencies. ECMP / hash based load balancing is a crock, no doubt about it; it's just less crocked than other approaches where there are no guarantees about device and bearer circuit behaviour. Nick
Re: Lossy cogent p2p experiences?
Masataka Ohta wrote on 03/09/2023 14:32: See, for example, the famous paper of "Sizing Router Buffers". With thousands of TCP connections at the backbone recognized by the paper, buffers with thousands of packets won't cause packet reordering. What you said reminds me of the old saying: in theory, there's no difference between theory and practice, but in practice there is. In theory, you can always fabricate unrealistic counter examples against theories by ignoring essential assumptions of the theories. In this case, "Without buffer bloat" is an essential assumption. I can see how this conclusion could potentially be reached in specific styles of lab configs, but the real world is more complicated and the assumptions you've made don't hold there, especially the implicit ones. Buffer bloat will make this problem worse, but small buffers won't eliminate the problem. That isn't to say that packet / cell spray arrangements can't work. There are some situations where they can work reasonably well, given specific constraints, e.g. limited distance transmission path and path congruence with far-side reassembly (!), but these are the exception. Usually this only happens inside network devices rather than between devices, but occasionally you see products on the market which support this between devices with varying degrees of success. Generally in real world situations on the internet, packet reordering will happen if you use round robin, and this will impact performance for higher speed flows. There are several reasons for this, but mostly they boil down to a lack of control over the exact profile of the packets that the devices are expected to transmit, and no guarantee that the individual bearer channels have identical transmission characteristics. Then multiply that across the N load-balanced hops that each flow will take between source and destination. It's true that per-hash load balancing is a nuisance, but it works better in practice on larger heterogeneous networks than RR. Nick
Re: Lossy cogent p2p experiences?
Masataka Ohta wrote on 03/09/2023 08:59: the proper thing to do is to use the links with round robin fashion without hashing. Without buffer bloat, packet reordering probability within each TCP connection is negligible. Can you provide some real world data to back this position up? What you said reminds me of the old saying: in theory, there's no difference between theory and practice, but in practice there is. Nick
Re: Lossy cogent p2p experiences?
Masataka Ohta wrote on 02/09/2023 16:04: 100 50Mbps flows are as harmful as 1 5Gbps flow. This is quite an unusual opinion. Maybe you could explain? Nick
Re: JunOS/FRR/Nokia et al BGP critical issue
Bjørn Mork wrote on 01/09/2023 10:52: But there's obviously not been enough thought applied to realize that optional transitive attributes must be considered evil by default. They can only be used after extremely careful parsing. This is the BGP version of select * from mytable where field = $unvalidated_user_input; it's not really. If the receiving BGP stack understands the attribute, then it should be parsed as default, i.e. carefully. Unfortunately, junos slipped up on this and didn't validate the input correctly, which is a parsing bug. Param validation bugs happen. They shouldn't happen, but they do. If an intermediate router doesn't understand a transitive attribute, it should be ignored, and life should move on. The problems arise in two situations: 1. malformed attribute, i.e. this situation. 2. vendors squatting path attribute values which are then assigned for other purposes. This is a subset of #1, but is messy and difficult to rectify when it happens. Great for fuzzing, not so good for production networks. Nick
Re: JunOS/FRR/Nokia et al BGP critical issue
Bjørn Mork wrote on 01/09/2023 08:17: Sounds familiar. https://supportportal.juniper.net/s/article/BGP-Malformed-AS-4-Byte-Transitive-Attributes-Drop-BGP-Sessions?language=en_US You'd think a lot of thought has gone into error handling for optional transitive attributes since then, but... A good deal of thought has gone into the problem, and this is where rfc7606 came from. Treat-as-withdraw for the NLRI in question is the default option with this approach, and should be deployed universally. Nick
Re: JunOS config yacc grammar?
Lyndon Nerenberg (VE7TFX/VE6BBM) wrote on 22/08/2023 01:27: Because I've been writing yacc grammars for decades. I just wanted to see if someone had already done it, as that would save me some time. But if there's nothing out there I'll just roll one myself. check out xorp and vyos - both contain code to parse junos style configurations. Just bear in mind that they provide basic tokeniser functionality, which parses the configurations into token trees. The config interpretation can then be handled on a modular basis. Nick
Re: JunOS config yacc grammar?
Lyndon Nerenberg (VE7TFX/VE6BBM) wrote on 21/08/2023 22:14: Any chance somebody out there has a yacc grammar that will parse a Juniper config files? My immediate interest involves v19.X on our EX4300s, but anything in the ballpark would save me having to write one from scratch. No need to reinvent that wheel: root@foo> show configuration | display xml root@foo> show configuration | display json ... then slurp into an ingestion engine in your favourite language. Nick
Re: Dodgy AS327933 ...?
Mike Hammett wrote on 15/08/2023 23:02: I'd say it's probably the best router UI ever, but I suppose now we'll find ourselves in a religious argument. Whatever about the web / winbox UI, there are some fairly serious weaknesses in the cli and api: 1. there's no atomic configuration commit + auto rollback. 2. the CLI is non-idempotent, for example if you're in a list context and issue the command "remove 1", it will do different things each time you execute it. 3. there is no way to delete the configuration tree or sub-trees (e.g. "config replace"), which outright blocks the possibility of clean-slate reconfiguration. 4. as a consequence of #1 and #3, it's not possible to blindly change the config on a routeros device without parsing the existing configuration. The net outcome is that orchestration is basically impossible on this platform, and it's not possible to fix. It would need a complete CLI/API redesign. Nick
Re: Dodgy AS327933 ...?
Malte Tashiro wrote on 12/08/2023 04:50: Looking at this I also saw that for a short time some prefixes belonging to AS37451 were announced by AS2454388738 (see [0] and [1]). Anybody have a smart idea which command could have caused this? AS2454388738 == AS37451.2, in asdot format. Nick
Re: Dodgy AS327933 ...?
Mark Tinka wrote on 11/08/2023 10:33: It is not terribly clever of Mikrotik to have two commands that do different things be that close in syntax. no, indeed. That said, why are we giving the routers the ability to manually generate AS_PATH's? On any router OS, this is simply asking for it. bgp is a policy based distance vector protocol. If you can't adjust the primary inter-domain metric to handle your policy requirements, it's not much use. Nick
Re: Dodgy AS327933 ...?
Mark Tinka wrote on 11/08/2023 10:17: So how would one fumble it to the degree where a fat-finger results in what should be a prepend becoming an AS_PATH? Genuine question - I have zero experience with Mikrotik in an SP role. If your asn is 327933, then: add chain=foo prefix=192.0.2.0/24 action=accept set-bgp-prepend=2 ... will produce: "327933 327933", and: add chain=foo prefix=192.0.2.0/24 action=accept set-bgp-prepend-path=2 ... will produce: "327933 2". Routeros does command completion on the CLI, so this is finger-slip territory, and the two commands are visually similarly enough to each other that it would be easy not to notice. Nick
Re: Dodgy AS327933 ...?
Mark Tinka wrote on 11/08/2023 09:43: Did I miss the memo where vendors went from explicitly defining the AS multiple times to determine the number of prepends, to, this :-)? yep, sure did. Check out the "set-bgp-prepend" action on routeros - it's right next to "set-bgp-prepend-path". https://wiki.mikrotik.com/wiki/Manual:Routing/Routing_filters Nick
Re: Prepending
Sandoiu Mihai wrote on 18/10/2022 12:59: We have witnessed a lot of prepending in the last days, we got a few internet routes that have 30…200 prepends, did you face the same issue? Not sure that this is causing an operational problem? If you don't like it, then nothing is stopping you from implementing an excess prepending policy. Nick
Re: 400G forwarding - how does it work?
Masataka Ohta wrote on 07/08/2022 12:16: Ethernet switches with small buffer is enough for IXes That would not be the experience of IXP operators. Nick
Re: Newbie x Cisco IOS-XR x ROV: BCP to not harassing peer(s)
Hank Nussbacher wrote on 14/05/2022 19:15: In the end, the reason for all this RPKI-thingy is to prevent route spoofing by malicious actors. a malicious actor will spoof the origin AS. The aim of RPKI to help stop mis-origination of prefixes, and the root cause of most of this is accidental. Nick
Re: Sabotage: several severed cables at the origin of a major internet outage in France
+ pics: https://twitter.com/acontios_net/status/1519296590015606787 https://twitter.com/acontios_net/status/1519280710762348545 https://twitter.com/acontios_net/status/1519276453350805504 Nick Paul Ferguson wrote on 27/04/2022 15:17: On 4/27/22 7:08 AM, Sean Donelan wrote: Multiple physical cable cuts in multiple diverse locations in France. Several networks that connect the internet infrastructures of major French cities were cut overnight, in a short interval. A state source evokes with "the Obs" a "coordinated malicious act", which confirms SFR and Free affected. An investigation has been opened. https://www.nouvelobs.com/faits-divers/20220427.OBS57722/plusieurs-cables-sectionnes-a-l-origine-d-une-importante-panne-internet-en-france.html English language news article, fwiw: https://www.telegraph.co.uk/world-news/2022/04/27/internet-multiple-cities-across-france-suspected-sabotage/ Cheers, - ferg
Re: 2749 routes AT RISK - Re: TIMELY/IMPORTANT - Approximately 40 hours until potentially significant routing changes (re: Retirement of ARIN Non-Authenticated IRR scheduled for 4 April 2022)
Kenneth Finnegan wrote on 04/04/2022 21:05: I've taken it upon myself to create proxy registrations for all of these prefixes in ALTDB. Please don't. You're not doing the routing security ecosystem any favours by doing this. Couple of reasons why: 1. this isn't your data and this is an unexpected action on the part of the registrants, 2. this is a sure-fire way of accumulating even more cruft in ALTDB in a way which is troublesome to clean up afterwards, 3. there are several thousand objects in there which are already marked as proxy registrations, and are already likely to be inaccurate, 4. you're losing authentication information for people to self-manage their registrations, and 5. you have likely not cross-checked this data against RIR transfers / de-registrations - it's not really possible to do with with the arin-nonauth db because that db doesn't include the last-modified timestamp, and the changed: attribute is unreliable. Nick
Re: MAP-T
Bjørn Mork wrote on 27/03/2022 10:42: Yes, for traditional mobile (i.e handsets) the picture is completely different. Same view shows an average of 85% IPv6 on mobile access: https://munin.fud.no/vg.no/www.vg.no/vg_ds_telenor_mobil.html from the point of view of cgnat scaling, a more useful figure would be the number of ipv6 "sessions" vs natted ipv4 sessions. It's well established that many of the highest volume traffic sources on the internet are ipv6 enabled, but the long tail is definitely not. I.e. throughput is not necessarily a useful data point for substantiating many of the claims that are made here and elsewhere about ipv6 popularity. Nick
Re: What do you think about this airline vs 5G brouhaha?
nano...@mulligan.org wrote on 19/01/2022 21:57: If you look at 5G deployments around Japan and Europe, generally they are NOT right up next to major airports. You might want to fact-check this claim. Most airports have cell towers nearby, particularly international airports. Whatever about Japan, Europe assigned 3300 - 3800 Mhz for 5G, which is a good deal further away from the radio altimeter allocation than the US 5G allocation of 3700 - 4000 MHz. Nick
Re: What do you think about this airline vs 5G brouhaha?
Mel Beckman wrote on 18/01/2022 21:25: /The collective tech industry needs to admit that it made a huge blunder when it urged the FCC’s clueless Ajit Pai to “blow off” the clearly demonstrated FAA spectrum conflict. Sorry, passengers, but if you look out your window, you’ll see that aviation owns this spectrum and is entitled to interference-free operation. Replacing all radar altimeters isn’t going to happen in time for 5G anyway — it took more than ten years just to deploy anti-collision technology. So do what you should have done from the beginning: follow the FCC rules of non-interference to existing users, who have clear priority in this case.”/ The original fixed satellite comms (space-to-earth) allocation was 3700-4200MHz, which was split into two parts in 2020: a mobile wireless spectrum allocation on 3700MHz to 4000MHz (for 5G) with 4000-4200MHz remaining allocated to satellite comms. The 4200-4400MHz range is allocated to aeronautical navigation and is used for radio altimeters. So by rights, aviation doesn't now and never did own this spectrum. That said, spectrum bleed on radio transmitters is something that happens, and I've no doubt that there are plenty of broken altimeter receiver antennas out there which will pick up signals outside their formal allocation of 4200-4400MHz. Regularly tested band pass filters should deal with most of this. Even if technically the aeronautical sector doesn't own this spectrum, the consequences of transmitter or receiver bleed from nearby allocations could be serious for the same reason that if someone walks out on a pedestrian crossing without checking and gets mown down by a drunk driver, they're not going to be jubilantly talking at their funeral about how at least they were acting within their rights. Nick
Re: Long hops on international paths
PAUL R BARFORD wrote on 18/01/2022 14:48: So, the question is what is the cost/benefit to providers to configure/maintain routes (that include long MPLS tunnels) that tend to concentrate international connectivity at a relatively small number of routers? the cost of mpls TE is pretty low: a couple of extra config lines per LSP. The benefit can be substantial in terms of having fine-grained control of how packets traverse a network, and allow optimisation of specific policy outcomes, e.g. cost / latency / throughput / pktloss / qos / etc. Nick
Re: SOHO IPv6 switches
Sean Donelan wrote on 18/01/2022 11:28: The top two capabilities: 1) MLD snooping and 2) a simple way to keep IPv6 off certain ports (i.e. ancient 10/100 devices, which don't like it. controlling the multicast floods may also help them). Most people don't use ipv6 multicast in anger (i.e. anything more than nd / bonjour / etc), so mld snooping isn't that important for small switches. For proper device access control, you also need the ability for the switch to do ND/RA + DHCP snooping / filtering. Otherwise you open yourself to rogue routers and/or address assignment. Nick
Re: Long hops on international paths
PAUL R BARFORD wrote on 17/01/2022 18:02: For example, there is a router operated by Telia (AS1299) in Chicago that has a high concentration of such links. this doesn't appear to match 1299's public network topology: https://www.teliacarrier.com/our-network.html Is ttl decrement disabled on the test paths you're measuring? Broadly speaking, if you have a point-to-point link from one location to another (or parallel set of links with a common failure path, e.g. waves on a specific fibre path), there's a single router at each end. Nick
Re: Log4j mitigation
The log4j people have updated their security advisory to say that these two mitigation measures are not sufficient to protect against the recent vulnerability: 2. start java with "-D log4j2.formatMsgNoLookups=true" (v2.10+ only) 3. start java with "LOG4J_FORMAT_MSG_NO_LOOKUPS=true" environment variable (v2.10+ only) The current recommended fixes are: 1. upgrade to 2.16.0 (not 2.15.0), or 2. remove the JndiLookup.class file from log4j-core-*.jar More details on: https://logging.apache.org/log4j/2.x/security.html Nick
Re: Log4j mitigation
Andy Ringsmuth wrote on 11/12/2021 03:54: The intricacies of Java are over my head, but I’ve been reading about this Log4j issue that sounds pretty bad. What do we know about this? What, if anything, can a network operator do to help mitigate this? Or even an end user? The payload can be contained in https, so there is no way of detecting / stopping this at the network level. Installations need to be upgraded / fixed. https://logging.apache.org/log4j/2.x/security.html 1. upgrade log4j to 2.15.0 and restart all java apps 2. start java with "-D log4j2.formatMsgNoLookups=true" (v2.10+ only) 3. start java with "LOG4J_FORMAT_MSG_NO_LOOKUPS=true" environment variable (v2.10+ only) 4. zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class There's a lot of scanning going on at the moment, so if you have an exposed java instance running something which includes log4j2, you may already be compromised. Nick
Re: Anyone else seeing DNSSEC failures from EU Commission ? (european-union.europa.eu)
Ca By wrote on 09/12/2021 14:36: Just saying, facts are on my side. Check the number of times dnssec caused an outage. Then check the number of hacks prevented by dnssec. Literally 0. it serves a purpose. There are plenty of actors, both public and private sector, who would be happy to announce their own local .root-servers.net address blocks, with consequent security issues for all end users at the receiving end (+ leakage causing collateral damage). For all its other flaws, dnssec makes this style of dns compromise difficult. Nick
Re: .bv ccTLD
Jaap Akkerhuis wrote on 04/12/2021 21:13: Similar ideas where held for MD and TM but didn'y seem to work out. Furthermore, an indepent Bougainville mighs change the name to something else (as Zimbabwe did). this is not unusual: .tp became one of the shortest-lived cctlds, and was dropped in favour of .tl. Apparently, there are two hard problems facing newly-create states: cash invalidation and naming things. Nick
Re: Redeploying most of 127/8, 0/8, 240/4 and *.0 as unicast
Joe Maimon wrote on 19/11/2021 14:30: Its very viable, since its a local support issue only. Your ISP can advise you that they will support you using the lowest number and you may then use it if you canall you may need is a single patched/upgraded router or firewall to get your additional static IP online. That would be an entertaining support phone call with grandma. So, she gets a new CPE which issues 192.168.1.0 to her laptop and .1 to her printer, and then her printer can no longer talk to her laptop. I'm sure that the ISP would be happy to walk her through doing a firmware upgrade on her printer or that her day would end up better for having learned about DHCP assignment policies on her CPE. They could even email her a copy of the RFC and a link to the IETF working group if she felt there was a problem. Nick
Re: Redeploying most of 127/8, 0/8, 240/4 and *.0 as unicast
John Gilmore wrote on 19/11/2021 01:54: Lowest address is in the most recent Linux and FreeBSD kernels, but not yet in any OS distros. lowest addresses will not be viable until widely supported on router (including CPE) platforms. This is hard to test in the wild - ripe atlas will only test the transit path rather than the local connection. I.e. it's not clear that what you're measuring here is a valid way of working out whether a lowest address is generally going to work, because .0 has been mostly accepted in the transit path since the 1990s (bit alarming to see that it's still not universal). The other risk with the lowest address proposal is that it will break network connectivity transitivity with no fallback or detection mechanism. I.e. consider three hosts on a broadcast domain: A, B and C. A uses the lowest address, B accepts a lowest address, but C does not. Then A can talk to B, B can talk to C, but C cannot talk to A. This does not seem to be addressed in the draft. Nick
Re: Redeploying most of 127/8, 0/8, 240/4 and *.0 as unicast
John Gilmore wrote on 18/11/2021 19:37: There will be no future free-for-all that burns through 300 million IPv4 addresses in 4 months. this is correct not necessarily because of the reasons you state, but because all the RIRs have changed their ipv4 allocation policies to policies which assume complete or near-complete depletion of the available pools, rather than policies which allocate / assign on the basis of stated requirement. For sure, organisations were previously requesting more than they needed, but if stated-requirement were reinstituted as a policy basis, the address space would disappear in a flash. The point remains that 127/8, 0/8, 240/4 are problematic to debogonise, and are not going to make a dramatic impact to the availability of ipv4 addresses in the longer term. Same with using the lowest ip address in a network block. Nice idea, but 30 years late. There's no problem implementing these ideas in code and quietly using the address space in private contexts. Nick
Re: WKBI #586, Redploying most of 127/8 as unicast public
John Levine wrote on 18/11/2021 03:03: The amount of work to change every computer in the world running TCP/IP and every IP application to treat 240/4 as unicast (or to treat some of 127/8) is not significantly less than the work to get them to support IPv6. So it would roughly double the work, for a 2% increase in the address space, or for 127/8 less than 1%. The code for IPv6 is already written, after all. Also, while the world has run out of free IPv4 address space, there is plenty of IPv4 if you are willing to pay for it. A 2% increase in v4 addresses would not change that. putting more numbers on the table, the pre-exhaustion burn rate of unallocated ipv4 address space was around 13 x /8 a year, i.e. a /8 every four weeks. The ask is to update every ip stack in the world (including validation, equipment retirement, reconfiguration, etc) and the gain is 4 weeks of extra ip address space in terms of estimated consumption. Nick
Re: DNS hijack?
Stephane Bortzmeyer wrote on 13/11/2021 09:25: To my mind, I simply don't understand why some people continue to use Network Solutions, with the track record they have. indeed. one aspect of this is that it's unusually difficult to migrate away compared to other registrars. Only Primary Contact accounts can request an auth code - normal "admin" accounts can't, and there's no indication about how to work around this; they unnecessarily delay issuing the epp code for 5 days; there are several prominent options for renewing the domain (can't change your mind if you do this), and only one for transferring (lots of options to change your mind). During the transfer process, several emails are issued, all which lead back to renewal. When it's all completed, the only way to formally close an account is over the phone. Also, the exorbitant renewal pricing isn't available until you log in. And you will need to prepare for a shock if the domain expires (no notification to standard "admin" contacts either). I had this little gem from NetSol for an expired domain last year: https://i.imgur.com/Vtp7BX7.png I.e. $36 for reinstatement and $40 for 1y renewal. The other option was losing the domain entirely. There are plenty of other registrars which are completely super to deal with. Nick
Re: possible rsync validation dos vuln
Barry Greene wrote on 29/10/2021 13:15: "The NCSC will try to resolve the security problem that you have reported in a system within 60 days. Once the problem has been resolved, we will decide in consultation whether and how details will be published.” I would have expected you to council the researchers on responsible disclosure principles. there's a public statement about this from NCSC-NL: https://www.ncsc.nl/actueel/nieuws/2021/oktober/29/aanstaande-bekendmaking-cvd-procedure-rpki "In dit proces is een afweging gemaakt om de ontwikkelaar van RPKI-client pas later te informeren. Deze afweging is gemaakt op basis van het publieke standpunt van deze ontwikkelaars, namelijk steun voor ‘full disclosure’. De ontwikkelaars van RPKI-client hebben het NCSC laten weten dat zij niet akkoord gaan met betrokkenheid onder embargo." "During this process, a decision was made to inform the developer of RPKI-client at a later stage. This decision was made on the basis of the public standpoint of these developers, namely support for 'full disclosure. The developers of RPKI-client have let the NCSC know that they do not agree with involvement under embargo." Looks like the NCSC got confused about OpenBSD's internal security vuln management process, which involves full disclosure on their terms, and the way they operate with disclosures from third parties / multiparty engagement, which involves co-operation with the disclosing party / CERT about mutually acceptable terms, including co-ordinated disclosure, i.e. standard industry practice. Some public clarity from the openbsd people would help here. + there was a screwup with the rcynic developers. It's a bit much to claim that the openbsd (+ rcynic) people didn't agree with involvement under embargo when the terms were apparently: we're releasing details in 4 days and will only tell you what the problem is if you agree to this. Regardless of how this misunderstanding came about, this style of approach doesn't form part of an acceptable vulnerability management process. Nick
Re: possible rsync validation dos vuln
Barry Greene wrote on 29/10/2021 13:15: That only happens if the team has the time to get the fix into the code, tested, validated, regressed, and deployed. I would say this is a classic example of “ego” to publish overruling established principles. The University of Twente should explore requiring classes for responsible disclosure. NCSC, it seems you threw out your own policy: "The NCSC will try to resolve the security problem that you have reported in a system within 60 days. Once the problem has been resolved, we will decide in consultation whether and how details will be published.” I would have expected you to council the researchers on responsible disclosure principles. Indeed + also manage the vendor disclosure process in a more comprehensive / structured way. An interesting and worthwhile outcome here would be a presentation on how the set of inputs into the sausage factory produced the mess that's going to be served for lunch on monday. I.e. let's use this as an opportunity to learn from the mistakes that were made here. Nick
Re: possible rsync validation dos vuln
Randy Bush wrote on 29/10/2021 02:03: received this vuln notice four days before these children intend to disclose. so you can guess how inclined to embargo. The position doesn't seem to be compatible with e.g. https://www.first.org/global/sigs/vulnerability-coordination/multiparty/FIRST-Multiparty-Vulnerability-Coordination.pdf At the top of the FIRST list: 1. Establish a strong foundation of processes and relationships 2. Maintain clear and consistent communications 2.1. All parties should clearly and securely communicate and negotiate expectations and timelines. Because this didn't happen, we now get to look forward to a weekend of elevated risk, followed by people upending their calendars to handle un-coordinated upgrades on monday morning. Vulnerability researchers perform a valuable service, but enthusiasm needs to be tempered with an understanding that there are real life consequences to not handling this sort of thing in a well-structured way. It doesn't need to be said that: "1. we screwed up with your email address, and 2. we're disclosing in 4 days and aren't telling you what the problem is unless you agree to our terms" is not an appropriate way of handling anything, whatever about claiming to speak on behalf of an NCSC. This won't be the last time a screw-up of this form happens, so maybe NCSC-NL's takeaway should be to ensure that co-ordinated vuln management and disclosure happens in a reasonable way when engaging with all parties? As a separate thing, software authors also need to have clearly defined security notification points and vulnerability management policies. Most have in this situation, but not all. Nick randy From: Koen van Hove Subject: CVD: Vulnerabilities in RPKI Validators To: ra...@psg.com, s...@hactrn.net Cc: c...@ncsc.nl Date: Wed, 27 Oct 2021 14:59:21 -0700 Dear Randy Bush and Rob Austein, Apologies, this email was previously sent to the wrong email address. On behalf of the University of Twente and the National Cyber Security Centre of the Netherlands (NCSC-NL) we want to notify you of a Coordinated Vulnerability Disclosure for RPKI vulnerabilities that also impact rcynic developed by Dragon Research Labs. The vulnerabilities were discovered by scientific research on the implementation of RPKI validators. Together with you, the NCSC-NL, the University of Twente, and multiple other parties, we would like to come to a timely solution before the results of this research will be made public. More information about Coordinated Vulnerability Disclosure can be found here [1]. The vulnerabilities are classified as a denial of service vulnerability and impact multiple implementations of RPKI validators including rcynic. Since RPKI is of international interest we hope that you will work together with us on this CVD. The goal is to have fixes available before 1 November which will also be the date that the results of this research will become public. Before 1 November the information in the CVD, or the fact that a CVD is taking place, is to be kept strictly confidential. The fixes are to be released collectively on 1 November. Please let us know whether you agree to these terms, and want to participate in this CVD. If so, we will send you the details. We hope to hear from you. If there are any further questions, please let us know. Yours sincerely, Koen van Hove University of Twente [1] https://english.ncsc.nl/contact/reporting-a-vulnerability-cvd - -- Koen van Hove -BEGIN PGP SIGNATURE- Version: FlowCrypt Email Encryption 8.1.5 Comment: Seamlessly send and receive encrypted email wsDzBAEBCgAGBQJhecu4ACEJEPnqm/++VTh9FiEE5Q3GCKqW0RQyUpA/+eqb /75VOH1CjwwAq8Hd0psDhfj6mL4X9ybLGogONpzFKYp9Okv9/CKzQvG4AkLR Cvrz3vHlQRKJP8I2PYSLZvtG9D/HXjjKcU+m24jjl2qbKKuSwprqQhLAqabN Md+RZFjQGve5Z4vtJsfhXKc4PhaAzMujVc4Mh5Mdbs4sFEdrub1hSnYKlcQV PvS/O9SpCYU0E0IC1I455HXxSXUtme+KHtzbGIWQe/mz4KpnZD2Me/Cr1LvG Od9izri0Qx5vF+kdpR51PEiwHgN+QkmnUP6Gkrca8TSC2x3ta9B1/ZprdCoZ ZYQ7QUFUAkfV+tKCMaBECNOrnDjw8E9GonvzmqpDHBtKBZ3LaxjZX/sxuuTC +Ele5nVeWW0ZFqrbanbPy9y1q04tFQd8ewdSN40iXdTj7Ha8GadUhcdSLWqJ cLmf71qUAvdwpp0Bt1nhExpU/bEtAaxfnEcTRDX43yUkZXSqV5BxYEyneSLj IvFV9AUi56Cx45ESkGRR1ASuCzoc8FCjRH7KOWnaL3fl =YQZI -END PGP SIGNATURE-
Re: IRR for IX peers
Randy Bush wrote on 07/10/2021 15:26: it was sabatoged there was more to it than that. The grammar was too complicated to easily describe common policies and too limited to describe complex policies. The structure was difficult to extend when the routing became more complicated (e.g. mpls, route servers, ipv6, complex ibgp, etc). The tooling was too complicated for anyone to understand properly how it worked and too early to benefit from later additions, e.g. scripting language plugins. If it had been an easy problem domain to fix, it would have been fixed a long time ago, but it wasn't. Nick
Re: IRR for IX peers
Randy Bush wrote on 04/10/2021 21:15: i was hoping that, if 3130 said it is peering with martha, artemis would get a clue and stfu right. This was klunked around using the export-via and import-via rpsl constructions (draft-snijders-rpsl-via), which never quite made it to ietf wg adoption status. It did, however, point out how limited RPSL grammar was :( Nick
Re: IRR for IX peers
Randy Bush wrote on 04/10/2021 17:44: what are others in this space doing? not using import/export lines in their RS or router configs, for starters. Probably you could count the number of IXPs that inspect import/export lines on the fingers of one hand, and possibly of one finger. Generally speaking, IXPs try to aim for filters based on a single {as-set,IRRDB set} tuple per RS client configured. If you're aiming for bilat bgp sessions, then this functionality would need to be replicated. Nearly 30 years on, this is still the state of the art. Nick
Re: uPRF strict more
Saku Ytti wrote on 29/09/2021 07:03: Having said that, I'm not convinced anyone should use uRPF at all. Because you should already know what IP addresses are possible behind the port, if you do, you can do ACL, and ACL is significantly lower cost in PPS in a typical modern lookup engine. urpf has its place if your network config build processes aren't automated to the point that it's no longer necessary. It would be a net security loss to the internet not to have it widely implemented on access devices. Nick
Re: IPv6 woes - RFC
Valdis Klētnieks wrote on 26/09/2021 01:44: 19:17:38 0 [~] ping 2130706433 PING 2130706433 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.126 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.075 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.063 ms 64 bytes from 127.0.0.1: icmp_seq=4 ttl=64 time=0.082 ms ^C --- 2130706433 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time84ms rtt min/avg/max/mdev = 0.063/0.086/0.126/0.025 ms Works on Fedora Rawhide based on RedHat, Debian 10, and Android 9. this is a good example of "might work but don't depend on it". The fact that it works at all is a historical curiosity which happened because the text format for ipv4 addresses was never formally specified, so when some of the tcp/ip code was originally written for bsd, it accepted unusual formats in some places, including: integers, octal, hex, binary and assuming zeros when the address is incompletely specified, among other things. The octal representation was a real problem because rfc790 specified decimal dotted quad notation with leading zeros, leading to a whole bag of pain for parsers because there is no way of knowing what a leading zero means in practice, and for 3-digit numbers where each digit is <= 7, there is no a-priori way of determining whether it's octal representation or decimal. Nick
Re: IPv6 woes - RFC
Randy Bush wrote on 13/09/2021 19:22: the specs as originally RFCed by the ietf is very telling. for your amusement, take a look at rfc 2450. it took five years of war to get rid of the tla/sla crap. and look at the /64 religion today[0]. architectural decisions were made because of a mixture of actual and perceived problems at the time, but several of the outcomes make little sense now or in some cases, actively cause problems. E.g. using mcast for address resolution because large flat l2 networks were the order of the day, that "privacy addresses" would give privacy, that client self-selected addresses should be the only game in town for auto-addressing (it took years to get any form of dhcp through), that extension headers were a great idea, that "transition mechanisms" would be viable for fundamentally incompatible protocols, etc. That said, it's easy to be critical of design decisions with 25y of hindsight, and even easier to understate how difficult it is to dislodge ipv4 which took 40 years of evolution to cement itself into its current position. Nick
Re: PeerinDB refuses to register certain networks [was: Setting sensible max-prefix limits]
Sabri Berisha wrote on 19/08/2021 00:57: - On Aug 18, 2021, at 4:03 PM, Rubens kuhlrube...@gmail.com wrote: Hi, Currently RPKI can only validate origin, not paths. If/when a path validation solution is available, then one easy way to know that network A really means to peer with network B is to publish a path validation that B can use and/or forward A's announcements. Yes, that would be a relatively easy thing to calculate. if this were easy, we'd have solved the problem space years ago. It's complicated because the description mechanism needs to be able to describe the complete set of all inter-as connectivity arrangements written in a language which is simple enough for people to be able to update it easily, which can be parsed by language interpreters relatively easily (allowing toolkits to be written / ), and which is flexible enough to output complex instructions including optimized regexps, routing metrics, etc, on a per-prefix, per-asn, per-interconnection point basis. RPSL attempted these things and probably failed on all three points. There have been some other attempts, but none came up with any usable outputs. Nick
Re: "Tactical" /24 announcements
Jon Lewis wrote on 12/08/2021 18:09: Arista. They call it FIB compression. They mention it's a trade-off, more memory and CPU utilization (keeping track of things) in exchange for being able to keep hardware that might otherwise be out of FIB space able to cope with full tables. it also causes non-deterministic fib resource consumption. On most edge deployments this won't matter, but it wouldn't be hard to cook up a topology that could fail in interesting ways. Overall fib compression is a net win, but you need to be careful with it. Nick
Re: Juniper hardware recommendation
Adam Thompson wrote on 14/05/2021 15:44: I did not know such a thing existed! Cool! Holy murdering your port density, though. Ouch$$$. oh the port wastage is completely criminal, but it can be a handy last resort. Nick
Re: Juniper hardware recommendation
Adam Thompson wrote on 14/05/2021 14:30: However, the MX 10k family still only shows as being compatible with two QSFP cards. And yes, you can get a QSFP-SFP+ breakout cable, but those don't let you use SFP+ CWDM/DWDM transceivers. you can also get QSA adapters to convert from a QSFP form factor port to a SFP+ port. This should allow SFP+ WDM transceivers. Nick
Re: Letters of Authorization still aren't worth the paper they aren't printed
Sean Donelan wrote on 15/03/2021 17:46: Its amazing the telecommunications industry still uses or relies on "Letter of Authorization". Its less secure than faxing a piece of paper on "letterhead." LOAs aren't about authorization. They're about shifting liability and having a paper trail. Nick
Re: DOD prefixes and AS8003 / GRSCORP
Siyuan Miao wrote on 12/03/2021 11:34: My biggest concern is why the AS8003 was assigned to the company (GLOBAL RESOURCE SYSTEMS, LLC) even before its existence. GRS LLC seems to have been around since 2006. https://opencorporates.com/companies/us_fl/M0601699 AS8003 was registered to them in Sep 2020: ASNumber: 8003 ASName: GRS-DOD ASHandle: AS8003 RegDate:2020-09-14 Updated:2020-09-14 Ref:https://rdap.arin.net/registry/autnum/8003 No doubt there is more information about the history of 8003 in WhoWas. Nick
Re: DPDK and energy efficiency
Shane Ronan wrote on 23/02/2021 16:59: For use cases where DPDK matters, are you really concerned with power consumption? Probably yeah. Have you assessed the lifetime cost of running a multicore CPU at 100% vs at 10%, particularly as you're likely to have multiples of these devices in operation? Nick
Re: DPDK and energy efficiency
Etienne-Victor Depasquale wrote on 23/02/2021 16:03: "we found that a poll mode driver (PMD) thread accounted for approximately 99.7 percent CPU occupancy (a full core utilization)." interrupt-driven network drivers generally can't compete with polled mode drivers at higher throughputs on generic CPU / PCI card systems. On this style of config, you optimise your driver parameters based on what works best under the specific conditions. Polled mode drivers have been around for a while, e.g. https://svnweb.freebsd.org/base?view=revision&revision=87902 Nick
Re: public open resolver list?
Randy Bush wrote on 01/02/2021 18:16: is there a list of public resolvers? e.g. 1.1.1.1, 4.4.4.4, 8.8.8.8, etc.? https://public-dns.info/ ? Nick
Re: Follow up to "has virtualization become obsolete in 5G"?
Etienne-Victor Depasquale wrote on 16/01/2021 11:34: The term NFV is a bit of a stretch for what is really network-function-containerization. Like ~ everything else relating to computers, network management and service provisioning functionality boils down to executing CPU instructions on physical devices with service access handles and protocols available over a management communications layer. There are plenty of choice about what particular abstraction layer you might want to sit between between the storage image and the CPU. Containers have been around for years, and have some advantages over hypervisor-based virtual machines, in relation to cost and deployment efficiency. Like everything else, there's a tradeoff, and the suitability of containers to the function at hand depends on what you're trying to achieve. The reaction of most technical people to deployment of NFV or declaration of NFV's death is going to be more along the lines of wondering why telco proponents were so late to the devops / containerisation game to start with, and what on earth did they think was so innovative about it that it deserved yet another marketing label. Nick
Re: Parler
Eric S. Raymond wrote on 11/01/2021 00:00: Yes, it would. This was an astonnishingly stupid move on AWS's part; I'm prett sure their counsel was not conmsulted. this is quite an innovative level of speculation. Care to provide sources? Nick
Re: A letter from the CEO
Warren Kumari wrote on 23/11/2020 16:05: They are better than terrorbits, which is what happen when anyone in the family says "My Internet is broken, can you fix it?" best to approach incidents like this with gigglebits, e.g. the sort of response that accompanies replies like "you did WHAT?? AGAIN??" Nick
Re: 100G over 100 km of dark fiber
Dale W. Carder wrote on 30/10/2020 14:33: You may also find that 100G PAM4 could work. not at 100km. This would be outside the dispersion tolerance limits for pam4. Nick
Re: Ingress filtering on transits, peers, and IX ports
Saku Ytti wrote on 15/10/2020 15:29: But you have to think about what prefixes a customer has. If BGP you need to generate prefix-list, if static you need to generate a static route. As you already have to know and manage this information, what is the incremental cost to also emit an ACL? the unfortunate reality is that many networks are run by CLI jockeys, so the incremental cost of this can be high. There are no good general-purpose networking sources of truth, which means that usually provisioning databases need to be highly customised, which is only worth it if the scale merits it. Nick
Re: Ingress filtering on transits, peers, and IX ports
Brian Knight via NANOG wrote on 13/10/2020 23:49: Strict mode won't work for us, because with our multi-homed transits and IX peers, we will almost certainly drop a legitimate packet because the best route is through another transit. there's no "almost" about it: strict mode is unfeasible for both transit and IX ports. Nick
Re: Hand held copper Ethernet testers
Chris Boyd wrote on 30/09/2020 21:24: My old Test-Um Lanscaper died, and I was curious what people liked these days. Don’t need throughput testing or anything like that, just basic wire map testing, cable ID, cable length, PoE voltage, and DHCP client. What do y’all like? https://pockethernet.com/ is pretty neat. Nick
Re: BFD for routes learned trough Route-servers in IXPs
Ryan Hamel wrote on 16/09/2020 03:01: Install a route optimizer that constantly pings next hops or if you want a more reliable IXP experience, don't install a route optimiser and if you do, don't make it ping next-hops. - you're not guaranteed that the icmp reply back to the route optimiser will follow the forward path. - you are guaranteed that icmp is heavily deprioritised on ixp routers - the busier the IXP, the busier the control planes of all the IXP routers you're going to ping, and the more likely they are to drop your ping packets. This will lead to greater route churn. If this approach is widely deployed it will lead to wider-scale routing oscillations due to control plane mismanagement. - route optimisers are associated with serious bgp leakage issues. if you're doing this at an IXP, the danger is significantly magnified because bi-lat peering sessions rarely, if ever, implement prefix filtering. It is true that IXPs occasionally see forwarding plane failures. These tend to be pretty unusual these days. Be careful about optimising edge cases like this. You'll often end up introducing new failure modes which may be more serious and which may occur more regularly. Nick
Re: SRv6
Saku Ytti wrote on 15/09/2020 18:05: You just move the encapsulation from in-order to inside-ip making everything harder for SW and much harder for HW, the simplicity is a lie. to quantify this, the tunneling header increased in size from a minimum of 4 octets to a minimum of 40 octets. If you want explicit path routing, you'll need to tack on a SRH which is another 8 octets + 16 octets per SID, so e.g. an mpls frame with 2-node ERO goes from 12 octets to 80 octets. This comes at a cost. What was previously a simple lookup operation on a tightly optimised format is now up to 10x bloated with little extra functionality to show for it. It's true that these devices already do ipv6, but can they do ipv6 + complex decapsulation in a single pass? If you're using an NPU, probably yes. If an ASIC, maybe not. What if the decapsulated packet has a stash of ipv6 extension headers? This gets complicated quickly, and that complication can only be solved by adding complication to silicon, which is what you want not to do when the speed of your underlying forwarding plane is increasing by leaps and bounds. Good, cheap, fast. Choose two - or maybe one. The control plane is byzantine. This also has a cost in terms of design, build and support / maintenance. As Mark points out, many companies have made their fortunes with a full stack product offering, from hardware and software to design, engineering and operations. It's not a bad business model as long as you realise that it's time-limited to the technology of the day. When it draws to a close, it's hard to turn companies around that have been used to a single-product or single-vertical cash trough which no longer exists. Some have done this successfully; others have floundered. Nick
Re: SRv6
Mark Tinka wrote on 15/09/2020 07:04: My head hurts:-)... yep, and you're not alone - the complexity level is pretty high, right from the control plane to the hardware. It's not clear that the modest net gain in functionality is worth it. Nick
Re: SRv6
aar...@gvtc.com wrote on 14/09/2020 20:03: Thanks Nick, I only see the following layers... I see no extension headers behind the ipv6 header. I sent you the wireshark sniff directly so you can see what I'm seeing. you should see extension headers if you're doing more complex stuff? E.g. if you run a ping / traceroute with the "use-srv6-op-sid" parameter, or create a segment list under "segment-routing srv6 traffic engineering", that should throw in some EHs. Nick
Re: SRv6
aar...@gvtc.com wrote on 14/09/2020 18:57: But rather, shows my L3VPN v4 traffic riding v6 and that’s it. that is how SRv6 works. IPv6 + extension headers (+ a bit extra which is incompatible with ipv6). Let me know if I’m seeing an SRH and just don’t know it, LOL. Check out the IPv6 Extension Headers in the underlay packet. Nick
Re: BGP Community - AS0 is de-facto "no-export-to" marker - Any ASN reserved to "export-only-to"?'
Jeff Tantsura via NANOG wrote on 09/09/2020 09:03: De-facto standards are as good as people implementing them, however in order to enforce non ambiguous implementations, it has to be de-jure (e.g. a standard track RFC). While I’m sympathetic to the idea, I’m quite skeptical about its viability. A well written BCP would be much more valuable, and perhaps when we get to a critical mass, codification would be a natural process, rather than artificially enforcing people doing stuff they don’t see value (ROI) in, discussion here perfectly reflects the state of art. Last year the IETF published RFC 8642, "Policy Behavior for Well-Known BGP Communities" which described how the three well-known communities defined in RFC1997 ought to be interpreted. RFC1997 was published in 1996, 23 years prior, and the definitions looked pretty simple and unambiguous. Here's the opening paragraph: The BGP Communities attribute was specified in [RFC1997], which introduced the concept of well-known communities. In hindsight, [RFC1997] did not prescribe as fully as it should have how well-known communities may be manipulated by policies applied by operators. Currently, implementations differ in this regard, and these differences can result in inconsistent behaviors that operators find difficult to identify and resolve. I sympathise with the idea of standardised well-known communities, but if it takes us 23 years to tie down the semantics of three simple WKCs to the point that they behave consistently across vendors and operators, it's going to be a real struggle to define anything more complicated to the point that they end up doing what we want them to do, which is to say that they behave consistently across NOS implementations and different operator networks. Even mixing 16-bit communities and 32-bit communities for stuff like ixp route server no-export causes interoperability problems. Which gets evaluated first? Why? What happens if you get the order wrong? How can you integrate this into an existing routing policy configuration? These things look a bit academic until something breaks, at which point it becomes clear that even simple-looking stuff can be complicated and messy when it goes wrong. Nick
Re: Centurylink having a bad morning?
Shawn L via NANOG wrote on 02/09/2020 12:15: We once moved a 3u server 30 miles between data centers this way. Plug redundant psu into a ups and 2 people carried it out and put them in a vehicle. hopefully none of these server moves that people have been talking about involved spinning disks. If they did, kit damage is one of the likely outcomes - you seriously do not want to bump active spindles: www.google.com/search?q=disk+platter+damage&tbm=isch SSDs are a different story. In that case it's just a bit odd as to why you wouldn't want to power down a system to physically move it - in the sense that if your service delivery model can't withstand periodic maintenance and loss of availability of individual components, rethinking the model might be productive. Nick
Re: TCP and UDP Port 0 - Should an ISP or ITP Block it?
K. Scott Helms wrote on 26/08/2020 13:55: To be clear, UDP port 0 is not and probably shouldn't be blocked because some network gear and reporting tools may mistake a fragmented UDP PDU for port 0. That's an implementation error, but one that may be common enough to create issues for users. do you have data on this? Nick
Re: Bottlenecks and link upgrades
Mark Tinka wrote on 13/08/2020 11:31: It's great to monitor packet loss, latency, pps, e.t.c. But packet loss at 10% link utilization is not a foreign occurrence. No amount of bandwidth upgrades will fix that. you could easily have 10% utilization and see packet loss due to insufficient bandwidth if you have egress << ingress and proportionally low buffering, e.g. UDP or iSCSI from a 40G/100 port with egress to a low-buffer 1G port. This sort of thing is less likely in the imix world, but it can easily happen with high capacity CDN nodes injecting content where the receiving port is small and subject to bursty traffic. Nick
Re: BGP route hijack by AS10990
Sabri Berisha wrote on 01/08/2020 20:59: My point is that there can be operational reasons to do so, and whatever they wish to do on their network is perfectly fine. As long as they don't bother the rest of the world with it. I get what you're saying, and am a big fan of personal responsibility, but when a vendor ships a product like a BGP optimiser, it requires that you run your network with the safety controls removed. It's no different in principle to shipping guns with the safety welded to off, or hot-wiring 20kW cables to bypass your RCDs. It can produce some great results, no doubt about it, but sooner or later you're guaranteed that there's going to be a nasty accident. In any individual case, it's understandable to assign blame to an operator for messing up their configs. In the general case, shipping products with dangerous-by-default configurations is going lead to more accidents happening. At this point, a large proportion of the major routing leaks on the internet can be associated with bgp optimisers and Noction's name appears with disturbing regularity. This is an appalling record, not least because it's almost entirely preventable. Nick
Re: BGP route hijack by AS10990
Sabri Berisha wrote on 01/08/2020 20:03: but because Noction's decision to not enable NO_EXPORT by default the primary problem is not this but that Noction reinjects prefixes into the local ibgp mesh with the as-path stripped and then prioritises these prefixes so that they're learned as the best path. The as-path is the primary loop detection mechanism in eBGP. Removing this is like hot-wiring your electrical distribution board because you found out you could get more power if you bypass those stupid RCDs. Once you strip off the as-path in the local view, it's like the AS7007 incident desperately begging to happen all over again. As long as route optimiser vendors ship their products with such deeply harmful defaults, we're going to continue to see these problems ad nauseam. Nick
Re: BGP route hijack by AS10990
Mark Tinka wrote on 01/08/2020 12:20: The difference between us and aviation is that fundamental flaws or mistakes that impact safety are required to be fixed and checked if you want to keep operating in the industry. We don't have that, so... ... so once again, route optimisers were at the heart of another serious route leaking incident. BGP is designed to prevent loops from happening, and has tools like no-export to help prevent inadvertent leaks. When people build "BGP optimisers" which reinject a prefix into a routing mesh with the entire as-path stripped and then they refuse to apply the basic minimum of common sense by refusing point blank to tag prefixes with no-export, it's a matter of certainty that leaks are going to happen, and that when they do, they'll cause damage. It's about as responsible as shipping a shotgun with the safety disabled and then handing it to a newbie. After all, the safety makes it more difficult to operate and if the newbie shoots themselves, it was their fault. And if they shot someone else, they shouldn't have got in the way, right? Nick
Re: BGP route hijack by AS10990
Hank Nussbacher wrote on 31/07/2020 08:21: But wait - MANRS indicates that Telia does everything right: Not only that, Telia indicates that Telia does everything right: https://www.teliacarrier.com/our-network/bgp-routing/routing-security-.html "We reject RPKI Invalids on all BGP Sessions; for both Peers and Customers." How can that be? Misconfig or oversight? Nick
Re: RFC 5549 - IPv4 Routes with IPv6 next-hop - Does it really exists?
Mark Tinka wrote on 29/07/2020 17:06: > Meaning the initial setup would still require the use of literal IP > addresses? You can't use hostnames, if that's what you're asking. FRR will also do unnumbered BGP with auto-config. Nick
Re: RFC 5549 - IPv4 Routes with IPv6 next-hop - Does it really exists?
Mark Tinka wrote on 29/07/2020 15:51: > I'm curious to know if this is after-the-fact, as I can't think of a way > that BGP would find hostnames to setup sessions with, outside of some > kind of upper layer name resolution capability. > > The draft isn't clear on how this happens, if it is, indeed, > before-the-fact. it's a capability negotiation, so is handled on session setup. Nick
Re: RFC 5549 - IPv4 Routes with IPv6 next-hop - Does it really exists?
Mark Tinka wrote on 29/07/2020 15:09: > Are the names based on DNS look-ups, or is there some kind of protocol > association between the device underlay and its hostname, as it pertains > to neighbors? afaik, this is an implementation of draft-walton-bgp-hostname-capability. Nick
Re: cloud backup
Michael Thomas wrote on 26/07/2020 21:39: AWS S3 infrequent access is $40/month. If it's really archival backup AWS has glacier which is less than $20/month, but it's name gives you an idea of what it is. how much does a full restore cost with these options? Nick
Re: questions asked during network engineer interview
William Herrin wrote on 21/07/2020 20:21: This is happening a lot in the big shops like Amazon that can afford to employ software developers to write purpose-built network code. IOW, it works if you have a large and homogeneous enough network with a sufficiently narrowly product portfolio that you can justify the cost of getting enough programming skill to make the cost/benefit ratio work. Some networks are like this; many aren't. In fairness, most networks would benefit from some degree of automation. Nick
Re: BFD for long haul circuit
Tom Hill wrote on 17/07/2020 16:06: If you're a service provider, don't buy a consumer product and hope to sell it on at a similar (or higher) SLA rate to other consumers; that way lies ruin. I was going to suggest that there wasn't much in the way of consumer grade international circuits, so why would you even bring this up? But then I lol'd. Nick
Re: Anyone running C-Data OLTs?
Mark Tinka wrote on 13/07/2020 16:03: Still don't know what "third world" means (of course I do...), but Obviously he means countries like Sweden, Ireland and Switzerland. https://en.wikipedia.org/wiki/Third_World#/media/File:Cold_War_alliances_mid-1975.svg It's not clear why there's any relationship between third world status and the choice of PON/active FTTP equipment used in 2020. Or maybe there's some subtlety that being lost here. Hard to tell. Nick
Re: SaoPaolo to Frankfurt
Colin Stanners (lists) wrote on 13/07/2020 14:41: Looking at the Wikipedia article, it claims that Atlantis-2 “can already be upgraded with current technology to 160Gbit/s”. Would be interesting why that wasn’t already done on this 20-year-old cable – assuming that the underground infrastructure (repeaters) are compatible with the newer modulations (or additional wavelengths, but that would have necessitated much more design), the upgrade cost should be small compared to the cable’s value. 160gbit/sec split over a standard 80ch itu dwdm grid sounds like 2gbit/sec per channel (although there are more efficient options than the standard itu grid). This sounds like it's seriously not worth it for today's bandwidth requirements, which might explain why it's only viable for voice traffic. Nick
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
Masataka Ohta wrote on 22/06/2020 13:49: But, it should be noted that a single class B routing table entry "a single class B routing table entry"? Did 1993 just call and ask for its addressing back? :-) But, it should be noted that a single class B routing table entry often serves for an organization with 1s of users, which is at least our case here at titech.ac.jp. It should also be noted that, my concern is scalability in ISP side. This entire conversation is puzzling: we already have "hierarchical routing" to a large degree, to the extent that the public DFZ only sees aggregate routes exported by ASNs. Inside ASNs, there will be internal aggregation of individual routes (e.g. an ISP DHCP pool), and possibly multiple levels of aggregation, depending on how this is configured. Aggregation is usually continued right down to the end-host edge, e.g. a router might have a /26 assigned on an interface, but the hosts will be aggregated within this /26. If you have 1000 PEs, you should be serving for somewhere around 1000 customers. And, if I understand BGP-MP correctly, all the routing information of all the customers is flooded by BGP-MP in the ISP. Well, maybe. Or maybe not. This depend on lots of things. Then, it should be a lot better to let customer edges encapsulate L2 or L3 over IP, with which, routing information within customers is exchanged by customer provided VPN without requiring extra overhead of maintaining customer local routing information by the ISP. If you have 1000 or even 1s of PEs, injecting simplistic non-aggregated routing information is unlikely to be an issue. If you have 1,000,000 PEs, you'll probably need to rethink that position. If your proposition is that the nature of the internet be changed so that route disaggregation is prevented, or that addressing policy be changed so that organisations are exclusively handed out IP address space by their upstream providers, then this is simple matter of misunderstanding of how impractical the proposition is: that horse bolted from the barn 30 years ago; no organisation would accept exclusive connectivity provided by a single upstream; and today's world of dense interconnection would be impossible on the terms you suggest. You may not like that there are lots of entries in the DFZ and many operators view this as a bit of a drag, but on today's technology, this can scale to significantly more than what we foresee in the medium-long term future. Nick
Re: Hurricane Electric has reached 0 RPKI INVALIDs in our routing table
Mark Tinka wrote on 18/06/2020 11:56: Invalid routes being dropped creates downtime. People respond to downtime a lot more eagerly. humanity is a crisis-driven species. Nick
Re: Hurricane Electric has reached 0 RPKI INVALIDs in our routing table
Mark Tinka wrote on 18/06/2020 11:16: On 17/Jun/20 21:16, Tim Warnock wrote: How did you know? Is there some monitoring system available to let you know or do you have your own? The usual way - a customer complained :-). The customer monitoring system is very reliable and often superior to in-house solutions. Nick
Re: Mikrotik RPKI Testing
Musa Stephen Honlue wrote on 18/06/2020 03:38: Did you face any issues with IPv6 on 6.4, I personally have participated in deployment projects on Mikrotik for many large networks. mikrotik ROS6 doesn't support next-hop recursion for ipv6 routes: https://forum.mikrotik.com/viewtopic.php?t=42268 It also doesn't support ospfv3 prefixes with the LA-bit set: https://forum.mikrotik.com/viewtopic.php?t=51124#p319794 I.e. if you originate an ipv6 loopback address from another vendor, the Mikrotik will silently drop the prefix on the floor. Note the dates on these posts: 2010 and 2011. Nick
Re: Router Suggestions
Baldur Norddahl wrote on 16/06/2020 07:32: purpose in life is to be a cold spare and a lab router. Why pay someone else for having a cold spare ready for next day replacement when you can have it yourself? e.g. your production deployment might be in another country, and getting equipment in and out of the country could involve customs headwreck, delay and cost. Or you might have only a handful of a specific type of device so there would be no justification getting a cold spare / lab unit. There are lots of good reasons to pay for support, but then again there are also lots of good reasons not to pay for support. It's highly dependent on what you're trying to achieve and there's no one-size-fits-all approach. Nick
Re: Router Suggestions
Patrick Cole wrote on 15/06/2020 14:16: MX204's may have gotten chaper in the last year I don't know. But YMMV. OP needs to check the licensing package for the MX204, and work out the N-year TCO. Nick
Re: [c-nsp] LDPv6 Census Check
Phil Bedard wrote on 11/06/2020 17:49: Just to clarify the only routers who potentially need to inspect or do anything with those headers are endpoints who require information in the extension header or hops in an explicit path. In the simple example I gave, there are no extension headers at all. perhaps, but no-one planning to use srv6 is going to invest in kit which can handle srv6 but not the TE component. Or deploy srv6 on existing kit which can't handle TE. Nick