Re: 60 ms cross-continent
Serious HFT moved to shortwave years ago. The chicago-NYC routes by microwave still exist, but are only for things that need higher data rates (as measured in kbps). It's hard to hide a giant log-periodic or yagi-uda antenna. The sites near Chicago that are aimed at London are well known to those in the industry. On Sun, Jun 21, 2020 at 10:53 AM Brett Frankenberger wrote: > On Sun, Jun 21, 2020 at 02:17:08PM -0300, Rubens Kuhl wrote: > > On Sat, Jun 20, 2020 at 5:05 PM Marshall Eubanks < > marshall.euba...@gmail.com> > > wrote: > > > > > This was also pitched as one of the killer-apps for the SpaceX > > > Starlink satellite array, particularly for cross-Atlantic and > > > cross-Pacific trading. > > > > > > > > > > https://blogs.cfainstitute.org/marketintegrity/2019/06/25/fspacex-is-opening-up-the-next-frontier-for-hft/ > > > > > > "Several commentators quickly caught onto the fact that an extremely > > > expensive network whose main selling point is long-distance, > > > low-latency coverage has a unique chance to fund its growth by > > > addressing the needs of a wealthy market that has a high willingness > > > to pay — high-frequency traders." > > > > > > > > This is a nice plot for a movie, but not how HFT is really done. It's so > > much easier to colocate on the same datacenter of the exchange and run > > algorithms from there; while those algorithms need humans to guide their > > strategy, the human thought process takes a couple of seconds anyways. So > > the real HFTs keep using the defined strategy while the human controller > > doesn't tell it otherwise. > > For faster access to one exchange, yes, absolutely, colocate at the > exchange. But there's more then one exchange. > > As one example, many index futures trade in Chicago. The stocks that > make up those indices mostly trade in New York. There's money to be > made on the arbitrage, if your Chicago algorithms get faster > information from New York (and vice versa) than everyone else's > algorithms. > > More expensive but shorter fiber routes have been build between NYC and > Chicago for this reason, as have a microwave paths (to get > speed-of-light in air rather than in glass). There's competition to > have the microwave towers as close as possible to the data centers, > because the last mile is fiber so the longer your last mile, the less > valuable your network. > > > https://www.bloomberg.com/news/features/2019-03-08/the-gazillion-dollar-standoff-over-two-high-frequency-trading-towers > > -- Brett >
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 23:01, Robert Raszuk wrote: > > Nope. You need to get to PQ node via potentially many hops. So you > need to have even ordered or independent label distribution to its > loopback in place. I have some testing I want to do with IS-IS only announcing the Loopback from a set of routers to the rest of the backbone, and LDP allocating labels for it accordingly, to solve a particular problem. I'll test this out and see what happens re: LDP LFA. Mark.
Re: Devil's Advocate - Segment Routing, Why?
> Wouldn't T-LDP fix this, since LDP LFA is a targeted session? Nope. You need to get to PQ node via potentially many hops. So you need to have even ordered or independent label distribution to its loopback in place. Best, R. On Sun, Jun 21, 2020 at 10:58 PM Mark Tinka wrote: > > > On 21/Jun/20 22:21, Robert Raszuk wrote: > > > Well this is true for one company :) Name starts with j > > Other company name starting with c - at least some time back by default > allocated labels for all routes in the RIB either connected or static or > sourced from IGP. Sure you could always limit that with a knob if desired. > > > > Juniper allocates labels to the Loopback only. > > Cisco allocates labels to all IGP and interface routes. > > Neither allocate labels to BGP routes for the global table. > > > > The issue with allocating labels only for BGP next hops is that your > IP/MPLS LFA breaks (or more directly is not possible) as you do not have a > label to PQ node upon failure. Hint: PQ node is not even running BGP :). > > > Wouldn't T-LDP fix this, since LDP LFA is a targeted session? > > Need to test. > > > > Sure selective folks still count of "IGP Convergence" to restore > connectivity. But I hope those will move to much faster connectivity > restoration techniques soon. > > > We are happy :-). > > Mark. >
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 22:21, Robert Raszuk wrote: > > Well this is true for one company :) Name starts with j > > Other company name starting with c - at least some time back by > default allocated labels for all routes in the RIB either connected or > static or sourced from IGP. Sure you could always limit that with a > knob if desired. Juniper allocates labels to the Loopback only. Cisco allocates labels to all IGP and interface routes. Neither allocate labels to BGP routes for the global table. > > The issue with allocating labels only for BGP next hops is that your > IP/MPLS LFA breaks (or more directly is not possible) as you do not > have a label to PQ node upon failure. Hint: PQ node is not even > running BGP :). Wouldn't T-LDP fix this, since LDP LFA is a targeted session? Need to test. > > Sure selective folks still count of "IGP Convergence" to restore > connectivity. But I hope those will move to much faster connectivity > restoration techniques soon. We are happy :-). Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 21:15, adamv0...@netconsultings.com wrote: > I wouldn't say it's known to many as not many folks are actually limited by > only up to ~1M customer connections, or next level up, only up to ~1M > customer VPNs. It's probably less of a problem now than it was 10 years ago. But, yes, I don't have any real-world experience. > Well yeah, things work differently in VRFs, not a big surprise. > And what about an example of bad flowspec routes/filters cutting the boxes > off net -where having those flowspec routes/filters contained within an > Internet VRF would not have such an effect. > See, it goes either way. > Would be interesting to see a comparison of good vs bad for the Internet > routes in VRF vs in Internet routes in global/default routing table. Well, the global table is the basics, and VRF's is where sexy lives :-). > No, that's just a result of having a finite FIB/RIB size -if you want to cut > these resources into virtual pieces you'll naturally get your equations above. > But if you actually construct your testing to showcase the delta between how > much FIB/RIB space is taken by x prefixes with each in a VRF as opposed to > all in a single default VRF (global routing table) the delta is negligible. > (Yes negligible even in case of per prefix VPN label allocation method -which > I'm assuming no one is using anyways as it inherently doesn't scale and would > limit you to ~1M VPN prefixes though per-CE/per-next-hop VPN label allocation > method gives one the same functionality as per-prefix one while pushing the > limit to ~1M PE-CE links/IFLs which from my experience is sufficient for most > folks out there). Like I said, with today's CPU's and memory, probably not an issue. But it's not an area I play in, so those with more experience - like yourself - would know better. Mark.
Re: Devil's Advocate - Segment Routing, Why?
> > I should point out that all of my input here is based on simple MPLS > forwarding of IP traffic in the global table. In this scenario, labels > are only assigned to BGP next-hops, which is typically an IGP Loopback > address. > Well this is true for one company :) Name starts with j Other company name starting with c - at least some time back by default allocated labels for all routes in the RIB either connected or static or sourced from IGP. Sure you could always limit that with a knob if desired. The issue with allocating labels only for BGP next hops is that your IP/MPLS LFA breaks (or more directly is not possible) as you do not have a label to PQ node upon failure. Hint: PQ node is not even running BGP :). Sure selective folks still count of "IGP Convergence" to restore connectivity. But I hope those will move to much faster connectivity restoration techniques soon. > Labels don't get assigned to BGP routes in a global table. There is no > use for that. > Sure - True. Cheers, R,
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 19:34, Robert Raszuk wrote: > > That is true for P routers ... not so much for PEs. > > Please observe that label space in each PE router is divided for IGP > and BGP as well as other label hungy services ... there are many > consumers of local label block. > > So it is always the case that LFIB table (max 2^20 entries - 1M) on > PEs is much larger then LFIB on P nodes. I should point out that all of my input here is based on simple MPLS forwarding of IP traffic in the global table. In this scenario, labels are only assigned to BGP next-hops, which is typically an IGP Loopback address. Labels don't get assigned to BGP routes in a global table. There is no use for that. Of course, as this is needed in VRF's and other BGP-based VPN services, the extra premium customers pay for that priviledge may be considered warranted :-). Mark.
RE: Devil's Advocate - Segment Routing, Why?
> From: NANOG On Behalf Of Mark Tinka > Sent: Friday, June 19, 2020 7:28 PM > > > On 19/Jun/20 17:13, Robert Raszuk wrote: > > > > > So I think Ohta-san's point is about scalability services not flat > > underlay RIB and FIB sizes. Many years ago we had requests to support > > 5M L3VPN routes while underlay was just 500K IPv4. > > Ah, if the context, then, was l3vpn scaling, yes, that is a known issue. > I wouldn't say it's known to many as not many folks are actually limited by only up to ~1M customer connections, or next level up, only up to ~1M customer VPNs. > Apart from the global table vs. VRF parity concerns I've always had (one of > which was illustrated earlier this week, on this list, with RPKI in a VRF), > Well yeah, things work differently in VRFs, not a big surprise. And what about an example of bad flowspec routes/filters cutting the boxes off net -where having those flowspec routes/filters contained within an Internet VRF would not have such an effect. See, it goes either way. Would be interesting to see a comparison of good vs bad for the Internet routes in VRF vs in Internet routes in global/default routing table. > the > other reason I don't do Internet in a VRF is because it was always a > trade-off: > > - More routes per VRF = fewer VRF's. > - More VRF's = fewer routes per VRF. > No, that's just a result of having a finite FIB/RIB size -if you want to cut these resources into virtual pieces you'll naturally get your equations above. But if you actually construct your testing to showcase the delta between how much FIB/RIB space is taken by x prefixes with each in a VRF as opposed to all in a single default VRF (global routing table) the delta is negligible. (Yes negligible even in case of per prefix VPN label allocation method -which I'm assuming no one is using anyways as it inherently doesn't scale and would limit you to ~1M VPN prefixes though per-CE/per-next-hop VPN label allocation method gives one the same functionality as per-prefix one while pushing the limit to ~1M PE-CE links/IFLs which from my experience is sufficient for most folks out there). adam
Re: 60 ms cross-continent
> > This is a nice plot for a movie, but not how HFT is really done. It's so > > much easier to colocate on the same datacenter of the exchange and run > > algorithms from there; while those algorithms need humans to guide their > > strategy, the human thought process takes a couple of seconds anyways. So > > the real HFTs keep using the defined strategy while the human controller > > doesn't tell it otherwise. > > For faster access to one exchange, yes, absolutely, colocate at the > exchange. But there's more then one exchange. > Yes, but to do real HFT you will need to colocate at each exchange. Otherwise your competitors have a head start on you. > > As one example, many index futures trade in Chicago. The stocks that > make up those indices mostly trade in New York. There's money to be > made on the arbitrage, if your Chicago algorithms get faster > information from New York (and vice versa) than everyone else's > algorithms. > Most traded index futures are longer than just that day closing, usually months to a year in advance. They are influenced mostly by traders perception on economic futures, and the current stocks valuation is a poor proxy for it. There is more chance in reading the news feeds and speculating its impact on perception than stocks. Rubens
Re: 60 ms cross-continent
On 6/21/20 1:53 PM, Brett Frankenberger wrote: On Sun, Jun 21, 2020 at 02:17:08PM -0300, Rubens Kuhl wrote: On Sat, Jun 20, 2020 at 5:05 PM Marshall Eubanks wrote: This was also pitched as one of the killer-apps for the SpaceX Starlink satellite array, particularly for cross-Atlantic and cross-Pacific trading. https://blogs.cfainstitute.org/marketintegrity/2019/06/25/fspacex-is-opening-up-the-next-frontier-for-hft/ "Several commentators quickly caught onto the fact that an extremely expensive network whose main selling point is long-distance, low-latency coverage has a unique chance to fund its growth by addressing the needs of a wealthy market that has a high willingness to pay — high-frequency traders." This is a nice plot for a movie, but not how HFT is really done. It's so much easier to colocate on the same datacenter of the exchange and run algorithms from there; while those algorithms need humans to guide their strategy, the human thought process takes a couple of seconds anyways. So the real HFTs keep using the defined strategy while the human controller doesn't tell it otherwise. For faster access to one exchange, yes, absolutely, colocate at the exchange. But there's more then one exchange. As one example, many index futures trade in Chicago. The stocks that make up those indices mostly trade in New York. There's money to be made on the arbitrage, if your Chicago algorithms get faster information from New York (and vice versa) than everyone else's algorithms. More expensive but shorter fiber routes have been build between NYC and Chicago for this reason, as have a microwave paths (to get speed-of-light in air rather than in glass). There's competition to have the microwave towers as close as possible to the data centers, because the last mile is fiber so the longer your last mile, the less valuable your network. https://www.bloomberg.com/news/features/2019-03-08/the-gazillion-dollar-standoff-over-two-high-frequency-trading-towers ... and similar to this: https://www.extremetech.com/extreme/122989-1-5-billion-the-cost-of-cutting-london-toyko-latency-by-60ms -- Brett
Re: 60 ms cross-continent
On Sun, Jun 21, 2020 at 02:17:08PM -0300, Rubens Kuhl wrote: > On Sat, Jun 20, 2020 at 5:05 PM Marshall Eubanks > wrote: > > > This was also pitched as one of the killer-apps for the SpaceX > > Starlink satellite array, particularly for cross-Atlantic and > > cross-Pacific trading. > > > > > > https://blogs.cfainstitute.org/marketintegrity/2019/06/25/fspacex-is-opening-up-the-next-frontier-for-hft/ > > > > "Several commentators quickly caught onto the fact that an extremely > > expensive network whose main selling point is long-distance, > > low-latency coverage has a unique chance to fund its growth by > > addressing the needs of a wealthy market that has a high willingness > > to pay — high-frequency traders." > > > > > This is a nice plot for a movie, but not how HFT is really done. It's so > much easier to colocate on the same datacenter of the exchange and run > algorithms from there; while those algorithms need humans to guide their > strategy, the human thought process takes a couple of seconds anyways. So > the real HFTs keep using the defined strategy while the human controller > doesn't tell it otherwise. For faster access to one exchange, yes, absolutely, colocate at the exchange. But there's more then one exchange. As one example, many index futures trade in Chicago. The stocks that make up those indices mostly trade in New York. There's money to be made on the arbitrage, if your Chicago algorithms get faster information from New York (and vice versa) than everyone else's algorithms. More expensive but shorter fiber routes have been build between NYC and Chicago for this reason, as have a microwave paths (to get speed-of-light in air rather than in glass). There's competition to have the microwave towers as close as possible to the data centers, because the last mile is fiber so the longer your last mile, the less valuable your network. https://www.bloomberg.com/news/features/2019-03-08/the-gazillion-dollar-standoff-over-two-high-frequency-trading-towers -- Brett
Re: Devil's Advocate - Segment Routing, Why?
> The LFIB in each node need only be as large as the number of LDP-enabled routers in the network. That is true for P routers ... not so much for PEs. Please observe that label space in each PE router is divided for IGP and BGP as well as other label hungy services ... there are many consumers of local label block. So it is always the case that LFIB table (max 2^20 entries - 1M) on PEs is much larger then LFIB on P nodes. Thx, R. On Sun, Jun 21, 2020 at 6:01 PM Mark Tinka wrote: > > > On 21/Jun/20 15:48, Robert Raszuk wrote: > > > > Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS > (when used without TE :). > > "LSP" term is perhaps what drives your confusion --- in LDP MPLS there is > no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are > locally significant and swapped at each LSR - resulting essentially with a > bunch of one hop crossconnects. > > In other words MPLS LDP strictly follows IGP SPT at each LSR hop. > > > Yep, which is what I tried to explain as well. With LDP, MPLS-enabled > hosts simply push, swap and pop. There is not concept of an "end-to-end > LSP" as such. We just use the term "LSP" to define an FEC. But really, each > node in the FEC's path is making its own push, swap and pop decisions. > > The LFIB in each node need only be as large as the number of LDP-enabled > routers in the network. You can get scenarios where FEC's are also created > for infrastructure links, but if you employ filtering to save on FIB slots, > you really just need to allocate labels to Loopback addresses only. > > Mark. >
Re: 60 ms cross-continent
On Sat, Jun 20, 2020 at 5:05 PM Marshall Eubanks wrote: > This was also pitched as one of the killer-apps for the SpaceX > Starlink satellite array, particularly for cross-Atlantic and > cross-Pacific trading. > > > https://blogs.cfainstitute.org/marketintegrity/2019/06/25/fspacex-is-opening-up-the-next-frontier-for-hft/ > > "Several commentators quickly caught onto the fact that an extremely > expensive network whose main selling point is long-distance, > low-latency coverage has a unique chance to fund its growth by > addressing the needs of a wealthy market that has a high willingness > to pay — high-frequency traders." > > This is a nice plot for a movie, but not how HFT is really done. It's so much easier to colocate on the same datacenter of the exchange and run algorithms from there; while those algorithms need humans to guide their strategy, the human thought process takes a couple of seconds anyways. So the real HFTs keep using the defined strategy while the human controller doesn't tell it otherwise. And in order to preserve equality among traders, each exchange already adds physically (loops of fiber or copper cable) some ns to closer racks so everyone gets at the system at the same time. And then comes a really high added latency of the trade risk controller, which limits what a trader is allowed to expose itself to what is deposited or agreed with the exchange. And this comes with both latency and jitter due to its implementation, making even the faster HFT only faster on average, not faster at every transaction. Rubens
Re: 60 ms cross-continent
Mel Beckman wrote: > An intriguing development in fiber optic media is hollow core optical > fiber, which achieves 99.7% of the speed of light in a vacuum. > > https://www.extremetech.com/computing/151498-researchers-create-fiber-network-that-operates-at-99-7-speed-of-light-smashes-speed-and-latency-records Here's an update from 7 years after that article which hints at the downside of hollow core fibre: https://phys.org/news/2020-03-hollow-core-fiber-technology-mainstream-optical.html It sounds like attenuation was a big problem: "in the space of 18 months the attenuation in data-transmitting hollow-core fibers has been reduced by over a factor of 10, from 3.5dB/km to only 0.28 dB/km within a factor of two of the attenuation of conventional all-glass fiber technology." Tony. -- f.anthony.n.finchhttp://dotat.at/ Shetland Isles: Southeasterly 5 or 6, veering southerly or southwesterly 3 or 4, then backing southeasterly 5 later in southwest. Slight or moderate, occasionally rough later in far west. Occasional rain then mainly fair, but showers far in east. Good, occasionally moderate.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 15:48, Robert Raszuk wrote: > > > Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS > (when used without TE :). > > "LSP" term is perhaps what drives your confusion --- in LDP MPLS there > is no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels > are locally significant and swapped at each LSR - resulting > essentially with a bunch of one hop crossconnects. > > In other words MPLS LDP strictly follows IGP SPT at each LSR hop. Yep, which is what I tried to explain as well. With LDP, MPLS-enabled hosts simply push, swap and pop. There is not concept of an "end-to-end LSP" as such. We just use the term "LSP" to define an FEC. But really, each node in the FEC's path is making its own push, swap and pop decisions. The LFIB in each node need only be as large as the number of LDP-enabled routers in the network. You can get scenarios where FEC's are also created for infrastructure links, but if you employ filtering to save on FIB slots, you really just need to allocate labels to Loopback addresses only. Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 14:58, Baldur Norddahl wrote: > > Not really the same. Lets say the best path is through transit 1 but > the customer thinks transit 1 sucks balls and wants his egress traffic > to go through your transit 2. Only the VRF approach lets every BGP > customer, even single homed ones, make his own choices about upstream > traffic. > > You would be more like a transit broker than a traditional ISP with a > routing mix. Your service is to buy one place, but get the exact same > product as you would have if you bought from top X transits in your > area. Delivered as X distinct BGP sessions to give you total freedom > to send traffic via any of the transit providers. We received such requests years ago, and calculated the cost of complexity vs. BGP communities. In the end, if the customer wants to use a particular upstream on our side, we'd rather setup an EoMPLS circuit between them and they can have their own contract. Practically, 90% of our traffic is peering. We don't that much with upstreams providers. > > This is also the reason you do not actually need any routes in the FIB > for each of those transit VRFs. Just a default route because all > traffic will unconditionally go to said transit provider. The customer > routes would still be there of course. Glad it works for you. We just found it too complex, not just for the problems it would solve, but also for the parity issues between VRF's and the global table. Mark.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 21/Jun/20 14:36, Masataka Ohta wrote: > > > That is a tragedy. Well... > If all the link-wise (or, worse, host-wise) information of possible > destinations is distributed in advance to all the possible sources, > it is not hierarchical but flat (host) routing, which scales poorly. > > Right? Host NLRI is summarized in iBGP within the domain, and eBGP outside the domain. It's no longer novel to distribute end-user NLRI in the IGP. If folk are still doing that, I can't feel sympathy for the pain they may experience. > > Why, do you think, flat routing does not but hierarchical > routing does scale? > > It is because detailed information to reach destinations > below certain level is advertised not globally but only for > small part of the network around the destinations. > > That is, with hierarchical routing, detailed information > around destinations is actively hidden from sources. > > So, with hierarchical routing, routing protocols can > carry only rough information around destinations, from > which, source side can not construct detailed (often > purposelessly nested) labels required for MPLS. But hosts often point default to a clever router. That clever router could also either point default to the provider, or carry a full BGP table from the provider. Neither the host nor their first-hop gateway need to be MPLS-aware. There are use-cases where a customer CPE can be MPLS-aware, but I'd say that in nearly 99.999% of all cases, CPE are never MPLS-aware. > According to your theory to ignore routing traffic, we can be happy > with global *host* routing table with 4G entries for IPv4 and a lot > lot lot more than that for IPv6. CIDR should be unnecessary > complication to the Internet Not sure what Internet you're running, but I, generally, accept aggregate IPv4 and IPv6 BGP routes from other AS's. I don't need to know every /32 or /128 host that sits behind them. > > With nested labels, you don't need so much labels at certain nesting > level, which was the point of Yakov, which does not mean you don't > need so much information to create entire nested labels at or near > the sources. I don't know what Yakov advertised back in the day, but looking at what I and a ton of others are running in practice, in the real world, today, I don't see what you're talking about. Again, if you can identify an actual scenario today, in a live, large scale (or even small scale) network, I'd like to know. I'm talking about what's in practice, not theory. > > The problem is that we can't afford traffic (and associated processing > by all the related routers or things like those) and storage (at or > near source) for routing (or MPLS, SR* or whatever) with such detailed > routing at the destinations. Again, I disagree as I mentioned earlier, because you won't be able to buy a router today that does only IP any cheaper than it does both IP and MPLS. MPLS has become mainstream, that its economies of scale have made the consideration between it and IP a non-starter. Heck, you can even do it in Linux... Mark.
Re: Devil's Advocate - Segment Routing, Why?
> I'm saying that, if some failure occurs and IGP changes, a > lot of LSPs must be recomputed, which does not scale > if # of LSPs is large, especially in a large network > where IGP needs hierarchy (such as OSPF area). > > Masataka Ohta > Actually when IGP changes LSPs are not recomputed with LDP or SR-MPLS (when used without TE :). "LSP" term is perhaps what drives your confusion --- in LDP MPLS there is no "Path" - in spite of the acronym (Labeled Switch *Path*). Labels are locally significant and swapped at each LSR - resulting essentially with a bunch of one hop crossconnects. In other words MPLS LDP strictly follows IGP SPT at each LSR hop. Many thx, R.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
Let's clarify a few things ... On Sun, Jun 21, 2020 at 2:39 PM Masataka Ohta < mo...@necom830.hpcl.titech.ac.jp> wrote: If all the link-wise (or, worse, host-wise) information of possible > destinations is distributed in advance to all the possible sources, > it is not hierarchical but flat (host) routing, which scales poorly. > > Right? > Neither link wise nor host wise information is required to accomplish say L3VPN services. Imagine you have three sites which would like to interconnect each with 1000s of users. So all you are exchanging as part of VPN overlay is three subnets. Moreover if you have 1000 PEs and those three sites are attached only to 6 of them - only those 6 PEs will need to learn those routes (Hint: RTC - RFC4684) It is because detailed information to reach destinations > below certain level is advertised not globally but only for > small part of the network around the destinations. > Same thing here. > That is, with hierarchical routing, detailed information > around destinations is actively hidden from sources. > Same thing here. That is why as described we use label stack. Top label is responsible to get you to the egress PE. Service label sitting behind top label is responsible to get you through to the customer site (with or without IP lookup at egress PE). > So, with hierarchical routing, routing protocols can > carry only rough information around destinations, from > which, source side can not construct detailed (often > purposelessly nested) labels required for MPLS. > Usually sources have no idea of MPLS. MPLS to the host never took off. > According to your theory to ignore routing traffic, we can be happy > with global *host* routing table with 4G entries for IPv4 and a lot > lot lot more than that for IPv6. CIDR should be unnecessary > complication to the Internet > I do not think any one saying it here. > With nested labels, you don't need so much labels at certain nesting > level, which was the point of Yakov, which does not mean you don't > need so much information to create entire nested labels at or near > the sources. > Label stack is here from day one. Each layer of the stack has a completely different role. That is your hierarchy. Kind regards, R.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
It is destination based flat routing distributed 100% before any data packet within each layer - yes. But layers are decoupled so in a sense this is what defines a hierarchy overall. So transport is using MPLS LSPs most often hosts IGP routes are matched with LDP FECs and flooded everywhere in spite of RFC 5283 at least allowing to aggregate IGP. Then say L2VPNs or L3VPNs with their own choice of routing protocols are in turn distributing reachability for the customer sites. Those are service routes linked to transport by BGP next hop(s). Many thx, R. On Sun, Jun 21, 2020 at 1:11 PM Masataka Ohta < mo...@necom830.hpcl.titech.ac.jp> wrote: > Robert Raszuk wrote: > > > MPLS LDP or L3VPNs was NEVER flow driven. > > > > Since day one till today it was and still is purely destination based. > > If information to create labels at or near sources to all the > possible destinations is distributed in advance, may be. But > it is effectively flat routing, or, in extreme cases, flat host > routing. > > Or, if information to create labels to all the active destinations > is supplied on demand, it is flow driven. > > On day one, Yakov said MPLS had scaled because of nested labels > corresponding to routing hierarchy. > > Masataka Ohta >
Re: Devil's Advocate - Segment Routing, Why?
On Sun, Jun 21, 2020 at 1:30 PM Mark Tinka wrote: > > > On 21/Jun/20 12:45, Baldur Norddahl wrote: > > > Yes I once made a plan to have one VRF per transit provider plus a peering > VRF. That way our BGP customers could have a session with each of those > VRFs to allow them full control of the route mix. I would of course also > need a Internet VRF for our own needs. > > But the reality of that would be too many copies of the DFZ in the routing > tables. Although not necessary in the FIB as each of the transit VRFs could > just have a default route installed. > > > We just opted for BGP communities :-). > > Not really the same. Lets say the best path is through transit 1 but the customer thinks transit 1 sucks balls and wants his egress traffic to go through your transit 2. Only the VRF approach lets every BGP customer, even single homed ones, make his own choices about upstream traffic. You would be more like a transit broker than a traditional ISP with a routing mix. Your service is to buy one place, but get the exact same product as you would have if you bought from top X transits in your area. Delivered as X distinct BGP sessions to give you total freedom to send traffic via any of the transit providers. This is also the reason you do not actually need any routes in the FIB for each of those transit VRFs. Just a default route because all traffic will unconditionally go to said transit provider. The customer routes would still be there of course. Regards, Baldur
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
Mark Tinka wrote: If information to create labels at or near sources to all the possible destinations is distributed in advance, may be. But this is what happens today. That is a tragedy. Whether you do it manually or use a label distribution protocol, FEC's are pre-computed ahead of time. What am I missing? If all the link-wise (or, worse, host-wise) information of possible destinations is distributed in advance to all the possible sources, it is not hierarchical but flat (host) routing, which scales poorly. Right? But it is effectively flat routing, or, in extreme cases, flat host routing. I still don't get it. Why, do you think, flat routing does not but hierarchical routing does scale? It is because detailed information to reach destinations below certain level is advertised not globally but only for small part of the network around the destinations. That is, with hierarchical routing, detailed information around destinations is actively hidden from sources. So, with hierarchical routing, routing protocols can carry only rough information around destinations, from which, source side can not construct detailed (often purposelessly nested) labels required for MPLS. > So why create labels on-demand if > a box to handle the traffic is already in place and actively working, > day-in, day-out? According to your theory to ignore routing traffic, we can be happy with global *host* routing table with 4G entries for IPv4 and a lot lot lot more than that for IPv6. CIDR should be unnecessary complication to the Internet With nested labels, you don't need so much labels at certain nesting level, which was the point of Yakov, which does not mean you don't need so much information to create entire nested labels at or near the sources. The problem is that we can't afford traffic (and associated processing by all the related routers or things like those) and storage (at or near source) for routing (or MPLS, SR* or whatever) with such detailed routing at the destinations. Masataka Ohta
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 21/Jun/20 13:11, Masataka Ohta wrote: > > If information to create labels at or near sources to all the > possible destinations is distributed in advance, may be. But this is what happens today. Whether you do it manually or use a label distribution protocol, FEC's are pre-computed ahead of time. What am I missing? > But > it is effectively flat routing, or, in extreme cases, flat host > routing. I still don't get it. > > Or, if information to create labels to all the active destinations > is supplied on demand, it is flow driven. What would the benefit of this be? Ingress and egress nodes don't come and go. They are stuck in racks in data centres somewhere, and won't disappear until a human wants them to. So why create labels on-demand if a box to handle the traffic is already in place and actively working, day-in, day-out? Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 12:45, Baldur Norddahl wrote: > > Yes I once made a plan to have one VRF per transit provider plus a > peering VRF. That way our BGP customers could have a session with each > of those VRFs to allow them full control of the route mix. I would of > course also need a Internet VRF for our own needs. > > But the reality of that would be too many copies of the DFZ in the > routing tables. Although not necessary in the FIB as each of the > transit VRFs could just have a default route installed. We just opted for BGP communities :-). Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 12:10, Masataka Ohta wrote: > > It was implemented and some technology was used by commercial > router from Furukawa (a Japanese vendor selling optical > fiber now not selling routers). I won't lie, never heard of it. > GMPLS, you are using, is the mechanism to guarantee QoS by > reserving wavelength resource. It is impossible for GMPLS > not to offer QoS. That is/was the idea. In practice (at least in our Transport network), deploying capacity as an offline exercise is significantly simpler. In such a case, we wouldn't use GMPLS for capacity reservation, just path re-computation in failure scenarios. Our Transport network isn't overly meshed. It's just stretchy. Perhaps if one was trying to build a DWDM backbone into, out of and through every city in the U.S., capacity reservation in GMPLS may be a use-case. But unless someone is willing to pipe up and confess to implementing it in this way, I've not heard of it. > > Moreover, as some people says they offer QoS with MPLS, they > should be using some prioritized queueing mechanisms, perhaps > not poor WFQ. It would be a combination - PQ and WFQ depending on the traffic type and how much customers want to pay. But carrying an MPLS EXP code point does not make MPLS unscalable. It's no different to carrying a DSCP or IPP code point in plain IP. Or even an 802.1p code point in Ethernet. > They are different, of course. But, GMPLS is to reserve bandwidth > resource. In theory. What are people doing in practice? I just told you our story. > MPLS, in general, is to reserve label values, at least. MPLS is the forwarding paradigm. Label reservation/allocation can be done manually or with a label distribution protocol. MPLS doesn't care how labels are generated and learned. It will just push, swap and pop as it needs to. > I didn't say scaling problem caused by QoS. > > But, as you are avoiding to extensively use MPLS, I think you > are aware that extensive use of MPLS needs management of a > lot of labels, which does not scale. > > Or, do I misunderstand something? I'm not avoiding extensive use of MPLS. I want extensive use of MPLS. In IPv4, we forward in MPLS 100%. In IPv6, we forward in MPLS 80%. This is due to vendor nonsense. Trying to fix. > No. IntServ specifies format to carry QoS specification in RSVP > packets without assuming any specific model of QoS. Then I'm failing to understand your point, especially since it doesn't sound like any operator is deploying such a model, or if so, publicly suffering from it. > No. As experimental switches are working years ago and making > it work >10Tbps is not difficult (switching is easy, generating > 10Tbps packets needs a lot of parallel equipment), there is little > remaining for research. We'll get there. This doesn't worry me so much :-). Either horizontally or vertically. I can see a few models to scale IP/MPLS carriage. > > SDN, maybe. Though I'm not saying SDN scale, it should be no > worse than MPLS. I still can't tell you what SDN is :-). I won't suffer it in this decade, thankfully. > I did some retrospective research. > > https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching > History > 1994: Toshiba presented Cell Switch Router (CSR) ideas to IETF BOF > 1996: Ipsilon, Cisco and IBM announced label switching plans > 1997: Formation of the IETF MPLS working group > 1999: First MPLS VPN (L3VPN) and TE deployments > 2000: MPLS traffic engineering > 2001: First MPLS Request for Comments (RFCs) released > > as I was a co-chair of 1994 BOF and my knowledge on MPLS is > mostly on 1997 ID: > > https://tools.ietf.org/html/draft-ietf-mpls-arch-00 > > there seems to be a lot of terminology changes. My comment to that was in reference to your text, below: "What if, an inner label becomes invalidated around the destination, which is hidden, for route scalability, from the equipments around the source?" I've never heard of such an issue in 16 years. > > I'm saying that, if some failure occurs and IGP changes, a > lot of LSPs must be recomputed, which does not scale > if # of LSPs is large, especially in a large network > where IGP needs hierarchy (such as OSPF area). That happens everyday, already. Links fail, IGP re-converges, LDP keeps humming. RSVP-TE too, albeit all that state does need some consideration especially if code is buggy. Particularly, where you have LFA/IP-FRR both in the IGP and LDP, I've not come across any issue where IGP re-convergence caused LSP's to fail. In practice, IGP hierarchy (OSPF Areas or IS-IS Levels) doesn't help much if you are running MPLS. FEC's are forged against /32 and /128 addresses. Yes, as with everything else, it's a trade-off. Mark.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
Robert Raszuk wrote: MPLS LDP or L3VPNs was NEVER flow driven. Since day one till today it was and still is purely destination based. If information to create labels at or near sources to all the possible destinations is distributed in advance, may be. But it is effectively flat routing, or, in extreme cases, flat host routing. Or, if information to create labels to all the active destinations is supplied on demand, it is flow driven. On day one, Yakov said MPLS had scaled because of nested labels corresponding to routing hierarchy. Masataka Ohta
Re: Devil's Advocate - Segment Routing, Why?
On Sun, Jun 21, 2020 at 9:56 AM Mark Tinka wrote: > > > On 20/Jun/20 22:00, Baldur Norddahl wrote: > > > I can't speak for the year 2000 as I was not doing networking at this > level at that time. But when I check the specs for the base mx204 it says > something like 32 VRFs, 2 million routes in FIB and 6 million routes in > RIB. Clearly those numbers are the total of routes across all VRFs > otherwise you arrive at silly numbers (64 million FIB if you multiply, 128k > FIB if you divide by 32). My conclusion is that scale wise you are ok as > long you do not try to have more than one VRF with a complete copy of the > DFZ. > > > I recall a number of networks holding multiple VRF's, including at least > 2x Internet VRF's, for numerous use-cases. I don't know if they still do > that today, but one can get creative real quick :-). > > Yes I once made a plan to have one VRF per transit provider plus a peering VRF. That way our BGP customers could have a session with each of those VRFs to allow them full control of the route mix. I would of course also need a Internet VRF for our own needs. But the reality of that would be too many copies of the DFZ in the routing tables. Although not necessary in the FIB as each of the transit VRFs could just have a default route installed. Regards, Baldur
Re: Devil's Advocate - Segment Routing, Why?
Mark Tinka wrote: There are many. So, our research group tried to improve RSVP. I'm a lot younger than the Internet, but I read a fair bit about its history. I can't remember ever coming across an implementation of RSVP between a host and the network in a commercial setting. No, of course, because, as we agreed, RSVP has a lot of problems. Was "S-RSVP" ever implemented, and deployed? It was implemented and some technology was used by commercial router from Furukawa (a Japanese vendor selling optical fiber now not selling routers). However, perhaps, most people think show stopper to RSVP is lack of scalability of weighted fair queueing, though, it is not a problem specific to RSVP and MPLS shares the same problem. QoS has nothing to do with MPLS. You can do QoS with or without MPLS. GMPLS, you are using, is the mechanism to guarantee QoS by reserving wavelength resource. It is impossible for GMPLS not to offer QoS. Moreover, as some people says they offer QoS with MPLS, they should be using some prioritized queueing mechanisms, perhaps not poor WFQ. I should probably point out, also, that RSVP (or RSVP-TE) is not MPLS. They are different, of course. But, GMPLS is to reserve bandwidth resource. MPLS, in general, is to reserve label values, at least. All MPLS can do is convey IPP or DSCP values as an EXP code point in the core. I'm not sure how that creates a scaling problem within MPLS itself. I didn't say scaling problem caused by QoS. But, as you are avoiding to extensively use MPLS, I think you are aware that extensive use of MPLS needs management of a lot of labels, which does not scale. Or, do I misunderstand something? If I understand this correctly, would this be the IntServ QoS model? No. IntServ specifies format to carry QoS specification in RSVP packets without assuming any specific model of QoS. I didn't attempt to standardize our result in IETF, partly because optical packet switching was a lot more interesting. Still is, even today :-)? No. As experimental switches are working years ago and making it work >10Tbps is not difficult (switching is easy, generating 10Tbps packets needs a lot of parallel equipment), there is little remaining for research. https://www.osapublishing.org/abstract.cfm?URI=OFC-2010-OWM4 Assuming a central controller (and its collocated or distributed back up controllers), we don't need complicated protocols in the network to maintain integrity of the entire network. Well, that's a point of view, I suppose. I still can't walk into a shop and "buy a controller". I don't know what this controller thing is, 10 years on. SDN, maybe. Though I'm not saying SDN scale, it should be no worse than MPLS. I can't say I've ever come across that scenario running MPLS since 2004. I did some retrospective research. https://en.wikipedia.org/wiki/Multiprotocol_Label_Switching History 1994: Toshiba presented Cell Switch Router (CSR) ideas to IETF BOF 1996: Ipsilon, Cisco and IBM announced label switching plans 1997: Formation of the IETF MPLS working group 1999: First MPLS VPN (L3VPN) and TE deployments 2000: MPLS traffic engineering 2001: First MPLS Request for Comments (RFCs) released as I was a co-chair of 1994 BOF and my knowledge on MPLS is mostly on 1997 ID: https://tools.ietf.org/html/draft-ietf-mpls-arch-00 there seems to be a lot of terminology changes. I'm saying that, if some failure occurs and IGP changes, a lot of LSPs must be recomputed, which does not scale if # of LSPs is large, especially in a large network where IGP needs hierarchy (such as OSPF area). Masataka Ohta
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 20/Jun/20 17:12, Robert Raszuk wrote: > > MPLS is not flow driven. I sent some mail about it but perhaps it > bounced. > > MPLS LDP or L3VPNs was NEVER flow driven. > > Since day one till today it was and still is purely destination based. > > Transport is using LSP to egress PE (dst IP). > > L3VPNs are using either per dst prefix, or per CE or per VRF labels. > No implementation does anything upon "flow detection" - to prepare any > nested labels. Even in FIBs all information is preprogrammed in > hierarchical fashion well before any flow packet arrives. If you really don't like LDP or RSVP-TE, you can statically assign labels and manually configure FEC's across your entire backbone. If trading state for administration is your thing, of course :-). Mark.
Re: Hurricane Electric has reached 0 RPKI INVALIDs in our routing table
Hi, On Thu, Jun 18, 2020, at 04:01, Jon Lewis wrote: > > Just like I said, if you create an ROA for an aggregate, forgetting that > you have customers using subnets of that aggregate (or didn't create ROAs > for customer subnets with the right origin ASNs), you're literally telling > those using RPKI to verify routes "don't accept our customers' routes." > That might not be bad for "your network", but it's probably bad for > someone's. That makes you a bad upstream operator, one that does things without understanding the consequences. This may still be the unfortunate norm, but it's by no means something to be considered an acceptable state. Put otherwise : if you have downstream customers that you allow to announce part of your address space in the GRT, make sure you can still provide the service after doing changes (like RPKI signing). Put in a yet another way : if you lease IP space (with or without connectivity), make sure all the additional services are included in a way or another. Those services should include RPKI signing and reverse DNS, and the strict minimum (only slightly better than not doing it at all) should be via "open a service ticket"; the more automated the better.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 20/Jun/20 17:08, Robert Raszuk wrote: > > > But with that let's not forget that aggregation here is still not > spec-ed out well and to the best of my knowledge it is also not > shipping yet. I recently proposed an idea how to aggregate SRGBs .. > one vendor is analyzing it. Hence why I think SR still needs time to grow up. There are some things I can be maverick about. I don't think SR is it, today. Mark.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 20/Jun/20 15:39, Masataka Ohta wrote: > Ipsilon was hopeless because, as Yakov correctly pointed out, flow > driven approach to automatically detect flows does not scale. > > The problem of MPLS, however, is that, it must also be flow driven, > because detailed route information at the destination is necessary > to prepare nested labels at the source, which costs a lot and should > be attempted only for detected flows. Again, I think you are talking about what RSVP should have been. RSVP != MPLS. > Routing table at IPv4 backbone today needs at most 16M entries to be > looked up by simple SRAM, which is as fast as MPLS look up, which is > one of a reason why we should obsolete IPv6. I'm not sure I should ask this in fear of taking this discussion way off tangent... aaah, what the heck: So if we can't assign hosts IPv4 anymore because it has run out, should we obsolete IPv6 in favour of CGN? I know this works. > > Though resource reserved flows need their own routing table entries, > they should be charged proportional to duration of the reservation, > which can scale to afford the cost to have the entries. RSVP failed to take off when it was designed. Outside of capturing Netflow data (or tracking firewall state), nobody really cares about handling flows at scale (no, I'm not talking about ECMP). Why would we want to do that in 2020 if we didn't in 2000? Mark.
Re: why am i in this handbasket? (was Devil's Advocate - Segment Routing, Why?)
On 20/Jun/20 01:32, Randy Bush wrote: > there is saku's point of distributing labels in IGP TLVs/LSAs. i > suspect he is correct, but good luck getting that anywhere in the > internet vendor task force. and that tells us a lot about whether we > can actually effect useful simplification and change. This is shipping today with SR-MPLS. Besides still being brand new and not yet fully field tested by the community, my other concern is unless you are running a Juniper and have the energy to pull a "Vijay Gill" and move your entire backbone to IS-IS, you'll get either no SR-ISISv6 support, no SR-OSPFv3 support, or both, with all the vendors. Which brings me back to the same piss-poor attention LDPv6 is getting, which is, really, poor attention to IPv6. Kind of hard for operators to take IPv6 seriously at this level if the vendors, themselves, aren't. > is a significant part of the perception that there is a forwarding > problem the result of the vendors, 25 years later, still not > designing for v4/v6 parity? I think the forwarding is fine, if you're carrying the payload in MPLS. The problem is the control plane. It's not insurmountable; the vendors just want to do less work. The issue is IPv4 is gone, and trying to keep it around will only lead to the creation of more hacks, which will further complicate the control and data plane. > > there is the argument that switching MPLS is faster than IP; when the > pressure points i see are more at routing (BGP/LDP/RSVP/whatever), > recovery, and convergence. Either way, the MPLS or IP problem already has an existing solution. If you like IP, you can keep it. If you like MPLS, you can keep it. So I'd be spending less time on the forwarding (of course, if there are ways to improve that and someone has the time, why not), and as you say, work on fixing the control plane and the signaling for efficiency and scale. > > did we really learn so little from IP routing that we need to > recreate analogous complexity and fragility in the MPLS control > plane? ( sound of steam eminating from saku's ears :) The path to SR-MPLS's inherent signaling carried in the IGP is an optimum solution, that even I have been wanting since inception. But, it's still too fresh, global deployment is terrible, and there is still much to be learned about how it behaves outside of the lab. For me, a graceful approach toward SR via LDPv6 makes sense. But, as always, YMMV. > and then there is buffering; which seems more serious than simple > forwarding rate. get it there faster so it can wait in a queue? my > principal impression of the Stanford/Google workshops was the parable > of the blind men and the elephant. though maybe Matt had the main > point: given scaling 4x, Moore's law can not save us and it will all > become paced protocols. will we now have a decade+ of BBR evolution > and tuning? if so, how do we engineer our networks for that? This deserves a lot more attention than it's receiving. The problem is it doesn't sound sexy enough to compile into a PPT that you can project to suits whom you need to part with cash. It doesn't have that 5G or SRv6 or Controller or IoT ring to it :-). It's been a while since vendors that control a large portion of the market paid real attention to their geeky side. The buffer problem, for me, would fall into that category. Maybe a smaller, more agile, more geeky start-up, can take the lead with this one. > and up 10,000m, we watch vendor software engineers hand crafting in > an assembler language with if/then/case/for, and running a chain of > checking software to look for horrors in their assembler programs. > it's the bleeping 21st century. why are the protocol specs not > formal and verified, and the code formally generated and verified? > and don't give me too slow given that the hardware folk seem to be > able to do 10x in the time it takes to run valgrind a few dozen > times. And for today's episode of Jeopardy: "What used to be the IETF?" > we're extracting ore with hammers and chisels, and then hammering it > into shiny objects rather than safe and securable network design and > construction tools. Rush it out the factory, fast, even though it's not ready. Get all their money before they board the ship and sail for Mars. Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 20/Jun/20 22:00, Baldur Norddahl wrote: > > I can't speak for the year 2000 as I was not doing networking at this > level at that time. But when I check the specs for the base mx204 it > says something like 32 VRFs, 2 million routes in FIB and 6 million > routes in RIB. Clearly those numbers are the total of routes across > all VRFs otherwise you arrive at silly numbers (64 million FIB if you > multiply, 128k FIB if you divide by 32). My conclusion is that scale > wise you are ok as long you do not try to have more than one VRF with > a complete copy of the DFZ. I recall a number of networks holding multiple VRF's, including at least 2x Internet VRF's, for numerous use-cases. I don't know if they still do that today, but one can get creative real quick :-). > > More worrying is that 2 million routes will soon not be enough to > install all routes with a backup route, invalidating BGP FRR. I have a niggling feeling this will be solved before we get there. Now, whether we can afford it is a whole other matter. Mark.
Re: Devil's Advocate - Segment Routing, Why?
On 21/Jun/20 00:54, Sabri Berisha wrote: > That will be very advantageous in a datacenter environment, or any other > environment dealing with a lot of ECMP paths. > > I can't tell you how often during my eBay time I've been troubleshooting > end-to-end packetloss between hosts in two datacenters where there were at > least > 10 or more layers of up to 16 way ECMP between them. Having a record of which > path is being taken by a packet is very helpful to determine the one with a > crappy > transceiver. > > That work is already underway, albeit not specifically for MPLS. For example, > I've worked with an experimental version of In-Band Network Telemetry (INT) > as described in this draft: > https://tools.ietf.org/html/draft-kumar-ippm-ifa-02 > > I even demonstrated a very basic implementatoin during SuperCompute 19 in > Denver > last year. Most people who were interested in the demo were academics however, > probably because it wasn't a real networking event. > > Note that there are several caveats that come with this draft and previous > versions, and that it is still very much work in progress. But the potential > is > huge, at least in the DC. Alright, we'll wait and see, then. > That's a different story, but not entirely impossible. A probe packet can > be sent across AS borders, and as long as the two NOCs are cooperating, the > entire path can be reconstructed. Yes, for once-off troubleshooting, I suppose that would work. My concern is if it's for normal day-to-day operations. But who knows, maybe someone will propose that too :-). Mark.
Re: 60 ms cross-continent
On Sat, 20 Jun 2020 at 23:14, Bryan Fields wrote: > I think he might be referring to the newer modulation types (QAM) on long haul > transport. There's quite a bit of time in uS that the encoding takes into QAM > and adding FEC. You typically won't see this at the plug-able level between > switches and stuff. FEC is low tens of meters (i.e. low tens of nanoseconds), QAM is less. Won't impact the pipeline or NPU scenarios meaningfully, will impact the low latency scenario. -- ++ytti