On Mon, Dec 20, 2021 at 3:11 PM Templin (US), Fred L < fred.l.temp...@boeing.com> wrote:
> Tom, in modern reassembly it is not going to wait for the MSL for all > fragments > > to arrive anymore; either they all get there after a very small > inter-fragment > > delay, or you send an immediate FRAGREP and possibly also a PTB soft error > > then quickly declare the reassembly dead if that doesn’t help. And, you > make > > sure to inspect IDs of received fragments before admitting them into the > > reassembly cache so you don’t end up caching garbage that will just have to > > be discarded later. > Fred, It doesn't matter in the sense that reassembly is a non-working conserving mechanism. In order to perform reassembly packet fragments need to be held which means memory will be consumed and since memory is a finite resource it needs to be managed. Managing memory means that some policy is needed when to time out a reassembly or which fragment train to discard under memory pressure. A network that implements some arbitrary policy can cause problems on unsuspecting hosts. For instance, there's mechanisms for hosts to try to guess what the timeout is in a NAT box and send a keepalive packet before an idle NAT state is evicted. So this is just a guess that may or may not be right, and in fact there might not even be a NAT in the path in which case the host is just wasting energy sending keepalives. Also, the second we introduce a new exhaustible resource in the path that becomes yet another denial of service vector (consider the case that an attacker spoofs a whole bunch of IP parcels). Unless the network can coordinate very specifically with the host about what it's doing on behalf of the host stack, I think it's much better for the network to just focus or forward packets without delay and let the host handle the details of receive processing, reassembly, security, etc. Tom > > Fred > > > > *From:* Tom Herbert [mailto:t...@herbertland.com] > *Sent:* Monday, December 20, 2021 1:06 PM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* to...@strayalpha.com; int-area@ietf.org > *Subject:* Re: IP parcels > > > > > > On Mon, Dec 20, 2021 at 12:03 PM Templin (US), Fred L < > fred.l.temp...@boeing.com> wrote: > > Tom, sorry I will try to use my words more carefully; I am using GSO/GRO > also for > > a UDP-based transport protocol – not QUIC but something similar. I like > GSO/GRO > > very much; I am glad the service is available and I want to see it > continue. But, my > > understanding of the services is that they leverage the IP ID field in > whole IPv4 > > packets that are not eligible for fragmentation and those are limitations > I am > > seeking to improve on. > > > > I want to enable a facility similar to GSO/GRO that works for both IPv4 > and IPv6 > > packets and allows for lower layers to fragment if necessary. And, I want > to use > > a well-behaved 32-bit IPv6 ID instead of the 16-bit IPv4 one where the use > is not > > well defined when DF=1. > > > > There has been a lot of work in this area. For instance, you might want to > take a look at https://www.youtube.com/watch?v=ccUeG1dAhbw > > > > About reassembly, that would only happen on the end systems themselves or > on > > a very capable device that is very close to the end systems; I would not > want for > > a high-speed core router to have to reassemble. > > > > Even so, an intermediate device close to the end system still has to > provide service to more than one host. Reassembly requires memory to store > fragments. A host would need enough memory to service all of its own flows, > but an intermediate node would need number of hosts it serves times that > amount of memory to perform reassembly. This is a fundamental scaling > problem of stateful services in the network, inevitably the network nodes > cannot scale to the number of users or flows that require service. In the > best case scenario, when resources are not available the network won't > attempt the stateful operation and will just forward the packet unimpeded > (which is fine because host will never rely on this class of optimization). > In the worse case scenario, the network will take a detrimental action such > as forcibly breaking a connection (e.g. this is what can happen when a NAT > evicts a TCP connection because it has run out of memory). IMO, maintaining > state in the network is a bad, albeit unfortunately prevalent, idea. > > > > Tom > > > > > > Again, GSO/GRO is nice work and much respect is due to those who made it > possible. > > > > Fred > > > > *From:* Tom Herbert [mailto:t...@herbertland.com] > *Sent:* Monday, December 20, 2021 9:20 AM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* to...@strayalpha.com; int-area@ietf.org > *Subject:* Re: [Int-area] [EXTERNAL] Re: IP parcels > > > > The world is not just TCP anymore. QUIC and other UDP-based transports > have already > > shown performance increases using facilities like GSO/GRO which are > essentially a short > > term and non-standard implementation of what parcels promise to do in the > long term. > > > > Fred, > > > > Can you explain why GSO/GRO aren't sufficient and are only short term > solutions? We've been using these for almost twenty years now with good > effect. These are widely deployed with TCP, TSO works well to offload > transmit, LRO is defined and is in much better shape to offload RX now that > programmable devices are emerging. For TCP it's hard to see how IP parcels > would help significantly, but even for UDP we now have UDP GSO, sendmmsg, > and recvmmsg that mitigate the cost of system calls and interrupts to which > the draft refers. The reason these aren't standards in IETF is because > they're implementation techniques and not protocol (although I will point > out that GSO/GRO/sendmmsg/recvmmsg are in all Linux devices so that > effectively makes it a de facto implementation standard). > > > > I am also concerned about the idea that intermediate devices would perform > reassembly. This has a whole bunch of implications like middleboxes are no > longer work conserving and seems to have the implicit requirement that it > has to be in the path of every packet in a parcel (i.e. even in the case of > the last hop performing reassembly. Also, as simply a matter of resources > and capabilities, hosts are in a much better position to perform tasks like > reassembly. I don't readily see that having intermediate devices perform > reassembly would be a win for hosts, and even if it were, host > implementations still would need the capability to perform reassembly > themselves since they will never rely on the network to always do it for > them. > > > > Tom > > > > > > Thanks - Fred > > > > *From:* to...@strayalpha.com [mailto:to...@strayalpha.com] > *Sent:* Sunday, December 19, 2021 11:53 AM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com> > *Subject:* Re: [Int-area] IP parcels > > > > Hi, Fred (et al.), > > > > On Dec 19, 2021, at 10:21 AM, Templin (US), Fred L < > fred.l.temp...@boeing.com> wrote: > > > > Joe, your insistence on using html makes it impossible to respond to all > of your points inline > > which is the reason for my top-posts. > > > > I use MacOS mail, IOS mail, and Thunderbird on Windows, all using default > configurations, FWIW. I appear to be able to post inside everyone else’s > responses. I don’t know if the IETF’s mailers are munging formats, though. > > > > I’ve made my position clear. However: > > > > - You still haven’t shown any evidence that end systems need to do all > this extra work so they can somehow run faster, nor that this will be > noticeably faster than large (i.e., 20-60KB) IPv4 packets. > > > > - You still haven’t shown any reason why this is feasible; in fact, below > you add the idea of on-path fragmentation, which is largely deprecated > because fragments won’t traverse tunnels (in your case, notably for single > chunks larger than 64KB). Nevermind that the fragmentation is both > expensive and slow-path at routers. > > > > - You have claimed that both routers and transports will somehow adopt > this when we can’t even get reasonably large MTUs that already fit within > IPv4 across heterogeneous enterprises. > > > > IPv4 is over; even if you don’t think so, any way forward with larger > packets starts with: > > a) getting ~64KB IP packets across the net > > b) after (a), prove that >64KB are needed based on the IPv6 > jumbo approach > > > > Any way forward with a lot of small packets inside one large one (where > both chunks and total length are less than 64K) starts by proving there’s a > need and it fixing how TCP interacts with its inherent burstiness and loss > correlation. > > > > Only THEN will this issue be worth more discussion. > > > > Joe > > > > > > Parcels that contain a single segment whether 64K or considerably less are > still sent as > > (singleton) parcels and not ordinary packets. That way, nodes in the > network can know > > that it is OK to encapsulate and fragment since by asserting its interest > in receiving parcels > > the destination has also subscribed to being able to reassemble up to a > full 64K. > > > > Parcels do not set (Payload Length / Total Length) to 0; they set it to > the length of the > > first element of the parcel (which is also the same length of each > non-final element of > > the parcel). What happens then is that network equipment will see a unit > with an L3 > > length that may be considerably shorter than the L2 length. You are right > that legacy > > routers might not like this (or, they might truncate the packet according > to L3 length), > > and so for paths that might traverse legacy routers the first-hop node > that recognizes > > parcels instead encapsulates the parcel in an IPv4 or IPv6 header then > performs (source) > > fragmentation if necessary. These IP fragments will then travel through > legacy routers > > just fine. > > > > About RFC793bis, you and Wes Eddy know far more about its status than I > do; I only > > noted that this is something with TCP implications and so made mention of > it in case > > there is still room for a few more engine tweaks while the hood is still > open. > > > > About IPv4, I am currently running IPv4 edge networks with IPv4-in-IPv6 > tunnel endpoints > > connected to an IPv6 transit network and it works really good. End systems > get to use > > smaller addresses and smaller headers, and they can talk to remote > correspondents using > > IPv4 as if they were all on the same IPv4 network. So yes, I think we > might still want to > > consider IPv4 for edge networks like that. > > > > About getting 64K packets across, only the edge networks or end systems > see them as > > large packets; in the core thy are typically broken up into something much > smaller by > > ingress nodes that apply segmentation/fragmentation. We don’t need the > core to move > > to jumbo links; we only need that at the edges. ATM taught us that. > > > > About our “nail”, end systems get to see larger packets/parcels and get to > take advantage > > of the reduced interrupts and system call overhead they provide. That is > what makes it > > worthwhile. > > > > Fred > > > > *From:* to...@strayalpha.com [mailto:to...@strayalpha.com > <to...@strayalpha.com>] > *Sent:* Saturday, December 18, 2021 8:13 PM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com> > *Subject:* Re: [Int-area] IP parcels > > > > HI, Fred, > > > > If you have one segment that’s less than 64K, you don’t need the parcel > option at all. > > > > If you have something longer than 64K, either as a single segment or > multiple smaller segments, by setting total length to 0, you end up being > dropped by legacy routers, which either ignore options they don’t > understand or drop packets with options they don’t support. > > > > RFC793bis does talk about IPv6 jumbos, but this new work is out of scope > for RFC793bis - furthermore, it’s too late. It has passed WGLC, IETF LC, > and is currently in IESG review for publication. > > > > You also haven’t addressed why the IETF should be taking up this *new* > work for IPv4, which I thought also had been considered ineligible. > > > > But overall, again, what’s the point? We can’t even get 64K IP packets > through the Internet; making them larger doesn’t make that easier or more > likely. Such large sizes are of diminishing benefit; routers already > forward at 40Gbps per link for minimal packets and end systems have other > problems that this exacerbates. > > > > This seems a lot like a huge hammer in search of a nail. Where’s the nail? > > > > Joe > > > > — > > Joe Touch, temporal epistemologist > > www.strayalpha.com > > > > On Dec 18, 2021, at 7:18 PM, Templin (US), Fred L < > fred.l.temp...@boeing.com> wrote: > > > > Joe, I never said that I wanted to restrict this to small transport > segments; in fact, I want > > just the opposite – I want large segments. A perfectly legal parcel is one > which includes 1 > > ~64KB segment - another legal parcel is one which includes 64 of them! If > you want bigger > > segments than that, then true jumbos are necessary and this spec does not > preclude that. > > > > About RFC793(bis), I see there is a section on Jumbos and IP parcels is > (sort of) an application > > of Jumbos. > > > > Fred > > > > *From:* to...@strayalpha.com [mailto:to...@strayalpha.com > <to...@strayalpha.com>] > *Sent:* Saturday, December 18, 2021 4:57 PM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com> > *Subject:* [EXTERNAL] Re: [Int-area] IP parcels > > > > EXT email: be mindful of links/attachments. > > > > > Hi, Fred, > > > > Regarding 793bis, new ideas are out of scope. It’s supposed to be a > roll-in of existing items only. > > > > Nevermind the problems below, which “TCP will find a way” doesn’t > magically fix. > > > > The problem is this: > > - end systems need to send larger transport segments (not just IP segments) > > - if they can do that, they should, filling up to the largest IP payload > > > > Having an IP packet have the opportunity to include lots of small > transport packets assumes transport packets either work better or faster > when they’re small. It’s the opposite. > > > > Joe > > > > — > > Joe Touch, temporal epistemologist > > www.strayalpha.com > > > > On Dec 18, 2021, at 4:42 PM, Templin (US), Fred L < > fred.l.temp...@boeing.com> wrote: > > > > Joe, TCP will find a way to adapt – it always has. I also see that TCP is > currently undergoing > > a second edition revision so the timing seems right to consider IP parcels > in the analysis. > > I am Cc’ing the second edition editor for his information – Wesley, please > consider this > > new concept called “IP Parcels” as it relates to your document. > > > > Here is the latest draft version – it expands on the “Motivation” section > and adds a number > > of important feature such as a new “Parcels Permitted” TCP option: > > > > https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/ > > > > Fred > > > > *From:* to...@strayalpha.com [mailto:to...@strayalpha.com > <to...@strayalpha.com>] > *Sent:* Friday, December 17, 2021 6:01 PM > *To:* Templin (US), Fred L <fred.l.temp...@boeing.com> > *Cc:* int-area@ietf.org > *Subject:* Re: [Int-area] IP parcels > > > > Hi, Fred, > > > > I’m first concerned at the use of an IP option at all, due to the problems > with *any* options forcing processing to slow-path. > > > > From TCP’s viewpoint, it seems like you’ve just created a nightmare for > SACK and ECN, basically because you will encourage drops of large bursts of > packets. > > > > This will also increase the bustiness of TCP, i.e., rather than letting > the ACKs support pacing. > > > > Any part of the system that currently coalesces TCP packets is likely to > generate errors here, because they might see only the first TCP segment. > > > > However, AFAICT the most significant consideration is that the issue with > per-packet performance is at the TCP and UDP layers, not as much at the IP > layer. > > > > So what problem is this trying to solve? > > > > Joe > > — > > Joe Touch, temporal epistemologist > > www.strayalpha.com > > > > > On Dec 17, 2021, at 5:06 PM, Templin (US), Fred L < > fred.l.temp...@boeing.com> wrote: > > > > Here's one that should help with shipping, just in time for Christmas. > Thanks > to everyone for the past and future list exchanges. > > Fred > > -----Original Message----- > From: I-D-Announce [mailto:i-d-announce-boun...@ietf.org > <i-d-announce-boun...@ietf.org>] On Behalf Of internet-dra...@ietf.org > Sent: Friday, December 17, 2021 5:00 PM > To: i-d-annou...@ietf.org > Subject: I-D Action: draft-templin-intarea-parcels-00.txt > > > A New Internet-Draft is available from the on-line Internet-Drafts > directories. > > > Title : IP Parcels > Author : Fred L. Templin > Filename : draft-templin-intarea-parcels-00.txt > Pages : 8 > Date : 2021-12-17 > > Abstract: > IP packets (both IPv4 and IPv6) are understood to contain a unit of > data which becomes the retransmission unit in case of loss. Upper > layer protocols such as the Transmission Control Protocol (TCP) > prepare data units known as "segments", with traditional arrangements > including a single segment per packet. This document presents a new > construct known as the "IP Parcel" which permits a single packet to > carry multiple segments. The parcel can be opened at middleboxes on > the path with the included segments broken out into individual > packets, then rejoined into one or more repackaged parcels to be > forwarded further toward the final destination. Reordering of > segments within parcels is unimportant; what matters is that the > number of parcels delivered to the final destination should be kept > to a minimum, and that loss or receipt of individual segments (and > not parcel size) determines the retransmission unit. > > > The IETF datatracker status page for this draft is: > https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/ > > There is also an htmlized version available at: > https://datatracker.ietf.org/doc/html/draft-templin-intarea-parcels-00 > > > Internet-Drafts are also available by rsync at rsync.ietf.org > ::internet-drafts > > > _______________________________________________ > I-D-Announce mailing list > i-d-annou...@ietf.org > https://www.ietf.org/mailman/listinfo/i-d-announce > Internet-Draft directories: http://www.ietf.org/shadow.html > or ftp://ftp.ietf.org/ietf/1shadow-sites.txt > > _______________________________________________ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area > > > > _______________________________________________ > Int-area mailing list > Int-area@ietf.org > https://www.ietf.org/mailman/listinfo/int-area > >
_______________________________________________ Int-area mailing list Int-area@ietf.org https://www.ietf.org/mailman/listinfo/int-area