Re: [Int-area] IP parcels

Tom Herbert Mon, 20 Dec 2021 16:14:14 -0800

On Mon, Dec 20, 2021 at 3:11 PM Templin (US), Fred L <
fred.l.temp...@boeing.com> wrote:


> Tom, in modern reassembly it is not going to wait for the MSL for all
> fragments
>
> to arrive anymore; either they all get there after a very small
> inter-fragment
>
> delay, or you send an immediate FRAGREP and possibly also a PTB soft error
>
> then quickly declare the reassembly dead if that doesn’t help. And, you
> make
>
> sure to inspect IDs of received fragments before admitting them into the
>
> reassembly cache so you don’t end up caching garbage that will just have to
>
> be discarded later.
>

Fred,

It doesn't matter in the sense that reassembly is a non-working conserving
mechanism. In order to perform reassembly packet fragments need to be held
which means memory will be consumed and since memory is a finite resource
it needs to be managed.  Managing memory means that some policy is needed
when to time out a reassembly or which fragment train to discard under
memory pressure. A network that implements some arbitrary policy can cause
problems on unsuspecting hosts. For instance, there's mechanisms for hosts
to try to guess what the timeout is in a NAT box and send a keepalive
packet before an idle NAT state is evicted. So this is just a guess that
may or may not be right, and in fact there might not even be a NAT in the
path in which case the host is just wasting energy sending keepalives.
Also, the second we introduce a new exhaustible resource in the path that
becomes yet another denial of service vector (consider the case that an
attacker spoofs a whole bunch of IP parcels).

Unless the network can coordinate very specifically with the host about
what it's doing on behalf of the host stack, I think it's much better for
the network to just focus or forward packets without delay and let the host
handle the details of receive processing, reassembly, security, etc.

Tom


>
> Fred
>
>
>
> *From:* Tom Herbert [mailto:t...@herbertland.com]
> *Sent:* Monday, December 20, 2021 1:06 PM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* to...@strayalpha.com; int-area@ietf.org
> *Subject:* Re: IP parcels
>
>
>
>
>
> On Mon, Dec 20, 2021 at 12:03 PM Templin (US), Fred L <
> fred.l.temp...@boeing.com> wrote:
>
> Tom, sorry I will try to use my words more carefully; I am using GSO/GRO
> also for
>
> a UDP-based transport protocol – not QUIC but something similar. I like
> GSO/GRO
>
> very much; I am glad the service is available and I want to see it
> continue. But, my
>
> understanding of the services is that they leverage the IP ID field in
> whole IPv4
>
> packets that are not eligible for fragmentation and those are limitations
> I am
>
> seeking to improve on.
>
>
>
> I want to enable a facility similar to GSO/GRO that works for both IPv4
> and IPv6
>
> packets and allows for lower layers to fragment if necessary. And, I want
> to use
>
> a well-behaved 32-bit IPv6 ID instead of the 16-bit IPv4 one where the use
> is not
>
> well defined when DF=1.
>
>
>
> There has been a lot of work in this area. For instance, you might want to
> take a look at https://www.youtube.com/watch?v=ccUeG1dAhbw
>
>
>
> About reassembly, that would only happen on the end systems themselves or
> on
>
> a very capable device that is very close to the end systems; I would not
> want for
>
> a high-speed core router to have to reassemble.
>
>
>
> Even so, an intermediate device close to the end system still has to
> provide service to more than one host. Reassembly requires memory to store
> fragments. A host would need enough memory to service all of its own flows,
> but an intermediate node would need number of hosts it serves times that
> amount of memory to perform reassembly.  This is a fundamental scaling
> problem of stateful services in the network, inevitably the network nodes
> cannot scale to the number of users or flows that require service. In the
> best case scenario, when resources are not available the network won't
> attempt the stateful operation and will just forward the packet unimpeded
> (which is fine because host will never rely on this class of optimization).
> In the worse case scenario, the network will take a detrimental action such
> as forcibly breaking a connection (e.g. this is what can happen when a NAT
> evicts a TCP connection because it has run out of memory). IMO, maintaining
> state in the network is a bad, albeit unfortunately prevalent, idea.
>
>
>
> Tom
>
>
>
>
>
> Again, GSO/GRO is nice work and much respect is due to those who made it
> possible.
>
>
>
> Fred
>
>
>
> *From:* Tom Herbert [mailto:t...@herbertland.com]
> *Sent:* Monday, December 20, 2021 9:20 AM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* to...@strayalpha.com; int-area@ietf.org
> *Subject:* Re: [Int-area] [EXTERNAL] Re: IP parcels
>
>
>
> The world is not just TCP anymore. QUIC and other UDP-based transports
> have already
>
> shown performance increases using facilities like GSO/GRO which are
> essentially a short
>
> term and non-standard implementation of what parcels promise to do in the
> long term.
>
>
>
> Fred,
>
>
>
> Can you explain why GSO/GRO aren't sufficient and are only short term
> solutions? We've been using these for almost twenty years now with good
> effect. These are widely deployed with TCP, TSO works well to offload
> transmit, LRO is defined and is in much better shape to offload RX now that
> programmable devices are emerging. For TCP it's hard to see how IP parcels
> would help significantly, but even for UDP we now have UDP GSO, sendmmsg,
> and recvmmsg that mitigate the cost of system calls and interrupts to which
> the draft refers. The reason these aren't standards in IETF is because
> they're implementation techniques and not protocol (although I will point
> out that GSO/GRO/sendmmsg/recvmmsg are in all Linux devices so that
> effectively makes it a de facto implementation standard).
>
>
>
> I am also concerned about the idea that intermediate devices would perform
> reassembly. This has a whole bunch of implications like middleboxes are no
> longer work conserving and seems to have the implicit requirement that it
> has to be in the path of every packet in a parcel (i.e. even in the case of
> the last hop performing reassembly. Also, as simply a matter of resources
> and capabilities, hosts are in a much better position to perform tasks like
> reassembly. I don't readily see that having intermediate devices perform
> reassembly would be a win for hosts, and even if it were, host
> implementations still would need the capability to perform reassembly
> themselves since they will never rely on the network to always do it for
> them.
>
>
>
> Tom
>
>
>
>
>
> Thanks - Fred
>
>
>
> *From:* to...@strayalpha.com [mailto:to...@strayalpha.com]
> *Sent:* Sunday, December 19, 2021 11:53 AM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com>
> *Subject:* Re: [Int-area] IP parcels
>
>
>
> Hi, Fred (et al.),
>
>
>
> On Dec 19, 2021, at 10:21 AM, Templin (US), Fred L <
> fred.l.temp...@boeing.com> wrote:
>
>
>
> Joe, your insistence on using html makes it impossible to respond to all
> of your points inline
>
> which is the reason for my top-posts.
>
>
>
> I use MacOS mail, IOS mail, and Thunderbird on Windows, all using default
> configurations, FWIW. I appear to be able to post inside everyone else’s
> responses. I don’t know if the IETF’s mailers are munging formats, though.
>
>
>
> I’ve made my position clear. However:
>
>
>
> - You still haven’t shown any evidence that end systems need to do all
> this extra work so they can somehow run faster, nor that this will be
> noticeably faster than large (i.e., 20-60KB) IPv4 packets.
>
>
>
> - You still haven’t shown any reason why this is feasible; in fact, below
> you add the idea of on-path fragmentation, which is largely deprecated
> because fragments won’t traverse tunnels (in your case, notably for single
> chunks larger than 64KB). Nevermind that the fragmentation is both
> expensive and slow-path at routers.
>
>
>
> - You have claimed that both routers and transports will somehow adopt
> this when we can’t even get reasonably large MTUs that already fit within
> IPv4 across heterogeneous enterprises.
>
>
>
> IPv4 is over; even if you don’t think so, any way forward with larger
> packets starts with:
>
>                a) getting ~64KB IP packets across the net
>
>                b) after (a), prove that >64KB are needed based on the IPv6
> jumbo approach
>
>
>
> Any way forward with a lot of small packets inside one large one (where
> both chunks and total length are less than 64K) starts by proving there’s a
> need and it fixing how TCP interacts with its inherent burstiness and loss
> correlation.
>
>
>
> Only THEN will this issue be worth more discussion.
>
>
>
> Joe
>
>
>
>
>
> Parcels that contain a single segment whether 64K or considerably less are
> still sent as
>
> (singleton) parcels and not ordinary packets. That way, nodes in the
> network can know
>
> that it is OK to encapsulate and fragment since by asserting its interest
> in receiving parcels
>
> the destination has also subscribed to being able to reassemble up to a
> full 64K.
>
>
>
> Parcels do not set (Payload Length / Total Length) to 0; they set it to
> the length of the
>
> first element of the parcel (which is also the same length of each
> non-final element of
>
> the parcel). What happens then is that network equipment will see a unit
> with an L3
>
> length that may be considerably shorter than the L2 length. You are right
> that legacy
>
> routers might not like this (or, they might truncate the packet according
> to L3 length),
>
> and so for paths that might traverse legacy routers the first-hop node
> that recognizes
>
> parcels instead encapsulates the parcel in an IPv4 or IPv6 header then
> performs (source)
>
> fragmentation if necessary. These IP fragments will then travel through
> legacy routers
>
> just fine.
>
>
>
> About RFC793bis, you and Wes Eddy know far more about its status than I
> do; I only
>
> noted that this is something with TCP implications and so made mention of
> it in case
>
> there is still room for a few more engine tweaks while the hood is still
> open.
>
>
>
> About IPv4, I am currently running IPv4 edge networks with IPv4-in-IPv6
> tunnel endpoints
>
> connected to an IPv6 transit network and it works really good. End systems
> get to use
>
> smaller addresses and smaller headers, and they can talk to remote
> correspondents using
>
> IPv4 as if they were all on the same IPv4 network. So yes, I think we
> might still want to
>
> consider IPv4 for edge networks like that.
>
>
>
> About getting 64K packets across, only the edge networks or end systems
> see them as
>
> large packets; in the core thy are typically broken up into something much
> smaller by
>
> ingress nodes that apply segmentation/fragmentation. We don’t need the
> core to move
>
> to jumbo links; we only need that at the edges. ATM taught us that.
>
>
>
> About our “nail”, end systems get to see larger packets/parcels and get to
> take advantage
>
> of the reduced interrupts and system call overhead they provide. That is
> what makes it
>
> worthwhile.
>
>
>
> Fred
>
>
>
> *From:* to...@strayalpha.com [mailto:to...@strayalpha.com
> <to...@strayalpha.com>]
> *Sent:* Saturday, December 18, 2021 8:13 PM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com>
> *Subject:* Re: [Int-area] IP parcels
>
>
>
> HI, Fred,
>
>
>
> If you have one segment that’s less than 64K, you don’t need the parcel
> option at all.
>
>
>
> If you have something longer than 64K, either as a single segment or
> multiple smaller segments, by setting total length to 0, you end up being
> dropped by legacy routers, which either ignore options they don’t
> understand or drop packets with options they don’t support.
>
>
>
> RFC793bis does talk about IPv6 jumbos, but this new work is out of scope
> for RFC793bis - furthermore, it’s too late. It has passed WGLC, IETF LC,
> and is currently in IESG review for publication.
>
>
>
> You also haven’t addressed why the IETF should be taking up this *new*
> work for IPv4, which I thought also had been considered ineligible.
>
>
>
> But overall, again, what’s the point? We can’t even get 64K IP packets
> through the Internet; making them larger doesn’t make that easier or more
> likely. Such large sizes are of diminishing benefit; routers already
> forward at 40Gbps per link for minimal packets and end systems have other
> problems that this exacerbates.
>
>
>
> This seems a lot like a huge hammer in search of a nail. Where’s the nail?
>
>
>
> Joe
>
>
>
> —
>
> Joe Touch, temporal epistemologist
>
> www.strayalpha.com
>
>
>
> On Dec 18, 2021, at 7:18 PM, Templin (US), Fred L <
> fred.l.temp...@boeing.com> wrote:
>
>
>
> Joe, I never said that I wanted to restrict this to small transport
> segments; in fact, I want
>
> just the opposite – I want large segments. A perfectly legal parcel is one
> which includes 1
>
> ~64KB segment - another legal parcel is one which includes 64 of them! If
> you want bigger
>
> segments than that, then true jumbos are necessary and this spec does not
> preclude that.
>
>
>
> About RFC793(bis), I see there is a section on Jumbos and IP parcels is
> (sort of) an application
>
> of Jumbos.
>
>
>
> Fred
>
>
>
> *From:* to...@strayalpha.com [mailto:to...@strayalpha.com
> <to...@strayalpha.com>]
> *Sent:* Saturday, December 18, 2021 4:57 PM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* int-area@ietf.org; Wes Eddy <w...@mti-systems.com>
> *Subject:* [EXTERNAL] Re: [Int-area] IP parcels
>
>
>
> EXT email: be mindful of links/attachments.
>
>
>
>
> Hi, Fred,
>
>
>
> Regarding 793bis, new ideas are out of scope. It’s supposed to be a
> roll-in of existing items only.
>
>
>
> Nevermind the problems below, which “TCP will find a way” doesn’t
> magically fix.
>
>
>
> The problem is this:
>
> - end systems need to send larger transport segments (not just IP segments)
>
> - if they can do that, they should, filling up to the largest IP payload
>
>
>
> Having an IP packet have the opportunity to include lots of small
> transport packets assumes transport packets either work better or faster
> when they’re small. It’s the opposite.
>
>
>
> Joe
>
>
>
> —
>
> Joe Touch, temporal epistemologist
>
> www.strayalpha.com
>
>
>
> On Dec 18, 2021, at 4:42 PM, Templin (US), Fred L <
> fred.l.temp...@boeing.com> wrote:
>
>
>
> Joe, TCP will find a way to adapt – it always has. I also see that TCP is
> currently undergoing
>
> a second edition revision so the timing seems right to consider IP parcels
> in the analysis.
>
> I am Cc’ing the second edition editor for his information – Wesley, please
> consider this
>
> new concept called “IP Parcels” as it relates to your document.
>
>
>
> Here is the latest draft version – it expands on the “Motivation” section
> and adds a number
>
> of important feature such as a new “Parcels Permitted” TCP option:
>
>
>
> https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/
>
>
>
> Fred
>
>
>
> *From:* to...@strayalpha.com [mailto:to...@strayalpha.com
> <to...@strayalpha.com>]
> *Sent:* Friday, December 17, 2021 6:01 PM
> *To:* Templin (US), Fred L <fred.l.temp...@boeing.com>
> *Cc:* int-area@ietf.org
> *Subject:* Re: [Int-area] IP parcels
>
>
>
> Hi, Fred,
>
>
>
> I’m first concerned at the use of an IP option at all, due to the problems
> with *any* options forcing processing to slow-path.
>
>
>
> From TCP’s viewpoint, it seems like you’ve just created a nightmare for
> SACK and ECN, basically because you will encourage drops of large bursts of
> packets.
>
>
>
> This will also increase the bustiness of TCP, i.e., rather than letting
> the ACKs support pacing.
>
>
>
> Any part of the system that currently coalesces TCP packets is likely to
> generate errors here, because they might see only the first TCP segment.
>
>
>
> However, AFAICT the most significant consideration is that  the issue with
> per-packet performance is at the TCP and UDP layers, not as much at the IP
> layer.
>
>
>
> So what problem is this trying to solve?
>
>
>
> Joe
>
> —
>
> Joe Touch, temporal epistemologist
>
> www.strayalpha.com
>
>
>
>
> On Dec 17, 2021, at 5:06 PM, Templin (US), Fred L <
> fred.l.temp...@boeing.com> wrote:
>
>
>
> Here's one that should help with shipping, just in time for Christmas.
> Thanks
> to everyone for the past and future list exchanges.
>
> Fred
>
> -----Original Message-----
> From: I-D-Announce [mailto:i-d-announce-boun...@ietf.org
> <i-d-announce-boun...@ietf.org>] On Behalf Of internet-dra...@ietf.org
> Sent: Friday, December 17, 2021 5:00 PM
> To: i-d-annou...@ietf.org
> Subject: I-D Action: draft-templin-intarea-parcels-00.txt
>
>
> A New Internet-Draft is available from the on-line Internet-Drafts
> directories.
>
>
>        Title           : IP Parcels
>        Author          : Fred L. Templin
>                Filename        : draft-templin-intarea-parcels-00.txt
>                Pages           : 8
>                Date            : 2021-12-17
>
> Abstract:
>   IP packets (both IPv4 and IPv6) are understood to contain a unit of
>   data which becomes the retransmission unit in case of loss.  Upper
>   layer protocols such as the Transmission Control Protocol (TCP)
>   prepare data units known as "segments", with traditional arrangements
>   including a single segment per packet.  This document presents a new
>   construct known as the "IP Parcel" which permits a single packet to
>   carry multiple segments.  The parcel can be opened at middleboxes on
>   the path with the included segments broken out into individual
>   packets, then rejoined into one or more repackaged parcels to be
>   forwarded further toward the final destination.  Reordering of
>   segments within parcels is unimportant; what matters is that the
>   number of parcels delivered to the final destination should be kept
>   to a minimum, and that loss or receipt of individual segments (and
>   not parcel size) determines the retransmission unit.
>
>
> The IETF datatracker status page for this draft is:
> https://datatracker.ietf.org/doc/draft-templin-intarea-parcels/
>
> There is also an htmlized version available at:
> https://datatracker.ietf.org/doc/html/draft-templin-intarea-parcels-00
>
>
> Internet-Drafts are also available by rsync at rsync.ietf.org
> ::internet-drafts
>
>
> _______________________________________________
> I-D-Announce mailing list
> i-d-annou...@ietf.org
> https://www.ietf.org/mailman/listinfo/i-d-announce
> Internet-Draft directories: http://www.ietf.org/shadow.html
> or ftp://ftp.ietf.org/ietf/1shadow-sites.txt
>
> _______________________________________________
> Int-area mailing list
> Int-area@ietf.org
> https://www.ietf.org/mailman/listinfo/int-area
>
>
>
> _______________________________________________
> Int-area mailing list
> Int-area@ietf.org
> https://www.ietf.org/mailman/listinfo/int-area
>
>

_______________________________________________
Int-area mailing list
Int-area@ietf.org
https://www.ietf.org/mailman/listinfo/int-area

Re: [Int-area] IP parcels

Reply via email to