-----Original Message-----
From: Int-area [mailto:int-area-boun...@ietf.org] On Behalf Of
Templin (US), Fred L
Sent: Thursday, March 24, 2022 9:45 AM
To: Tom Herbert <t...@herbertland.com>
Cc: int-area <int-area@ietf.org>; Eggert, Lars <l...@netapp.com>;
l...@eggert.org
Subject: Re: [Int-area] IP Parcels improves performance for end
systems
Hi Tom - responses below:
-----Original Message-----
From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Thursday, March 24, 2022 9:09 AM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: Eggert, Lars <l...@netapp.com>; int-area <int-area@ietf.org>;
l...@eggert.org
Subject: Re: [Int-area] IP Parcels improves performance for end
systems
On Thu, Mar 24, 2022 at 7:27 AM Templin (US), Fred L
<fred.l.temp...@boeing.com> wrote:
Tom - see below:
-----Original Message-----
From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Thursday, March 24, 2022 6:22 AM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: Eggert, Lars <l...@netapp.com>; int-area
<int-area@ietf.org>; l...@eggert.org
Subject: Re: [Int-area] IP Parcels improves performance for end
systems
On Wed, Mar 23, 2022 at 10:47 AM Templin (US), Fred L
<fred.l.temp...@boeing.com> wrote:
Tom, looks like you have switched over to HTML which can be a real
conversation-killer.
But, to some points you raised that require a response:
You can't turn it off UDP checksums for IPv6 (except for narrow case of
encapsulation).
That sounds like a good reason to continue to use IPv4 - at
least as far as end system
addressing is concerned - right?
Not at all. All NICs today provide checksum offload and so it's
basically zero cost to perform the UDP checksum. The fact that
we don't have to do extra checks on the UDPv6 checksum field to
see if it's zero actually is a performance improvement over UDPv4.
(btw, I will present implementation of the Internet checksum at
TSVGWG Friday, this will include discussion of checksum offloads).
Actually, my assertion wasn't good to begin with because for IPv6
even if UDP checksums are turned off the OMNI encapsulation layer
includes a checksum that ensures the integrity of the IPv6 header.
UDP checksums off for IPv6 when OMNI encapsulation is used is perfectly fine.
I assume you are referring to RFC6935 and RFC6936 that allow the
UDPv6 to be zero for tunneling with a very constrained set of conditions.
If it's a standard per packet Internet checksum then a lot of HW could do it.
If it's something like CRC32 then probably not.
The integrity check is covered in RFC5327, and I honestly
haven't had a chance to
look at that myself yet.
LTP is a nice experiment, but I'm more interested as to the interaction between
IP parcels and TCP or QUIC.
Please be aware that while LTP may seem obscure at the moment
that may be changing now
that the core DTN standards have been published. As DTN use
becomes more widespread I
think we can see LTP also come into wider adoption.
My assumption is that IP parcels is intended to be a general
solution of all protocols. Maybe in the next draft you could
discuss the details of TCP in IP parcels including how to
offload the TCP checksum.
I could certainly add that. For TCP, each of the concatenated
segments would include its own TCP header with checksum field
included. Any hardware that knows the structure of an IP Parcel
can then simply do the TCP checksum offload function for each segment.
To be honest, the odds of ever getting support in NIC hardware for
IP parcels are extremely slim. Hardware vendors are driven by
economics, so the only way they would do that would be to
demonstrate widespread deployment of the protocol. But even then,
with all the legacy hardware in deployment it will take many years
before there's any appreciable traction. IMO, the better approach
is to figure out how to leverage the existing hardware features for use with IP
parcels.
There will be two kinds of links that will need to be "Parcel-capable":
1) Edge network (physical) links that natively forward large
parcels, and
2) OMNI (virtual) links that forward parcels using encapsulation
and fragmentation.
The category 1) links are not yet in existence, but once parcels
start to enter the mainstream innovation will drive the creation of
new kinds of data links (1TB Ethernet?) that will be rolled out as
new hardware. And that new hardware can be made to understand the
structure of parcels from the beginning. The category 2) links
might take a large parcel from the upper layers on the local node
(or one that has been forwarded by a parcel-capable link) and break
it down into smaller sub-parcels then apply IP fragmentation to
each sub-parcel and send the fragments to an OMNI link egress node.
You know better than me how checksum offload could be applied in an environment
like that.
There was quite a bit of work and discussion on this in Linux.
I believe the deviation from the standard was motivated by
some
deployed devices required the IPID be set on receive, and
setting IPID with DF equals to 1 is thought to be innocuous.
You may
want to look at Alex Duyck's papers on UDP GSO, he wrote a lot of code in this
area.
RFC6864 has quite a bit to say about coding IP ID with DF=1 - mostly in the
negative.
But, what I have seen in the linux code seems to indicate that
there is not even any
coordination between the GSO source and the GRO destination -
instead, GRO simply
starts gluing together packets that appear to have consecutive
IP IDs without ever first
checking that they were sent by a peer that was earnestly doing
GSO. These aspects
would make it very difficult to work GSO/GRO into an IETF
standard, plus it doesn't
work for IPv6 at all where there is no IP ID included by default.
IP Parcels addresses
all of these points, and can be made into a standard.
Huh? GRO/GSO works perfectly fine with IPV6.
Where is the spec for that? My understanding is that GSO/GRO
leverages the IP ID for IPv4. But, for IPv6, there is no IP ID unless you
include a Fragment Header.
Does IPv6 somehow do GSO/GRO differently?
GRO and GSO don't use the IPID to match a flow. The primary match
is the TCP 4-tuple.
Correct, the 5-tuple (src-ip, src-port, dst-ip, dst-pot, proto) is
what is used to match the flow. But, you need more than that in
order to correctly paste back together with GRO the segments of an
original ULP buffer that was broken down by GSO - you need
Identifications and/or other markings in the IP headers to give a reassembly
context.
Otherwise, GRO might end up gluing together old and new pieces of
ULP data and/or impart a lot of reordering. IP Parcels have well
behaved Identifications and Parcel IDs so that the original ULP buffer context
is honored during reassembly.
There's also another possibility with IPv6-- use jumbograms. For
instance, instead of GRO reassembling segments up to a 64K packet,
it could be modified to reassemble up to a 4G packet using IPv6
jumbograms where one really big packet is given to the stack.
But we probably don't even need jumbograms for that. In Linux, GRO
might be taught to reassemble up to 4G super packet and set a flag
bit in the skbuf to ignore the IP payload field and get the length
from the skbuf len field (as though a jumbogram was received).
This trick would work for IPV4 and IPv6 and GSO as well. It should
also work TSO if the device takes the IP payload length to be that for each
segment.
Yes, I was planning to give that a try to see what kind of
performance can be gotten with GSO/GRO when you exceed 64KB. But,
my concern with GSO/GRO is that the reassembly is (relatively)
unguided and haphazard and can result in mis-ordered
concatenations. And, there is no protocol by which the GRO receiver
can imply that the things it is gluing together actually originated
from a sender that is earnestly doing GSO. So, I do not see how
GSO/GRO as I see it in the implementation could be made into a
standard, whereas there is a clear path for standardizing IP parcels.
Another thing I forgot to mention is that in my experiments with
GSO/GRO I found that it won't let me set a GSO segment size that
would cause the resulting IP packets to exceed the path MTU (i.e., it won't
allow fragmentation).
I fixed that by configuring IPv4-in-IPv6 encapsulation per RFC2473
and then allowed the IPv6 layer to apply fragmentation to the encapsulated
packet.
That way, I can use IPv4 GSO segment sizes up to ~64KB.
Fred
Tom
Thanks - Fred
Tom
Fred
From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Wednesday, March 23, 2022 9:37 AM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: Eggert, Lars <l...@netapp.com>; int-area
<int-area@ietf.org>; l...@eggert.org
Subject: Re: [EXTERNAL] Re: [Int-area] IP Parcels improves
performance for end systems
EXT email: be mindful of links/attachments.
On Wed, Mar 23, 2022, 9:54 AM Templin (US), Fred L <fred.l.temp...@boeing.com>
wrote:
Hi Tom,
-----Original Message-----
From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Wednesday, March 23, 2022 6:19 AM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: Eggert, Lars <l...@netapp.com>; int-area@ietf.org;
l...@eggert.org
Subject: Re: [Int-area] IP Parcels improves performance for
end systems
On Tue, Mar 22, 2022 at 10:38 AM Templin (US), Fred L
<fred.l.temp...@boeing.com> wrote:
Tom, see below:
-----Original Message-----
From: Tom Herbert [mailto:t...@herbertland.com]
Sent: Tuesday, March 22, 2022 10:00 AM
To: Templin (US), Fred L <fred.l.temp...@boeing.com>
Cc: Eggert, Lars <l...@netapp.com>; int-area@ietf.org
Subject: Re: [Int-area] IP Parcels improves performance for
end systems
On Tue, Mar 22, 2022 at 7:42 AM Templin (US), Fred L
<fred.l.temp...@boeing.com> wrote:
Lars, I did a poor job of answering your question. One of
the most important aspects of
IP Parcels in relation to TSO and GSO/GRO is that
transports get to use a full 4MB buffer
instead of the 64KB limit in current practices. This is
possible due to the IP Parcel jumbo
payload option encapsulation which provides a 32-bit length field instead of
just a 16-bit.
By allowing the transport to present the IP layer with a
buffer of up to 4MB, it reduces
the overhead, minimizes system calls and interrupts, etc.
So, yes, IP Parcels is very much about improving the
performance for end systems in
comparison with current practice (GSO/GRO and TSO).
Hi Fred,
The nice thing about TSO/GSO/GRO is that they don't require
any changes to the protocol as just implementation
techniques, also they're one sided opitmizations meaning for
instance that TSO can be used at the sender without requiring GRO to be used at
the receiver.
My understanding is that IP parcels requires new protocol
that would need to be implemented on both endpoints and possibly in some
routers.
It is not entirely true that the protocol needs to be
implemented on both endpoints . Sources that send IP Parcels
send them into a Parcel-capable path which ends at either the
final destination or a router for which the next hop is not
Parcel-capable. If the Parcel-capable path extends all the
way to the final destination, then the Parcel is delivered to
the destination which knows how to deal with it. If the
Parcel-capable path ends at a router somewhere in the middle,
the router opens the Parcel and sends each enclosed segment
as an independent IP packet. The final destination is then
free to