On Mon, Apr 22, 2024 at 9:23 AM Gunter Van de Velde via Datatracker < nore...@ietf.org> wrote:
> > ###COMMENTS: > ##generic comments: > The abstract implies the possibility of utilizing various resources to > enhance > performance for the same traffic selector, yet the document consistently > mentions only the CPU. If multiple CPUs are indeed the sole resource > envisaged > for the child SAs associated with a single traffic selector, it would be > advantageous for the document to specify this more clearly in the abstract. > Generally, the term "resource" encompasses a broad range of elements within > networking (i.e. bandwidth, QoS queues, optical paths, ECMP paths, etc); > however, in this draft, it appears to specifically refer to computing > resources. > Indeed. Originally, the draft talked about multiple queues for QoS/diffserv as well as CPUs, might that became complicated and no one really had a real world need for QoS SA's. While the focus is on CPU, if there are other resources that could be available this could apply for those as well, which is why I think the generic phrasing is okay. I was on the verge of making the ballot a DISCUSS because section 2 talking > about performance bottlenecks and detailing that some state can not be > shared > without impacting performance justifies the existence of this rfc-to-be. > While > it is well-understood when using different CPUs (sequence numbers etc), > but it > is not so simple to understand what the performance benefit is when > separate > queues are the resource differences. Maybe i misunderstood how this > operates > together in symbiose? or misunderstood the word queue (it may have > different > meaning in IPsec then on L3-network-interfaces) > Some NIC cards also have different CPUs/queues on them to load balance, usually on the hash(addr+proto+ports). Some NIC's now run linux in them, so keeping it broader, we don't care really if it is CPU or queue, or multi pieces of NIC silicon. In all cases, the hardware can act indepedently of other hardware instead of needing vast amounts of locking. ##detailed comments > 134 1.2. Terminology > > This section only has the pre-existing terminology. I was wondering if > terms > like the new SA_RESOURCE_INFO should be mentioned to have everything > documented > in a single place I read Terminology sections as "stuff you should know or read the reference for to understand this document", so I would not put stuff we define in this document in this section. > This section could be a good place to extend more explicit on > what is exactly meant with the term resource in the context of this draft. > That's a good point. Added: <t>This document also uses the following terms defined in <xref target="RFC4301"/>: SPD, SA. > should SADB_ACQUIRE be mentioned? SPD? > SADB_ACQUIRE is used once and we show RFC2367 as reference right there. I wouldn't want to promote it to the Terminology Section as the whole RFC (PF_KEYv2) is kinda obsolete (at least in Linux, BSD still uses it) but the replacement API has no RFC (there is a Netlink RFC but no XFRM RFC) > > 141 2. Performance bottlenecks > > Here is the header 'performance bottlenecks'. Only a single bottleneck is > mentioned in the section. Maybe the section title can be phrased in such > that > it covers the content more explicit for readability. For a network > generalist > it is unclear which other bottlenecks exist. > That is getting pretty close to specific hardware considerations that we were trying to avoid mentioning as hardware constrains today might be those of tomorrow. See how in the past crypto offloading was done to the PCI bus, only later to be folded back onto CPUs to be again offloaded but now to NIC hw resources. > 143 There are a number of practical reasons why most > implementations have > 144 to limit a Child SA to only one specific hardware resource, but > a key > 145 limitation is that sharing the cryptographic state, counters and > 146 sequence numbers between multiple CPUs that are trying to use > these > 147 shared states at the same time is not feasible without a > significant > 148 performance penalty. There is a need to negotiate and establish > 149 multiple Child SAs with identical TSi/TSr on a per-resource > basis. > > This phrase is rather long and not so easy to digest. What about this > re-edit. > I also took liberty to expand on TSi/TSr as a first time use: > > " > There are several pragmatic reasons why most implementations must restrict > a > Child Security Association (SA) to a single specific hardware resource. A > primary limitation arises from the challenges associated with sharing > cryptographic states, counters, and sequence numbers among multiple CPUs. > When > these CPUs attempt to simultaneously utilize shared states, it becomes > impractical to do so without incurring a significant performance penalty. > It is > necessary to negotiate and establish multiple Child Security Associations > (SAs) > with identical Traffic Selector initiator (TSi) and Traffic Selector > responder > (TSr) on a per-resource basis." > > Sure, taken your text. > 168 Upon installation, each resource-specific Child SA is > associated with > 169 an additional local selector, such as CPU or queue. These > resource- > 170 specific Child SAs MUST be negotiated with identical Child SA > 171 properties that were negotiated for the initial Child SA. This > > In section 2 is written that for improved performance "the cryptographic > state, > counters and sequence numbers between multiple CPUs" is difficult to share. > THis is trivial to understand with CPUs, but how does that explicit > correlate > with queues? > I removed "or queue". It was also kind of a left over when we talked about different QoS based SAs. > > 192 There are various considerations that an implementation can use > to > 193 determine the best way to install multiple Child SAs. > > The best 'way' or the best 'procedure'? > Changed to procedure. > > 195 A simple distribution could be to install one additional Child > SA on > 196 each CPU. An implementation MAY ensure that one Child SA can > be used > > A distribution of what? is this referring to an implementation? > Changed to procedure. > > 213 When the number of queue or CPU resources are different between > the > 214 peers, the peer with the least amount of resources may decide > to not > 215 install a second outbound Child SA for the same resource as it > will > 216 never use it to send traffic. However, it MUST install all > inbound > 217 Child SAs as it has committed to receiving traffic on these > 218 negotiated Child SAs. > > Is there risk to create an overload of SAs for a single resource? > Not really. These are just a few bytes per SA and a single entry in a lookup table that hopefully uses radix trees or some other non-serial lookup. > > 224 Section 2.9. Based on the trigger TSi entry, an > implementations can > > s/implementations/implementation/ > Fixed. > > 243 All multi-octet fields representing integers are laid out in big > 244 endian order (also known as "most significant byte first", or > 245 "network byte order"). > > is this necessary to be explained? is that not part of what RFC7296 > specifies > anyway? > Yes but as the sentence above it says " is copied here for convenience" :) > 261 * Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. > 263 * SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. > > and > > 280 * Protocol ID (1 octet) - MUST be 0. MUST be ignored if not 0. > 282 * SPI Size (1 octet) - MUST be 0. MUST be ignored if not 0. > > no code-point reservations needed for experimental? or future use? > No. There are IKE SAs and Child (IPsec) SAs. Every message pertaining to IKE sets the protocol ID and SPI Size to 0. If it pertains to a child SA, it sets the protocol ID (eg AH or ESP) and the spi size. > > 267 * Resource Identifier (optional). This opaque data may be set > to > 268 convey the local identity of the resource. > > should there be no restrictions on what can be considered as a local > identity? > What of this identity is an extremely long blockchain blob? what would > happen? > is that allowed? > It would not be stored and just produce a long debug log message once? > > 290 Implementations supporting per-CPU SAs SHOULD extend their > local SPD > > later in the text is the it is mentioned per-queue also... Does this behave > differently then the per-CPU principle? > I can no longer find "per queue" or "per-queue" references ? 344 An implementation that does not accept any further resource > specific > 345 Child SAs MUST NOT return the NO_ADDITIONAL_SAS error because > this > 346 can be interpreted by the peer that no other Child SAs with > different > 347 TSi/TSr are allowed either. Instead, it MUST return > TS_MAX_QUEUE. > > should anything be mentioned about state kept on the implementation that > has no > more resources? What if the remote side tries to open 10M SAs? (is it an > attack > vector?) > That is done in the Security Considerations: Similar to how an implementation should limit the number of half-open SAs to limit the impact of a denial of service attack, it is RECOMMENDED that an implementation limits the maximum number of additional Child SAs allowed per unique TSi/TSr. Paul
_______________________________________________ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec