William Allen Simpson writes:
> > So it might make sense to have the ICV at the end because it is
> > likely cache hot when needed.
> 
> But after removing padding for these stream algorithms, then the ICV is
> very likely not aligned.  For zero-copy RDMA, it is rather inconvenient.
> And the IP header cache lines are likely still hot.
> 
> Anyway, that's why I'd like to consider at least a negotiated option --
> as long as it's possible to implement efficiently in Linux and others.
> We need to hear from more implementers.

I think it would be a bad idea to have such option. Yes, it might
offer small gains on environments where the option is picked exactly
when both ends like it, but when one ends pick it in a way which other
end does not like, we usually end up big performance penalties.

And also it again multiples the testing effort as now you need to add
to your test suites this combined with all posible ciphers etc.

This is was one of the main problems with IKEv1, there were so many
different combinations of different options that to be able to test
all of them required thousands or tens of thousands of test cases.

Example of similar optimzation causing problems was found during
interop events when some version (I think it was Linux) was sending
fragmented packets in reverse order, so the first network packet
sent/received was the last fragment. The idea was that when receiver
saw that last fragment it can immediately know how big the final
packet will be and it can allocate big enough buffer for the packet.

Then when you combined that with IPsec with per flow policy, meaning
each TCP/UDP flow might be using different SA, that meant that SGW
required to store all fragments in memory until it got the last packet
from the network, which was the first fragment, and only after that it
could check whether this packet is allowed to pass, and which SA it
needs to use.

Then it sent the fragments out, but there was lots of added latency
because of this. Actually I think there were also implementations
which did not even store the later fragments, they simply checked the
later fragments, and found out that they have not seen the first
fragment that would allow them to be passed, so they dropped them.

The SGW of course then sent the frames out in order, so that the
receiving SGW can efficently do exit tunnel checks (i.e., check from
the first fragment that this packet should be allowed, and match later
fragments with same fragment id to that, and allow them to passed),
and it then did not need to do same buffering. Of course the final
destination host now did not benefit at all from this optimization as
it was negatied by the SGWs in the middle.

So immediately when the options to use start to depend on the
platform, implementation etc it gets harder and harder to find out
which will be the optimal combination between two devices, and they
might end up using suboptimal feature set. And all of those add more
complexity and combinations that needs to be tested.

Of course if this option is added, it should be something like IPCOMP
or transport mode, meaning it is off by default, and everybody MUST
implement case where it is not enabled, then you would only implement
(and propose) it on cases where it actually benefits you, and
everybody else would never ever implement it.
-- 
kivi...@iki.fi

_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to