Re: [lng-odp] [API-NEXT PATCH v3 1/4] api: ipsec: extend lookaside API

Bill Fischofer Tue, 28 Mar 2017 05:33:42 -0700

On Tue, Mar 28, 2017 at 7:21 AM, Bogdan Pricope
<bogdan.pric...@linaro.org> wrote:
> There are two aspects to consider:
> 1. How to process odp_ipsec_in(), odp_ipsec_out(),etc. with explicit
> SA reference after an odp_ipsec_sa_disable() (or after
> odp_ipsec_sa_destroy())?
> 2. When is safe to odp_ipsec_sa_destroy() after odp_ipsec_sa_disable()?
>
>
>
> If application holds a SA handler and IS NOT doing book keeping, it
> may end up using a SA after it was disabled and destroyed – accessing
> invalid/reused SA etc.
>
> If application holds a SA handler and IS doing book keeping it may
> avoid above case…. but will be difficult to tell when all workers
> stopped using that SA (pending operations inside application, etc.).
>
>
> We can ask application to do book keeping for the SA used directly and
> use a timeout to guarantee to application that SA will not be
> destroyed for a determined amount of time: we can have an
> odp_ipsec_sa_shutdown() to do: disable, timeout and destroy. Than
> odp_ipsec_in(), odp_ipsec_out(), etc. with explicit SA will return
> error between disable and destroy and trigger fatal error after
> destroy.


Passing an undefined / invalid handle to an ODP API always results in
undefined behavior except for those APIs explicitly designed for
validation (e.g., odp_packet_is_valid()). These are always application
programming errors.

To avoid such errors during termination paths in multithreaded
environments, it's necessary to have some sort of quiesce/flush
protocol to ensure that the application is finished using a handle
before it is destroyed. While applications can design their own such
protocols, having ODP provide support for this common need improves
application simplicity and robustness since getting this sort of thing
right is often non-trivial in the general case. In general, it's
better for an ODP implementation do to things like this correctly in a
manner optimal for its platform than to ask every ODP application to
figure out how to best do this on its own.

In the specific case of SA's the ODP SA is simply a handle
(odp_ipsec_sa_t) that is opaque to the application. For the
application to be able to track the SA state independent of the ODP
implementation would require that it allocate and maintain its own
"shadow SA" that contains the needed state information, which would be
both error-prone and wasteful..That's why ODP handles things like SA
expiration processing, and why, I argue, it should do the same for
disablement status.

>
> On 28 March 2017 at 14:54, Bill Fischofer <bill.fischo...@linaro.org> wrote:
>> On Tue, Mar 28, 2017 at 4:58 AM, Savolainen, Petri (Nokia - FI/Espoo)
>> <petri.savolai...@nokia-bell-labs.com> wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
>>>> Sent: Tuesday, March 28, 2017 4:32 AM
>>>> To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolainen@nokia-bell-
>>>> labs.com>
>>>> Cc: lng-odp-forward <lng-odp@lists.linaro.org>
>>>> Subject: Re: [lng-odp] [API-NEXT PATCH v3 1/4] api: ipsec: extend
>>>> lookaside API
>>>>
>>>> On Mon, Mar 27, 2017 at 5:27 AM, Savolainen, Petri (Nokia - FI/Espoo)
>>>> <petri.savolai...@nokia-bell-labs.com> wrote:
>>>> >
>>>> >> >  /**
>>>> >> > + * Configuration options for IPSEC inbound processing
>>>> >> > + */
>>>> >> > +typedef struct odp_ipsec_inbound_config_t {
>>>> >> > +       /** Default destination queue for IPSEC events
>>>> >> > +        *
>>>> >> > +        *  When inbound SA lookup fails in asynchronous or inline
>>>> >> modes,
>>>> >> > +        *  resulting IPSEC events are enqueued into this queue.
>>>> >> > +        */
>>>> >> > +       odp_queue_t default_queue;
>>>> >> > +
>>>> >> > +       /** Constraints for SPI values of inbound SAs for which
>>>> lookup
>>>> >> is
>>>> >> > +        *  enabled. Minimal range size and unique SPI values may
>>>> >> improve
>>>> >> > +        *  performance. */
>>>> >> > +       struct {
>>>> >> > +               /** Minimum inbound SPI value. Default value is 0. */
>>>> >> > +               uint32_t min;
>>>> >> > +
>>>> >> > +               /** Maximum inbound SPI value. Default value is
>>>> >> UINT32_MAX. */
>>>> >> > +               uint32_t max;
>>>> >> > +
>>>> >> > +               /** Inbound SPI values may overlap. Default value is
>>>> 0.
>>>> >> */
>>>> >> > +               odp_bool_t overlap;
>>>> >>
>>>> >> It's not clear what you mean by SPI values overlapping since an SPI is
>>>> >> a unit32_t value. Some explanation / illustration seems warranted
>>>> >> here.
>>>> >
>>>> > Application tells if it uses unique SPI values for all (inbound) SAs it
>>>> creates (== no two SAs have the same SPI), which results simpler lookup
>>>> for implementation. This should be the default setting, since application
>>>> can decide inbound SPI values. Since default values are usually zero, I
>>>> used term "overlap" instead of e.g. "non_unique".
>>>>
>>>> As we discussed today, this is really saying whether lookups will be
>>>> done via a combination of SPI + dest addr or SPI only. In the latter
>>>> case SPI values must be unique. Either the name should make that
>>>> distinction obvious or else some additional documentation text should
>>>> be included here to avoid any possible confusion on the part of the
>>>> application writer or ODP implementer as to what the intent is here.
>>>> Perhaps "lookup_spi_and_dest" ?
>>>>
>>>> >
>>>> >
>>>> >> >
>>>> >> >  /**
>>>> >> > + * Disable IPSEC SA
>>>> >> > + *
>>>> >> > + * Application must use this call to disable a SA before destroying
>>>> it.
>>>> >> The call
>>>> >> > + * marks the SA disabled, so that IPSEC implementation stops using
>>>> it.
>>>> >> For
>>>> >> > + * example, inbound SPI lookups will not match any more. Application
>>>> >> must
>>>> >> > + * stop providing the SA as parameter to new IPSEC input/output
>>>> >> operations
>>>> >> > + * before calling disable. Packets in progress during the call may
>>>> >> still match
>>>> >> > + * the SA and be processed successfully.
>>>> >>
>>>> >> Does this mean that it is an application responsibility to ensure that
>>>> >> there are no "in flight" IPsec operations at the time this call is
>>>> >> made? That seems unnecessarily burdensome, as one of the purposes of
>>>> >> an API like this is to flush the request pipeline before issuing a
>>>> >> destroy call. I'd prefer the following sort of wording:
>>>> >>
>>>> >> "Operations initiated or in progress at the time this call is made
>>>> >> will complete normally, however no further operations on this SA will
>>>> >> be accepted. For example, inbound SPI lookups will not match any more,
>>>> >> and outbound operations referencing this SA will fail."
>>>> >
>>>> >
>>>> > Application must not pass the same SA as parameter simultaneously to
>>>> disable() and e.g. ipsec_in() or ipsec_enq(). This way implementation does
>>>> not (may not) need to synchronize, between slow path (disable) and fast
>>>> path (ipsec_in) calls. Appplication should be able to synchronize itself
>>>> so that one thread does not disable/destroy a SA that other thread still
>>>> actively uses.
>>>> >
>>>> > Look ups for in-flight packet are still possible during disable, since
>>>> that SA usage is in control of implementation. So, application can still
>>>> continue passing packets without SA (for lookup), but it must not
>>>> explicitly refer use SA itself on other (fast path calls).
>>>>
>>>> An IPsec operation referencing an SA can do so explicitly (via an
>>>> application odp_ipsec_in/out call) or implicitly (via an inline
>>>> lookup). As part of any such SA reference, the ODP implementation must
>>>> always determine whether the SA is valid. Reasons for an SA not being
>>>> valid include:
>>>>
>>>> - The SA lookup failed (implicit only)
>>>> - The SA has expired due to elapsed time or number of bytes processed
>>>> - The SA has been administratively disabled
>>>>
>>>
>>> When application refers explicitly to SA, implementation can trust that the 
>>> SA exist. It does not do look up, just refer to the SA directly. Expiration 
>>> counters are part of SA context, SA exist even if counters exceed their 
>>> limits. Just an error is returned. The last bullet concerns mostly the look 
>>> up phase (implicit usage of SA).
>>
>> You can think of a disabled SA as equivalent to one that has been
>> expired by the application rather than time or usage, so any SA may be
>> disabled just as it may be expired.
>>
>>>
>>>
>>>> Since the ODP implementation has to do these checks for every
>>>> operation anyway, having it do the checks for administrative
>>>> disablement, which is what odp_ipsec_sa_disable() does, is not adding
>>>> any real additional work. The proposal that the implementation does
>>>> the check for administrative disablement only for implicit SA
>>>> references imposes an additional burden on the part of every ODP
>>>> application that uses IPsec that seems unnecessary.
>>>
>>> Our expectation is that when application passes SA, implementation does not 
>>> check SA validity but just uses it. Disable step is needed to phase out SA 
>>> usage in the implicit case (implementation does look up and continues from 
>>> there to process the packet).
>>
>> Agreed. Attempting to reference a destroyed SA is a programming error,
>> the same as trying to reference any other stale handle. However an
>> expired/disabled SA is still a valid SA reference even though it's not
>> usable for IPsec operations, just as a stopped pktio is not usable for
>> I/O.
>>
>>>
>>> SA state check concerns application only when itself does the SA look up, 
>>> and thus needs to synchronize SA validity anyway.
>>
>> No, because an SA may also be unusable because it is expired. It's the
>> implementation's responsibility to check for expirations since asking
>> a worker thread to do this would require knowledge it does not easily
>> have (e.g., how many bytes were sent on this SA by this or any other
>> thread). As noted above, disable is simply a way for the application
>> to force an SA into an expired state so that operations against it can
>> be flushed prior to destroying it.
>>
>>>
>>>
>>>>
>>>> I view odp_ipsec_sa_disable() to be analogous to odp_pktio_stop(),
>>>> which is used to quiesce an odp_pktio_t prior to closing it. Note the
>>>> analogous documentation in odp_pktio_stop():
>>>>
>>>>  * Stop packet receive and transmit on a previously started interface. New
>>>>  * packets are not received from or transmitted to the network. Packets
>>>> already
>>>>  * received from the network may be still available from interface and
>>>>  * application can receive those normally. New packets may not be accepted
>>>> for
>>>>  * transmit. Packets already stored for transmit are not freed. A
>>>> following
>>>>  * odp_packet_start() call restarts packet receive and transmit.
>>>>
>>>> The "new packets may not be accepted for transmit" is the way I'd like
>>>> to see odp_ipsec_out() behave following an odp_ipsec_sa_disable()
>>>> call. Disable simply marks the SA so that in-flight operations can
>>>> complete but no new operations on that SA will be accepted. Once it
>>>> returns (indicating that the operation "pipeline" for that SA has been
>>>> flushed) the application can know that it is safe to destroy the SA.
>>>
>>> Another analog is queue API.
>>>
>>> 1) Application first stops itself from adding new events into the queue
>>> 2) Application calls dequeue so many times that nothing comes out
>>> 3) Application destroys queue only after it's empty
>>>
>>> 1) stop explicit SA usage
>>> 2) disable SA
>>> 3) process all inflight IPSEC packets for the SA
>>> 4) SA can be destroyed only after all inflight IPSEC for the SA is finished 
>>> (disable event or return value indicate that)
>>
>> I'd argue that adding an odp_queue_disable() API that would prevent
>> further enqueues while allowing dequeues would be a good add and is
>> something we're going to need anyway as we have more queues that can
>> have events added to it by the ODP implementation itself (e.g., SA
>> error queues). Otherwise we'll run into similar race conditions trying
>> to destroy them. The odp_pktio_stop() model should be the norm going
>> forward for various async operations to permit easy and well-defined
>> termination paths.
>>
>>>
>>>
>>>>
>>>> If odp_ipsec_sa_disable() does not behave this way, then the
>>>> application would have to separately track and check SA enable/disable
>>>> status before it makes any odp_ipsec_out() call. But note that we're
>>>> not asking the application to track and check time/data expiration
>>>> limits on the SA, so this is inconsistent. Either the application
>>>> using lookaside processing should be responsible for tracking
>>>> everything or the ODP implementation should do this for the
>>>> application. The ODP philosophy has been that it's better to do such
>>>> things within the implementation than to require every ODP application
>>>> to invent their own scheme for doing this sort of thing.
>>>>
>>>> Hope that clarifies things.
>>>
>>>
>>> To me it's hard to believe that application would not have any book keeping 
>>> of active SAs or inflight IPSEC requests. With explicit SA usage, worker 
>>> thread synchronization through IPSEC block sounds an odd design ("a call 
>>> failed on a SA,
>> so some other thread must have destroyed the SA").
>>
>> Disabled, not destroyed. And no different than an operation failure
>> because the SA has expired (which could occur at any time from the
>> worker's perspective).
>>
>>>
>>> -Petri
>>>
>>>

Re: [lng-odp] [API-NEXT PATCH v3 1/4] api: ipsec: extend lookaside API

Reply via email to