Re: [lng-odp] Compression API: hashing

2018-12-11 Thread Peltonen, Janne (Nokia - FI/Espoo)
I cannot really speak for OFP but OFP is not even using the compression API.

I do not really have an opinion one way or the other. Moving digest to 
odp_comp_packet_result_t does sound a little more flexible and maybe the 
overhead is not too bad with --disable-abi-compat.

 Janne

From: Bill Fischofer 
Sent: Monday, December 10, 2018 11:27 PM
To: Dmitry Eremin-Solenikov ; Peltonen, 
Janne (Nokia - FI/Espoo) 
Cc: lng-odp-forward 
Subject: Re: [lng-odp] Compression API: hashing

I'd like to hear from Janne as to what OFP would prefer. I'll also add this to 
the discussion agenda for tomorrow's public call. Thanks.

On Mon, Dec 10, 2018 at 2:28 PM Dmitry Eremin-Solenikov 
mailto:dmitry.ereminsoleni...@linaro.org>> 
wrote:
On 10/12/2018 22:11, Bill Fischofer wrote:
> I assume that's an API change? Can you put together a draft PR of what
> what would look like that we can discuss?

I can sketch a draft PR, however I'd like to hear an opinion first.

Basically my suggestion is to move digest from packet data to
odp_comp_packet_result_t. This will result in extra API call being
necessary to access digest data.

>
> On Mon, Dec 10, 2018 at 1:00 PM Dmitry Eremin-Solenikov
> mailto:dmitry.ereminsoleni...@linaro.org>
> <mailto:dmitry.ereminsoleni...@linaro.org<mailto:dmitry.ereminsoleni...@linaro.org>>>
>  wrote:
>
> Hello,
>
> I have been reworking compression API implementation to properly
> allocate/deallocate memory. Right now I've stumbled upon digest part of
> compression API. Currently on both compression and decompression digest
> should be written after co/decompressed output data. However both our
> hardware and current DPDK compressdev use out-of-band data to pass
> resulting hash. Do we want to stick with in-package data or do we want
> to switch to OOB way to return message digest?
>
> --
> With best wishes
> Dmitry
>


--
With best wishes
Dmitry


Re: [lng-odp] wrong push to master branch was reverted

2018-04-09 Thread Peltonen, Janne (Nokia - FI/Espoo)
My repo, to which I pulled from ODP before the odp-dpdk merge happened,
has the following commit that has not disappeared from the ODP repo. Is it
intentional? (btw, the commit comment of it may not be totally accurate
anyway (i.e. does it really fix the bug)):

Janne

commit 273264b459d98140a3c1a217bcea2c7bd680c826 (HEAD -> master, origin/master, 
origin/HEAD)
Author: Josep Puigdemont 
AuthorDate: Tue Apr 3 09:44:51 2018 +0200
Commit: Maxim Uvarov 
CommitDate: Wed Apr 4 15:58:26 2018 +0300

fdserver: handle interruption by signal in accept

This patch fixes: https://bugs.linaro.org/show_bug.cgi?id=3690

Suggested-by: Janne Peltonen 
Signed-off-by: Josep Puigdemont 
Reviewed-by: Bill Fischofer 
Signed-off-by: Maxim Uvarov 



> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Maxim 
> Uvarov
> Sent: Monday, April 09, 2018 10:37 AM
> To: LNG ODP Mailman List 
> Subject: [lng-odp] wrong push to master branch was reverted
> 
> Colleagues,
> 
> I'm apologize of wrong push to master branch. On had new environment and
> wrongly pushed odp-dpdk changes to master branch. (pushed to wrong git
> remote). Then I quickly corrected that with force push. I hope current
> history is good as it was before. But if you used automatic pulls from
> master it's needed to rebase.
> 
> I apologize for that accident.
> 
> Maxim.


Re: [lng-odp] Inside Secure IPsec HW capabilities

2018-01-17 Thread Peltonen, Janne (Nokia - FI/Espoo)
Pascal Van Leeuwen wrote:
> I get the impression that
> the IPsec payload can just be random garbage, i.e. it doesn't have to decrypt 
> to anything
> that makes sense and it doesn't have to authenticate properly (to facilitate 
> fast
> generation of such packets by skipping actual cryptographic operations).

The next header field is encrypted, so decryption must happen successfully
before the dummy TFC packets can be recognized. Despite that, the RFC
specifies that also authentication check is done before that. See RFC 4303,
section 3.4: Inbound Packet Processing and section 3.4.4.1, step 4.

Janne




Re: [lng-odp] IPsec and crypto performance and OpenSSL

2017-12-12 Thread Peltonen, Janne (Nokia - FI/Espoo)
Dmitry Eremin-Solenikov [mailto:dmitry.ereminsoleni...@linaro.org] wrote:
> On 12 December 2017 at 14:00, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> >> Also note that this will break explicit IV support.
> >
> > Why so? The iv is set in every operation.
> 
> Sorry, I meant implicit IV, when there is no override_iv_ptr, but IV
> is 'kept' inside session.

I still do not see the problem. iv_ptr points to either the override-IV
or the IV kept in a session. I just moved context creation and cipher
selection and key configuration to session creation.

Janne



Re: [lng-odp] IPsec and crypto performance and OpenSSL

2017-12-12 Thread Peltonen, Janne (Nokia - FI/Espoo)
> Also note that this will break explicit IV support.

Why so? The iv is set in every operation.

Janne

> -Original Message-
> From: Dmitry Eremin-Solenikov [mailto:dmitry.ereminsoleni...@linaro.org]
> Sent: Tuesday, December 12, 2017 12:36 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>; Bill 
> Fischofer
> <bill.fischo...@linaro.org>; lng-odp@lists.linaro.org
> Subject: Re: Suspected SPAM - Re: [lng-odp] IPsec and crypto performance and 
> OpenSSL
> 
> Hello,
> 
> On 12 December 2017 at 12:38, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> >
> >> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> >> global_init(). I
> >> guess context allocation depends on algorithm, etc config, but we could 
> >> e.g. pre-
> allocate
> >> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> >> they are
> today.
> >> A balance between simple solution, process support and better performance 
> >> for the
> common
> >> case.
> >
> > That would help but since the same context would be shared by
> > many crypto sessions the keys would have to be configured in the
> > context in every crypto op.
> >
> > The current code looks like this (in every crypto op):
> >
> > ctx = EVP_CIPHER_CTX_new();
> > EVP_EncryptInit_ex(ctx, session->cipher.evp_cipher, NULL,
> >session->cipher.key_data, NULL);
> > EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);
> > EVP_CIPHER_CTX_set_padding(ctx, 0);
> >
> > ret = internal_encrypt(ctx, pkt, param);
> >
> > EVP_CIPHER_CTX_free(ctx);
> >
> > I tried this, with big speedup (and broken process support):
> >
> > ctx = session->perthread[odp_thread_id()].cipher_ctx;
> > EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);
> >
> > ret = internal_encrypt(ctx, pkt, param);
> 
> Also note that this will break explicit IV support.
> 
> > What you propose, would be in between. Maybe I should try and
> > see how fast it would be.
> >
> > Janne
> >
> >
> >> -Original Message-
> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> >> Savolainen, Petri
> >> (Nokia - FI/Espoo)
> >> Sent: Tuesday, December 12, 2017 10:30 AM
> >> To: Bill Fischofer <bill.fischo...@linaro.org>; Dmitry Eremin-Solenikov
> >> <dmitry.ereminsoleni...@linaro.org>
> >> Cc: lng-odp@lists.linaro.org
> >> Subject: Suspected SPAM - Re: [lng-odp] IPsec and crypto performance and 
> >> OpenSSL
> >>
> >> We should not deliberately break process support. Since ODP and OFP are 
> >> libraries, it's
> >> the application (e.g. NGINX) that creates the threads and it may have 
> >> historical or
> other
> >> valid reasons to fork processes instead of creating pthreads. Processes 
> >> may be forked
> at
> >> many points. If we target fork after global init, we are making process 
> >> support better
> >> instead of making it worse.
> >>
> >> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> >> global_init(). I
> >> guess context allocation depends on algorithm, etc config, but we could 
> >> e.g. pre-
> allocate
> >> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> >> they are
> today.
> >> A balance between simple solution, process support and better performance 
> >> for the
> common
> >> case.
> >>
> >> -Petri
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill
> >> > Fischofer
> >> > Sent: Tuesday, December 12, 2017 3:23 AM
> >> > To: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> >> > Cc: lng-odp@lists.linaro.org
> >> > Subject: Re: [lng-odp] IPsec and crypto performance and OpenSSL
> >> >
> >> > I think we've pretty much abandoned the notion that linux-generic will
> >> > support threads as separate processes as there seems to be little
> >> > justification for it. The current plan is to continue this assumption in
> >> > the "2.0" code base

Re: [lng-odp] Suspected SPAM - Re: IPsec and crypto performance and OpenSSL

2017-12-12 Thread Peltonen, Janne (Nokia - FI/Espoo)

> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> global_init(). I
> guess context allocation depends on algorithm, etc config, but we could e.g. 
> pre-allocate
> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> they are today.
> A balance between simple solution, process support and better performance for 
> the common
> case.

That would help but since the same context would be shared by
many crypto sessions the keys would have to be configured in the
context in every crypto op.

The current code looks like this (in every crypto op):

ctx = EVP_CIPHER_CTX_new();
EVP_EncryptInit_ex(ctx, session->cipher.evp_cipher, NULL,
   session->cipher.key_data, NULL);
EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);
EVP_CIPHER_CTX_set_padding(ctx, 0);

ret = internal_encrypt(ctx, pkt, param);

EVP_CIPHER_CTX_free(ctx);

I tried this, with big speedup (and broken process support):

ctx = session->perthread[odp_thread_id()].cipher_ctx;
EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);

ret = internal_encrypt(ctx, pkt, param);

What you propose, would be in between. Maybe I should try and
see how fast it would be.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> Savolainen, Petri
> (Nokia - FI/Espoo)
> Sent: Tuesday, December 12, 2017 10:30 AM
> To: Bill Fischofer <bill.fischo...@linaro.org>; Dmitry Eremin-Solenikov
> <dmitry.ereminsoleni...@linaro.org>
> Cc: lng-odp@lists.linaro.org
> Subject: Suspected SPAM - Re: [lng-odp] IPsec and crypto performance and 
> OpenSSL
> 
> We should not deliberately break process support. Since ODP and OFP are 
> libraries, it's
> the application (e.g. NGINX) that creates the threads and it may have 
> historical or other
> valid reasons to fork processes instead of creating pthreads. Processes may 
> be forked at
> many points. If we target fork after global init, we are making process 
> support better
> instead of making it worse.
> 
> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> global_init(). I
> guess context allocation depends on algorithm, etc config, but we could e.g. 
> pre-allocate
> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> they are today.
> A balance between simple solution, process support and better performance for 
> the common
> case.
> 
> -Petri
> 
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill
> > Fischofer
> > Sent: Tuesday, December 12, 2017 3:23 AM
> > To: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> > Cc: lng-odp@lists.linaro.org
> > Subject: Re: [lng-odp] IPsec and crypto performance and OpenSSL
> >
> > I think we've pretty much abandoned the notion that linux-generic will
> > support threads as separate processes as there seems to be little
> > justification for it. The current plan is to continue this assumption in
> > the "2.0" code base as well. So having OpenSSL rely on threads sharing the
> > same address space should not be a problem in practice.
> >
> > Any platform that has native HW crypto capabilities would certainly use
> > those in preference OpenSSL, so again such restrictions should not be of
> > concern here.
> >
> > I agree, however, that this sort of tuning should be a follow-on activity,
> > perhaps under the general performance improvement work we want for 2.0,
> > but
> > should not be a gating consideration for the Tiger Moth release.
> >
> > On Mon, Dec 11, 2017 at 5:19 PM, Dmitry Eremin-Solenikov <
> > dmitry.ereminsoleni...@linaro.org> wrote:
> >
> > > On 11 December 2017 at 19:14, Maxim Uvarov <maxim.uva...@linaro.org>
> > > wrote:
> > > > odp_init_global() allocates shm, then odp_init_local() /
> > odp_term_local()
> > > > allocates/destroys per thread contexts in array in that shm. I think
> > that
> > > > has to work.
> > >
> > > The problem lies in OpenSSL 1.1 "opaque structures" approach. They
> > stopped
> > > providing exact struct definitions which can be embedded somewhere.
> > Another
> > > option would be to switch to libnettle licensed under LGPL-2.1+. It may
> > be
> > > however less optimized compared to OpenSSL.
> > >
> > > > On 11 December 2017 at 17:02, Francois Ozog <francois.o...@linaro.org>
> > > > wrote:
> > > >
> > > &g

[lng-odp] IPsec and crypto performance and OpenSSL

2017-12-11 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

When playing with IPsec I noticed that the Linux generic
ODP implementation creates a separate OpenSSL crypto context
for each crypto-operation as opposed to doing it at ODP
crypto session creation. With IPsec this adds a lot of
overhead for every packet processed and significantly
reduces packet throughput.

I wonder what, if anything, should be done about it.

I already almost sent a patch to create and initialize
crypto contexts only once per session but realized that
it is not that easy.

Here are some alternatives that came to my mind, but all
of them have their own problems:

a) Create per-thread OpenSSL contexts at crypto session
   creation time.
   - Does not work with ODP threads that do not share
 their address space since OpenSSL is allocating
 memory through malloc() during context creation.

b) Do a) plus provide OpenSSL a custom memory allocator
   on top of shared memory.
   - There is no generic heap allocator in ODP code base.

c) Create per-thread contexts lazily when needed.
   - Creation would work as it would happen in the right
 thread but there would be no way to delete the
 contexts. The thread destroying the ODP crypto
 session cannot delete the per-thread contexts that
 might reside in a different address spaces. That
 thread could ask every other thread to do the
 per-thread cleanup, except that there is no mechanism
 for that without application assistance or big
 changes in the generic ODP code.

d) Create a limited-size cache of per-thread contexts.
   - This would allow postponing the deletion of each
 context either to the point the cache slot needs
 to be reused or all the way to ODP termination,
 both occuring in the right thread. But this is
 getting complicated and sizing the cache is nasty.

Any thoughts?

Janne




Re: [lng-odp] [PATCH API-NEXT v2 0/3] api: ones complement metadata

2017-10-26 Thread Peltonen, Janne (Nokia - FI/Espoo)
As discussed offline with Petri, I feel that it would be useful
to let the ODP implementation tell from which packet offset the
sum calculation starts (or even let the implementation specify
the range, i.e. also the end).

The rationale is that even though starting from L4 offset is
reasonable from API point-of-view, different HW may start the
sum calculation from different offsets and fixing the sum in
SW is somewhat costly in those cases where the application is
not actually interested in the sum, for example:

- An application both forwards packets receives packets locally
  and needs checksum checking only for the locally received packets.
- An application receives all incoming traffic locally, but drops
  some of it before it event wants to check the checksum (e.g.
  QoS filtering, one fragment lost, packet to wrong TCP port).

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> Savolainen, Petri
> (Nokia - FI/Espoo)
> Sent: Thursday, October 26, 2017 10:00 AM
> To: Github ODP bot ; lng-odp@lists.linaro.org
> Subject: Suspected SPAM - Re: [lng-odp] [PATCH API-NEXT v2 0/3] api: ones 
> complement
> metadata
> 
> [This sender failed our fraud detection checks and may not be who they appear 
> to be. Learn
> about spoofing at http://aka.ms/LearnAboutSpoofing]
> 
> Ping.
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Github ODP bot
> > Sent: Monday, October 23, 2017 1:00 PM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [PATCH API-NEXT v2 0/3] api: ones complement metadata
> >
> > Added packet metadata for ones complement sum over IP
> > payload in a packet. Some NICs calculate the sum
> > during packet input (at least for IP fragments) and
> > store the value into the packet descriptor. This offloads
> > L4 checksum calculation for IP fragments as SW does not
> > need sum all payload data, but just combine pre-calculated
> > sums from packet descriptors and remove extra header fields
> > from the sum.


Re: [lng-odp] [PATCH API-NEXT v1 1/3] api: ipsec: rework ODP_IPSEC_SA_DISABLE into packet error

2017-10-24 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Comments below:

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Github 
> ODP bot
> Sent: Tuesday, October 24, 2017 2:00 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH API-NEXT v1 1/3] api: ipsec: rework 
> ODP_IPSEC_SA_DISABLE into
> packet error
> 
> From: Dmitry Eremin-Solenikov 
> 
> According to the discussion on mailing list, most of implementations
> will not be able to support odp_ipsec_sa_disable() status event
> directly.  Instead they will submit a dummy packet to that SA. Then
> after receiving this packet after odp_ipsec_result() will detect this
> packet and will report it as a packet with
> odp_ipsec_error_t->sa_disabled bit set.
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> Cc: Nikhil Agarwal 
> Cc: Balasubramanian Manoharan 
> ---
> /** Email created from pull request 256 (lumag:ipsec_sa_disable_v2)
>  ** https://github.com/Linaro/odp/pull/256
>  ** Patch: https://github.com/Linaro/odp/pull/256.patch
>  ** Base sha: 825f75ed8644ef57c5648961e7982daf13cd9375
>  ** Merge commit sha: 1bab6bb255422c8eb061c6482397eb498fc7b1cc
>  **/
>  include/odp/api/spec/ipsec.h | 38 ++
>  1 file changed, 14 insertions(+), 24 deletions(-)
> 
> diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> index 26e852fca..f550b6bac 100644
> --- a/include/odp/api/spec/ipsec.h
> +++ b/include/odp/api/spec/ipsec.h
> @@ -842,11 +842,12 @@ odp_ipsec_sa_t odp_ipsec_sa_create(const 
> odp_ipsec_sa_param_t
> *param);
>   * the SA and be processed successfully.
>   *
>   * When in synchronous operation mode, the call will return when it's 
> possible
> - * to destroy the SA. In asynchronous mode, the same is indicated by an
> - * ODP_EVENT_IPSEC_STATUS event sent to the queue specified for the SA. The
> - * status event is guaranteed to be the last event for the SA, i.e. all
> + * to destroy the SA. In asynchronous mode, the same is indicated by a packet
> + * event sent to the queue specified for the SA having sa_disabled error bit
> + * set. The packet is guaranteed to be the last event for the SA, i.e. all
>   * in-progress operations have completed and resulting events (including 
> status
> - * events) have been enqueued before it.

Maybe it should be clarified that the packet event with sa_disabled
error bit set will be sent even when there is no traffic on the SA,
i.e. no acual packets being processed.

Then I wonder if "packet event with sa_disabled error bit set" is too
shortened an expression that requires the reader to figure out that it
really means such ODP_EVENT_PACKET with subtype ODP_EVENT_PACKET_IPSEC
that odp_ipsec_result(odp_ipsec_packet_from_event(ev)) indicates
sa_disabled through the error bits.

If the packet resulting from odp_ipsec_packet_from_event() must never
be used as a packet as Bill summarized in his post, then that should be
specified somewhere, maybe in the documentation of 
odp_ipsec_packet_from_event().

event.h says this about ODP_EVENT_PACKET:

* - ODP_EVENT_PACKET
* - Packet event (odp_packet_t) containing packet data and plenty of
*   packet processing related metadata

But that does not really apply to the "sa_disabled" packet events.

Then event.h says this regarding event subtypes:

* When
* subtype is known, these subtype specific functions should be preferred over
* the event type general function (e.g. odp_packet_from_event()).

This implies that even though odp_ipsec_packet_from_event() is "preferred",
odp_packet_from_event() is ok even for the sa_disabled ipsec packet events.
But if the resulting packet must not be used as a packet, then one needs
to always check the subtype too. If some packet events are not really
packets, the whole packet event type and subtype thing does not seem
terribly useful to me.


> + * events) have been enqueued before it. The will be no more packets coming
> + * from SA queue.

The last sentence (which has a typo, btw) is not correct since more
packets can be coming through the queue, not for the SA that was
disabled but for other SAs.

>   *
>   * @param sa  IPSEC SA to be disabled
>   *
> @@ -914,6 +915,8 @@ typedef struct odp_ipsec_error_t {
> 
>   /** Hard lifetime expired: packets */
>   uint32_t hard_exp_packets : 1;
> +
> + uint32_t sa_disabled : 1;

Doxygen comment is missing.

>   };
> 
>   /** All error bits
> @@ -927,7 +930,12 @@ typedef struct odp_ipsec_error_t {
> 
>  } odp_ipsec_error_t;
> 
> -/** IPSEC warnings */
> +/** IPSEC warnings
> + *
> + * For outbound SAs in ODP_IPSEC_OP_MODE_INLINE mode warnings can be reported
> + * only as status events. In all other cases warnings can be reported either 
> as
> + * a part of packet result or via separate ODP status event.
> + */

Reporting 

Re: [lng-odp] IPsec fragmentation ambiguity

2017-09-18 Thread Peltonen, Janne (Nokia - FI/Espoo)

The odp_ipsec_error_t, which is part of packet result includes the
following error bit:

/** Packet does not fit into the given MTU size */
uint32_t mtu  : 1;

In ODP_IPSEC_FRAG_CHECK mode an operation fails and indicates the
mtu error if fragmentation would have been needed (i.e the packet
is already too big or would become too big after IPsec encapsulation,
compared to the MTU configured for the SA).

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Tuesday, September 19, 2017 6:31 AM
> To: lng-odp-forward 
> Subject: [lng-odp] IPsec fragmentation ambiguity
> 
> While working on the IPsec user documentation, I've run across an
> ambiguity that may indicate a gap in the currently-defined API.
> 
> For outbound lookaside processing, one of the fields in the
> odp_ipsec_out_param_t struct is the odp_ipsec_out_opt_t that contains
> the odp_ipsec_frag_mode_t. This is defined in the spec as:
> 
> /**
>  * Fragmentation mode
>  *
>  * These options control outbound IP packet fragmentation offload. When 
> offload
>  * is enabled, IPSEC operation will determine if fragmentation is needed and
>  * does it according to the mode.
>  */
> typedef enum odp_ipsec_frag_mode_t {
> /** Do not fragment IP packets */
> ODP_IPSEC_FRAG_DISABLED = 0,
> 
> /** Fragment IP packet before IPSEC operation */
> ODP_IPSEC_FRAG_BEFORE,
> 
> /** Fragment IP packet after IPSEC operation */
> ODP_IPSEC_FRAG_AFTER,
> 
> /** Only check if IP fragmentation is needed,
>  * do not fragment packets. */
> ODP_IPSEC_FRAG_CHECK
> } odp_ipsec_frag_mode_t;
> 
> All this seems straightforward except for the last options. If the
> mode is specified as ODP_IPSEC_FRAG_CHECK, it's not clear how the
> check results are returned to the application. I'd expect this to be a
> warning flag in the odp_ipsec_packet_result_t, but no such flag is
> defined.
> 
> Is this an oversight? Or have I missed how this information is returned?


Re: [lng-odp] IPsec API finialization

2017-09-11 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bill Fischofer [mailto:bill.fischo...@linaro.org] wrote:
> On Fri, Sep 8, 2017 at 6:10 AM, Janne Peltonen <janne.pelto...@nokia.com> 
> wrote:
> >
> >
> > On Fri, 8 Sep 2017, Nikhil Agarwal wrote:
> >> On 7 September 2017 at 14:09, Peltonen, Janne (Nokia - FI/Espoo)
> >> <janne.pelto...@nokia.com> wrote:>
> >> > Bill Fischofer wrote:
> >> > >  if (odp_ipsec_result(_res, pkt) < 0) { /* Stale
> >> > > event, discard */
> >>
> >> > There is a race condition here if the idea is that calling
> >> > odp_ipsec_sa_disable() (or destroy()) marks the SA stale to prevent
> >> > further processing for that SA in the result event handling.
> >>
> >> > Maybe the odp_ipsec_result() succeeds but the SA becomes stale
> >> > right after that, before rest of the processing is done.
> >>
> >> Same race condition exist in the proposed API when one thread received the 
> >> last
> >> packet of SA and still process it while the other thread on receiving 
> >> disable
> >> event destroy the SA, application will always need to synchronize its own
> >> thread.
> >
> > There is no race in the API, only in incorrect use of it. As I explained,
> > synchronization is needed even with the disable event and it can be
> > easily done using an ordered lock that ensures that other event handling
> > threads are finished processing the SA (they have released the queue
> > context through another call to schedule or explicitly). So the event
> > handler for the disable event can destroy the SA safely without any race
> > if it uses an ordered lock (or if the SA queue is atomic).
> >
> > I wrote a piece of pseudocode about this in my reply, maybe you missed it.
> >
> >>
> >> Let me reiterate the solution we discussed yesterday. After the SA is 
> >> disabled
> >> successfully, implementations will stop enqueuing any more events from 
> >> that SA
> >> and any call to odp_ipsec_result will indicate that SA is disabled.(SA 
> >> Entry is
> >> still valid though). Application may choose to synchronize its database and
> >> process that packet or may drop it. Then it will call odp_ipsec_sa_destroy 
> >> which
> >> will delete the SA and any further call to odp_ipsec_result pertaining to 
> >> that
> >> SA will result in invalid event. Hope this resolves the issue.
> >
> > That clarifies the API you are proposing but not how you intend it to
> > be used. It is easy to say that an application just should do whatever
> > synchronization is necessary.
> >
> > That said, I think your proposal could be made to work from application
> > perspective, but with some inconvenience:
> >
> > After the packet event handler checks (for every IPsec packet event)
> > that the event is not stale, it has to prevent odp_ipsec_sa_destroy()
> > from being called in any other thread until it no longer uses the SA.
> > While this could be done e.g. using an RCU based or epoch based
> > synchronization scheme to delay the destroy() just long enough, it
> > may perhaps not be easy for all application authors to do so since
> > ODP does not really provide anything than the basic atomics for that.
> 
> Wouldn't this happen automatically if the event queue were atomic? Or
> if an ordered lock were used to protect odp_ipsec_result() for the
> duration of the SA context reference? For example:

That approach would work if the event handler destroyed the SA in
the same queue context as it handles the packet events of the SA.

The problem is that SA deletion starts elsewhere in some control thread
and there would have to be a way to reliably delegate the actual destroy
to the correct queue context by having a an event enqueued through the
SA queue. And after the proposed API change the ODP implementation would
not necessarily send such an event (e.g. disable complete event) and
would not allow the application to send any event either.

The approach would work if there are unhandled ipsec packets events
for the SA in the SA queues but that may not be the case. In async case
one could send a dummy packet after disable() to get such an event, but
in the inline case that would not work (well, if one shared the same
event queue between inline and async processing, then that trick could
work).

The atomic queue or ordered lock approach works with the current API
just because there is the disable completion event that can be used
to transfer control to the correct packet processing thread (even if
the actual destroy would be further deferred to some 

Re: [lng-odp] IPsec API finialization

2017-09-07 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Comments below:

Bill Fischofer wrote:
> As discussed during today's ARCH call, we need to close on the
> remaining IPsec confusion surrounding the definition of
> odp_ipsec_sa_disable().
> 
> The main scenario that this API is designed to address, from an
> application standpoint is as follows:
> 
> - Application has one or more active SAs that are receiving inline RX
> IPsec traffic.
> 
> - An active SA needs to be deactivated / destroyed for some reason
> 
> - Application needs to be sure that "in flight" events that it is not
> yet aware of are in a defined state.
> 
> odp_ipsec_sa_disable() allows the application to tell the ODP
> implementation that it wants to deactivate an SA so that it can be
> destroyed. In response to this the implementation is expected to:
> 
> - Stop making internal use of this SA
> 
> - Tell the application when the SA is quiesced as far as the
> implementation is concerned.
> 
> As currently defined, the way this second step is communicated to the
> application is via a status event posted to the event queue associated
> with the SA.
> 
> The main point of the discussion is that NXP HW does not have a direct
> analog to this API. As Nikhil explained it, the "quiescing" function
> is assumed to be an application responsibility such that by the time
> an application calls odp_ipsec_sa_destroy() it already knows that the
> SA is inactive.
> 
> When an application is using IPsec in lookaside mode this is a not
> unreasonable requirement, as the application explicitly invokes every
> IPsec operation, so it knows which operations are in flight.

Yes, except that with fragmentation offload combined with IPsec
offload the application cannot easily know when it has received all
the result packets generated by single IPsec operation. If there
were always one result packet per operation, simple SA reference
counting in the application would work.

We discussed this with Petri when designing the API and considered
various options but concluded that the disable completion event
is a simple solution with low overhead when an SA is not being
disabled. And since event queues are a central concept of ODP and
of its asynchronous APIs, informing the disable completion through
an event seems natural.

Other options that do not work that well:

- Application inspects the result fragments to see when it has got
  them all. This would require almost the same work as fragment
  reassembly and it would introduce state that would have to be
  synchronized between multiple threads (since the result packets
  may get scheduled to different threads).

- ODP implementation marks the last result fragment with a special
  bit. This may perhaps be nasty for some HW implementations that
  parallelize the processing and do not know which of the fragments
  gets queued the last. If the application reference counts the
  SAs (as it most likely has to do to safely destroy the SA and
  safely delete its own state for the SA), it can unref and SA
  when it sees the marked "last" result fragment but only after
  it has processed the other fragments. This imposes synchronization
  overhead (e.g. ordered lock if the SA queue is ordered and
  scheduled) for the normal processing path when the SA is not
  being deleted.

- Variations of the above (e.g. returning the total number of result
  fragments for one IPsec operation in all the results) have
  similar problems regarding parallelization in the implementation
  and synchronization need in the application.

The disable completion event also requires synchronization (e.g.
in the form of an ordered lock but only when the SA is being deleted)

> 
> For inline processing, however, it's less clear what the application
> can do since packets may be in process that the application has yet to
> see. If an application calls odp_ipsec_sa_destroy() the implementation
> can pick a point after which all packets matching that SA are
> ignored/dropped, but the issue is what about packets that have been
> processed and are sitting in event queues that the application is not
> yet aware of?
> 
> If an application receives an event that was processed under a
> destroyed SA, the concern is that it not segfault or experience some
> other anomaly trying to access any application context that it had
> associated with that now-defunct SA.
> 
> The solution proposed in PR #109[1] is that odp_ipsec_sa_disable() be
> allowed to complete synchronously rather than asynchronously, which
> means that the NXP implementation can treat it as an effective NOP
> since it doesn't map to any HW function. While this solves one
> problem, there is still the problem of how to deal with events that
> are already enqueued at the time of this call. The intent of
> odp_ipsec_sa_disable() is that after the application is assured that
> the SA is disabled it may safely call odp_ipsec_sa_destroy() with
> assurance that it doesn't have to deal with expired SA handles.

Yes, but not only that. The disable 

Re: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be reported synchronously.

2017-08-16 Thread Peltonen, Janne (Nokia - FI/Espoo)
I think we are pretty much on the same page about the sync issues. It is then 
just an API
design choice whether ODP always sends a disable completion event in async and 
inline
mode (current API) or whether it is left to the application if it needs it (the 
patch).
I was preferring the former, but maybe it is just me.

> So it's a simple matter to drain those events if needed before calling 
> destroy.

I think an event (enqueued either by ODP or by the application after disabling 
an SA)
is needed to do the draining safely. Maybe a timing based approach would do too 
with
reasonable assumptions about the ODP implementation.

> If the application took events off the queue and was still doing something 
> else with them
> then it's an application responsibility to know when it's safe to issue the 
> destroy call for the SA.

Yes. One way to do it is to configure the SA queue as atomic, another is to use 
an ordered
queue and order the disable completion event handling with respect to other SA 
events
using an ordered lock.

 Janne


From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
Sent: Tuesday, August 15, 2017 3:34 PM
To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
Cc: Github ODP bot <odp...@yandex.ru>; lng-odp@lists.linaro.org
Subject: Re: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be 
reported synchronously.



On Mon, Aug 14, 2017 at 11:42 PM, Peltonen, Janne (Nokia - FI/Espoo) 
<janne.pelto...@nokia.com<mailto:janne.pelto...@nokia.com>> wrote:
> The definition of odp_ipsec_sa_disable() is that RC = 0 means the SA is now 
> guaranteed idle

It means that no further events for that SA will be posted to the SA queue.

> and may be safely destroyed,

Only from ODP point-of-view. The application may still have IPsec subtype 
packets for the SA
in flight in other threads.

True. Applications must always know themselves. If an application dequeues an 
event from an SA queue and re-enqueues it elsewhere presumably it's the 
application's responsibility to track that event as well.


> so there would be no other events to dequeue.

There can still be unhandled events in the event queues and outside event 
queues in other threads.

My point in my comments is that the application needs to synchronize between 
regular IPsec
completion event handling and destroying an SA and for that an “end marker” 
event in the SA
queue would be quite convenient, or even necessary to avoid more costly 
synchronization.

Completion of the disable operation means that the SA is "idle" from an ODP 
standpoint. As you noted, it's always up to the application to say when it is 
idle from the application's standpoint. That's why disable is different from 
destroy. A disabled SA is still valid, it will just not be used by ODP. 
Applications can still reference (but not initiate work on) disabled SAs.


Let’s consider IPsec packet reception in inline mode as an example:

As long as an SA is active (not disabled) incoming packets can match it and end 
up as
IPsec packet events in the SA queue. Thus when an application disables the SA, 
there can
be unhandled events for that SA in the event queues and/or under processing in 
other threads.
If the thread that disabled the SA immediately destroyed the SA, then event 
handling
would attempt to use a destroyed SA (e.g. it would call odp_ipsec_sa_context()) 
for an
SA that was already destroyed, resulting in undefined behavior.

This is a good example. Upon return from odp_ipsec_sa_disable() RC 0 guarantees 
that ODP will make no further reference to this SA. So no additional packets 
will match it. But it's still up to the application to decide when to destroy 
the SA. As far as events go, if an SA event queue is used then RC = 0 says that 
queue is "idle", not empty, which means that anything ODP wanted to add to that 
queue has been added and the application is guaranteed that no further events 
will be added by ODP. So it's a simple matter to drain those events if needed 
before calling destroy.

Note that the async completion means exactly the same, it's just that it's 
receipt also tells the application that the queue is empty. But again, that's 
only a statement about this queue. If the application took events off the queue 
and was still doing something else with them then it's an application 
responsibility to know when it's safe to issue the destroy call for the SA.


If ODP sent SA disable completion event to the SA queue or if the application 
did it
itself after the proposed synchronous SA disable completion, then it would be 
able to
postpone the destroying of an SA until all events for the SA have been fully 
handled.

 Janne


From: Bill Fischofer 
[mailto:bill.fischo...@linaro.org<mailto:bill.fischo...@linaro.org>]
Sent: Monday, August 14, 2017 9:19 PM
To: Peltonen, Janne (Nokia - FI/Espoo) 
<

Re: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be reported synchronously.

2017-08-14 Thread Peltonen, Janne (Nokia - FI/Espoo)
> The definition of odp_ipsec_sa_disable() is that RC = 0 means the SA is now 
> guaranteed idle

It means that no further events for that SA will be posted to the SA queue.

> and may be safely destroyed,

Only from ODP point-of-view. The application may still have IPsec subtype 
packets for the SA
in flight in other threads.

> so there would be no other events to dequeue.

There can still be unhandled events in the event queues and outside event 
queues in other threads.

My point in my comments is that the application needs to synchronize between 
regular IPsec
completion event handling and destroying an SA and for that an “end marker” 
event in the SA
queue would be quite convenient, or even necessary to avoid more costly 
synchronization.

Let’s consider IPsec packet reception in inline mode as an example:

As long as an SA is active (not disabled) incoming packets can match it and end 
up as
IPsec packet events in the SA queue. Thus when an application disables the SA, 
there can
be unhandled events for that SA in the event queues and/or under processing in 
other threads.
If the thread that disabled the SA immediately destroyed the SA, then event 
handling
would attempt to use a destroyed SA (e.g. it would call odp_ipsec_sa_context()) 
for an
SA that was already destroyed, resulting in undefined behavior.

If ODP sent SA disable completion event to the SA queue or if the application 
did it
itself after the proposed synchronous SA disable completion, then it would be 
able to
postpone the destroying of an SA until all events for the SA have been fully 
handled.

 Janne


From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
Sent: Monday, August 14, 2017 9:19 PM
To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
Cc: Github ODP bot <odp...@yandex.ru>; lng-odp@lists.linaro.org
Subject: Re: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be 
reported synchronously.



On Mon, Aug 14, 2017 at 9:41 AM, Peltonen, Janne (Nokia - FI/Espoo) 
<janne.pelto...@nokia.com<mailto:janne.pelto...@nokia.com>> wrote:

Hi,

Why is this change needed or preferred over the current API?

I would guess that many applications that use inline or asynchronous
mode need an end marker in the SA queue so that they know when they
can free their own per-SA state and destroy the ODP SA (i.e. when
all events for the SA being disabled have been dequeued and processed).

If odp_ipsec_sa_disable() may return synchronously, then such an
application would need to submit an end marker event to the SA queue
itself. The application cannot create an IPsec status event itself so
it has to define another type of event. But the application still has
to be able to process both the IPsec status event and its own event
in the event handler.

So, based on the above, it does not look like this change would make
the life of an application any easier, but maybe even the opposite.
Does it make ODP implementation significantly easier?

If this change is anyway made, the API should make it clear that after
return with retval 0 the implementation will not enqueue any further
events for that SA to the SA queue.

The definition of odp_ipsec_sa_disable() is that RC = 0 means the SA is now 
guaranteed idle and may be safely destroyed, so there would be no other events 
to dequeue. If the implementation cannot guarantee this then it cannot return 
synchronously, so I don't see any ambiguity here.

Aside to Maxim: the GitHub to mailing list path has been working well but the 
mailing list to GitHub return path is not. Any idea what's needed to enable 
that path?  Alternatively, Janne, you might want to reply via GitHub to help 
keep the discussion in one place along with the PR.


Janne


> -Original Message-
> From: lng-odp 
> [mailto:lng-odp-boun...@lists.linaro.org<mailto:lng-odp-boun...@lists.linaro.org>]
>  On Behalf Of Github ODP bot
> Sent: Thursday, August 10, 2017 9:00 AM
> To: lng-odp@lists.linaro.org<mailto:lng-odp@lists.linaro.org>
> Subject: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be 
> reported
> synchronously.
>
> From: Nikhil Agarwal 
> <nikhil.agar...@linaro.org<mailto:nikhil.agar...@linaro.org>>
>
> IPSEC events may be delivered synchronous or ansynchrous
> depending on implementation. Application will know based on
> return value of odp_ipsec_sa_disable API.
>
> Signed-off-by: Nikhil Agarwal 
> <nikhil.agar...@linaro.org<mailto:nikhil.agar...@linaro.org>>
> ---
> /** Email created from pull request 109 (NikhilA-Linaro:disable_event)
>  ** https://github.com/Linaro/odp/pull/109
>  ** Patch: https://github.com/Linaro/odp/pull/109.patch
>  ** Base sha: e420668cd3886f003c8bd6022e210bf08a0ee3b5
>  ** Merge commit sha: f65d49b122f2f60a7fc9af3e0a09067dcfd369d9
>  **/
>  include/odp/api/spec/ipsec.

Re: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be reported synchronously.

2017-08-14 Thread Peltonen, Janne (Nokia - FI/Espoo)

Hi,

Why is this change needed or preferred over the current API?

I would guess that many applications that use inline or asynchronous
mode need an end marker in the SA queue so that they know when they
can free their own per-SA state and destroy the ODP SA (i.e. when
all events for the SA being disabled have been dequeued and processed).

If odp_ipsec_sa_disable() may return synchronously, then such an
application would need to submit an end marker event to the SA queue
itself. The application cannot create an IPsec status event itself so
it has to define another type of event. But the application still has
to be able to process both the IPsec status event and its own event
in the event handler.

So, based on the above, it does not look like this change would make
the life of an application any easier, but maybe even the opposite.
Does it make ODP implementation significantly easier?

If this change is anyway made, the API should make it clear that after
return with retval 0 the implementation will not enqueue any further
events for that SA to the SA queue.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Github 
> ODP bot
> Sent: Thursday, August 10, 2017 9:00 AM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH API-NEXT v5 1/1] API:IPSEC: IPSEC events may be 
> reported
> synchronously.
> 
> From: Nikhil Agarwal 
> 
> IPSEC events may be delivered synchronous or ansynchrous
> depending on implementation. Application will know based on
> return value of odp_ipsec_sa_disable API.
> 
> Signed-off-by: Nikhil Agarwal 
> ---
> /** Email created from pull request 109 (NikhilA-Linaro:disable_event)
>  ** https://github.com/Linaro/odp/pull/109
>  ** Patch: https://github.com/Linaro/odp/pull/109.patch
>  ** Base sha: e420668cd3886f003c8bd6022e210bf08a0ee3b5
>  ** Merge commit sha: f65d49b122f2f60a7fc9af3e0a09067dcfd369d9
>  **/
>  include/odp/api/spec/ipsec.h | 17 ++---
>  1 file changed, 10 insertions(+), 7 deletions(-)
> 
> diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> index 7085bc0d..3f02635a 100644
> --- a/include/odp/api/spec/ipsec.h
> +++ b/include/odp/api/spec/ipsec.h
> @@ -831,16 +831,19 @@ odp_ipsec_sa_t odp_ipsec_sa_create(const 
> odp_ipsec_sa_param_t
> *param);
>   * before calling disable. Packets in progress during the call may still 
> match
>   * the SA and be processed successfully.
>   *
> - * When in synchronous operation mode, the call will return when it's 
> possible
> - * to destroy the SA. In asynchronous mode, the same is indicated by an
> - * ODP_EVENT_IPSEC_STATUS event sent to the queue specified for the SA. The
> - * status event is guaranteed to be the last event for the SA, i.e. all
> - * in-progress operations have completed and resulting events (including 
> status
> - * events) have been enqueued before it.
> + * A return value 0 indicates that the disable request has completed
> + * synchronously and the SA is now disabled. A return value 1 indicates that 
> the
> + * disable request has been accepted and completion will be indicated by an
> + * ODP_EVENT_IPSEC_STATUS sent to the queue specified for the SA. This event 
> is
> + * guaranteed to be the last event for the SA, i.e., all in-progress 
> operations
> + * have completed and resulting events (including status events) have been
> + * enqueued before it. In synchronous mode of operation, disable requests are
> + * gauranteed to complete synchronously as there is no queue assciated with 
> SA.
>   *
>   * @param sa  IPSEC SA to be disabled
>   *
> - * @retval 0  On success
> + * @retval 0  When SA is disabled successfully.
> + * @retval 1  Disable event will be posted on SA queue.
>   * @retval <0 On failure
>   *
>   * @see odp_ipsec_sa_destroy()



Re: [lng-odp] [PATCH API-NEXT v1 1/1] api:pktio: Adds MTU set function.

2017-08-14 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

The API patch should make it more clear what MTU means in the
context of ODP packet I/O.

I think it should mean something close to this:

ODP must be able to send and receive packets up to the MTU size
succesfully. Packets larger than the MTU may get dropped or may
get received successfully (i.e. setting MTU does not guarantee
that larger packets get dropped by ODP). Trying to send packets
larger than MTU results in undefined behavior (or maybe that is
too harsh and it should be that such packets may get dropped?).

The transmit behavior with respect to MTU in case of IPsec inline
output probably requires clarification.

An exact definition of which packet (or L2 frame) bytes get counted
towards the MTU is also needed (e.g. is Ethernet FCS counted? Or
VLAN tag?)

The API should document what the default MTU value is before
the API user calls odp_pktio_mtu_set().

> >> +   /** Allow app to set MTU size */
> >> +   uint32_t mtu_set : 1;

It is not clear why this would be needed. If the underlying
implementation has a fixed MTU, then at API level ODP could
still allow the MTU to be set to any lower value. It just would
not affect anything. 

> sure, will add min_mtu_size field in odp_paktio_capability_t.

Similarly, it is not clear why min MTU is needed. If an application
says MTU of 1 byte is enough, the ODP implementation could just
accept it, even if some underlying HW MTU would be left to some
larger value.

If the MTU setting guaranteed that all packets bigger than the
MTU would be accurately dropped, then the API would probably
need the min MTU. But I do not see a real use case for such
accurate packet length based filtering and it could be a bit
of a paint for some implementations or some pktio types.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of vamsi a
> Sent: Friday, August 11, 2017 8:22 AM
> To: Bill Fischofer 
> Cc: Mahipal Challa ; vattun...@cavium.com; Shally Verma
> ; lng-odp-forward 
> Subject: Re: [lng-odp] [PATCH API-NEXT v1 1/1] api:pktio: Adds MTU set 
> function.
> 
> On Tue, Aug 8, 2017 at 5:08 PM, Bill Fischofer 
> wrote:
> 
> >
> >
> > On Tue, Aug 8, 2017 at 2:38 AM, Vamsi Attunuru 
> > wrote:
> >
> >> Signed-off-by: Vamsi Attunuru 
> >> Signed-off-by: Shally Verma 
> >> Signed-off-by: Mahipal Challa 
> >>
> >> ---
> >>  include/odp/api/spec/packet_io.h | 21 +
> >>  1 file changed, 21 insertions(+)
> >>
> >> diff --git a/include/odp/api/spec/packet_io.h
> >> b/include/odp/api/spec/packet_io.h
> >> index d42cebf..be81c3d 100644
> >> --- a/include/odp/api/spec/packet_io.h
> >> +++ b/include/odp/api/spec/packet_io.h
> >> @@ -452,6 +452,9 @@ typedef union odp_pktio_set_op_t {
> >> struct {
> >> /** Promiscuous mode */
> >> uint32_t promisc_mode : 1;
> >> +
> >> +   /** Allow app to set MTU size */
> >> +   uint32_t mtu_set : 1;
> >> } op;
> >> /** All bits of the bit field structure.
> >>   * This field can be used to set/clear all flags, or bitwise
> >> @@ -480,6 +483,9 @@ typedef struct odp_pktio_capability_t {
> >>
> >> /** @deprecated Use enable_loop inside odp_pktin_config_t */
> >> odp_bool_t ODP_DEPRECATE(loop_supported);
> >> +
> >> +   /** Maximum MTU size supported */
> >> +   uint32_t max_mtu_size;
> >>  } odp_pktio_capability_t;
> >>
> >>  /**
> >> @@ -910,6 +916,21 @@ int odp_pktout_send(odp_pktout_queue_t queue, const
> >> odp_packet_t packets[],
> >>  uint32_t odp_pktio_mtu(odp_pktio_t pktio);
> >>
> >>  /**
> >> + * Set MTU value of a packet IO interface.
> >> + *
> >> + * Application should pass value upto max_mtu_size as indicated by
> >> + * odp_pktio_capability_t:max_mtu_size. Any value beyond max_mtu_size
> >> + * limit will result in failure. mtu value < 68 also results in failure.
> >>
> >
> > Seems like it would be better to have an explicit min_mtu_size in the
> > odp_pktio_capability() than this arbitrary note buried as a comment.
> >
> 
> sure, will add min_mtu_size field in odp_paktio_capability_t.
> 
> From the last discussion, odp_pktio_config() api was suggested instead of
> set_mtu() api, if my understanding is correct, Is the following sequence
> allowed to occur multiple times like stop pktio interface, call
> pktio_config() and start pktio interface.
> 
> >
> >
> >> + *
> >> + * @param pktio  Packet IO handle.
> >> + * @param mtuMTU value to be set.
> >> + *
> >> + * @return  0 on success
> >> + * @retval <0 on failure
> >> + */
> >> +int odp_pktio_mtu_set(odp_pktio_t pktio, uint32_t mtu);
> >> +
> >> +/**
> >>   * Enable/Disable promiscuous mode on a packet IO interface.
> >>   *
> >>   * @param[in] pktioPacket IO 

Re: [lng-odp] [EXT] Re: ODP1.15 buffer alignment issue

2017-08-14 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

The alignment must be for the buffer start address that is
returned by odp_buffer_addr(), since that is the only memory
address visible in the buffer API.

The internal representation of the buffer and the associated
metadata is not visible to the application through the API
so the alignment of those is up to the implementation.

odp_buffer_t is an opaque handle. Therefore it does not
make sense for an application to request any specific
"alignment" for it. In some implementations the handle may
actually be a pointer to an implementation specific internal
buffer structure, but in that case its alignment is not
an API issue.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of shally 
> verma
> Sent: Friday, August 11, 2017 8:05 AM
> To: Maxim Uvarov 
> Cc: Elo, Matias (Nokia - FI/Espoo) ; Petri Savolainen
> ; lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [EXT] Re: ODP1.15 buffer alignment issue
> 
> just to add for some implementation Buffer == data itself (without any
> metadata or headroom). So while you work on this, please get us
> clarity for such implementations as well.
> 
> On Fri, Aug 11, 2017 at 12:36 AM, Maxim Uvarov  
> wrote:
> > On 08/10/17 21:41, Liron Himi wrote:
> >> Hi,
> >>
> >> I think it is more than just naming replacement.
> >> The odp-pool API state that the alignment is for buffer alignment and not 
> >> for data. So
> it seems like a bug, right?
> >>
> >> Here is a snipped from the API:
> >>   /** Minimum buffer alignment in bytes. Valid values 
> >> are
> >>   powers of two. Use 0 for default alignment.
> >>   Default will always be a multiple of 8. */
> >>   uint32_t align;
> >>   } buf;
> >>
> >> Regards,
> >> Liron
> >>
> >
> >
> > Ok, in my understanding this value comes from:
> >
> > typedef struct odp_pool_capability_t {
> > .
> >
> > /** Buffer pool capabilities  */
> > struct {
> > /** Maximum buffer data alignment in bytes */
> > uint32_t max_align;
> >
> >
> > So it has to be "buffer data alignment". Please give me some time to
> > double check with community that we all understand this api definition
> > right.
> >
> > Thank you,
> > Maxim.
> >
> >
> >> -Original Message-
> >> From: Maxim Uvarov [mailto:maxim.uva...@linaro.org]
> >> Sent: Thursday, August 10, 2017 19:45
> >> To: Liron Himi ; lng-odp@lists.linaro.org
> >> Cc: Petri Savolainen ; Elo, Matias (Nokia - 
> >> FI/Espoo)
> 
> >> Subject: [EXT] Re: [lng-odp] ODP1.15 buffer alignment issue
> >>
> >> External Email
> >>
> >> --
> >> On 08/08/17 09:49, Liron Himi wrote:
> >>> Hi,
> >>>
> >>> See comments inline
> >>>
> >>> Regards,
> >>> Liron
> >>>
> >>> -Original Message-
> >>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> >>> Maxim Uvarov
> >>> Sent: Monday, August 07, 2017 23:21
> >>> To: lng-odp@lists.linaro.org
> >>> Subject: Re: [lng-odp] ODP1.15 buffer alignment issue
> >>>
> >>> it's implementation specific.
> >>>
> >>> Full code is:
> >>>
> >>> +   offset = pool->headroom;
> >>> +
> >>> +   /* move to correct align */
> >>> +   while (((uintptr_t)[offset]) % pool->align != 0)
> >>> +   offset++;
> >>> 
> >>> +
> >>> +   /* Pointer to data start (of the first segment) */
> >>> +   buf_hdr->addr[0] = [offset];
> >>> +   /* Store buffer into the global pool */
> >>> +   ring_enq(ring, mask, (uint32_t)(uintptr_t)buf_hdl);
> >>>
> >>> If I understood idea right there should be odp specific packet header 
> >>> with odp fields
> which is needed to implement api but missing in hardware buffer field. Then 
> hw buffer
> which is aligned by default with pointer to data. Everything depends on 
> implementation and
> it's not mandatory.
> >>> [L.H.] As part of the linux-generic there is a 
> >>> 'ODP_CONFIG_BUFFER_ALIGN_MIN' which
> should be used for the buffer alignment.
> >>> However, the code above use this value for the data section alignment in 
> >>> the buffer
> and not the buffer address alignment.
> >>> On ODP1.11 the 'ODP_CONFIG_BUFFER_ALIGN_MIN' was used correctly for 
> >>> buffer alignment.
> >>> In order to continue we need to understand what was the real intention 
> >>> here:
> >>> 1. To make sure the data section is aligned?
> >>> 2. To make sure the buffer address is aligned? If so, there is a bug in 
> >>> this code. A
> possible fix is as follow:
> >>>  -   offset = pool->headroom;
> >>>  +   offset = 0;
> >>>
> >>>  /* move to correct 

Re: [lng-odp] [API-NEXT PATCH v1 1/1] pktio APIs to set the MAC address and MTU size.

2017-07-14 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

I think MAC addresses and MTU are a bit different.

There are valid use cases for application wanting to set the MAC address, even 
dynamically. For instance, with VRRP one wants to receive packets destined to a 
VRRP MAC address in addition to the fixed MAC address of a port. The 
possibility of configuring multiple MAC addresses (for the purpose of RX frame 
filtering) to the same port and changing them dynamically would be useful to 
some apps.
The physical MTU of a port is more like a capability, the maximum that the port 
can do. The ODP application can then use any frame sizes it wants, up to the 
maximum size, without configuring anything to ODP. e.g. IP layer MTUs can be 
maintained by an IP stack application while the physical MTU at ODP level stays 
at the fixed maximum. An application does not have a need to tell ODP that the 
maximum frame size should be limited, but an ODP implementation might 
conceivably wish to get such information from the application or from elsewhere 
if supporting big frames (i.e. enabling jumbo frames) involves some costs that 
the ODP implementation would rather avoid if the application does not really 
need big frames.

Even if MTU configuration is not added in ODP API, it is useful to be able to 
query the maximum physical MTU of a port as that can vary also among HW that 
supports pretty big frame sizes. The meaning of the returned MTU value (i.e. 
which bytes are counted) should be clarified.

 Janne


From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
Sent: Friday, July 14, 2017 5:39 PM
To: Bogdan Pricope <bogdan.pric...@linaro.org>; Petri Savolainen 
<petri.savolai...@linaro.org>
Cc: Narayana Prasad Athreya <pathr...@caviumnetworks.com>; Peltonen, Janne 
(Nokia - FI/Espoo) <janne.pelto...@nokia.com>; mcha...@cavium.com; 
lng-odp@lists.linaro.org; pathr...@cavium.com; vattun...@cavium.com; 
sve...@cavium.com
Subject: Re: [lng-odp] [API-NEXT PATCH v1 1/1] pktio APIs to set the MAC 
address and MTU size.



On Fri, Jul 14, 2017 at 9:11 AM, Bogdan Pricope 
<bogdan.pric...@linaro.org<mailto:bogdan.pric...@linaro.org>> wrote:
Hi,

I think we need both MTU and MAC address to be settable from ODP.

In OFP, at some point I was considering using tap pktio instead of ofp
tap interface for slow path processing - not being able to set MAC
address was A problem.
What about DHCP by MAC address? Maybe there are systems where
predefined MAC address is a requirement... don't know...

MTU: what if default value is not 9K but something else?

I'll add this to the agenda for Monday's ARCH call. If Petri (+cc) wants to 
chime in before he leaves for vacation we'll take that input as well.


BR,
Bogdan


On 14 July 2017 at 16:41, Narayana Prasad Athreya
<pathr...@caviumnetworks.com<mailto:pathr...@caviumnetworks.com>> wrote:
> Hi Bill
> The reasons below dont jive with what ODP does today. If the routines
> odp_pktio_mtu() and odp_pktio_mac_addr() exist today. They assume that MTU
> can be configured and returned to the datapath. As an application developer,
> it does make sense for me to use 2 different paths to talk to the same
> underlying device for the same setting - one path for configuring and one
> for reading the configuration.
>
> So from this standpoint the ODP should either support both get and set
> routines or not support either of them.
>
> On the control-plane vs data-plane debate, again as an application
> developer, my integrated application will need access to the underlying
> device for both control and data. If i am forced to right separate code and
> potentially drivers for 2 paths, where is my incentive to use ODP API,
> because anyway i'm losing portability.
>
> Thanks
> Prasad
>
> On Friday 14 July 2017 06:33 PM, Bill Fischofer wrote:
>>
>>
>>
>> On Fri, Jul 14, 2017 at 6:05 AM, Peltonen, Janne (Nokia - FI/Espoo)
>> <janne.pelto...@nokia.com<mailto:janne.pelto...@nokia.com> 
>> <mailto:janne.pelto...@nokia.com<mailto:janne.pelto...@nokia.com>>> wrote:
>>
>> Hi,
>>
>> ODP API should somewhere define what exactly MTU means in the
>> context of ODP.
>>
>> One can guess that transmission and reception of L2 frames larger
>> than the
>> configured MTU is not guaranteed to succeed, but which bytes are
>> taken into
>> account? For instance, is Ethernet FCS counted towards the MTU?
>>
>>
>> We've had this discussion before. The whole concept of MTU is a bit of an
>> anachronism from the early days of Ethernet. At speeds ODP is designed for
>> (10Gb and above) all equipment should be running with jumbo frames (MTU =
>> 9K). All equipment capable of running at these speeds is capable of this
>> size, and 9K is the reco

Re: [lng-odp] [API-NEXT PATCH v1 1/1] pktio APIs to set the MAC address and MTU size.

2017-07-14 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

ODP API should somewhere define what exactly MTU means in the context of ODP. 

One can guess that transmission and reception of L2 frames larger than the
configured MTU is not guaranteed to succeed, but which bytes are taken into
account? For instance, is Ethernet FCS counted towards the MTU?

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Vamsi 
> Attunuru
> Sent: Friday, July 14, 2017 1:05 PM
> To: lng-odp@lists.linaro.org
> Cc: mcha...@cavium.com; pathr...@cavium.com; vattun...@cavium.com; 
> sve...@cavium.com
> Subject: [lng-odp] [API-NEXT PATCH v1 1/1] pktio APIs to set the MAC address 
> and MTU size.
> 
> Adds new pktio APIs to set MTU and MAC address on pktio interface.
> 
> Signed-off-by: Vamsi Attunuru 
> Signed-off-by: Mahipal Challa 
> Signed-off-by: Shally Verma   
> 
> ---
>  include/odp/api/spec/packet_io.h | 45 
> 
>  1 file changed, 45 insertions(+)
> 
> diff --git a/include/odp/api/spec/packet_io.h 
> b/include/odp/api/spec/packet_io.h
> index 8802089..1269f44 100644
> --- a/include/odp/api/spec/packet_io.h
> +++ b/include/odp/api/spec/packet_io.h
> @@ -451,6 +451,10 @@ typedef union odp_pktio_set_op_t {
>   struct {
>   /** Promiscuous mode */
>   uint32_t promisc_mode : 1;
> + /** MAC address */
> + uint32_t mac : 1;
> + /** MTU size */
> + uint32_t mtu : 1;
>   } op;
>   /** All bits of the bit field structure.
> * This field can be used to set/clear all flags, or bitwise
> @@ -482,6 +486,12 @@ typedef struct odp_pktio_capability_t {
>* A boolean to denote whether loop back mode is supported on this
>* specific interface. */
>   odp_bool_t loop_supported;
> +
> + /** Maximum MTU size supported */
> + uint32_t max_mtu_size;
> +
> + /** Length of MAC address supported on this specific interface */
> + uint32_t mac_addr_len;
>  } odp_pktio_capability_t;
> 
>  /**
> @@ -912,6 +922,21 @@ int odp_pktout_send(odp_pktout_queue_t queue, const 
> odp_packet_t
> packets[],
>  uint32_t odp_pktio_mtu(odp_pktio_t pktio);
> 
>  /**
> + * Set MTU value of a packet IO interface.
> + *
> + * Application should pass value upto max_mtu_size as indicated by
> + * odp_pktio_capability_t:max_mtu_size. Any value beyond max_mtu_size
> + * limit will result in failure.
> + *
> + * @param pktio  Packet IO handle.
> + * @param mtuMTU value to be set.
> + *
> + * @return  0 on success
> + * @retval <0 on failure
> + */
> +int odp_pktio_mtu_set(odp_pktio_t pktio, uint32_t mtu);
> +
> +/**
>   * Enable/Disable promiscuous mode on a packet IO interface.
>   *
>   * @param[in] pktio  Packet IO handle.
> @@ -946,6 +971,26 @@ int odp_pktio_promisc_mode(odp_pktio_t pktio);
>  int odp_pktio_mac_addr(odp_pktio_t pktio, void *mac_addr, int size);
> 
>  /**
> + * Set the MAC address of a packet IO interface.
> + *
> + * Application should pass mac_addr buffer with size >=
> + * odp_pktio_capablity_t:mac_addr_len, size value less than
> + * implementation supported will result in API failure.
> + *
> + * On success, Implementation would read mac_addr buffer bytes
> + * upto mac_addr_len value indicated in capability information.
> + *
> + * @param pktioPacket IO handle
> + * @param mac_addr Pointer to MAC address to be set
> + * @param size Size of MAC address buffer
> + *
> + * @return  0 on success
> + * @retval <0 on failure
> + */
> +int odp_pktio_mac_addr_set(odp_pktio_t pktio, const uint8_t *mac_addr,
> +   int size);
> +
> +/**
>   * Setup per-port default class-of-service.
>   *
>   * @param[in]pktio   Ingress port pktio handle.
> --
> 1.9.3



Re: [lng-odp] [API-NEXT PATCH 1/9] api: ipsec: add salt parameter

2017-07-12 Thread Peltonen, Janne (Nokia - FI/Espoo)
Dmitry Eremin-Solenikov wrote:
> On 11.07.2017 15:31, Petri Savolainen wrote:
> > Added a parameter for passing salt for AES GCM. Currently,
> > only option for length is 4 bytes, but later on other algorithms
> > may need more/less salt data.
> >
> > Signed-off-by: Petri Savolainen 
> > ---
> >  include/odp/api/spec/ipsec.h | 10 ++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> > index e602e4b8..15dbb164 100644
> > --- a/include/odp/api/spec/ipsec.h
> > +++ b/include/odp/api/spec/ipsec.h
> > @@ -384,6 +384,16 @@ typedef struct odp_ipsec_crypto_param_t {
> > /** Authentication key */
> > odp_crypto_key_t auth_key;
> >
> > +   /** Salt for SA's that use AES_GCM algorithm. Other algorithms ignore
> > +*  these fields. */
> 
> I'd suggest more generic wording:
> - Additional static IV data used as salt or nonce for block IV
> computation if algorithm/block mode requires it. For example, GCM
> requires 4 bytes salt, CCM requires 3 bytes salt, CTR mode requires 4
> bytes nonce.

I do not think that would be very good either since this is not for
computing ESP IV but GCM/CCM IV which is called nonce in the context
of ESP. IOW, I think the description should not refer to IV.

Maybe it should just be called 'additional keying material' and maybe the
structure name should also be changed from salt to something more generic.
An alternative would be to require that much more bytes in the key field(s)
but that would sort of change the meaning of odp_crypto_key_t when used in
the context of IPsec (the same algorithm would require different key lengths
depending on whether it is used through the IPsec API or the crypto API).

Janne

> 
> > +   struct {
> > +   /** Pointer to salt data. */
> > +   const uint8_t *ptr;
> > +
> > +   /** Salt length. Valid value for AES_GCM is 4. */
> > +   uint32_t len;
> > +   } salt;
> > +
> >  } odp_ipsec_crypto_param_t;
> >
> >  /**
> >
> 
> 
> --
> With best wishes
> Dmitry


Re: [lng-odp] [API-NEXTv2] api: ipsec: reorganize odp_ipsec_sa_param_t structure based on SA direction

2017-06-30 Thread Peltonen, Janne (Nokia - FI/Espoo)
> >>   /** Initial sequence number */
> >>   uint64_t seq;
> >
> > This may be moved to outbound struct.
> 
> I thought the same.. yet, some say is needed on inbound as well ("to
> know from where to start expecting frames") a.k.a. antireplay
> mechanism.

It is not needed in inbound since the antireplay window will anyway
slide to the correct position when the first packet is received.

Actually, this field is probably not needed in outbound either since
the ESP RFC says (but maybe does not strictly require it?) that
the first ESP packet sent on an SA will have sequence number 1.

So I suppose the initial sequence number field should just be removed
from the param struct.

Janne




Re: [lng-odp] [API-NEXT PATCH] linux-gen: queue: clean up after modular interface

2017-06-16 Thread Peltonen, Janne (Nokia - FI/Espoo)

Honnappa Nagarahalli wrote:
> On 12 June 2017 at 06:11, Petri Savolainen  
> wrote:
> > Clean up function and parameter naming after modular interface
> > patch. Queue_t type is referred as "queue internal": queue_int or
> > q_int. Term "handle" is reserved for API level handles (e.g.
> > odp_queue_t, odp_pktio_t, etc) through out linux-gen implementation.
> >
> 
> "queue_t" type should be referred to as "handle_int". "handle_int" is
> clearly different from "handle".
> If we look at the definition of "queue_t":
> 
> typedef struct { char dummy; } _queue_t;
> typedef _queue_t *queue_t;
> 
> it is nothing but a definition of a handle. Why should it be called
> with some other name and create confusion? Just like how odp_queue_t
> is an abstract type, queue_t is also an abstract type. Just like how
> odp_queue_t is a handle, queue_t is also a handle, albeit a handle
> towards internal components.

I do not see how calling variables of type queue_t handles instead
of queue_int or q_int makes the call any clearer or less confusing.
If the term handle is reserved for ODP API level handles, then I
suppose this code should adhere to that. And 'handle_int' is not
very descriptive as a variable name anyway.

> > +static inline queue_entry_t *handle_to_qentry(odp_queue_t handle)
> 
> Why is there a need to add this function? We already have
> 'queue_from_ext' and 'qentry_from_int' which are a must to implement.
> The functionality provided by 'handle_to_qentry' can be achieved from
> these two functions. 'handle_to_qentry' is adding another layer of
> wrapper. This adds to code complexity.

There is a need to convert from handle to queue entry in quite many
places in the code. Having a function for that makes perfect sense
since it reduces code duplication and simplifies all the call sites
that no longer need to know how the conversion is done.

This is also how the code was before you changed it (unnecessarily,
one might think), so this change merely brings back the old code
structure (with a naming change).

> >  static odp_queue_type_t queue_type(odp_queue_t handle)
> >  {
> > -   return qentry_from_int(queue_from_ext(handle))->s.type;
> > +   return handle_to_qentry(handle)->s.type;
> 
> No need to introduce another function.
> qentry_from_int(queue_from_ext(handle)) clearly shows the current
> status that there exists an internal handle. handle_to_qentry(handle)
> hides that fact and makes code less readable. This comment applies to
> all the instances of similar change.

Hiding is good. Only handle_to_qentry() needs to know that the
conversion is (currently) done through queue_t. I would argue that
the code is more readable with handle_to_qentry() than with having
to read the same conversion code all the time. An if the code ever
changes so that the conversion is better done in another way, having
handle_to_qentry() avoids the need to change all its call sites.

Janne




Re: [lng-odp] Suspected SPAM - Re: [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Peltonen, Janne (Nokia - FI/Espoo)


> -Original Message-
> From: Ola Liljedahl [mailto:ola.liljed...@arm.com]
> Sent: Friday, June 02, 2017 1:54 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>; 
> Savolainen, Petri
> (Nokia - FI/Espoo) <petri.savolai...@nokia.com>; Honnappa Nagarahalli
> <honnappa.nagaraha...@linaro.org>
> Cc: Elo, Matias (Nokia - FI/Espoo) <matias@nokia.com>; nd <n...@arm.com>; 
> Kevin Wang
> <kevin.w...@arm.com>; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>; 
> lng-
> o...@lists.linaro.org
> Subject: Re: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add 
> scalable scheduler
> 
> 
> >
> >
> >> -Original Message-
> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> >>Savolainen, Petri
> >> (Nokia - FI/Espoo)
> >> Sent: Friday, June 02, 2017 1:18 PM
> >> To: Honnappa Nagarahalli <honnappa.nagaraha...@linaro.org>; Ola
> >>Liljedahl
> >> <ola.liljed...@arm.com>
> >> Cc: Elo, Matias (Nokia - FI/Espoo) <matias@nokia.com>; nd
> >><n...@arm.com>; Kevin Wang
> >> <kevin.w...@arm.com>; Honnappa Nagarahalli
> >><honnappa.nagaraha...@arm.com>; lng-
> >> o...@lists.linaro.org
> >> Subject: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add
> >>scalable scheduler
> >>
> >>
> >>
> >> > -Original Message-
> >> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> >> > Honnappa Nagarahalli
> >> > Sent: Thursday, June 01, 2017 11:30 PM
> >> > To: Ola Liljedahl <ola.liljed...@arm.com>
> >> > Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli
> >> > <honnappa.nagaraha...@arm.com>; Elo, Matias (Nokia - FI/Espoo)
> >> > <matias@nokia.com>; Kevin Wang <kevin.w...@arm.com>; nd
> >><n...@arm.com>
> >> > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> >> >
> >> > On 1 June 2017 at 15:20, Ola Liljedahl <ola.liljed...@arm.com> wrote:
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On 01/06/2017, 22:15, "Honnappa Nagarahalli"
> >> > > <honnappa.nagaraha...@linaro.org> wrote:
> >> > >
> >> > >>On 1 June 2017 at 15:09, Ola Liljedahl <ola.liljed...@arm.com>
> >>wrote:
> >> > >>>
> >> > >>>
> >> > >>> On 01/06/2017, 21:03, "Bill Fischofer" <bill.fischo...@linaro.org>
> >> > >>>wrote:
> >> > >>>
> >> > >>>>On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
> >> > >>>><honnappa.nagaraha...@linaro.org> wrote:
> >> > >>>>> On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
> >> > >>>>> <matias@nokia.com> wrote:
> >> > >>>>>>
> >> > >>>>>>> On 31 May 2017, at 23:53, Bill Fischofer
> >> > <bill.fischo...@linaro.org>
> >> > >>>>>>>wrote:
> >> > >>>>>>>
> >> > >>>>>>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia -
> >>FI/Espoo)
> >> > >>>>>>> <matias@nokia.com> wrote:
> >> > >>>>>>>>
> >> > >>>>>>>>>>> What¹s the purpose of calling ord_enq_multi() here? To
> >>save
> >> > >>>>>>>>>>>(stash)
> >> > >>>>>>>>>>> packets if the thread is out-of-order?
> >> > >>>>>>>>>>> And when the thread is in-order, it is re-enqueueing the
> >> > packets
> >> > >>>>>>>>>>>which
> >> > >>>>>>>>>>> again will invoke pktout_enqueue/pktout_enq_multi but this
> >> > time
> >> > >>>>>>>>>>> ord_enq_multi() will not save the packets, instead they
> >>will
> >> > >>>>>>>>>>>actually be
> >> > >>>>>>>>>>> transmitted by odp_pktout_send()?
> >> > >>>>>>>>>>>
> >> > >>>>>>>>>>
> >

Re: [lng-odp] Suspected SPAM - Re: [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-06-02 Thread Peltonen, Janne (Nokia - FI/Espoo)


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> Savolainen, Petri
> (Nokia - FI/Espoo)
> Sent: Friday, June 02, 2017 1:18 PM
> To: Honnappa Nagarahalli ; Ola Liljedahl
> 
> Cc: Elo, Matias (Nokia - FI/Espoo) ; nd ; 
> Kevin Wang
> ; Honnappa Nagarahalli ; 
> lng-
> o...@lists.linaro.org
> Subject: Suspected SPAM - Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable 
> scheduler
> 
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Honnappa Nagarahalli
> > Sent: Thursday, June 01, 2017 11:30 PM
> > To: Ola Liljedahl 
> > Cc: lng-odp@lists.linaro.org; Honnappa Nagarahalli
> > ; Elo, Matias (Nokia - FI/Espoo)
> > ; Kevin Wang ; nd 
> > Subject: Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler
> >
> > On 1 June 2017 at 15:20, Ola Liljedahl  wrote:
> > >
> > >
> > >
> > >
> > > On 01/06/2017, 22:15, "Honnappa Nagarahalli"
> > >  wrote:
> > >
> > >>On 1 June 2017 at 15:09, Ola Liljedahl  wrote:
> > >>>
> > >>>
> > >>> On 01/06/2017, 21:03, "Bill Fischofer" 
> > >>>wrote:
> > >>>
> > On Thu, Jun 1, 2017 at 10:59 AM, Honnappa Nagarahalli
> >  wrote:
> > > On 1 June 2017 at 01:26, Elo, Matias (Nokia - FI/Espoo)
> > >  wrote:
> > >>
> > >>> On 31 May 2017, at 23:53, Bill Fischofer
> > 
> > >>>wrote:
> > >>>
> > >>> On Wed, May 31, 2017 at 8:12 AM, Elo, Matias (Nokia - FI/Espoo)
> > >>>  wrote:
> > 
> > >>> What¹s the purpose of calling ord_enq_multi() here? To save
> > >>>(stash)
> > >>> packets if the thread is out-of-order?
> > >>> And when the thread is in-order, it is re-enqueueing the
> > packets
> > >>>which
> > >>> again will invoke pktout_enqueue/pktout_enq_multi but this
> > time
> > >>> ord_enq_multi() will not save the packets, instead they will
> > >>>actually be
> > >>> transmitted by odp_pktout_send()?
> > >>>
> > >>
> > >> Since transmitting packets may fail, out-of-order packets
> > cannot
> > >>be
> > >> stashed here.
> > > You mean that the TX queue of the pktio might be full so not all
> > >packets
> > > will actually be enqueued for transmission.
> > 
> >  Yep.
> > 
> > > This is an interesting case but is it a must to know how many
> > >packets are
> > > actually accepted? Packets can always be dropped without notice,
> > >the
> > > question is from which point this is acceptable. If packets
> > >enqueued onto
> > > a pktout (egress) queue are accepted, this means that they must
> > >also be
> > > put onto the driver TX queue (as done by odp_pktout_send)?
> > >
> > 
> >  Currently, the packet_io/queue APIs don't say anything about
> > packets
> > being
> >  possibly dropped after successfully calling odp_queue_enq() to a
> > pktout
> >  event queue. So to be consistent with standard odp_queue_enq()
> > operations I
> >  think it is better to return the number of events actually
> > accepted
> > to the TX queue.
> > 
> >  To have more leeway one option would be to modify the API
> > documentation to
> >  state that packets may still be dropped after a successful
> > odp_queue_enq() call
> >  before reaching the NIC. If the application would like to be sure
> > that the
> >  packets are actually sent, it should use odp_pktout_send()
> > instead.
> > >>>
> > >>> Ordered queues simply say that packets will be delivered to the
> > next
> > >>> queue in the pipeline in the order they originated from their
> > source
> > >>> queue. What happens after that depends on the attributes of the
> > >>>target
> > >>> queue. If the target queue is an exit point from the application,
> > >>>then
> > >>> this is outside of ODP's scope.
> > >>
> > >> My point was that with stashing the application has no way of
> > knowing
> > >>if an
> > >> ordered pktout enqueue call actually succeed. In case of parallel
> > and
> > >>atomic
> > >> queues it does. So my question is, is this acceptable?
> > >>
> > > Also, currently, it is not possible for the application to have a
> > > consistent 'wait/drop on destination queue full' policy for all the
> > 

Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-31 Thread Peltonen, Janne (Nokia - FI/Espoo)


Ola Liljedahl wrote:
> On 23/05/2017, 16:49, "Peltonen, Janne (Nokia - FI/Espoo)"
> <janne.pelto...@nokia.com> wrote:
> 
> 
> >
> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
> >> +   int num, int *ret)
> >> +{
> >> +  (void)queue_index;
> >> +  (void)p_buf_hdr;
> >> +  (void)num;
> >> +  (void)ret;
> >> +  return 0;
> >> +}
> >
> >How is packet order maintained when enqueuing packets read from an ordered
> >queue to a pktout queue? Matias' recent fix uses the ord_enq_multi
> >scheduler
> >function for that, but this version does not do any ordering. Or is the
> >ordering guaranteed by some other means?
> The scalable scheduler standard queue enqueue function also handles
> ordered queues. odp_queue_scalable.c can refer to the same thread-specific
> data as odp_schedule_scalable.c so we don¹t need this internal interface.
> We could perhaps adapt the code to use this interface but I think this
> interface is just an artefact of the implementation of the default
> queues/scheduler.

The problem is that odp_pktout_queue_config() sets qentry->s.enqueue
to pktout_enqueue() and that does not have any of the scalable scheduler
specific magic that odp_queue_scalable.c:queue_enq{_multi}() has. So
ordering does not happen for pktout queues even if it works for other
queues, right?

Janne

> 
> >
> >> +static void order_lock(void)
> >> +{
> >> +}
> >> +
> >> +static void order_unlock(void)
> >> +{
> >> +}
> >
> >Is it ok that these are no-ops? tm_enqueue() seems to use these.
> No these ought to be implemented. We have fixed that now. Thanks.
> 
> 
> -- Ola
> 
> Ola Liljedahl, Networking System Architect, ARM
> Phone: +46 706 866 373  Skype: ola.liljedahl
> 
> 
> 
> 
> >
> >> +
> >> +const schedule_fn_t schedule_scalable_fn = {
> >> +  .pktio_start= pktio_start,
> >> +  .thr_add= thr_add,
> >> +  .thr_rem= thr_rem,
> >> +  .num_grps   = num_grps,
> >> +  .init_queue = init_queue,
> >> +  .destroy_queue  = destroy_queue,
> >> +  .sched_queue= sched_queue,
> >> +  .ord_enq_multi  = ord_enq_multi,
> >> +  .init_global= schedule_init_global,
> >> +  .term_global= schedule_term_global,
> >> +  .init_local = schedule_init_local,
> >> +  .term_local = schedule_term_local,
> >> +  .order_lock = order_lock,
> >> +  .order_unlock   = order_unlock,
> >> +};
> >
> > Janne
> >
> >



Re: [lng-odp] Summary of Expiration Discussion

2017-05-29 Thread Peltonen, Janne (Nokia - FI/Espoo)
> -Original Message-
> From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
> Sent: Friday, May 26, 2017 5:04 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: lng-odp-forward <lng-odp@lists.linaro.org>
> Subject: Re: [lng-odp] Summary of Expiration Discussion
> 
> On Wed, May 24, 2017 at 8:15 AM, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> > Bill Fischofer wrote:
> >>
> >> The following data items MUST be in the SAD:
> >
> > But that does not necessarily mean that they need to be in the ODP SA.
> >
> >> With this background, it's clear that should we choose to remove
> >> time-based expiration from the ODP IPsec API spec we would have an
> >> incomplete IPsec specification that would require any ODP
> >> implementation that wishes to be IPsec compliant to "fill in" the spec
> >> on its own. That would not be good.
> >
> > Regardless of how complete the life time support is in ODP, there
> > are other things that the application needs to do for IPsec RFC
> > compliancy. For example, ODP does not do any policy enforcement
> > but leaves it to the application (which is good since it adds
> > flexibility).
> >
> >> Note also that the RFC says
> >> nothing about counting packets, only bytes, so perhaps we shouldn't be
> >> offering packet counts?
> 
> Packets are certainly a more convenient approximation for data-based
> limits as these are not intended to be precise. I don't believe anyone
> ever tests compliance based on exact byte counts (e.g., the limit is
> 43210495 and you better not allow that 43210496th byte to be
> transmitted). Cryptographic degradation is gradual and statistical,
> not a step function.
> 
> >
> > I suppose packet based life times could be added later as a backward
> > compatible API change if they are not included now. But I think some
> > other IPsec implementations provide them.
> >
> >> Given that we want to keep these, the question is then how are they to
> >> be handled? The argument I would make is that these are control
> >> signals as they indicate that the control plane should begin rekeying
> >> on an SA (soft expiration) or that an SA's lifetime has ended (hard
> >> expiration). As such they should be odp_ipsec_status_t events, which
> >> seems to be what IPsec users like OFP would prefer.
> >
> > Even if all life time expirations are signaled as status events,
> > a normal operation completion event for a packet is needed when
> > IPsec processing is attempted but fails due to hard life time
> > (or byte or packet limit) expiration. So those error bits are
> > still needed in the op_result.
> 
> From a worker perspective, an SA whose hard limit has been reached is
> no different than one that has been disabled. Effectively, when the
> limit is reached odp_ipsec_sa_disable() is called by the
> implementation. The SA is still visible until it is destroyed, but any
> further attempt to use that SA will fail. Saying that the operation
> failed because the SA is disabled is sufficient.

Given how odp_ipsec_sa_disable() has now been defined, it is not
quite the same as reaching a hard limit. When an application calls
SA disable it promises that it will not initiate any new operations
that explicitly refer to the SA. An application cannot promise
the same (not using the SA in an IPsec operation) after reaching
a hard byte/packet/time limit (unless the application itself keeps
track of the limits and synchronizes the state between threads).

So from application perspective, it would be acceptable to have
IPsec operations on a disabled SA result in undefined behavior (as
currently defined) but not so for an SA whose hard limit has been
reached. But maybe there does not have to be hard limit specific
error bits in the result (or per packet metadata, if we switch to
that) but some generic operation error could be used for this case
too.

Another difference is that when an SA has been disabled, inbound
lookups will not match it anymore and incoming packets end up in
the default queue (in async mode) or in pktin (in inline mode).
Currently, after reaching a hard limit, the SA still matches
and the error result goes to the SA specific queue.

> 
> >
> > I am not sure OFP has any preference one way or the other.
> >
> >> But perhaps this is too coarse. Having a dest_cos makes sense for
> >> PIPELINE_CLS output but the current dest_queue is used for both
> >> odp_ipsec_result_t and odp_ipsec_status_t events, which inherently
> >> have different "consumers". So might it make more 

Re: [lng-odp] Summary of Expiration Discussion

2017-05-24 Thread Peltonen, Janne (Nokia - FI/Espoo)
Bill Fischofer wrote:
> 
> The following data items MUST be in the SAD:

But that does not necessarily mean that they need to be in the ODP SA.

> With this background, it's clear that should we choose to remove
> time-based expiration from the ODP IPsec API spec we would have an
> incomplete IPsec specification that would require any ODP
> implementation that wishes to be IPsec compliant to "fill in" the spec
> on its own. That would not be good.

Regardless of how complete the life time support is in ODP, there
are other things that the application needs to do for IPsec RFC
compliancy. For example, ODP does not do any policy enforcement
but leaves it to the application (which is good since it adds
flexibility).

> Note also that the RFC says
> nothing about counting packets, only bytes, so perhaps we shouldn't be
> offering packet counts?

I suppose packet based life times could be added later as a backward
compatible API change if they are not included now. But I think some
other IPsec implementations provide them.

> Given that we want to keep these, the question is then how are they to
> be handled? The argument I would make is that these are control
> signals as they indicate that the control plane should begin rekeying
> on an SA (soft expiration) or that an SA's lifetime has ended (hard
> expiration). As such they should be odp_ipsec_status_t events, which
> seems to be what IPsec users like OFP would prefer.

Even if all life time expirations are signaled as status events,
a normal operation completion event for a packet is needed when
IPsec processing is attempted but fails due to hard life time
(or byte or packet limit) expiration. So those error bits are
still needed in the op_result.

I am not sure OFP has any preference one way or the other.

> But perhaps this is too coarse. Having a dest_cos makes sense for
> PIPELINE_CLS output but the current dest_queue is used for both
> odp_ipsec_result_t and odp_ipsec_status_t events, which inherently
> have different "consumers". So might it make more sense to split this
> into:
> 
> /**
> * Output queue for IPSEC results
> *
> * Operations in asynchronous or inline mode enqueue result events are placed
> * onto this queue.
> */
> odp_queue_t result_queue;
> 
> /**
> * Output queue for IPSEC status
> *
> * Status events relating to this SA are placed onto this queue.
> */
> odp_queue_t status_queue;

There are some status events (at least the disable SA completion
event) that need to be ordered with operation completion events.
When an application gets the disable completion event, it must
be able to count on that it has seen all operation completion
events for that SA or else it becomes much more difficult to
tear down SAs safely. To get that ordering, disable completion
needs to come through the same queue as the normal IPsec
operation events.

This ordering is not currently explicitly specified in the API
but we have discussed the need to clarify the API with Petri.
I think the ordering becomes even more important if we change
operation result events to packet events with IPsec specific
packet metadata.

Janne




Re: [lng-odp] [API-NEXT PATCH v6 6/6] Add scalable scheduler

2017-05-24 Thread Peltonen, Janne (Nokia - FI/Espoo)
Honnappa Nagarahalli wrote:
> On 23 May 2017 at 09:49, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> >
> >> +static int ord_enq_multi(uint32_t queue_index, void *p_buf_hdr[],
> >> +  int num, int *ret)
> >> +{
> >> + (void)queue_index;
> >> + (void)p_buf_hdr;
> >> + (void)num;
> >> + (void)ret;
> >> + return 0;
> >> +}
> >
> > How is packet order maintained when enqueuing packets read from an ordered
> > queue to a pktout queue? Matias' recent fix uses the ord_enq_multi scheduler
> > function for that, but this version does not do any ordering. Or is the
> > ordering guaranteed by some other means?
> >
> 
> We have been running TM related test cases and they are working fine.
> So, I would assume we do not need to do anything.

The packet ordering question above is not specific to TM but is
relevant also when TM is not used. And a test case may not reveal
ordering problems unless you are specifically looking for it and
somehow making sure that the right order does not happen by accident.

Here is a simple use case for packet ordering at output: Application
gets incoming packets from single pktin through single ordered and
scheduled queue. The packets get spread to multiple threads by the
scheduler. The threads output the packets to the same pktout queue
used in queued mode. The output packets in the wire should be in the
same order as they were received, even if the worker threads processed
the packets at very different speeds.

> 
> What is the use case from TM? Does TM use ODP queues?

I am not familiar with the TM, but it looks like the scheduler may want to
ensure correct packet order using the order_lock function and therefore
that function being no-op in the scalable scheduler looked alarming to me.

Janne

> 
> >> +static void order_lock(void)
> >> +{
> >> +}
> >> +
> >> +static void order_unlock(void)
> >> +{
> >> +}
> >
> > Is it ok that these are no-ops? tm_enqueue() seems to use these.
> >
> >> +
> >> +const schedule_fn_t schedule_scalable_fn = {
> >> + .pktio_start= pktio_start,
> >> + .thr_add= thr_add,
> >> + .thr_rem= thr_rem,
> >> + .num_grps   = num_grps,
> >> + .init_queue = init_queue,
> >> + .destroy_queue  = destroy_queue,
> >> + .sched_queue= sched_queue,
> >> + .ord_enq_multi  = ord_enq_multi,
> >> + .init_global= schedule_init_global,
> >> + .term_global= schedule_term_global,
> >> + .init_local = schedule_init_local,
> >> + .term_local = schedule_term_local,
> >> + .order_lock = order_lock,
> >> + .order_unlock   = order_unlock,
> >> +};
> >
> > Janne
> >
> >


Re: [lng-odp] [PATCH] linux-gen: sched: fix ordered enqueue to pktout queue

2017-05-23 Thread Peltonen, Janne (Nokia - FI/Espoo)

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Matias 
> Elo
> Sent: Monday, May 22, 2017 12:39 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH] linux-gen: sched: fix ordered enqueue to pktout 
> queue
> 
> Make sure packet order is maintained if enqueueing packets from an ordered
> queue.
> 
> Fixes https://bugs.linaro.org/show_bug.cgi?id=3002
> 
> Signed-off-by: Matias Elo 
> ---
>  platform/linux-generic/odp_packet_io.c   | 8 
>  platform/linux-generic/odp_queue.c   | 1 +
>  platform/linux-generic/odp_schedule.c| 5 -
>  platform/linux-generic/odp_schedule_iquery.c | 5 -
>  4 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_packet_io.c b/platform/linux-
> generic/odp_packet_io.c
> index 98460a5..7e45c63 100644
> --- a/platform/linux-generic/odp_packet_io.c
> +++ b/platform/linux-generic/odp_packet_io.c
> @@ -586,6 +586,10 @@ int pktout_enqueue(queue_entry_t *qentry, 
> odp_buffer_hdr_t *buf_hdr)
>   int len = 1;
>   int nbr;
> 
> + if (sched_fn->ord_enq_multi(qentry->s.index, (void **)buf_hdr, len,
> + ))
> + return nbr;
> +

The return value is not right here. If the ord_enq_multi() returns 1,
the packet was successfully enqueued and odp_queue_enq() should return 0,
not 1.

After this patch the default scheduler always returns 0 in this case so
the branch is never taken and the bug is not exposed.

>   nbr = odp_pktout_send(qentry->s.pktout, , len);
>   return (nbr == len ? 0 : -1);
>  }

Janne




Re: [lng-odp] [API-NEXT PATCH] api: packet: add per packet checksum control

2017-05-19 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bill Fischofer wrote: 
> If the pktout configuration says "don't checksum" and the per-packet bits
> say "do checksum", then it's up to the implementation to insert the requested
> checksum or else fail the odp_pktout_send() request.

The API spec says the following, implying that it is perfectly fine to request
checksumming through the override and expect it to also happen at packet output
even when the pktio level configuration is "don't checksum". If it were not,
then value 1 for the l3 parameter would not be very useful.

/**
 * Layer 3 checksum insertion override
 *
 * Override checksum insertion configuration per packet. This per packet setting
 * overrides a higher level configuration for checksum insertion into a L3
 * header during packet output processing.
 *
 * @param pkt Packet handle
 * @param l3  0: do not insert L3 checksum
 *1: insert L3 checksum
 */
void odp_packet_l3_chksum_insert(odp_packet_t pkt, int l3);

Whether the data returned through the capability query is relevant for the
above is not clear in the API and obviously different people make different
interpretations or assumptions now.

> it shouldn't matter whether HW is involved. Note that the 
> odp_pktout_config_opt_t
> makes no reference to how checksums are performed.

Yes, now it is an internal ODP implementation issue whether HW is involved
in the calculation or not.

> One could argue that these capability bits say whether checksums are 
> HW-assisted
> since any implementation should be able to insert SW checksums as needed.
> Perhaps that's the clarification that's needed? In this case the
> odp_pktin_config_opt_t and odp_pktout_config_opt_t returned by
> odp_pktio_capability() should be clarified as referring to HW support
> for checksum validation/insertion.

I think that is also a possible way to resolve the unclarity. Maybe Petri
wants to comment which way he would prefer when he comes back.

        Janne




Re: [lng-odp] [API-NEXT PATCH] api: packet: add per packet checksum control

2017-05-19 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi Bogdan,

I am discussing the ODP API, not a particular implementation.

> if HW csum is [...] not configured by application [...]
> then packet will not be modified

Having checksum disabled in config does not mean the checksum will not
be inserted in the packet (in api-next). In means that only if the
override is not set. If override is set to "insert", then the config
setting does not matter (I think this is clear in the API). Whether
the capability matters is more debatable, but see the point blelow.

> if HW csum is NOT calculated (not supported by HW [...]
> then packet will not be modified

The API says "Use odp_pktio_capability() to see which options are
supported by the implementation.". So if the checksum config option
is not supported, it quite obviously should not be set by the
application. But that wording does not make it clear whether requesting
checksum offload through override is still allowed. The API should
be made more clear about it.

The possible alternatives of how the behavior of the override("insert")
could be defined in the API in case the checksum capa is zero include
undefined behavior (i.e. application must never call override("insert")
in that case), not modifying the packet (but calling override("insert")
would be ok, and actually doing the insert (which, IMHO, is implied
by the current text in the API spec even if not intended that way).

> 3. Override set: insert checksum => depends on “HW Supported”

That is just one possible way to define the behavior. Now it is left
ambiguous in the API.

Janne


> -Original Message-
> From: Bogdan Pricope [mailto:bogdan.pric...@linaro.org]
> Sent: Friday, May 19, 2017 11:07 AM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: Bala Manoharan <bala.manoha...@linaro.org>; Petri Savolainen
> <petri.savolai...@linaro.org>; lng-odp-forward <lng-odp@lists.linaro.org>
> Subject: Re: [lng-odp] [API-NEXT PATCH] api: packet: add per packet checksum 
> control
> 
> Hi Janne,
> 
> There is no undefined behavior: if HW csum is NOT calculated (not
> supported by HW OR not configured by application OR override “NOT
> insert”) then packet will not be modified (packet will be sent with
> the value set by the application).
> 
> “HW Supported” = what dpdk reports as HW capabilities (odp_pktio_capability())
> “Configured” = what is configured by application with
> odp_pktio_config(). This is a subset of what is “HW Supported”.
> 
> 
> Behavior:
> 
> 1. Override not set (default) => depends on „Configured”
> 
> 2. Override set: do not insert checksum => HW csum NOT calculated
> 
> 3. Override set: insert checksum => depends on “HW Supported”
> 
> BR,
> Bogdan


[lng-odp] Packet order lost when enqueuing to pktout queue

2017-05-17 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

It seems that the packet enqueue order to pktout queues does
not necessarily match the dequeue order of an ordered source
queue.

Enqueue to a normal queue ensures the right order by interacting
with the scheduler in enq_multi() but the pktout specific enqueue
implementation pktout_enqueue() does not.

This changed in the following commits. Apparently something was
missing from the later one.

a267605 linux-gen: sched: new ordered queue implementation
0d6d092 linux-gen: sched: remove old ordered queue implementation

Janne




Re: [lng-odp] [RFC] dpdk: enable hardware checksum support

2017-05-15 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bogdan Pricope wrote:
> On 10 May 2017 at 20:06, Savolainen, Petri (Nokia - FI/Espoo)
>  wrote:
> >
> >
> >> -Original Message-
> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> >> Bogdan Pricope
> >> Sent: Wednesday, May 10, 2017 4:50 PM
> >> To: lng-odp@lists.linaro.org
> >> Subject: [lng-odp] [RFC] dpdk: enable hardware checksum support
> >>
> >> Purpose of this RFC is to highlight/discuss different issue observed while
> >> enabling DPDK checksum calculation/validation:
> >>  - TX: DPDK needs L2 len, L3 len, L3 protocol, L4 protocol. Should we
> >> force application to set L2 offset, L3 offset etc. or ODP should parse?
> >
> >
> > /**
> >  * Packet output configuration options bit field
> >  *
> >  * Packet output configuration options listed in a bit field structure. 
> > Packet
> >  * output checksum insertion may be enabled or disabled. When it is enabled,
> >  * implementation will calculate and insert checksum into every outgoing 
> > packet
> >  * by default. Application may use a packet metadata flag to disable 
> > checksum
> >  * insertion per packet bases. For correct operation, packet metadata must
> >  * provide valid offsets for the appropriate protocols. For example, UDP
> >  * checksum calculation needs both L3 and L4 offsets (to access IP and UDP
> > -
> >  * headers). When application (e.g. a switch) does not modify L3/L4 data and
> >  * thus checksum does not need to be updated, output checksum insertion 
> > should
> >  * be disabled for optimal performance.
> >  */
> > typedef union odp_pktout_config_opt_t {
> >
> >
> >
> > API already says that outgoing packet must have (L3/L4) offsets set 
> > correctly if (e.g.
> L4) checksum calculation is offloaded.
> 
> Fine with me. Applications (likw ofp) needs to make sure that will
> always set all the offsets/protocols, etc.

The API says that the L3/L4 offsets need to be set correctly but it does
not say anything about the packet flags. If the packet flags need to be
set too, then I think the API should explicitly say so. And tell also what
happens if the packet flags do not match the packet (e.g. L3 offset is
set and points to an IPv4 header but has_ipv4 packet flag is not set).

> 
> >
> >
> >
> >
> >> Note: application cannot compute checksum or let HW do it until it
> >> identifies TX interface. This is odd considering OSI model.
> >>  - RX: application needs to know interface offload settings in order to
> >> trust error flags received with the packet (e.g. validation was performed
> >> and result is OK vs. validation was not performed due to no support). It
> >> may not be trivial for application so I suggest this information to be
> >> delivered by ODP in packet metadata.
> >
> > /**
> >  * Packet input configuration options bit field
> >  *
> >  * Packet input configuration options listed in a bit field structure. 
> > Packet
> >  * input timestamping may be enabled for all packets or at least for those 
> > that
> >  * belong to time synchronization protocol (PTP).
> >  *
> >  * Packet input checksum checking may be enabled or disabled. When it is
> >  * enabled, implementation will verify checksum correctness on incoming 
> > packets
> >  * and depending on drop configuration either deliver erroneous packets with
> >  * appropriate flags set (e.g. odp_packet_has_l3_error()) or drop those.
> >  * When packet dropping is enabled, application will never receive a packet
> >  * with the specified error and may avoid to check the error flag.
> >  */
> > typedef union odp_pktin_config_opt_t {
> >
> >
> > Application itself configures input checksum offload on/off and should 
> > remember what it
> configured. Right?
> 
> As I said, is not trivial: in odp_generator I used:
> 
> itf = [odp_pktio_index(odp_packet_input(pkt))];
> 
> That is, for this basic application, pktio index is the interface
> index in application's array. That may not be the case in a complex
> application where pktios are created/deleted multiple times - you are
> forcing application to mirror odp implementation of pktio array or use
> workarounds (OFP is looping through the list of interfaces searching
> by pktio handle).
> 
> Then, application may need to store offload capabilities of the
> interface with the packet for easy processing: application may use its
> own metadata in pkt user area or use packet user pointer
> ofp:
> odp_packet_user_ptr_set(pkt, ifnet);
> 
> Bottom line is, will be more efficient if ODP will provide this data
> (what offloading operation were performed on a packet) with pkt
> metadata since result in case of error is already there.

Or, in the case the application can anyway easily find the correct
pktio, there is extra overhead in requiring ODP to insert more
metadata in every packet, right? So it is not so clear cut to me.

OTOH, L4 checksum error bit may be clear not just because checksumming
was 

Re: [lng-odp] [PATCH API-NEXT v4 2/10] api: ipsec: note that soft_exp bits are set only once

2017-05-15 Thread Peltonen, Janne (Nokia - FI/Espoo)

Dmitry Eremin-Solenikov  wrote:
> 
> Add an explicit note telling that soft_exp bits are set only once, for
> the packet actually crossing the boundary. They will not be set for
> further packets.

Isn't this perhaps a bit too restrictive on implementations that
may process packets in parallel and may not be easily track
the number of bytes and packets per SA with per-packet accuracy?

Would it be better to define the bits so that the soft_exp bits
are set for at least one packet when the limit is crossed or
slightly before that but may also be set for later packets? Or
at least allow the bit to be set for more than single packet
when the soft limit is being crossed?

Then I wonder what happens when inline inbound IPsec traffic exceeds
the processing capacity of the SW and packets get dropped. If the
soft expiration bit it set only for one packet or some packets,
then the information may get lost if just that packet or those
packets get dropped before SW processing.

Janne




Re: [lng-odp] Packet validation question

2017-05-15 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bill Fischofer wrote:
> On Fri, May 12, 2017 at 6:27 AM, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> > Dmitry Eremin-Solenikov wrote:
> >> Another generic question regarding ODP. If the package gets passed to
> >> IPsec API, should I trust e.g. IP header values? IOW, can I assume, that
> >> ip->tot_len + l3_offset is equal to odp_packet_len(), or is that an error?
> >
> > Good question.
> >
> > I think there are several somewhat different cases that may merit
> > a bit different handling. In any case some clarification is needed
> > in the API spec.
> >
> > I-1) Inbound: IP length < packet length - L3 offset
> >
> > Received packets may contain some L2 data at the end (e.g. to pad
> > Ethernet frames to the minimum frame length).
> 
> One of the advantages of using the ODP classifier is that the ODP
> parser is assumed to validate that the packet is well formed and to
> flag any that are not as errors. So if the error bits are clean coming
> out of classification you can assume that you're dealing with a
> well-formed packet. If you bypass the classifier, it's the
> application's responsibility to validate the packet structure.
> 
> In the case of IPsec, the range of bytes over which encrypt/decrypt
> operations occurs is per the RFCs (the IP length). Any additional L2
> bytes are ignored and not propagated to the output. Remember that the
> range of operation for IPsec may not be the entire packet due to
> tunnel modes. See section 5 of RFC 4301 for details.

I am not sure if you agree or disagree that ODP should just ignore
the extra packet data and process the packet successfully in this case.

> 
> >
> > I think it would make most sense to require ODP implementations to
> > discard the extra packet data (i.e. truncate the packet) before any
> > IPsec processing and then perform IPsec decapsulation normally.
> >
> > I do not think it would be very useful to require the application to
> > truncate the packet to the correct length since the ODP implementation
> > must anyway be able to handle the extra packet data in the inbound
> > inlined offload case.
> >
> > I-2) Inbound: IP length > packet length - L3 offset
> >
> > I think ODP should return an error (which one?) through the operation
> > status. Specifying undefined behavior may not be very useful since the
> > ODP implementation must anyway be able to gracefully handle this case
> > for inline processed packets.
> 
> Various length underruns/overruns are classic attack vectors and are
> the reasons why parsers must validate packet structure. An inner
> section whose purported length is larger than its containing structure
> is always an error.

I think the API should say if such input results in undefined
behavior (e.g. crash) or an error report through the op status.

> If it is less than its containing structure this
> may or may not be a problem as padding bytes are generally benign,
> though can also be used as a covert channel.
> 
> >
> > O-1) Outbound: IP length < packet length - L3 offset
> >
> > One possibility would be to discard the extra packet data at the end
> > before IPsec processing, just like in the inbound case. But here we
> > could also consider treating the extra packet data as TFC padding that
> > gets encrypted along with the actual payload (but that might be more
> > complex, especially with fragmentation offload).
> >
> > O-2) Outbound: IP length > packet length - L3 offset
> >
> > I think ODP should return an error in the op status as in the inbound
> > case, but I suppose undefined behavior could be considered too.
> 
> For output since the application is supplying the packet it's the
> application's responsibility to ensure that output packets are well
> formed. Results are undefined if this is not the case.

The current API spec does not say the packet needs to be well formed,
so the API spec needs improvement if that is what is desired. And the
meaning of "well formed" should be made clear enough. Extra packet data
after the end of the IP packet does not make the IP packet ill-formed
and I think it would be reasonable to allow it in the API.

Janne




Re: [lng-odp] Packet validation question

2017-05-12 Thread Peltonen, Janne (Nokia - FI/Espoo)
Dmitry Eremin-Solenikov wrote:
> Another generic question regarding ODP. If the package gets passed to
> IPsec API, should I trust e.g. IP header values? IOW, can I assume, that
> ip->tot_len + l3_offset is equal to odp_packet_len(), or is that an error?

Good question.

I think there are several somewhat different cases that may merit
a bit different handling. In any case some clarification is needed
in the API spec.

I-1) Inbound: IP length < packet length - L3 offset

Received packets may contain some L2 data at the end (e.g. to pad
Ethernet frames to the minimum frame length).

I think it would make most sense to require ODP implementations to
discard the extra packet data (i.e. truncate the packet) before any
IPsec processing and then perform IPsec decapsulation normally.

I do not think it would be very useful to require the application to
truncate the packet to the correct length since the ODP implementation
must anyway be able to handle the extra packet data in the inbound
inlined offload case.

I-2) Inbound: IP length > packet length - L3 offset

I think ODP should return an error (which one?) through the operation
status. Specifying undefined behavior may not be very useful since the
ODP implementation must anyway be able to gracefully handle this case
for inline processed packets.

O-1) Outbound: IP length < packet length - L3 offset

One possibility would be to discard the extra packet data at the end
before IPsec processing, just like in the inbound case. But here we
could also consider treating the extra packet data as TFC padding that
gets encrypted along with the actual payload (but that might be more
complex, especially with fragmentation offload).

O-2) Outbound: IP length > packet length - L3 offset

I think ODP should return an error in the op status as in the inbound
case, but I suppose undefined behavior could be considered too.

Janne




Re: [lng-odp] Suspected SPAM - Re: [PATCH API-NEXT v1 2/2] api: ipsec: move soft limits expiration to flags, rather than errors

2017-05-05 Thread Peltonen, Janne (Nokia - FI/Espoo)
There is perhaps some ambiguity now in the API on whether expired soft
lifetime means that a packet is not "successfully processed", which in
turn determines whether inline mode outbound packets get sent directly out.

Having the life time expiration in the error field could be interpreted
to imply that soft life time expiration in inline output is not considered
successful processing, so an application can expect getting a result event
with successfully transformed packet but with one of the soft lifetime
error bits set. That works, but I do not think we want to drop out from
inline processing mode just because of soft lifetime (/bytes/packets)
expiration.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> Savolainen, Petri
> (Nokia - FI/Espoo)
> Sent: Friday, May 05, 2017 11:18 AM
> To: lng-odp@lists.linaro.org
> Subject: Suspected SPAM - Re: [lng-odp] [PATCH API-NEXT v1 2/2] api: ipsec: 
> move soft
> limits expiration to flags, rather than errors
> 
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Github ODP bot
> > Sent: Thursday, May 04, 2017 8:00 PM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [PATCH API-NEXT v1 2/2] api: ipsec: move soft limits
> > expiration to flags, rather than errors
> >
> > From: Dmitry Eremin-Solenikov 
> >
> > Soft limit expiration isn't an error per se. It does not mean, that we
> > received invalid or unprocessed packet. They look more like flags,
> > noting that soft limit on this SA was expired.
> 
> Advantage of keeping those among error flags, is the easy/fast check in the 
> common case:
> 
> Only check (error.u64 == 0) to see that everything is OK vs.
> 
> checking always both (error.u64 == 0 && (status.u64 == 0 || 
> (status.soft_exp_sec == 0 &&
> status.soft_exp_bytes == 0 && status.soft_exp_packets == 0)))
> 
> 
> -Petri



Re: [lng-odp] [PATCH API-NEXT v1 1/2] api: ipsec: add soft limit expiration event

2017-05-05 Thread Peltonen, Janne (Nokia - FI/Espoo)
If this ends up being the selected solution, then I think
there needs to be a bit more documentation in the API on
what this event means and when it will come. And maybe the
application wants to know which of the limits was reached.

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Github 
> ODP bot
> Sent: Thursday, May 04, 2017 8:00 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH API-NEXT v1 1/2] api: ipsec: add soft limit 
> expiration event
> 
> From: Dmitry Eremin-Solenikov 
> 
> If outbound packet was processed in inline mode, soft limit expiration
> event is not reported, as packet goes to the interface. Instead report
> this as an ODP_IPSEC_STATUS_SA_SOFT_EXPIRED.
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> ---
> /** Email created from pull request 22 (lumag:ipsec-limits)
>  ** https://github.com/Linaro/odp/pull/22
>  ** Patch: https://github.com/Linaro/odp/pull/22.patch
>  ** Base sha: 0707c974ed19c859fb92778c35a2f92bf7cd9fc6
>  ** Merge commit sha: bff71bdc47fecb62fced59449c139d3ea4b44def
>  **/
>  include/odp/api/spec/ipsec.h | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> index 384c43d..2f8a007 100644
> --- a/include/odp/api/spec/ipsec.h
> +++ b/include/odp/api/spec/ipsec.h
> @@ -1080,7 +1080,10 @@ typedef struct odp_ipsec_op_result_t {
>   */
>  typedef enum odp_ipsec_status_id_t {
>   /** Response to SA disable command */
> - ODP_IPSEC_STATUS_SA_DISABLE = 0
> + ODP_IPSEC_STATUS_SA_DISABLE = 0,
> +
> + /** Soft limit expired on this SA */
> + ODP_IPSEC_STATUS_SA_SOFT_EXPIRED
> 
>  } odp_ipsec_status_id_t;
> 



Re: [lng-odp] IPsec limits support

2017-05-05 Thread Peltonen, Janne (Nokia - FI/Espoo)
Now I realize that there is a problem in the current API in the
outbound inline case when soft byte or packet limit is reached.

Either such packet is considered successfully processed and sent
out directly, in which case there is no indication to the application
that the soft limit was reached. Or alternatively the packet is
not considered successfully handled and is sent back to the
application, in which case inline processing changes to slower
async processing.

So I guess something needs to change in the API also for the
byte and packet based lifetimes.

Janne


> -Original Message-
> From: Peltonen, Janne (Nokia - FI/Espoo)
> Sent: Friday, May 05, 2017 11:37 AM
> To: Bill Fischofer <bill.fischo...@linaro.org>; Dmitry Eremin-Solenikov
> <dmitry.ereminsoleni...@linaro.org>
> Cc: lng-odp-forward <lng-odp@lists.linaro.org>
> Subject: RE: [lng-odp] IPsec limits support
> 
> 
> Bill Fischofer wrote:
> > On Thu, May 4, 2017 at 11:44 AM, Dmitry Eremin-Solenikov <
> > dmitry.ereminsoleni...@linaro.org> wrote:
> >
> > > On 04.05.2017 19:35, Bill Fischofer wrote:
> > > >
> > > >
> > > > On Thu, May 4, 2017 at 11:25 AM, Dmitry Eremin-Solenikov
> > > > <dmitry.ereminsoleni...@linaro.org
> > > > <mailto:dmitry.ereminsoleni...@linaro.org>> wrote:
> > > >
> > > > Hello,
> > > >
> > > > I have been working on limits support in IPsec. Now I have several
> > > > questions:
> > > >
> > > >  - Is hard limit crossing fatal? IOW, should I start returning
> > > > unprocessed packets after crossing it?
> > > >
> > > >
> > > > The reason for having soft and hard limits is this distinction. When a
> > > > soft limit is reached a notification event should be issued. When a hard
> > > > limit is reached the SA is treated as disabled. So an operation against
> > > > an SA that's reached it's hard limit should be treated the same as an
> > > > operation against a disabled SA.
> > >
> > > Argh. There is no 'event' for soft limits, just a status in the error
> > > flags. BTW: should we move soft_exp_* to flags instead of errors?
> > >
> >
> > This is one of the "to do" areas we'll hopefully cover next week. Reaching
> > a soft limit should result in an odp_ipsec_status_t event being issued to
> > alert the application that the soft limit was reached.
> 
> There is a difference between time based limits and the byte/packet based
> limits. For the latter the API should already be ok since byte and packet
> based limits are tracked on per-packet basis as part of processing the
> packets.
> 
> Expiration of the byte and packet life times are indicated in the status
> bits of the packet result. When a soft byte or packet limit is exceeded,
> the corresponding status bit should be set in the result but otherwise the
> IPsec operation should complete normally. If a hard limits is exceeded,
> then the IPsec operation must fail and the original unprocessed packet
> must be returned to the application through the operation result struct.
> 
> Handling of time based lifetimes on the other hand is indeed a to-do
> item, since their expiration is not tied to any packets. The current
> API can signal time based life time expiration only when a packet for
> the SA is getting processed but not if there is no traffic for the SA.
> This means that the application needs to track at least the soft life
> time itself to be able to rekey in time even if there is no traffic
> just then through the SA.
> 
> One alternative would be to signal time based life time expiration
> through the status event. Another would be to just drop them from
> the API completely and let the application handle them (which would
> still allow introducing them back in the API later, if desired).
> 
> >
> >
> > >
> > > And also there is no way to treat hard-expired SA as disabled. We should
> > > report hard_exp_* through result errors.
> > >
> >
> > That's fine. The point is the operation fails. It's an error to continue to
> > process packets against an SA that's reached a hard limit.
> >
> 
> Passing a disabled SA explicitly to IPsec processing function is disallowed
> by the API and can therefore cause undefined behavior as we have discussed.
> 
> But attempting an IPsec operation for an SA for which a hard life time
> has been reached is not disallowed, so an application may do so. The ODP
> implementation must then indicate the error through a status bit and
> return 

Re: [lng-odp] IPsec limits support

2017-05-05 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bill Fischofer wrote:
> On Thu, May 4, 2017 at 11:44 AM, Dmitry Eremin-Solenikov <
> dmitry.ereminsoleni...@linaro.org> wrote:
> 
> > On 04.05.2017 19:35, Bill Fischofer wrote:
> > >
> > >
> > > On Thu, May 4, 2017 at 11:25 AM, Dmitry Eremin-Solenikov
> > >  > > > wrote:
> > >
> > > Hello,
> > >
> > > I have been working on limits support in IPsec. Now I have several
> > > questions:
> > >
> > >  - Is hard limit crossing fatal? IOW, should I start returning
> > > unprocessed packets after crossing it?
> > >
> > >
> > > The reason for having soft and hard limits is this distinction. When a
> > > soft limit is reached a notification event should be issued. When a hard
> > > limit is reached the SA is treated as disabled. So an operation against
> > > an SA that's reached it's hard limit should be treated the same as an
> > > operation against a disabled SA.
> >
> > Argh. There is no 'event' for soft limits, just a status in the error
> > flags. BTW: should we move soft_exp_* to flags instead of errors?
> >
> 
> This is one of the "to do" areas we'll hopefully cover next week. Reaching
> a soft limit should result in an odp_ipsec_status_t event being issued to
> alert the application that the soft limit was reached.

There is a difference between time based limits and the byte/packet based
limits. For the latter the API should already be ok since byte and packet
based limits are tracked on per-packet basis as part of processing the
packets.

Expiration of the byte and packet life times are indicated in the status
bits of the packet result. When a soft byte or packet limit is exceeded,
the corresponding status bit should be set in the result but otherwise the
IPsec operation should complete normally. If a hard limits is exceeded,
then the IPsec operation must fail and the original unprocessed packet
must be returned to the application through the operation result struct.

Handling of time based lifetimes on the other hand is indeed a to-do
item, since their expiration is not tied to any packets. The current
API can signal time based life time expiration only when a packet for
the SA is getting processed but not if there is no traffic for the SA.
This means that the application needs to track at least the soft life
time itself to be able to rekey in time even if there is no traffic
just then through the SA.

One alternative would be to signal time based life time expiration
through the status event. Another would be to just drop them from
the API completely and let the application handle them (which would
still allow introducing them back in the API later, if desired).

> 
> 
> >
> > And also there is no way to treat hard-expired SA as disabled. We should
> > report hard_exp_* through result errors.
> >
> 
> That's fine. The point is the operation fails. It's an error to continue to
> process packets against an SA that's reached a hard limit.
> 

Passing a disabled SA explicitly to IPsec processing function is disallowed
by the API and can therefore cause undefined behavior as we have discussed.

But attempting an IPsec operation for an SA for which a hard life time
has been reached is not disallowed, so an application may do so. The ODP
implementation must then indicate the error through a status bit and
return the unprocessed packet back to the application.

Requiring that an application never attempts IPsec processing for an
SA if it has gotten the hard life time expiry error once would require
synchronization between different threads of the application for that
purpose and I do not think it would make the ODP implementation much
easier.

> 
> >
> > >  - Does 'bytes' limit count packet bytes before or IPsec operation?
> > Does
> > > it count 'usefull' payload or the whole odp_packet_len()?

It should count the bytes to which the crypto/auth operations are applied,
i.e. the number of bytes they keys have been used, including padding bytes.
Counting the bytes outside of that (e.g. outer tunnel header or L2 header)
is not useful since those bytes do not affect the secrecy of the keys.

But I do not think counting some extra bytes would be a big problem.
Maybe the API should say something about it.

> > >
> > >
> > > It's typically easier to just count packets and not be overly concerned
> > > about trying to cut off packets mid-stream on byte limits. For byte
> > > counting the SA would simply count the number of bytes processed for
> > > each operation and compare that to the limits as the operation finishes
> > > up. Limits are statistical in nature and as such if the odd in-flight
> > > packet or byte slips past it's not something to worry about.

For soft limits I would agree, but I am not sure if that is always ok for
the hard limits. Failing a little too early in case of a hard limit should
be ok, but letting too many bytes or packets through may become a security
issue. At least the API should 

Re: [lng-odp] [PATCH 3/4] linux-generic: crypto: add SHA-1 authentication support

2017-05-04 Thread Peltonen, Janne (Nokia - FI/Espoo)

Dmitry Eremin-Solenikov [mailto:dmitry.ereminsoleni...@linaro.org] wrote:
> 
> On 03.05.2017 17:17, Peltonen, Janne (Nokia - FI/Espoo) wrote:
> > Hi,
> >
> > I think there is a problem or ambiguity in the ODP API spec regarding
> > the truncation of the HMAC output.
> >
> > According to RFC 2104 the terminology goes so that HMAC-{hash-func} (e.g.
> > HMAC-SHA-512) means the HMAC with the full, non-truncated output and
> > HMAC-{hash-func}-{bits} (e.g. HMAC-SHA-512-256) means the HMAC output
> > truncated to {bits} bits.
> >
> > The existing implementation (and this patch too) calculates the
> > truncated output, even though the algorithm names do not indicate
> > it. The deprecated names did indicate the output length (but not
> > that the algorithms were HMACs).
> 
> Output is truncated up to the length specified in
> param->auth_range.length.

No, output (i.e. the digest) is truncated to the length hard-coded for
each algorithm. param->auth_range.length specifies the range of input
data to be authenticated.

Janne




Re: [lng-odp] [PATCH 3/4] linux-generic: crypto: add SHA-1 authentication support

2017-05-03 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

I think there is a problem or ambiguity in the ODP API spec regarding
the truncation of the HMAC output.

According to RFC 2104 the terminology goes so that HMAC-{hash-func} (e.g.
HMAC-SHA-512) means the HMAC with the full, non-truncated output and
HMAC-{hash-func}-{bits} (e.g. HMAC-SHA-512-256) means the HMAC output
truncated to {bits} bits.

The existing implementation (and this patch too) calculates the
truncated output, even though the algorithm names do not indicate
it. The deprecated names did indicate the output length (but not
that the algorithms were HMACs).

Now it is not possible to get the non-truncated output, which are
used at least in IKE PRFs.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Github 
> ODP bot
> Sent: Wednesday, May 03, 2017 12:00 AM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH 3/4] linux-generic: crypto: add SHA-1 
> authentication support
> 
> From: Dmitry Eremin-Solenikov 
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> ---
> /** Email created from pull request 10 (lumag:crypto-update)
>  ** https://github.com/Linaro/odp/pull/10
>  ** Patch: https://github.com/Linaro/odp/pull/10.patch
>  ** Base sha: 0b1dbf37b4030c6da4c6f13645c63fd4ac8ff923
>  ** Merge commit sha: 72489ef29ea4586f487b1c806cf37fab63272c7a
>  **/
>  platform/linux-generic/odp_crypto.c | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/platform/linux-generic/odp_crypto.c 
> b/platform/linux-generic/odp_crypto.c
> index 0670755..79e582e 100644
> --- a/platform/linux-generic/odp_crypto.c
> +++ b/platform/linux-generic/odp_crypto.c
> @@ -54,6 +54,9 @@ static const odp_crypto_cipher_capability_t 
> cipher_capa_aes_gcm[] = {
>  static const odp_crypto_auth_capability_t auth_capa_md5_hmac[] = {
>  {.digest_len = 12, .key_len = 16, .aad_len = {.min = 0, .max = 0, .inc = 0} 
> } };
> 
> +static const odp_crypto_auth_capability_t auth_capa_sha1_hmac[] = {
> +{.digest_len = 12, .key_len = 20, .aad_len = {.min = 0, .max = 0, .inc = 0} 
> } };
> +
>  static const odp_crypto_auth_capability_t auth_capa_sha256_hmac[] = {
>  {.digest_len = 16, .key_len = 32, .aad_len = {.min = 0, .max = 0, .inc = 0} 
> } };
> 
> @@ -561,7 +564,7 @@ int odp_crypto_capability(odp_crypto_capability_t *capa)
> 
>   capa->auths.bit.null = 1;
>   capa->auths.bit.md5_hmac = 1;
> - capa->auths.bit.sha1_hmac= 0;
> + capa->auths.bit.sha1_hmac= 1;
>   capa->auths.bit.sha256_hmac  = 1;
>   capa->auths.bit.sha512_hmac  = 0;
>   capa->auths.bit.aes_gcm  = 1;
> @@ -635,6 +638,10 @@ int odp_crypto_auth_capability(odp_auth_alg_t auth,
>   src = auth_capa_md5_hmac;
>   num = sizeof(auth_capa_md5_hmac) / size;
>   break;
> + case ODP_AUTH_ALG_SHA1_HMAC:
> + src = auth_capa_sha1_hmac;
> + num = sizeof(auth_capa_sha1_hmac) / size;
> + break;
>   case ODP_AUTH_ALG_SHA256_HMAC:
>   src = auth_capa_sha256_hmac;
>   num = sizeof(auth_capa_sha256_hmac) / size;
> @@ -740,6 +747,9 @@ odp_crypto_session_create(odp_crypto_session_param_t 
> *param,
>   case ODP_AUTH_ALG_MD5_96:
>   rc = process_auth_param(session, 96, 16, EVP_md5());
>   break;
> + case ODP_AUTH_ALG_SHA1_HMAC:
> + rc = process_auth_param(session, 96, 20, EVP_sha1());
> + break;
>   case ODP_AUTH_ALG_SHA256_HMAC:
>/* deprecated */
>   case ODP_AUTH_ALG_SHA256_128:



Re: [lng-odp] [PATCH 3/3] linux-generic: classification: implement packet hashing in classifier

2017-05-03 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> @@ -846,11 +897,91 @@ int cls_classify_packet(pktio_entry_t *entry, const 
> uint8_t *base,
> 
>   *pool = cos->s.pool;
>   pkt_hdr->p.input_flags.dst_queue = 1;
> - pkt_hdr->dst_queue = cos->s.queue->s.handle;
> 
> + if (!cos->s.queue_group) {
> + pkt_hdr->dst_queue = cos->s.queue->s.handle;
> + return 0;
> + }
> +
> + hash = packet_rss_hash(pkt_hdr, cos->s.hash_proto, base);
> + hash = hash & 0x1F;
> + hash = hash & cos->s.num_queue;

This works only if cos->s.num_queue is one less than a power of two,
which it does not seem to be in general.

> + tbl_index = (cos->s.index * ODP_COS_QUEUE_MAX) + hash;
> +
> + if (!queue_grp_tbl->s.queue[tbl_index]) {
> + LOCK(>s.lock);
> + if (!queue_grp_tbl->s.queue[tbl_index])
> + UNLOCK(>s.lock);

You probably did not mean that but to skip the rest and unlock
if the queue now exists.

But I think it may be bad to have queue creation as a side effect of
packet processing since the creation (including possible registration
to the scheduler) can be somewhat slow and cause uncontrolled latency
to packet handling.

Another issue is that there is no practical way of getting the queue
handle in case one wants to use the queue directly instead of through
the scheduler. And even if there were, one would need to get to know
about new queues right when they get created.

Where does the packet get delivered if queue creation here fails?
How would one know that such a problem occurred?

Janne

> +
> + sprintf(queue_name, "%s%d", "queue_group", tbl_index);
> + queue = odp_queue_create(queue_name, >s.queue_param);
> + queue_grp_tbl->s.queue[tbl_index] = queue;
> + UNLOCK(>s.lock);
> + }
> +
> + pkt_hdr->dst_queue = queue_grp_tbl->s.queue[tbl_index];
>   return 0;
>  }
> 



Re: [lng-odp] [[RFCv2] 3/4] api: ipsec: change semantics of odp_ipsec_result function

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)

Bill Fischofer [mailto:bill.fischo...@linaro.org] wrote:
> On Fri, Apr 28, 2017 at 5:44 AM, Dmitry Eremin-Solenikov <
> dmitry.ereminsolenikov at linaro.org> wrote:
> 
> > On 28.04.2017 12:03, Peltonen, Janne (Nokia - FI/Espoo) wrote:
> > >
> > >> -Original Message-
> > >> From: lng-odp [mailto:lng-odp-bounces at lists.linaro.org] On Behalf Of
> > Bill Fischofer
> > >> Sent: Friday, April 28, 2017 4:06 AM
> > >> To: Dmitry Eremin-Solenikov 
> > >> Cc: lng-odp-forward 
> > >> Subject: Re: [lng-odp] [[RFCv2] 3/4] api: ipsec: change semantics of
> > odp_ipsec_result
> > >> function
> > >>
> > >> On Thu, Apr 27, 2017 at 7:50 PM, Dmitry Eremin-Solenikov <
> > >> dmitry.ereminsolenikov at linaro.org> wrote:
> > >>
> > >>> On 28.04.2017 03:45, Bill Fischofer wrote:
> > >>>>
> > >>>>
> > >>>> On Thu, Apr 27, 2017 at 5:49 PM, Dmitry Eremin-Solenikov
> > >>>>  > >>>> <mailto:dmitry.ereminsolenikov at linaro.org>> wrote:
> > >>>>
> > >>>> On 28.04.2017 01:46, Bill Fischofer wrote:
> > >>>> >
> > >>>> >
> > >>>> > On Thu, Apr 27, 2017 at 6:51 AM, Dmitry Eremin-Solenikov
> > >>>> >  > >>>> <mailto:dmitry.ereminsolenikov at linaro.org>
> > >>>> > <mailto:dmitry.ereminsolenikov at linaro.org
> > >>>> <mailto:dmitry.ereminsolenikov at linaro.org>>> wrote:
> >
> > >>>> Also, shouldn't the signature of this API be:
> > >>>>
> > >>>> int odp_ipsec_result(odp_ipsec_op_result_t *result,
> > odp_ipsec_result_t
> > >>>> ipsec_ev)?
> > >>>
> > >>> odp_ipsec_result_t is an internal API used to odp_event code to pass
> > >>> event to odp_ipsec_result_* handling. Maybe I should drop it
> > alltogether
> > >>> and replace with just odp_event_t. What do you think?
> > >>>
> > >>
> > >> odp_event_t is a generic type that encompasses any specific event so
> > that
> > >> odp_queue_enq/deq() don't have to be generic functions (since generic
> > >> functions aren't supported in C99). The events that come out of IPsec
> > >> operations are odp_ipsec_result_t events so best to use that here, no?
> > >>
> > >
> > > In the ODP API there are only generic events from which one can get
> > > odp_ipsec_op_result_t objects through this function and that is why
> > > ipsec_ev should be odp_event_t here.
> > >
> > > odp_ipsec_result_t is an internal type used in this implementation
> > > where ipsec events happen to be of that type. In other implementations
> > > they may be equal to other types. But from application point of view
> > > ipsec events are opaque (and thus odp_event_t is all one needs) and
> > > need to be interpreted through this API function.
> >
> > Most probably we should also make odp_crypto_* API to use odp_event_t
> > and hide odp_crypto_compl_t as internal API.
> >
> 
> Packets also arrive as ODP events, but the odp_packet_xxx() APIs take
> odp_packet_t objects, not odp_event_t. The same for odp_buffer_xxx() APIs
> taking odp_buffer_t. odp_timeout_xxx() APIs taking odp_timeout_t,  and
> odp_crypto_xxx() APIs taking odp_crypto_xxx_ events. This is an
> odp_ipsec_xxx API, not an odp_event_xxx API hence it should take an
> odp_ipsec_xxx type for consistency.
> 

I think I misunderstood the original comment. So the odp_ipsec_xxx type
would also be opaque to the application and would still have to be
converted through the odp_ipsec_result() function to a non-opaque
type (odp_ipsec_op_result_t). That sounds reasonable to me and looks
consistent with the crypto API. There are currently two IPsec specific
event types, so maybe two new opaque types would be needed.

Maybe Petri wants to chime in and comment.

> 
> >
> > > It would be more correct to say that the function is valid  only
> > > for events of type ODP_EVENT_IPSEC_RESULT. At API level there is
> > > no separate type defined as it would not be very useful.
> >
> > It is stated so in API docs.
> >
> 
> The consistent way to state this (as is done for other APIs) is to have it
> take an odp_ipsec_xxx type and note that the odp_ipsec_result_from_event()
> call is only meaningful when given events of type ODP_EVENT_IPSEC_RESULT.
> 
> 
> >
> > --
> > With best wishes
> > Dmitry
> >



Re: [lng-odp] IPsec SA disabling

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)
 sequences need to be combined (possibly
requiring more than one refcount per SA).

Do you see any problems with the above? It is how the ODP IPsec
draft patch work (except that only the async-without-lookup case
has been sent to the list so far).

And as said, I think there are probably many other ways that do
not necessarily involve so much per-SA reference counting.

Janne


> -Original Message-
> From: Dmitry Eremin-Solenikov [mailto:dmitry.ereminsoleni...@linaro.org]
> Sent: Friday, April 28, 2017 4:03 PM
> To: Bala Manoharan <bala.manoha...@linaro.org>
> Cc: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>; 
> lng-odp-forward  o...@lists.linaro.org>
> Subject: Re: [lng-odp] IPsec SA disabling
> 
> On 28.04.2017 15:32, Bala Manoharan wrote:
> > On 28 April 2017 at 15:41, Dmitry Eremin-Solenikov
> > <dmitry.ereminsoleni...@linaro.org> wrote:
> >> On 28.04.2017 12:27, Peltonen, Janne (Nokia - FI/Espoo) wrote:
> >>> Hi,
> >>>
> >>>> -Original Message-
> >>>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> >>>> Dmitry Eremin-
> >>>> Solenikov
> >>>> Sent: Friday, April 28, 2017 11:59 AM
> >>>> To: lng-odp-forward <lng-odp@lists.linaro.org>
> >>>> Subject: [lng-odp] IPsec SA disabling
> >>>>
> >>>> Hello,
> >>>>
> >>>> While responding to Janne's email, I've come to an issue which warrants
> >>>> a separate discussion IMO.
> >>>>
> >>>> Consider the following scenario. Application with two thread.
> >>>>
> >>>> - First thread: gets outbound packet, finds SA through SPD, configures
> >>>> IPsec request, submits it.
> >>>>
> >>>> - Second thread: wants to disable SA, pending deletion.
> >>>>
> >>>> Is that an application error if SA was disabled after being looked up,
> >>>> but before being submitted by another thread? Yes it is.
> >>>
> >>> Yes, it would be an error.
> >>>
> >>>> Right now SA
> >>>> disabling/deletion is susceptible to race conditions. It requires an app
> >>>> to lock SA after lookup, unlock it after submission, etc.
> >>>
> >>> To avoid races one indeed currently needs some synchronization in the
> >>> application side. Not necessarily locks but something.
> >>>
> >>> The OFP IPsec draft code (which is just draft, trying to use the ODP API),
> >>> refcounts SA usage and disables an SA only after it has been removed
> >>> from the SAD and after ongoing operations (tracked through the refcount)
> >>> have completed.
> >>
> >> Basically as we already have refcounting on ODP side, I'd suggest
> >> reusing it in applications.
> >>
> >>>
> >>> Because of the above, one could wonder why even have disable() in the
> >>> API. The main reason is to be able to disable offloaded inbound SA
> >>> lookup for the SA so that the SA can be safely deleted (the application
> >>> needs to be able to clean up its own SA state but cannot do it if there
> >>> are in-flight inbound operations that may still refer (through the ctx 
> >>> ptr)
> >>> to the application SA state).
> >>
> >> How do you track references of SA instances returned from ODP after
> >> lookup? How will application cope with disabling/deleting SA if it is
> >> returned to another thread by ODP?
> >>
> >>>
> >>>> Should ODP
> >>>> provide API to 'lock' SA?
> >>>
> >>> It would be convenient to the application if disable() could be called
> >>> at any time. But that could impose difficult synchronization needs
> >>> in the ODP implementation side and therefore we thought that it is
> >>> better to leave that to the application which may need to do some
> >>> synchronization anyway due to its own needs.
> >>>
> >>> But this is definitely a good topic to ponder about. It is possible
> >>> that the current API got this wrong, but it is not easy to see how
> >>> things should be if they need to be changed.
> >>
> >> Well, right now there is a need to do synchronization on ODP side
> >> because odp_ipsec_sa_disable() is not an optional call.
> >>
> >>>> I was thinking about smth like:
> >>>>
> >>

Re: [lng-odp] IPsec SA disabling

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Dmitry 
> Eremin-
> Solenikov
> Sent: Friday, April 28, 2017 11:59 AM
> To: lng-odp-forward 
> Subject: [lng-odp] IPsec SA disabling
> 
> Hello,
> 
> While responding to Janne's email, I've come to an issue which warrants
> a separate discussion IMO.
> 
> Consider the following scenario. Application with two thread.
> 
> - First thread: gets outbound packet, finds SA through SPD, configures
> IPsec request, submits it.
> 
> - Second thread: wants to disable SA, pending deletion.
> 
> Is that an application error if SA was disabled after being looked up,
> but before being submitted by another thread? Yes it is.

Yes, it would be an error.

> Right now SA
> disabling/deletion is susceptible to race conditions. It requires an app
> to lock SA after lookup, unlock it after submission, etc. 

To avoid races one indeed currently needs some synchronization in the
application side. Not necessarily locks but something.

The OFP IPsec draft code (which is just draft, trying to use the ODP API),
refcounts SA usage and disables an SA only after it has been removed
from the SAD and after ongoing operations (tracked through the refcount)
have completed.

Because of the above, one could wonder why even have disable() in the
API. The main reason is to be able to disable offloaded inbound SA
lookup for the SA so that the SA can be safely deleted (the application
needs to be able to clean up its own SA state but cannot do it if there
are in-flight inbound operations that may still refer (through the ctx ptr)
to the application SA state).

> Should ODP
> provide API to 'lock' SA?

It would be convenient to the application if disable() could be called
at any time. But that could impose difficult synchronization needs
in the ODP implementation side and therefore we thought that it is
better to leave that to the application which may need to do some
synchronization anyway due to its own needs.

But this is definitely a good topic to ponder about. It is possible
that the current API got this wrong, but it is not easy to see how
things should be if they need to be changed.

> 
> I was thinking about smth like:
> 
> int odp_ipsec_sa_use(odp_ipsec_sa_t sa);
> 
> void odp_ipsec_sa_unuse(odp_ipsec_sa_t sa);
> 
> One would use first function when an app has looked up SA in it's own
> tables and is going to submit it via one of odp_ipsec_* processing calls
> (except or including SA deletion/disabling -- TBD).

I am not sure how this would help and be better than something done
totally at the application side. This would anyway require a bit more
specification to define the semantics. For instance, when unuse() could
be called with respect to the enque operations and what exactly it
would mean.

Janne

> 
> One would use second function when it doesn't want to use that SA for
> one reason or another.
> 
> --
> With best wishes
> Dmitry


Re: [lng-odp] [[RFCv2] 3/4] api: ipsec: change semantics of odp_ipsec_result function

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Friday, April 28, 2017 4:06 AM
> To: Dmitry Eremin-Solenikov 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [[RFCv2] 3/4] api: ipsec: change semantics of 
> odp_ipsec_result
> function
> 
> On Thu, Apr 27, 2017 at 7:50 PM, Dmitry Eremin-Solenikov <
> dmitry.ereminsoleni...@linaro.org> wrote:
> 
> > On 28.04.2017 03:45, Bill Fischofer wrote:
> > >
> > >
> > > On Thu, Apr 27, 2017 at 5:49 PM, Dmitry Eremin-Solenikov
> > >  > > > wrote:
> > >
> > > On 28.04.2017 01:46, Bill Fischofer wrote:
> > > >
> > > >
> > > > On Thu, Apr 27, 2017 at 6:51 AM, Dmitry Eremin-Solenikov
> > > >  > > 
> > > >  > > >> wrote:
> > > >
> > > >  - Move packets from the event instead of copying them. This
> > simplifies
> > > >event handling/freeing code, which now does not have to
> > track, which
> > > >packets were copied from the event and which packets should
> > be freed.
> > > >
> > > >  - Do not require to free the event before processing packets.
> > This
> > > >allows one to copy packets from the event in small batches
> > and
> > > >process them accordingly.
> > > >
> > > >  - Freeing the event in odp_ipsec_result() leaves space for
> > optimized
> > > >implementations, where an event is actually a packet with
> > additional
> > > >metadata.
> > > >
> > > > Signed-off-by: Dmitry Eremin-Solenikov
> > > >  > > 
> > > >  > > >>
> > > > ---
> > > >  include/odp/api/spec/ipsec.h | 22 ++
> > > >  1 file changed, 14 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/include/odp/api/spec/ipsec.h
> > > b/include/odp/api/spec/ipsec.h
> > > > index 7f43e81c..255c5850 100644
> > > > --- a/include/odp/api/spec/ipsec.h
> > > > +++ b/include/odp/api/spec/ipsec.h
> > > > @@ -1275,17 +1275,23 @@ int odp_ipsec_out_inline(const
> > > > odp_ipsec_op_param_t *op_param,
> > > >   * Get IPSEC results from an ODP_EVENT_IPSEC_RESULT event
> > > >   *
> > > >   * Copies IPSEC operation results from an event. The event
> > > must be of
> > > > - * type ODP_EVENT_IPSEC_RESULT. It must be freed before the
> > > > application passes
> > > > - * any resulting packet handles to other ODP calls.
> > > > + * type ODP_EVENT_IPSEC_RESULT. The event will be freed
> > > > automatically if
> > > > + * odp_ipsec_result() returns 0.  In all other case it must be
> > > > freed via
> > > > + * odp_event_free().
> > > >   *
> > > > - * @param[out]result  Pointer to operation result for
> > output.
> > > > Maybe NULL, if
> > > > - *application is interested only on
> > the
> > > > number of
> > > > - *packets.
> > > > + * @param[out]result  Pointer to operation result for
> > output.
> > > > May be
> > > > + *NULL, if application is interested
> > only
> > > > on the
> > > >
> > > >
> > > > Correct typo in original: is interested only *in* the ...
> > > >
> > > >
> > > > + *number of packets.
> > > >   * @param event   An ODP_EVENT_IPSEC_RESULT event
> > > >   *
> > > > - * @return Number of packets in the event. If this is larger
> > than
> > > > - * 'result.num_pkt', all packets did not fit into
> > result
> > > > struct and
> > > > - * application must call the function again with a
> > larger
> > > > result struct.
> > > > + * @return Number of packets remaining in the event.
> > > > + * @retval > 0All packets did not fit into result struct
> > and
> > > > + *application must call the function again.
> > > Packets
> > > > + *returned during previous calls will not be
> > > returned
> > > > + *again in subsequent calls.
> > > > + * @retval 0  All packets were returned. The event was
> > > freed during
> > > > + *this call. Application should not access
> > > the event
> > > > 

Re: [lng-odp] [[RFCv2] 2/4] api: ipsec: add default queue for outbound events

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Dmitry 
> Eremin-
> Solenikov
> Sent: Friday, April 28, 2017 3:44 AM
> To: Bill Fischofer 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [[RFCv2] 2/4] api: ipsec: add default queue for 
> outbound events
> 
> On 28.04.2017 03:33, Bill Fischofer wrote:
> >
> >
> > On Thu, Apr 27, 2017 at 6:19 PM, Dmitry Eremin-Solenikov
> >  > > wrote:
> >
> > On 28.04.2017 01:23, Bill Fischofer wrote:
> > >
> > >
> > > On Thu, Apr 27, 2017 at 6:51 AM, Dmitry Eremin-Solenikov
> > >  > 
> > >  > >> wrote:
> > >
> > > If an application has passed invalid SA in async mode, there is 
> > no way
> > > to report it back to application except using default queue 
> > (which does
> > > not exist at this moment).
> > >
> > > Signed-off-by: Dmitry Eremin-Solenikov
> > >  > 
> > >  > >>
> > > ---
> > >  include/odp/api/spec/ipsec.h | 7 +++
> > >  1 file changed, 7 insertions(+)
> > >
> > > diff --git a/include/odp/api/spec/ipsec.h 
> > b/include/odp/api/spec/ipsec.h
> > > index 4f746f67..7f43e81c 100644
> > > --- a/include/odp/api/spec/ipsec.h
> > > +++ b/include/odp/api/spec/ipsec.h
> > > @@ -190,6 +190,13 @@ typedef struct odp_ipsec_inbound_config_t {
> > >   * Configuration options for IPSEC outbound processing
> > >   */
> > >  typedef struct odp_ipsec_outbound_config_t {
> > > +   /** Default destination queue for IPSEC events
> > > +*
> > > +*  When passed invalid SA in the asynchronous mode,
> > > +*  resulting IPSEC events are enqueued into this queue.
> > > +*/
> > > +   odp_queue_t default_queue;
> > >
> > >
> > > Is an invalid SA going to be passed for any legitimate reason or is 
> > this
> > > always a programming error? If the latter then this seems unnecessary 
> > as
> > > this case falls under the "results are undefined" category. If the
> > > former, then why can't the call simply be failed before being accepted
> > > for async processing? The caller is "owed" an odp_ipsec_result_t only 
> > if
> > > the async call is successful.
> >
> > Unfortunately returning an error or 'processing stopped here' does not
> > allow application to determine the cause of that error/processing stop.

Passing invalid SAs through the API is a programming error, so one could argue
that it is not the application but the programmer who needs to figure it out
and do the debugging.

> > So, it well might resubmit the packet later. Thus it might be better to
> > accept such packet and return an error through event.
> >
> >
> > Communicating error cause is what odp_errno is designed for. If you're
> > saying an application might loop on an error, that's certainly a
> > possibility but not something that ODP can control nor would that be
> > unique to this API.
> 
> I mean that it can loop, because (at least now) API docs don't say a
> thing about possible error causes.

Since calling the ipsec API with an invalid SA is a programming error
and not something that every underlying implementation needs to be
prepared to handle, anything can happen. The application may not even
get a change to decide whether to loop since the ODP implementation
may already crash before that.

Requiring extensive and safe validity checks for the parameters provided
through the API can incur significant performance penalty, including a
need for locking.

Even if the implementation manages to return an error for an invalid SA,
I do not think it would be reasonable for the application to retry.
Generally a failure does not go away by just trying again, and the cases
where trying again is the right thing to do are typically documented and
have a specific error code (EAGAIN, EWOULDBLOCK and similar).

If adding a specific error code in the API for certain special case
would help, that would be a lighter weight thing that to add a whole
new queue.
 
> Moreover, another point for default_queue. Consider odp_ipsec_out()
> call. There is no need to stop process on invalid/disabled SA.

Some programming errors may be more easily detectable and may be
less catastrophic in the synchronous API than in the asynchronous API.

Or maybe not. What does invalid SA mean anyway? ODP_IPSE_SA_INVALID
specifically or 

Re: [lng-odp] [API_NEXT v2] API: IPSEC: Updating ipsec APIs to support sNIC implementation.

2017-04-28 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Nikhil 
> Agarwal
> Sent: Friday, April 28, 2017 9:50 AM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [API_NEXT v2] API: IPSEC: Updating ipsec APIs to support 
> sNIC
> implementation.
> 
> Signed-off-by: Nikhil Agarwal 
> ---
>  include/odp/api/spec/ipsec.h | 62 
> +---
>  include/odp/api/spec/packet_io.h | 10 +++
>  2 files changed, 36 insertions(+), 36 deletions(-)
> 
> diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> index e83494d..a2624ff 100644
> --- a/include/odp/api/spec/ipsec.h
> +++ b/include/odp/api/spec/ipsec.h
> @@ -59,8 +59,10 @@ typedef enum odp_ipsec_op_mode_t {
>   /** Inline IPSEC operation
> *
> * Packet input/output is connected directly to IPSEC inbound/outbound
> -   * processing. Application uses asynchronous or inline IPSEC
> -   * operations.
> +   * processing. Application may use asynchronous IPSEC operations.
> +   * Packet post IPSEC operations are delivered to PKTIO queues. Further
> +   * classification/Hashing(inbound) will be applicaed to packet post 
> IPSEC as
> +   * defined in PKTIO configuration.
> */
>   ODP_IPSEC_OP_MODE_INLINE,

The added comment is not valid for asynchronously processed packets which are 
also
allowed in the inline operating mode. So maybe the inline-specific comment 
should
be reformulated or located elsewhere.

> 
> @@ -226,6 +228,24 @@ typedef struct odp_ipsec_outbound_config_t {
> 
>  } odp_ipsec_outbound_config_t;
> 
> +typedef union odp_ipsec_protocols_t {
> + /** Cipher algorithms */
> + struct {
> + /** ODP_IPSEC_ESP */
> + uint32_t esp: 1;
> +
> + /** ODP_IPSEC_AH */
> + uint32_t ah : 1;
> +
> + } bit;
> +
> + /** All bits of the bit field structure
> +  *
> +  * This field can be used to set/clear all flags, or bitwise
> +  * operations over the entire structure. */
> + uint32_t all_bits;
> +} odp_ipsec_protocols_t;
> +
>  /**
>   * IPSEC capability
>   */
> @@ -264,6 +284,9 @@ typedef struct odp_ipsec_capability_t {
>*/
>   uint8_t hard_limit_sec;
> 
> + /** Supported ipsec Protocols */
> + odp_ipsec_protocols_t protocols;
> +
>   /** Supported cipher algorithms */
>   odp_crypto_cipher_algos_t ciphers;
> 
> @@ -554,21 +577,6 @@ typedef enum odp_ipsec_lookup_mode_t {
>  } odp_ipsec_lookup_mode_t;
> 
>  /**
> - * Result event pipeline configuration
> - */
> -typedef enum odp_ipsec_pipeline_t {
> - /** Do not pipeline */
> - ODP_IPSEC_PIPELINE_NONE = 0,
> -
> - /** Send IPSEC result events to the classifier.
> -  *
> -  *  IPSEC capability 'pipeline_cls' determines if pipelined
> -  *  classification is supported. */
> - ODP_IPSEC_PIPELINE_CLS
> -
> -} odp_ipsec_pipeline_t;
> -
> -/**
>   * IPSEC Security Association (SA) parameters
>   */
>  typedef struct odp_ipsec_sa_param_t {
> @@ -632,31 +640,13 @@ typedef struct odp_ipsec_sa_param_t {
>*/
>   uint32_t mtu;
> 
> - /** Select pipelined destination for IPSEC result events
> -  *
> -  *  Asynchronous and inline modes generate result events. Select where
> -  *  those events are sent. Inbound SAs may choose to use pipelined
> -  *  classification. The default value is ODP_IPSEC_PIPELINE_NONE.
> -  */
> - odp_ipsec_pipeline_t pipeline;
> -

So for asynchronously processed packets this proposal removes the possibility
of ODP distributing the packets decapsulated from one fat tunnel to multiple
queues for better parallelism.

>   /** Destination queue for IPSEC events
>*
> -  *  Operations in asynchronous or inline mode enqueue resulting events
> +  *  Operations in asynchronous mode enqueue resulting events
>*  into this queue.
>*/
>   odp_queue_t dest_queue;

So where do the inline processed packets end up? After this patch the
API would not say it.

> 
> - /** Classifier destination CoS for IPSEC result events
> -  *
> -  *  Result events for successfully decapsulated packets are sent to
> -  *  classification through this CoS. Other result events are sent to
> -  *  'dest_queue'. This field is considered only when 'pipeline' is
> -  *  ODP_IPSEC_PIPELINE_CLS. The CoS must not be shared between any pktio
> -  *  interface default CoS.
> -  */
> - odp_cos_t dest_cos;
> -
>   /** User defined SA context pointer
>*
>*  User defined context pointer associated with the SA.
> diff --git a/include/odp/api/spec/packet_io.h 
> b/include/odp/api/spec/packet_io.h
> index 8802089..0744b1a 100644
> --- a/include/odp/api/spec/packet_io.h
> +++ b/include/odp/api/spec/packet_io.h
> @@ -1136,6 +1136,16 @@ uint64_t odp_pktin_ts_res(odp_pktio_t pktio);
>   * 

Re: [lng-odp] [PATCH] linux-generic: rwlock: fix odp_rwlock_read_trylock()

2017-04-26 Thread Peltonen, Janne (Nokia - FI/Espoo)

That does not work. Since the value of cnt is not checked, the code
would happily take the lock even when a writer already has it.

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Github 
> ODP bot
> Sent: Wednesday, April 26, 2017 8:00 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [PATCH] linux-generic: rwlock: fix 
> odp_rwlock_read_trylock()
> 
> From: Dmitry Eremin-Solenikov 
> 
> odp_rwlock_read_trylock() currently works only if there are no readers
> (and writers) as it compares counter with 0. Make it actually work in
> case there are other active readers.
> 
> Fixes: https://bugs.linaro.org/show_bug.cgi?id=2974
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> ---
> /** Email created from pull request 19 (lumag:fix-rwlock)
>  ** https://github.com/Linaro/odp/pull/19
>  ** Patch: https://github.com/Linaro/odp/pull/19.patch
>  ** Base sha: 9b993a1531c94782b48292adff54a95de9d2be5c
>  ** Merge commit sha: 3af3b11b695d157ce1649955b52e8a165cc82937
>  **/
>  platform/linux-generic/odp_rwlock.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_rwlock.c 
> b/platform/linux-generic/odp_rwlock.c
> index 13c17a2..e0c46ce 100644
> --- a/platform/linux-generic/odp_rwlock.c
> +++ b/platform/linux-generic/odp_rwlock.c
> @@ -33,9 +33,9 @@ void odp_rwlock_read_lock(odp_rwlock_t *rwlock)
> 
>  int odp_rwlock_read_trylock(odp_rwlock_t *rwlock)
>  {
> - uint32_t zero = 0;
> + uint32_t cnt = odp_atomic_load_u32(>cnt);
> 
> - return odp_atomic_cas_acq_u32(>cnt, , (uint32_t)1);
> + return odp_atomic_cas_acq_u32(>cnt, , cnt + 1);
>  }
> 
>  void odp_rwlock_read_unlock(odp_rwlock_t *rwlock)



Re: [lng-odp] [API-NEXT PATCH 8/8] linux-gen: time: use hw time counter when available

2017-04-26 Thread Peltonen, Janne (Nokia - FI/Espoo)
> > odp_sched_latency currently uses clock_gettime. It is my understanding
> > that clock_gettime does not have the over head of the system call. Can
> > you elaborate more on the 'improved significantly' part?
> >
> 
> clock_gettime() uses the same TSC, but when you profile it with perf you can 
> see tens of
> kernel functions including system call entry, RCU maintenance, etc.

clock_gettime() does not use the vdso implementation without syscall overhead
on x86 if clock id is CLOCK_MONOTONIC_RAW as it seems to be in ODP. I think
new enough kernels do support CLOCK_MONOTONIC_RAW in vsdo for arm64 though.

CLOCK_MONOTONIC is supported in vdso in x86 and would not cause syscall
overhead provided that the kernel time source is tsc (which it often is,
but not always (e.g. in some VMs)).

Janne




Re: [lng-odp] [PATCH v2] linux-generic: crypto: properly handle errors in packet copy

2017-04-24 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> 
> There is connected interesting question, which should be though wrt. all
> 'packet-consuming' functions. Should such functions always consume and
> free incoming packet? IOW:
> 
>  - odp_crypto_operation() returned -1. Should the app free inbound
> packet afterwards?

The API spec is not crystal clear about it but the way I read it is
that on failure the crypto operation does nothing to the input packets.
So the input packets continue to be owned by the application which needs
take care of them (maybe free them).

> 
>  - odp_ipsec_*() returned -1. Should inbound packets be freed by an app?
> 

I think the IPsec API is slightly more clear since it says that a positive
return value indicates the number of packets consumed by the operation.
So on error none of the input packets were consumed. In that case they
need to be freed or otherwise taken care of by the application which
still owns them.

> Would it be sane for the app to assume that it should re-read inbound
> packet/packet array from params(!) and free all non-INVALID packets (or
> requeque them, etc).

When the IPsec processing function returns a negative status, none
of the input packets were consumed. ODP does not update the operation
parameter struct and the input packet table in there, but there is also
no need to check the status packet-by-packet in that case. Per-packet
errors can occur only when the operation as a whole succeeds. Per-packet
errors are then indicated in the operation result struct.

Janne




Re: [lng-odp] [API-NEXT PATCH v2 00/16] A scalable software scheduler

2017-04-19 Thread Peltonen, Janne (Nokia - FI/Espoo)

Ola Liljedahl [mailto:ola.liljed...@linaro.org] wrote:
> 
> On 10 April 2017 at 10:56, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> > Hi,
> >
> > Ola Liljedahl <mailto:ola.liljed...@linaro.org> wrote:
> >> Peltonen, Janne (Nokia - FI/Espoo) <mailto:janne.pelto...@nokia.com> wrote:
> >> > In an IPsec GW (as a use case example) one might want to do all
> >> > stateless processing (like ingress and egress IP processing and the
> >> > crypto operations) in ordered contexts to get parallelism, but the
> >> > stateful part (replay-check and sequence number generation) in
> >> > an atomic context (or holding an ordered lock) so that not only
> >> > the packets are output in ingress order but also their sequence
> >> > numbers are in the same order.
> >>
> >> To what extent does IPsec sequence number ordering have to equal actual
> >> transmission order? If an IPsec GW preserves ingress-to-egress packet 
> >> order,
> >> does it matter that this order might not be the same as the IPsec sequence
> >> number order? If IPsec SN's are not used for reordering, just for replay
> >> protection, I don't see why the two number series have to match.
> >
> > The sequence numbers of course do not need to fully match the transmission
> > order because the anti-replay window mechanism can tolerate out-of-sequence
> > packets (caused either by packet reordering in the network or by sequence
> > number assignment not quite following the transmission order.
> >
> > But assigning sequence numbers out of order with respect to the packet
> > order (which hopefully stays the same between the ingress and egress of
> > an SGW) eats into the out-of-sequence tolerance budget (i.e. the window
> > size at the other end) and leaves less of the budget for actual
> > reordering in the network.
> >
> > Whether out-of-sequence sequence number assignment is ok or problematic
> > depends on the peer configuration, network, possible QoS induced
> > packet reordering and the magnitude of the possible sequence number
> > (not packet) reordering with respect to transmission order in the sender.
> >
> > Often the antireplay window is something like 32 or 64 packets
> That seems like very small window(s). I understand the simplicity enabled
> by such small windows but is it really enough for modern high speed networks
> with multiple L2/L3 switch/router hops (possibly with some link aggregation
> in there as well).

Maybe it is, maybe it isn't. But in the spirit of being conservative
in what one sends to the network, it would not be so nice to have an
implementation that puts additional requirements like bigger
windows to the peers compared to an implementation that does not.

> 
> > and maybe
> > not all of that can be used by the IPsec sender for relaxed ordering of
> > the sequence number assignment. One issue is that the size of the replay
> > window is not negotiated so the sender cannot tell the receiver that
> > a bigger window than normal is needed.
> The receiver should be able to adapt the size of antireplay window by 
> monitoring
> the amount of (supposedly) stale packets (SN's). Has such a design been tried?

I am not aware of any such implementation (which does not mean that
they do not exist).

> Do you have any "quality" requirements here, how large proportion of packets 
> is
> allowed to be dropped due to limited size of antireplay window? I assume there
> are higher level SLA's that control packet loss, perhaps it is up to
> the service provider
> to use that packet loss budget as it sees fit.
> 
> >
> >> > That said, some might argue that IPsec replay window can take care
> >> > not only of packet reordering in the network but also of reordering
> >> > inside an IPsec GW and therefore the atomic context (or ordered lock)
> >> > is not necessarily needed in all implementations.
> >>
> >> Do you mean that the replay protection also should be responsible for
> >> end-to-end (IPsec GW to GW) order restoration?
> >
> > No, I do not mean that and I think it is not in general correct for
> > an IPsec GW to reorder received packets to the sequence number order.
> If order restoration adds latency, it can definitively do harm. And even if it
> could be done without adding latency (e.g. in some queue), we don't know if
> the SN order is the "real" order and order restoration actually is beneficial.
> 
> >
> > What I mean (but formulated sloppily) is that the window mechanism of
> >

Re: [lng-odp] [API-NEXT][RFC][rebased] linux-gen: ipsec: draft IPsec implementation

2017-04-18 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> >> +   /* Setup parameters and call crypto library to create session */
> >> +   crypto_param.op = (ODP_IPSEC_DIR_INBOUND == param->dir) ?
> >> +   ODP_CRYPTO_OP_DECODE :
> >> +   ODP_CRYPTO_OP_ENCODE;
> >> +   crypto_param.auth_cipher_text = 1;
> >> +
> >> +   // FIXME: is it possible to use ASYNC crypto with ASYNC IPsec?
> >
> > Did you mean to say SYNC crypto here, since that's what you're doing?
> 
> No. I meant what I said. Using ASYNC crypto events to to power ASYNC
> IPsec events. Unfortunately it doesn't seem possible ATM.

I also thought SYNC crypto was meant there in the FIXME comment.

Using ASYNC crypto to provide ASYNC IPsec is clearly possible. Maybe
it cannot be done as efficiently as one might want, but one could
e.g. do an async crypto-op, and then upon its completion create and
queue an IPsec completion event. Event queuing would occur twice.

However, since the IPsec implementation is part of the ODP implementation,
it does not necessarily have to restrict itself to the public ODP
crypto API but can use internal APIs too, which may make things easier.

> >> +   /* Generate an IV */
> >> +   if (ipsec_sa->esp_iv_len) {
> >> +   crypto_param.iv.data = ipsec_sa->iv;
> >> +   crypto_param.iv.length = 
> >> odp_random_data(crypto_param.iv.data,
> ipsec_sa->esp_iv_len, 1);
> >
> > The odp_random_data() API has been changed. This call should now be:
> > crypto_param.iv.length = odp_random_data(crypto_param.iv.data,
> > ipsec_sa->esp_iv_len, ODP_RANDOM_CRYPTO);

Note that for AES-GCM one cannot use random IV since IV values must
never be reused in GCM. With AES-GCM a counter or alike would work.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Dmitry 
> Eremin-
> Solenikov
> Sent: Friday, April 14, 2017 3:00 PM
> To: Bill Fischofer 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT][RFC][rebased] linux-gen: ipsec: draft IPsec
> implementation
> 
> Bill,
> 
> Thanks a lot for the review! I will send updated patches in few hours
> hopefully.
> 
> On 14.04.2017 00:42, Bill Fischofer wrote:
> > Next version should be marked API-NEXT, whether or not it is still an
> 
> Yes, that was an error from my side.
> 
> > On Tue, Apr 11, 2017 at 8:41 AM, Dmitry Eremin-Solenikov
> >  wrote:
> >> For now it's only a preview with the following limitation:
> >>  - No inline processing support
> >>  - No SA lookups
> >>  - Only IPv4 support
> >>  - No tunnel support
> >>  - No header modification according to RFCs
> >>
> >> Signed-off-by: Dmitry Eremin-Solenikov 
> >> ---
> >>  platform/linux-generic/include/odp_internal.h  |4 +
> >>  .../linux-generic/include/odp_ipsec_internal.h |   71 ++
> >>  platform/linux-generic/include/protocols/ip.h  |   52 +
> >>  platform/linux-generic/odp_event.c |5 +
> >>  platform/linux-generic/odp_init.c  |   13 +
> >>  platform/linux-generic/odp_ipsec.c | 1172 
> >> +++-
> >>  6 files changed, 1287 insertions(+), 30 deletions(-)
> >>  create mode 100644 platform/linux-generic/include/odp_ipsec_internal.h
> >>
> >> diff --git a/platform/linux-generic/include/odp_internal.h 
> >> b/platform/linux-
> generic/include/odp_internal.h
> >> index 05c8a422..fd7848ac 100644
> >> --- a/platform/linux-generic/include/odp_internal.h
> >> +++ b/platform/linux-generic/include/odp_internal.h
> >> @@ -71,6 +71,7 @@ enum init_stage {
> >> CLASSIFICATION_INIT,
> >> TRAFFIC_MNGR_INIT,
> >> NAME_TABLE_INIT,
> >> +   IPSEC_INIT,
> >> MODULES_INIT,
> >> ALL_INIT  /* All init stages completed */
> >>  };
> >> @@ -130,6 +131,9 @@ int _odp_ishm_init_local(void);
> >>  int _odp_ishm_term_global(void);
> >>  int _odp_ishm_term_local(void);
> >>
> >> +int odp_ipsec_init_global(void);
> >> +int odp_ipsec_term_global(void);
> >> +
> >>  int _odp_modules_init_global(void);
> >>
> >>  int cpuinfo_parser(FILE *file, system_info_t *sysinfo);
> >> diff --git a/platform/linux-generic/include/odp_ipsec_internal.h 
> >> b/platform/linux-
> generic/include/odp_ipsec_internal.h
> >> new file mode 100644
> >> index ..c7620b88
> >> --- /dev/null
> >> +++ b/platform/linux-generic/include/odp_ipsec_internal.h
> >> @@ -0,0 +1,71 @@
> >> +/* Copyright (c) 2017, Linaro Limited
> >> + * All rights reserved.
> >> + *
> >> + * SPDX-License-Identifier:BSD-3-Clause
> >> + */
> >> +
> >> +/**
> >> + * @file
> >> + *
> >> + * ODP internal IPsec routines
> >> + */
> >> +
> >> +#ifndef ODP_IPSEC_INTERNAL_H_
> >> +#define ODP_IPSEC_INTERNAL_H_
> >> +
> >> +#ifdef __cplusplus
> >> +extern "C" {
> >> +#endif
> >> +
> >> +#include 
> >> +#include 
> >> +
> >> +/** @ingroup odp_ipsec
> >> + *  @{
> >> + */
> >> +

Re: [lng-odp] [API-NEXT][RFC][rebased] linux-gen: ipsec: draft IPsec implementation

2017-04-18 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Some comments on the patch:

> >>  - No tunnel support
> >>  - No header modification according to RFCs

There is some tunneling and header modification code in the patch.
Now it is not easy to tell which part of the code is supposed to work
and which not. Other limitations include: There is no replay protection;
IV is not generated properly for each packet but is a constant.

The code does not have the two-phase SA deletion of the latest IPsec
API. An application has to first call odp_ipsec_sa_disable() and
later, when disabling has completed, odp_ipsec_sa_destroy().

The code performs potentially unaligned accesses when accessing
and modifying packet headers. I wonder if that is ok in ODP or
whether the linux generic implementation needs to support targets
that do not support unaligned accesses.

The code makes assumptions that the relevant packet headers
are in one contiguous segment of a packet. Even if that assumption
held for packet received from the network, it does not necessarily
hold for packets processed by the application. Related to this,
the success of odp_packet_pull_head() etc are not checked but they
cannot be assumed to always succeed.

There is also an assumption that the L3 header is in the same
segment as the start of the packet (see the pointer arithmetic
in odp_ipsec_in_single()).

There is an assumption that the size of an IP header equals
sizeof(_odp_ipv4hdr_t) which is not true when the IP packet
has options.

SA creation does not work for AES-GCM, which requires that auth
algo is set to ODP_AUTH_ALG_AES_GCM and cipher algo is set to
ODP_CIPHER_ALG_AES_GCM. I suppose ICV length in that case would
always be 16 since the IPsec API lacks a way to specify it (it
assumes that the algorithm fully determines the ICV length which
is not true for AES-GCM).

The meaning of ipsec_sa->reserved seems to vary. Sometimes 1
means reserved (as in: in use) and sometimes 0.

The comment of odp_ipsec_alloc_pkt_ctx() does not quite match
the code which does not associate the ctx with a packet buffer.

As discussed in one arch meeting, checking whether an SA handle
passed by the application is valid is unnecessary.

odp_ipsec_out_single() zeroes several packet fields for AH
processing but does not restore the fields anywhere.

The content of user area is not copied from input packets
to output packets.

The IPv4 ID field generation for the outer IP header of tunneled
packets is weird. There is no point in randomizing its value
at wraparound.

A few of the comments are written as if AH is the authentication
and ESP is the encryption of an IPsec transformation, even
though AH versus ESP is the protocol selection and then each
protocol can include authentication and encryption (well, except
that AH cannot have encryption). The code could also be written
so that AH and ESP are mutually exclusive for one SA (and
the IPsec API does not support AH+ESP in a single operation).

odp_ipsec_result() does not work according to the API spec,
but probably this is because the patch was not meant for the
latest API-NEXT but on top of other patches that modified
the API?

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Dmitry 
> Eremin-
> Solenikov
> Sent: Tuesday, April 11, 2017 4:42 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [API-NEXT][RFC][rebased] linux-gen: ipsec: draft IPsec 
> implementation
> 
> For now it's only a preview with the following limitation:
>  - No inline processing support
>  - No SA lookups
>  - Only IPv4 support
>  - No tunnel support
>  - No header modification according to RFCs
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> ---
>  platform/linux-generic/include/odp_internal.h  |4 +
>  .../linux-generic/include/odp_ipsec_internal.h |   71 ++
>  platform/linux-generic/include/protocols/ip.h  |   52 +
>  platform/linux-generic/odp_event.c |5 +
>  platform/linux-generic/odp_init.c  |   13 +
>  platform/linux-generic/odp_ipsec.c | 1172 
> +++-
>  6 files changed, 1287 insertions(+), 30 deletions(-)
>  create mode 100644 platform/linux-generic/include/odp_ipsec_internal.h
> 
> diff --git a/platform/linux-generic/include/odp_internal.h b/platform/linux-
> generic/include/odp_internal.h
> index 05c8a422..fd7848ac 100644
> --- a/platform/linux-generic/include/odp_internal.h
> +++ b/platform/linux-generic/include/odp_internal.h
> @@ -71,6 +71,7 @@ enum init_stage {
>   CLASSIFICATION_INIT,
>   TRAFFIC_MNGR_INIT,
>   NAME_TABLE_INIT,
> + IPSEC_INIT,
>   MODULES_INIT,
>   ALL_INIT  /* All init stages completed */
>  };
> @@ -130,6 +131,9 @@ int _odp_ishm_init_local(void);
>  int _odp_ishm_term_global(void);
>  int _odp_ishm_term_local(void);
> 
> +int odp_ipsec_init_global(void);
> +int odp_ipsec_term_global(void);
> +
>  int _odp_modules_init_global(void);
> 
>  int 

[lng-odp] IPsec API: signaling lifetime expiration

2017-04-11 Thread Peltonen, Janne (Nokia - FI/Espoo)
[Moving the discussion to the ODP list]

> > Bogdan: I wonder why ODP did not considered to signal SA expiry as a status 
> > event (like
> ODP_IPSEC_STATUS_SA_DISABLE)... meaning, what if there is no packet?
> 
> This is an ODP question and that discussion should be moved to the ODP
> mailing list. The answer is that IPsec exceptions have yet to be
> defined as the IPsec API is still a working definition and not yet
> finalized. This is something we should finalize at next month's design
> sprint.
>

Commenting from the point-of-view of the current API draft:

Soft expiry for byte or packet based lifetimes does not indicate
any error in processing the packet and therefore using the normal
per-SA completion queue is desired since then packet order is
reserved and different SAs can use different queues for more
parallelism.

For time based soft lifetimes the current mechanism is indeed not
very useful as the expiration can be signaled only if there was
a packet for the SA *). Therefore the application would need to
keep track of time based lifetimes anyway.

I would still keep byte and packet based lifetime expirations in the
operation result since they are related to the processed packets,
but maybe soft time based lifetimes should be moved to the status
event (or even removed altogether?).

Even if the expiration of hard time based lifetimes could be
signaled through the status event, the status is needed also
in the operation result for the case an attempt is made to
use an expired SA.

*) Actually, time based life time expiration could be signaled
through the operation result e.g. by generating an event (or
one additional result to an event) that does not correspond to
any input packet and does not have a valid packet pointer.
But if that was the intention, the API would have said something
about it. 

Janne




Re: [lng-odp] [API-NEXT PATCH v2 00/16] A scalable software scheduler

2017-04-10 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Ola Liljedahl <mailto:ola.liljed...@linaro.org> wrote:
> Peltonen, Janne (Nokia - FI/Espoo) <mailto:janne.pelto...@nokia.com> wrote:
> > In an IPsec GW (as a use case example) one might want to do all
> > stateless processing (like ingress and egress IP processing and the
> > crypto operations) in ordered contexts to get parallelism, but the
> > stateful part (replay-check and sequence number generation) in
> > an atomic context (or holding an ordered lock) so that not only
> > the packets are output in ingress order but also their sequence
> > numbers are in the same order.
>
> To what extent does IPsec sequence number ordering have to equal actual
> transmission order? If an IPsec GW preserves ingress-to-egress packet order,
> does it matter that this order might not be the same as the IPsec sequence
> number order? If IPsec SN's are not used for reordering, just for replay
> protection, I don't see why the two number series have to match.

The sequence numbers of course do not need to fully match the transmission
order because the anti-replay window mechanism can tolerate out-of-sequence
packets (caused either by packet reordering in the network or by sequence
number assignment not quite following the transmission order.

But assigning sequence numbers out of order with respect to the packet
order (which hopefully stays the same between the ingress and egress of
an SGW) eats into the out-of-sequence tolerance budget (i.e. the window
size at the other end) and leaves less of the budget for actual
reordering in the network.

Whether out-of-sequence sequence number assignment is ok or problematic
depends on the peer configuration, network, possible QoS induced
packet reordering and the magnitude of the possible sequence number
(not packet) reordering with respect to transmission order in the sender.

Often the antireplay window is something like 32 or 64 packets and maybe
not all of that can be used by the IPsec sender for relaxed ordering of
the sequence number assignment. One issue is that the size of the replay
window is not negotiated so the sender cannot tell the receiver that
a bigger window than normal is needed.

> > That said, some might argue that IPsec replay window can take care
> > not only of packet reordering in the network but also of reordering
> > inside an IPsec GW and therefore the atomic context (or ordered lock)
> > is not necessarily needed in all implementations.
>
> Do you mean that the replay protection also should be responsible for
> end-to-end (IPsec GW to GW) order restoration?

No, I do not mean that and I think it is not in general correct for
an IPsec GW to reorder received packets to the sequence number order.

What I mean (but formulated sloppily) is that the window mechanism of
replay protection can tolerate out-of-sequence sequence numbers to some
extent even when the cause is not the network but the sending IPsec GW.

So, depending on the implementation and on the circumstances, one might
want to ensure that sequence number gets assigned in the transmission
order or one might decide not to worry about it and let the window
mechanism in the receiver handle it.

> It would be great if each QoS class would have its own IPsec SA but
> is that always the case?

Yes it would and that is what the RFC suggests but it is not often
the case. And since some antireplay window needs to be left for
QoS-caused out-of-order tolerance, it may not always be a good idea
to have the sender essentially consume a big chunk of the window even
before the IPsec packets enter the network.

Janne




Re: [lng-odp] [API-NEXT] API: IPSEC: Updating ipsec APIs to support sNIC implementation.

2017-04-07 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> >  Shally - enque in order-of-completion or in-order-submission? What does 
> > this mean in-
> > context of ipsec?
> > In general, there is bit of confusion when we are using terms async in ODP 
> > context here.
> > It imply that queue is used to output events. an async implementation can 
> > queue events in
> > order of their completion (which may be different from order of their 
> > submission).
> > If we are queuing events in-order-of submission, then it is actually a 
> > "Synchronous queued
> > operation" (as we block outputs until previous ones are complete).
> > Can we make it bit more explicit.
> 
> It is order of submission. I  think it is mentioned in some API 
> documentation, will update
> it here as well.

My understanding was that the completion queuing order is neither the
submission order nor operation completion order, but the packet input
order with respect to the ordered context of the thread(s) submitting
the packet to IPsec (i.e. the order packets arrived in the system if
there is a chain of ordered contexts before IPsec enqueue).

But now that I look at the latest IPsec API I am not sure it reads that
way. Maybe I am mistaken or maybe the API doc is just not clear enough.

Anyway, I think original packet order is the most useful alternative
since then I can construct the application data path like this:

pktin->ordered queue->app processing->ipsec enque ...
ipsec event->ordered queue->app processing->ordered output queue->pktout

Original packet order would be maintained through the whole chain
without the application having to take any atomic context.

If IPsec offload only respected the submission order, then maintaining
the original packet order would require switching to an atomic context
(going through an atomic queue) or taking an ordered lock before IPsec
operation submission.

Janne




Re: [lng-odp] [API-NEXT PATCH] abi: event: add ODP_EVENT_IPSEC_STATUS

2017-04-07 Thread Peltonen, Janne (Nokia - FI/Espoo)
I think it is a good suggestion. But I understood that in C++
the comma is not allowed until C++11 so I wonder if it is
ok to add it in ODP (in C it is ok from C99 on).

Perhaps someone can just add the comma when pushing the patch?

Janne

> -Original Message-
> From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
> Sent: Friday, April 07, 2017 3:53 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: lng-odp-forward <lng-odp@lists.linaro.org>
> Subject: Re: [lng-odp] [API-NEXT PATCH] abi: event: add ODP_EVENT_IPSEC_STATUS
> 
> On Fri, Apr 7, 2017 at 4:54 AM, Janne Peltonen <janne.pelto...@nokia.com> 
> wrote:
> >
> > Update ABI spec with the new IPsec event type.
> >
> > Signed-off-by: Janne Peltonen <janne.pelto...@nokia.com>
> 
> Reviewed-by: Bill Fischofer <bill.fischo...@linaro.org>
> 
> > ---
> >  include/odp/arch/default/api/abi/event.h | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/odp/arch/default/api/abi/event.h
> b/include/odp/arch/default/api/abi/event.h
> > index ab14157..87220d6 100644
> > --- a/include/odp/arch/default/api/abi/event.h
> > +++ b/include/odp/arch/default/api/abi/event.h
> > @@ -29,7 +29,8 @@ typedef enum odp_event_type_t {
> > ODP_EVENT_PACKET   = 2,
> > ODP_EVENT_TIMEOUT  = 3,
> > ODP_EVENT_CRYPTO_COMPL = 4,
> > -   ODP_EVENT_IPSEC_RESULT = 5
> > +   ODP_EVENT_IPSEC_RESULT = 5,
> > +   ODP_EVENT_IPSEC_STATUS = 6
> 
> It's perfectly legal C for the last element in an enum to be
> comma-terminated and that makes it easier to keep extending the enum
> with additional entries since the previous entry doesn't have to keep
> being modified to add a comma, so I'd suggest adopting that convention
> here. Other than that nit, this looks fine.
> 
> >  } odp_event_type_t;
> >
> >  /**
> > --
> > 2.5.0
> >


Re: [lng-odp] [API-NEXT PATCH v2 00/16] A scalable software scheduler

2017-04-07 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

On Thu, Apr 6, 2017 at 1:46 PM, Bill Fischofer  
wrote:
> On Thu, Apr 6, 2017 at 1:32 PM, Ola Liljedahl  
> wrote:
> > On 6 April 2017 at 13:48, Jerin Jacob  
> > wrote:

> >> We see ORDERED->ATOMIC as main use case for basic packet forward.Stage
> >> 1(ORDERED) to process on N cores and Stage2(ATOMIC) to maintain the ingress
> >> order.
> > Doesn't ORDERED scheduling maintain the ingress packet order all the
> > way to the egress interface? A least that's my understanding of ODP
> > ordered queues.
> > From an ODP perspective, I fail to see how the ATOMIC stage is needed.

For basic IP forwarding I also do not see why an atomic stage would be
needed, but for stateful things like IPsec or some application specific
higher layer processing the situation can be different.

At the risk of stating the obvious: Ordered scheduling maintains ingress
order when packets are placed in the next queue (toward the next pipeline
stage or to pktout), but it allows parallel processing of packets of the
same flow between the points where order is maintained. To guarantee packet
processing in the ingress order in some section of code, the code needs
to be executed in an atomic context or protected using an ordered lock.

> As pointed out earlier, ordered locks are another option to avoid a
> separate processing stage simply to do in-sequence operations within
> an ordered flow. I'd be curious to understand the use-case in a bit
> more detail here. Ordered queues preserve the originating queue's
> order, however to achieve end-to-end ordering involving multiple
> processing stages requires that flows traverse only ordered or atomic
> queues. If a parallel queue is used ordering is indeterminate from
> that point on in the pipeline.

Exactly.

In an IPsec GW (as a use case example) one might want to do all
stateless processing (like ingress and egress IP processing and the
crypto operations) in ordered contexts to get parallelism, but the
stateful part (replay-check and sequence number generation) in
an atomic context (or holding an ordered lock) so that not only
the packets are output in ingress order but also their sequence
numbers are in the same order.

That said, some might argue that IPsec replay window can take care
not only of packet reordering in the network but also of reordering
inside an IPsec GW and therefore the atomic context (or ordered lock)
is not necessarily needed in all implementations.

Janne




Re: [lng-odp] odp_ipsec_result behaviour

2017-04-04 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,


>  - odp_ipsec_result() would change buffer type from IPSEC_RESULT to
> PACKET, extract ipsec_op_result, etc. After this operation original
> event is unsuitable for further operations.

An added complication with ODP IPsec is fragmentation offload that can cause
multiple result packets for one input packet in outbound IPsec offload.

> Ugh. An application does not have a way to determine the amount of
> memory it needs (well, unless it depends on 1:1 event <-> packet
> correspondence as ipsec_offload does).

Currently there is no 1:1 correspondence since one input packet
can cause multiple output packets and in the current API multiple
such sets of fragmented packets may come through the same event.

Note also that an application using the IPsec API needs to know how many
result packets it gets out of certain input packet to be able to track
SA usage and the operations in flight. The current API promises that
all the result fragments of an input packet are delivered in the same
event in consecutive slots in a way that allows the necessary bookkeeping
by the application.

Janne




Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC support

2017-03-23 Thread Peltonen, Janne (Nokia - FI/Espoo)
> > This is part of odp_ipsec_sa_param_t, so lookup config is per SA.
> 
> Does that mean that both SPI lookup and DEST ADDR lookup are mandatory?
> Is there a reason lookup mode is per SA?

The lookup mode is per SA to make it possible to have SAs for which
not ODP but the application performs the lookup (e.g. currently for
multicast IPsec SAs that do not have unique SPIs and require src
address lookup too, or for any other SAs with weird lookup rules)
and uses look-a-side IPsec ops with an explicit SA to do the
IPsec transforms.

Thus, I think at minimum the ODP_IPSEC_LOOKUP_DISABLED needs to
be per-SA even if the SPI versus SPI+dstaddr selection would be
global.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bala 
> Manoharan
> Sent: Thursday, March 23, 2017 4:42 PM
> To: Savolainen, Petri (Nokia - FI/Espoo) 
> 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC 
> support
> 
> Regards,
> Bala
> 
> 
> On 23 March 2017 at 17:40, Savolainen, Petri (Nokia - FI/Espoo)
>  wrote:
> >> >
> >> >  /**
> >> > @@ -381,11 +547,29 @@ typedef enum odp_ipsec_lookup_mode_t {
> >> > ODP_IPSEC_LOOKUP_DISABLED = 0,
> >> >
> >> > /** Inbound SA lookup is enabled. Used SPI values must be
> >> unique. */
> >> > -   ODP_IPSEC_LOOKUP_IN_UNIQUE_SA
> >> > +   ODP_IPSEC_LOOKUP_IN_UNIQUE_SPI,
> >> > +
> >> > +   /** Inbound SA lookup is enabled. Lookup matches both SPI and
> >> > + * destination IP address. Used SPI values must be unique. */
> >> > +   ODP_IPSEC_LOOKUP_IN_DSTADDR_UNIQUE_SPI
> >> >
> >> >  } odp_ipsec_lookup_mode_t;
> >>
> >> odp_ipsec_lookup_mode_t is not added either in odp_ipsec_config() or
> >> in odp_ipsec_capability().
> >> I believe this should be added in both these struct?
> >
> >
> > This is part of odp_ipsec_sa_param_t, so lookup config is per SA.
> 
> Does that mean that both SPI lookup and DEST ADDR lookup are mandatory?
> Is there a reason lookup mode is per SA?
> 
> >
> >
> >>
> >> >
> >> >
> >> >  /**
> >> > + * Result event destination
> >> > + */
> >> > +typedef enum odp_ipsec_dest_mode_t {
> >> > +   /** Destination for IPSEC result events is a queue. */
> >> > +   ODP_IPSEC_DEST_QUEUE = 0,
> >> > +
> >> > +   /** Destination for IPSEC result events is the classifier.
> >> > +*  IPSEC capability 'cls_inline' determines if inline
> >> classification
> >> > +*  is supported. */
> >> > +   ODP_IPSEC_DEST_CLS
> >> > +
> >> > +} odp_ipsec_dest_mode_t;
> >>
> >> Should'nt we add a dest_mode ODP_IPSEC_DEST_PKTIO for outbound inline
> >> when the packet is sent out through interface directly.
> >
> > This selection is for result events. For output direction, queue are the 
> > only option
> (for events). Queue vs inline pktout is selected by odp_ipsec_out_enq() vs
> odp_ipsec_out_inline(). Selection of output pktio (or TM queue in the future) 
> is
> parameters to odp_ipsec_out_inline().
> 
> Yes. But the odp_ipsec_dest_mode_t is available in SA params and if
> the SA is configured in outbound direction and linked to the pktio
> then the configuration of dest_mode cannot be ODP_IPSEC_DEST_QUEUE.
> 
> >
> > -Petri
> >
> >
> >


Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC support

2017-03-23 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

A few quick comments below. Petri will probably comment the other points.

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Nikhil 
> Agarwal
> Sent: Thursday, March 23, 2017 2:02 PM
> To: Petri Savolainen ; lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC 
> support
> 
> Few initial comments:
> 
> - For Mixing inline and async processing must maintain same SA context in HW 
> and
> implementation. In that case, we should shift to async SA creation so that 
> while creating
> an SA, HWinterface can return the context to sw and Async processing can 
> be done for
> that context.
> - Why we want to add different level of header retention, we can add retain 
> header or
> discard.
> - Why parsing is optional and layer wise. Parsing should be same as on the 
> pktio
> interface. Even in the lookaside APIs L3 parsing was mandatory. Otherwise 
> application will
> have to do the parsing on its own. Moreover, application will never know L3 
> protocol if
> there are no parse results.
> - In case outer headers are retained in the packet, hardware will parse 
> packet based outer
> headers not inner. Same goes for classification.

Outer headers are not provided _in the packet_. The retained outer headers are 
not
part of the packet data but stored with the packet in some implementation 
defined
way.

Therefore parsing and classification is done based on the inner headers (or 
rather,
the packet as it appears after IPsec decapsulation).

> - Checksum configuration (For inbound as well as outbound) can be per flow 
> basis, as
> requirement can be different for different flows.

It depends on the definition of flow here. Both locally generated and forwarded 
traffic
may be sent out through the same outbound SA. For forwarded traffic L4 
checksums should
not be (re)generated but for locally originated traffic they should. Also, some 
UDP
protocols like VxLAN do not use checksum.

So, I think checksum generation should be selectable per packet. This is not 
part
of the latest IPsec patch since it probably needs to be done in the generic
packet API. In fact, packet_io.h currently says: "Application may use a packet
metadata flag to disable checksum insertion per packet bases." Such feature is
just missing.

> - Support for ESP/AH and Tunnel/Transport mode should be exposed via 
> capability.
> - Inline SA lookup should also support 3 tuple lookups (SIP, DIP, SPI).

Now lookups that would need source IP would need to be done by the application
in SW. Support in the inline IPsec API can be added later, but I am not sure
if it is needed there right away (it would only be needed for multicast IPsec
traffic anyway, right?).

Right now the API assumes that inbound SAs have unique SPIs, which can make
things easier to implementations. With source address matching in addition
to destination address and SPI you would presumably have non-unique SPIs too.

> - Why there are separate lookup prams in SA, when SA tunnel params already 
> include SIP and
> DIP?

Transport mode SAs do not have the IP addresses.

> - What about enabling HASH distribution post IPSEC for FAT tunnel use case.
> - In per SA params, if dest_mode is ODP_IPSEC_DEST_CLS, why output is to a 
> new CoS, why
> not to default CoS of that pktio. Are we saying that we will have two level of
> classification on that oktio, one before IPsec and one after that?

One reason is that separate CoS allows one to distinguish IPsec
processed packets from other packets in the classification rules.

> - Who holds the responsibility of freeing the l2hdr in 
> odp_ipsec_inline_op_param_t post
> transmission? IUs it mandated to be inside the packet? The why not just use 
> l2_offset?

The implementation copies the l2 header data to its internal storage during
the enqueue call. ODP implementation will not dereference the provided pointer
after the enqueue call returns, so the application is then free to do anything
with it. 
 
> 
> Regards
> Nikhil
> 
> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Petri 
> Savolainen
> Sent: Tuesday, March 21, 2017 7:47 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC 
> support
> 
> Added support for inline IPSEC processing on packet input and output. Inline 
> mode IPSEC
> and traffic manager cannot be enabled
> (currently) on the same pktio interface.
> 
> Signed-off-by: Petri Savolainen 
> ---
>  include/odp/api/spec/ipsec.h | 348 
> ---
>  include/odp/api/spec/packet_io.h |  32 
>  2 files changed, 355 insertions(+), 25 deletions(-)
> 
> diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h index
> e951e49..23d02cf 100644
> --- a/include/odp/api/spec/ipsec.h
> +++ 

Re: [lng-odp] [API-NEXT PATCH v2 1/3] api: ipsec: extend lookaside API

2017-03-23 Thread Peltonen, Janne (Nokia - FI/Espoo)

> >  /**
> > + * Disable IPSEC SA
> > + *
> > + * Application must use this call to disable a SA before destroying it. 
> > The call
> > + * marks the SA disabled, so that IPSEC implementation stops using it. For
> > + * example, inbound SPI lookups will not match any more. Application must
> > + * stop providing the SA as parameter to new IPSEC input/output operations
> > + * before calling disable. Packets in progress during the call may still 
> > match
> > + * the SA and be processed successfully.
> > + *
> > + * When in synchronous operation mode, the call will return when it's 
> > possible
> > + * to destroy the SA. In asynchronous mode, the same is indicated by an
> > + * ODP_EVENT_IPSEC_STATUS event sent to the queue specified for the SA.
> > 
> > During synchronous operation mode, it is possible that this call is 
> > executing on one core
> > and the SA is being used in ipsec operation by some other core and so this 
> > call might take
> > considerable cpu
> > cycles waiting for other cores to finish.
> > It might be easy if we dictate that the result will be returned using the 
> > status even for
> > synchronous mode or we can add a new API odp_ipsec_sa_use() which specifies 
> > if the SA has
> > been disabled or not.
> 
> 
> This is included into the spec already: " Application must stop providing the 
> SA as
> parameter to new IPSEC input/output operations before calling disable."
> 
> So, first application needs to synchronize between all threads, so that any 
> of those are
> not any more using the SA for current/new calls. Only after that one 
> application thread
> goes and calls disable. In synchronous mode there are no packets even in 
> flight, if
> application has done the above sync.
> 

Yes, but the SA may be in use through calls that do not specify the SA
explicitly as a parameter but let ODP do the lookup. In this case sa_disable()
may have to block for some time to let the sync operations on other threads
for the same SA to complete.

Janne




Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC support

2017-03-22 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> > +
> > +   /** Outer headers for inline output operation
> > +*
> > +*  Outbound inline IPSEC operation uses this information to
> > prepend
> > +*  outer headers to the IPSEC packet before sending it out.
> > +*/
> > +   struct {
> > +   /** Points to first byte of outer headers to be copied in
> > +*  front of the outgoing IPSEC packet. Implementation
> > copies
> > +*  the headers during odp_ipsec_out_inline() call. */
> > +   uint8_t *ptr;
> >
> 
> Should this be an odp_packet_t rather than a raw set of bytes? The
> rationale for making this an odp_packet_t is that this could then be a
> packet reference which can be used for (re)transmit tracking. For input

If it were odp_packet_t, the IPsec implementation would have to be
prepared for the data to be segmented in arbitrary ways, which probably
would not improve performance. And to me it is not really a packet anyway.

I did not quite understand the point about retransmit tracking in this
context.

One thing we already discussed with Petri is that to enable IP and UDP
based tunneling protocols (e.g. IPIP, GRE, VxLAN) outside the IPsec
packet in the inline output case, the right packet length would have
to be inserted in the right place of the added header. There is the
same issue with L2 protocols that would contain the total packet
length (e.g. some proprietary fabric encapsulation). In this case
the outer header is rather like a template. This is something to be
considered in some future versions of the API but one possible
way to accommodate some of these cases is to have an offset that
specifies which word of the complete packet shall be incremented
by the resulting packet size after IPsec encapsulation.

> having this be raw bytes makes sense since the ODP implementation is
> controlling their allocation and placement, however for output how is the
> application going to allocate these? Presumably it would create a header
> via odp_packet_alloc() and then use odp_packet_data() to get this address,
> but that seems cumbersome compared to simply passing the odp_packet_t
> directly. The alternative would seem to require that the application use
> some sort of odp_shm_reserve() call, but that gets awkward and is far less
> efficient than packet allocation, which is a fastpath operation.
>

I do not think applications would dynamically allocate the storage for the
header (through packet alloc or shm alloc or whatever) for every packet sent
but would just have the pointer point to a cached header somewhere (e.g.
ready-made L2 header cached in the next hop structure found by L3 route
search). If the header or part of it does have to be constructed on the
fly, the application can use the stack or other thread specific storage
that does not have to be allocated separately for every output operation.
Or maybe even some existing per-packet memory like the user area.

Btw, the API should make it clear if the header data can be in the headroom
or the user area of the packet being enqueued (I suppose the latter should
be fine).
 
Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Wednesday, March 22, 2017 12:17 AM
> To: Petri Savolainen 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT PATCH v2 2/3] api: ipsec: add inline IPSEC 
> support
> 
> On Tue, Mar 21, 2017 at 9:17 AM, Petri Savolainen <
> petri.savolai...@linaro.org> wrote:
> 
> > Added support for inline IPSEC processing on packet input and
> > output. Inline mode IPSEC and traffic manager cannot be enabled
> > (currently) on the same pktio interface.
> >
> > Signed-off-by: Petri Savolainen 
> > ---
> >  include/odp/api/spec/ipsec.h | 348 ++
> > ++---
> >  include/odp/api/spec/packet_io.h |  32 
> >  2 files changed, 355 insertions(+), 25 deletions(-)
> >
> > diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> > index e951e49..23d02cf 100644
> > --- a/include/odp/api/spec/ipsec.h
> > +++ b/include/odp/api/spec/ipsec.h
> > @@ -19,6 +19,8 @@ extern "C" {
> >  #endif
> >
> >  #include 
> > +#include 
> > +#include 
> >
> >  /** @defgroup odp_ipsec ODP IPSEC
> >   *  Operations of IPSEC API.
> > @@ -51,11 +53,43 @@ typedef enum odp_ipsec_op_mode_t {
> >   * Application uses asynchronous IPSEC operations,
> >   * which return results via events.
> >   */
> > -   ODP_IPSEC_OP_MODE_ASYNC
> > +   ODP_IPSEC_OP_MODE_ASYNC,
> > +
> > +   /** Inline IPSEC operation
> > + *
> > + * Packet input/output is connected directly to IPSEC
> > inbound/outbound
> > + * processing. Application uses asynchronous or inline IPSEC
> > + * operations.
> > + */
> > +   ODP_IPSEC_OP_MODE_INLINE,
> > +
> > + 

[lng-odp] Linking problem with --disable-abi-compat

2017-03-08 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

I have hard time getting the following fine ODP application linked
against the shared ODP library when ODP has been configured with
--disable-abi-compat. It seems to me that the problem is in the
'local' binding of the _odp_packet_inline symbol in the .so, caused
by the -fvisibility=hidden argument in the ODP compilation.

$ cat test.c
#include 

int main(void)
{
return !odp_packet_user_ptr(0);
}

$ gcc -pthread -I../ODP/include test.c -L../ODP/lib -lodp-linux
/tmp/cc8GPUfA.o: In function `_odp_packet_user_ptr':
test.c:(.text+0xb): undefined reference to `_odp_packet_inline'
collect2: error: ld returned 1 exit status
Makefile:10: recipe for target 'dynamic2' failed
make: *** [dynamic2] Error 1

$ objdump -t ../ODP/lib/libodp-linux.so | grep _odp_packet_inline
0003b040 l O .rodata0070  
_odp_packet_inline

Linking statically works. In the archive the symbol is global,
although marked hidden:

$ objdump -t ../ODP/lib/libodp-linux.a | grep 'rodata.*_odp_packet_inline'
0080 g O .rodata0070 .hidden 
_odp_packet_inline

I have no idea what the right fix is, but I got things working
with this:

diff --git a/platform/linux-generic/odp_packet.c 
b/platform/linux-generic/odp_packet.c
index 60eef3a..10a99ca 100644
--- a/platform/linux-generic/odp_packet.c
+++ b/platform/linux-generic/odp_packet.c
@@ -25,7 +25,7 @@
 #define BASE_LEN  CONFIG_PACKET_MAX_SEG_LEN

 /* Fill in packet header field offsets for inline functions */
-const _odp_packet_inline_offset_t _odp_packet_inline ODP_ALIGNED_CACHE = {
+const _odp_packet_inline_offset_t _odp_packet_inline ODP_ALIGNED_CACHE 
__attribute__ ((visibility ("default"))) = {
.data   = offsetof(odp_packet_hdr_t, buf_hdr.seg[0].data),
.seg_len= offsetof(odp_packet_hdr_t, buf_hdr.seg[0].len),
.frame_len  = offsetof(odp_packet_hdr_t, frame_len),


After this patch:
$ objdump -t ../ODP/lib/libodp-linux.so | grep _odp_packet_inline
0003b080 g O .rodata0070  
_odp_packet_inline

So, should this now be fixed in ODP or should I do something
differently when compiling and linking ODP apps against ODP
without ABI compatibility?

Janne




Re: [lng-odp] [PATCH] linux-gen: abi: fix include/odp/api/abi symlink creation

2017-03-08 Thread Peltonen, Janne (Nokia - FI/Espoo)
> >   install-data-hook:
> > if [ -h $(prefix)/include/odp/api/abi ]; then \
> > -   : \
> > +   : ; \
> > else \
> > $(LN_S) -rf $(prefix)/include/odp/arch/@ARCH_ABI@/odp/api/abi \
> > $(prefix)/include/odp/api/abi; \
> 
> Which bash are you using?  Mine was ok without ";".
> 
> Maxim.
> 

Are you saying that in your shell this works (echoes foobar):
if false ; then : else echo foobar ; fi

When you tested the original code, did you check that the symlink
really got created when it did not yet exist?

Janne



Re: [lng-odp] [API-NEXT PATCHv7 2/5] linux-generic: packet: implement reference apis

2017-02-20 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Comments below.

Janne

> -Original Message-
> From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
> Sent: Friday, February 17, 2017 10:39 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] [API-NEXT PATCHv7 2/5] linux-generic: packet: 
> implement reference
> apis
> 
> First off, thank you very much for this review.
> 
> Please note that this code has been streamlined in patch
> http://patches.opendataplane.org/patch/7879/ and has been further
> refined with patch http://patches.opendataplane.org/patch/8145/ but
> the exposure you identify still exists in that code.
> 
> On Fri, Feb 17, 2017 at 11:31 AM, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> > Hi,
> >
> > I took a look at the packet references and it seems to me that
> > either the implementation is a bit racy or I confused myself
> > when reading the code. Or maybe I got the intended concurrency
> > semantics of the packet references wrong?
> >
> > My first issue is that packet_free() may access freed packet
> > header or corrupt unshared_len.
> >
> > The packet free function looks like this:
> >
> > static inline void packet_free(odp_packet_hdr_t *pkt_hdr)
> > {
> > odp_packet_hdr_t *ref_hdr;
> > uint32_t ref_count;
> >
> > do {
> > ref_hdr = pkt_hdr->ref_hdr;
> > ref_count = packet_ref_count(pkt_hdr) - 1;
> > free_bufs(pkt_hdr, 0, pkt_hdr->buf_hdr.segcount);
> >
> > if (ref_count == 1)
> > pkt_hdr->unshared_len = pkt_hdr->frame_len;
> >
> > pkt_hdr = ref_hdr;
> > } while (pkt_hdr);
> > }
> >
> > The problem here is that decrementing the ref_count, checking
> > its value and updating unshared_len is not single atomic
> > operation. By the time packet_free() checks if ref_count == 1
> > (i.e. if there is exactly one other reference left somewhere),
> > the true ref_count may have already been changed by another
> > thread doing packet_free() or packet_ref().
> >
> > For example, if two threads have a reference to the same packet
> > then execution (or the relevant memory ops) may get "interleaved"
> > as follows:
> >
> > T1: call packet_free()
> > T1: ref_count = packet_ref_count(pkt_hdr) - 1;
> > At this point ref_count variable is 1
> > T1: call free_bufs()
> > T1: call packet_ref_dec()
> > Now the ref_count of the packet header is 1.
> > T2: call and complete packet_free()
> > Thread 2 sees refcount 1 in the packet and frees the buffers
> > T1: pkt_hdr->unshared_len = pkt_hdr->frame_len;
> > Thread 1 accesses freed buffer for reading and writing.
> 
> I agree. These steps should be reversed so that the code should read:
> 
> if (ref_count == 1)
>pkt_hdr->unshared_len = pkt_hdr->frame_len;
> 
> free_bufs(pkt_hdr, 0, pkt_hdr->buf_hdr.segcount);
> 
> Or using the code with the above two patches applied, the code should read:
> 
> static inline void packet_free(odp_packet_hdr_t *pkt_hdr)
> {
> odp_packet_hdr_t *ref_hdr;
> uint32_t ref_count;
> int num_seg;
> 
> do {
> ref_count = packet_ref_count(pkt_hdr);
> num_seg = pkt_hdr->buf_hdr.segcount;
> ref_hdr = pkt_hdr->ref_hdr;
> 
> if (odp_likely((CONFIG_PACKET_MAX_SEGS == 1 || num_seg == 1) 
> &&
> ref_count == 1)) {
> buffer_free_multi((odp_buffer_t
> *)_hdr->buf_hdr.handle.handle, 1);
> } else {
> if (ref_count == 2)
> pkt_hdr->unshared_len = pkt_hdr->frame_len;
> 
> free_bufs(pkt_hdr, 0, num_seg);
>  }
> 
>  pkt_hdr = ref_hdr;
> } while (pkt_hdr);
> }
> 
> The mistake was trying to optimize things so that unshared_len is not
> set if the buffers are being freed, but that exposes these race
> conditions. So the worst that should now happen is that it is set
> unnecessarily before being freed.
> 
> If you concur I'll fold this fix into a v3 for patch
> http://patches.opendataplane.org/patch/8145/

I commented separately the patch you sent.

> 
> >
> > Similarly, if T2 created a new reference, T1 would have
> > a wrong idea of the number of remaining references and
> > would adjust the 

Re: [lng-odp] [API-NEXT PATCHv2] linux-generic: packet: avoid race condition in packet_free processing

2017-02-20 Thread Peltonen, Janne (Nokia - FI/Espoo)

Hi,

I do not think this really fixes the problem.

If we ignore memory ordering issues for now, I think this would fix the
problem that unshared_len would be modified after the packet has been
freed (since now the current thread keeps its own reference over the
unshared_len modification).

However, unshared_len may still be set to frame_len when the packet
is actually shared, because checking the ref count and then adjusting
unshared_len is not atomic.

If packet_ref_dec() returns 1 then the current thread must be the only
one who has a reference to the the packet header. It is thus safe to
assume that the ref count stays as one, as there is no other thread that
could be concurrently incrementing it. But if packet_ref_dec() returns
greater than 1, then another thread may have a reference to the
packet hdr. At any moment the other thread may create a new reference,
incrementing the reference count of the base packet. Therefore the
reference count returned by packet_ref_dec() may already be old
and incorrect when the caller gets it.

IOW, this is always racy:

> + if (ref_count == 2)
> + pkt_hdr->unshared_len = pkt_hdr->frame_len;

ref_count of the pkt_hdr may be exactly one just before the
assignment, but already e.g. three (implying that pkt_hdr is
shared by more than one other user) when the assignment takes
place.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Saturday, February 18, 2017 2:52 PM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [API-NEXT PATCHv2] linux-generic: packet: avoid race 
> condition in
> packet_free processing
> 
> Resolve bug https://bugs.linaro.org/show_bug.cgi?id=2892 by setting
> unshared_len prior to potentially freeing segments.
> 
> Reported-by: Janne Peltonen 
> Signed-off-by: Bill Fischofer 
> ---
>  platform/linux-generic/odp_packet.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_packet.c 
> b/platform/linux-generic/odp_packet.c
> index 81bbcedd..b18b69a9 100644
> --- a/platform/linux-generic/odp_packet.c
> +++ b/platform/linux-generic/odp_packet.c
> @@ -658,6 +658,9 @@ static inline odp_packet_hdr_t 
> *packet_free_to_list(odp_packet_hdr_t
> *pkt_hdr,
>   if (*nfree + num_seg >= nbufs)
>   break;
> 
> + if (ref_count == 2)
> + pkt_hdr->unshared_len = pkt_hdr->frame_len;
> +
>   for (i = 0; i < num_seg; i++) {
>   odp_packet_hdr_t *hdr =
>   pkt_hdr->buf_hdr.seg[i].hdr;
> @@ -666,9 +669,6 @@ static inline odp_packet_hdr_t 
> *packet_free_to_list(odp_packet_hdr_t
> *pkt_hdr,
>   packet_ref_dec(hdr) == 1)
>   buf[(*nfree)++] = buffer_handle(hdr);
>   }
> -
> - if (ref_count == 2)
> - pkt_hdr->unshared_len = pkt_hdr->frame_len;
>   }
> 
>   pkt_hdr = ref_hdr;
> @@ -693,10 +693,10 @@ static inline void packet_free(odp_packet_hdr_t 
> *pkt_hdr)
>   buffer_free_multi((odp_buffer_t *)
> _hdr->buf_hdr.handle.handle, 1);
>   } else {
> - free_bufs(pkt_hdr, 0, num_seg);
> -
>   if (ref_count == 2)
>   pkt_hdr->unshared_len = pkt_hdr->frame_len;
> +
> + free_bufs(pkt_hdr, 0, num_seg);
>   }
> 
>   pkt_hdr = ref_hdr;
> --
> 2.12.0.rc1



Re: [lng-odp] [API-NEXT PATCHv7 2/5] linux-generic: packet: implement reference apis

2017-02-17 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

I took a look at the packet references and it seems to me that
either the implementation is a bit racy or I confused myself
when reading the code. Or maybe I got the intended concurrency
semantics of the packet references wrong?

My first issue is that packet_free() may access freed packet
header or corrupt unshared_len.

The packet free function looks like this:

static inline void packet_free(odp_packet_hdr_t *pkt_hdr)
{
odp_packet_hdr_t *ref_hdr;
uint32_t ref_count;

do {
ref_hdr = pkt_hdr->ref_hdr;
ref_count = packet_ref_count(pkt_hdr) - 1;
free_bufs(pkt_hdr, 0, pkt_hdr->buf_hdr.segcount);

if (ref_count == 1)
pkt_hdr->unshared_len = pkt_hdr->frame_len;

pkt_hdr = ref_hdr;
} while (pkt_hdr);
}

The problem here is that decrementing the ref_count, checking
its value and updating unshared_len is not single atomic
operation. By the time packet_free() checks if ref_count == 1
(i.e. if there is exactly one other reference left somewhere),
the true ref_count may have already been changed by another
thread doing packet_free() or packet_ref().

For example, if two threads have a reference to the same packet
then execution (or the relevant memory ops) may get "interleaved"
as follows:

T1: call packet_free()
T1: ref_count = packet_ref_count(pkt_hdr) - 1;
At this point ref_count variable is 1
T1: call free_bufs()
T1: call packet_ref_dec()
Now the ref_count of the packet header is 1.
T2: call and complete packet_free()
Thread 2 sees refcount 1 in the packet and frees the buffers
T1: pkt_hdr->unshared_len = pkt_hdr->frame_len;
Thread 1 accesses freed buffer for reading and writing.

Similarly, if T2 created a new reference, T1 would have
a wrong idea of the number of remaining references and
would adjust the unshared_len to an incorrect value.

Right?

Maybe other modifications of unshared_len are also racy.



The second issue is that the atomic ops for setting and
reading the ref count seem to have too relaxed memory
ordering. In particular, packet_ref_dec() must not happen
(be visible to other threads) before its caller is done
with the packet and the related memory accesses have
completed. Now there does not seem to be any optimization
and memory barrier to prevent the ref count decrementing
happening too early. So I think it is at least theoretically
possible that a thread e.g. reads from a packet buffer
after it has already been freed by another thread, somehow
like this:

Source code order:
T1: interesting_data = read_from_pkt(pkt)
T1: packet_free(pkt)

Order visible to T2:
1: ref count decr
2: read from pkt

Now if T2 goes and frees the remaining reference between
steps 1 and 2, T1 may get even more interesting data.

Right?

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Wednesday, January 11, 2017 4:34 AM
> To: lng-odp@lists.linaro.org
> Subject: [lng-odp] [API-NEXT PATCHv7 2/5] linux-generic: packet: implement 
> reference apis
> 
> Implement the APIs:
> - odp_packet_ref_static()
> - odp_packet_ref()
> - odp_packet_ref_pkt()
> - odp_packet_has_ref()
> - odp_packet_unshared_len()
> 
> This also involves functional upgrades to the existing packet manipulation
> APIs to work with packet references as input arguments.
> 
> Signed-off-by: Bill Fischofer 
> ---
>  .../linux-generic/include/odp_packet_internal.h|  87 +++-
>  platform/linux-generic/odp_packet.c| 536 
> +
>  2 files changed, 516 insertions(+), 107 deletions(-)
> 
> diff --git a/platform/linux-generic/include/odp_packet_internal.h 
> b/platform/linux-
> generic/include/odp_packet_internal.h
> index e6e9d74..607560d 100644
> --- a/platform/linux-generic/include/odp_packet_internal.h
> +++ b/platform/linux-generic/include/odp_packet_internal.h
> @@ -19,6 +19,7 @@ extern "C" {
> 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -168,7 +169,7 @@ typedef struct {
>   * packet_init(). Because of this any new fields added must be reviewed for
>   * initialization requirements.
>   */
> -typedef struct {
> +typedef struct odp_packet_hdr_t {
>   /* common buffer header */
>   odp_buffer_hdr_t buf_hdr;
> 
> @@ -184,6 +185,13 @@ typedef struct {
>   uint32_t headroom;
>   uint32_t tailroom;
> 
> + /* Fields used to support packet references */
> + uint32_t unshared_len;
> + struct odp_packet_hdr_t *ref_hdr;
> + uint32_t ref_offset;
> + uint32_t ref_len;
> + odp_atomic_u32_t ref_count;
> +
>   /*
>* Members below are not initialized by packet_init()
>*/
> @@ -212,6 +220,55 @@ static inline odp_packet_hdr_t 
> *odp_packet_hdr(odp_packet_t pkt)
>   return (odp_packet_hdr_t *)buf_hdl_to_hdr((odp_buffer_t)pkt);
>  }
> 
> +static inline odp_packet_hdr_t 

Re: [lng-odp] [openfastpath] Performance issues found in OFP 2.0 upon integration into Memcached

2017-01-26 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> > I took a look at the default_event_dispatcher() in OFP v2.0 code and noticed
> > that it calls odp_schedule_multi() and it blocks until it receives the 
> > burst size.
> > This will likely cause the starvation issues Bill mentioned.

I am not sure about that. The dispatcher calls odp_schedule_multi()
with ODP_SCHED_WAIT, but that does not necessarily imply waiting for
the full burst. I think the API allows waiting for the full burst
as well as returning when there is even one event available and as
far as I can see odp and odp-dpdk are both doing the latter.

> > >> My workload is not stressing the full bandwidth of the NIC.

I was not referring to that either. I just think that odp-dpdk reads
packets from the NIC (from the DPDK PMD) in bursts bigger than
4 bytes if the stack is very highly loaded. The bursts seen in the
ODP API is then another matter.

Janne


> -Original Message-
> From: Bill Fischofer [mailto:bill.fischo...@linaro.org]
> Sent: Thursday, January 26, 2017 3:59 PM
> To: Geoffrey Blake <geoffrey.bl...@arm.com>
> Cc: Mike Holmes <mike.hol...@linaro.org>; Sorin Vultureanu 
> <sorin.vulture...@enea.com>;
> Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>; Bogdan Pricope
> <bogdan.pric...@enea.com>; openfastp...@list.openfastpath.org; nd 
> <n...@arm.com>; lng-odp  o...@lists.linaro.org>
> Subject: Re: [openfastpath] Performance issues found in OFP 2.0 upon 
> integration into Memcached
> 
> On Thu, Jan 26, 2017 at 6:55 AM, Geoffrey Blake <geoffrey.bl...@arm.com> 
> wrote:
> > I will reiterate that I am using OFP and ODP in ways that were not 
> > originally intended and the
> problems I have exposed are in a way unique to that usage.  But the problem 
> with batching is
> still relevant.  I took a look at the default_event_dispatcher() in OFP v2.0 
> code and noticed
> that it calls odp_schedule_multi() and it blocks until it receives the burst 
> size.  This will
> likely cause the starvation issues Bill mentioned.
> 
> OK, this is good to know and it does confirm that waiting for the
> "straggler" events to show up will inject unacceptable levels of
> latency.
> 
> >
> > ODP does have a timeout in ns specified for odp_schedule_multi() to wait 
> > for the requested
> burst already that could be used here instead.  But this might not be the 
> correct semantics, you
> don’t want to wait for the burst to happen if an event has been sitting on 
> the queue since the
> last time through the schedule call, you would want to handle that event more 
> quickly so instead
> the semantics should be in OFP (or ODP) “I want to call odp_schedule_multi() 
> at a specific
> frequency”. If you are calling odp_schedule_multi() slower than your max-rate 
> it immediately
> grabs what is available to avoid starvation, and if you are calling it too 
> fast, it blocks for
> the remaining time left in the period.  Since Bill is on this chain, an event 
> digest may also be
> useful for the scheduler to provide to upper layers so they can query how 
> much work may be
> available to make decisions on whether to call the full schedule routine.  I 
> am not sure if that
> is really possible though with polling pktio, but would be nice to have.
> 
> I can see how you'd want to establish some sort of "TDM"-like cadence
> on scheduling, however that model isn't likely to be very portable
> across different implementations since the scheduler itself may or may
> not be an active element in the system. In odp-linux, for example, the
> scheduler is only run at the time odp_schedule() is called and so we
> have no idea how long an event has been "available". To get that
> information would require recording an arrival timestamp of some sort
> at odp_queue_enq() time, which of course adds its own overhead. Since
> we now do have a framework for a pluggable scheduler and proposal to
> extend that modularity to other ODP components like queues, in theory
> we should be able to develop and experiment with a TDM-scheduler that
> would behave more like this to see what benefits might be had. The
> idea behind supporting different scheduler models is that one
> universal scheduling strategy is unlikely to be optimal for all
> applications/workloads so we'd like to support the ability for more
> sophisticated applications to choose a scheduling model that they know
> is better suited to their workloads. Today that's done at ODP compile
> time, but we intend to move that capability to odp_init_global() time
> at some point.
> 
> >
> > Thanks,
> > Geoffrey Blake
> >
> > On 1/25/17, 1:48 PM, "Bill Fischofer"

Re: [lng-odp] clarification of pktout checksum offload feature

2016-12-05 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> > There is also a second thing: how to disable checksum calculation
> > per-packet?

> We've talked about doing this but currently have no APIs for exposing this
> capability. Do you have a use case in mind?

The use cases include many UDP based tunneling protocols that want to disable
checksumming to save processing capacity both locally and in the other end.
These include VxLAN, IPsec-over-UDP, MPLS-over-UDP, GTP.

For instance, the VxLAN RFC says that the UDP checksum SHOULD be set to zero,
but currently there is no reliable way for an ODP application to do so if
the checksum offload feature is enabled in the output interface.

The pktin side could also benefit from checksum API changes. It seems that
now the ODP application must assume that L4 checksum was calculated if the
feature was enabled in the pktin config and if the received packet was UDP,
TCP or SCTP and if the packet was not a fragment. Per-packet checksum flags
could be more robust and convenient.

Janne


> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Wednesday, October 19, 2016 6:25 PM
> To: Maciej Czekaj 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] clarification of pktout checksum offload feature
> 
> Hi Maciej, sorry for the delay in getting a response to you on this. See
> inline.
> 
> On Thu, Oct 13, 2016 at 6:33 AM, Maciej Czekaj <
> maciej.cze...@caviumnetworks.com> wrote:
> 
> >
> > Guys,
> >
> > I was going to implement checksum offload for OFP project based on Monarch
> > checksum offload capability and I found out that there is no example for
> > using that API. Also, documentation seams to leave some room for various
> > interpretations, so I would like to clarify that and post a patch to
> > documentation, too.
> >
> >
> > This is an exempt from pktio.h from Monarch LTS:
> >
> >
> > /**
> >  * Packet output configuration options bit field
> >  *
> >  * Packet output configuration options listed in a bit field structure.
> > Packet
> >  * output checksum insertion may be enabled or disabled. When it is
> > enabled,
> >  * implementation will calculate and insert checksum into every outgoing
> > packet
> >  * by default. Application may use a packet metadata flag to disable
> > checksum
> >  * insertion per packet bases. For correct operation, packet metadata must
> >  * provide valid offsets for the appropriate protocols. For example, UDP
> >  * checksum calculation needs both L3 and L4 offsets (to access IP and UDP
> >  * headers). When application (e.g. a switch) does not modify L3/L4 data
> > and
> >  * thus checksum does not need to be updated, output checksum insertion
> > should
> >  * be disabled for optimal performance.
> >
> >
> >
> > From my contact with varoius NICs, including Octeon PKO & VNIC from
> > ThunderX, offloading H/W needs at least:
> >
> > For L4 offload:
> > L4 packet type: TCP/UDP/SCTP
> > L4 header offset
> > L3 header offset
> > L3 type may or may not be required but it is good to define it for
> > consistency
> >
> > For L3 checksum:
> > L3 packet type: IPv4
> > L3 header offset
> >
> 
> Yes, this info will be set by the ODP classifier. If you're constructing
> packets "by hand" then these fields should be set with appropriate
> odp_packet_xxx_set() calls.
> 
> 
> >
> > There is also a second thing: how to disable checksum calculation
> > per-packet?
> > If packet has no type in metadata, then obviously checksum will not be
> > computed. I think that would be the recommended method for now, even if ODP
> > community plans to  extend odp_packet API in the future to cover that case.
> >
> 
> We've talked about doing this but currently have no APIs for exposing this
> capability. Do you have a use case in mind?
> 
> 
> >
> > Maybe that is implicit that packet types should be set along header
> > offsets, but it is good to state that clearly and provide some usage
> > example, e.g. in examples/generator. I can send a patch for both doc and
> > generator but I would like to make sure we are on the same page.
> >
> 
> Patches always welcome. :)
> 
> Thanks.
> 
> 
> >
> >
> > Regards
> > Maciej
> >
> >
> >
> >


Re: [lng-odp] [API-NEXT PATCH] api: ipsec: added IPSEC API

2016-11-29 Thread Peltonen, Janne (Nokia - FI/Espoo)

Hi,

> I am wonder how will be calculate lifetime in bytes for asynchronous
> inbound operations and how the application can figure it out from output
> packets: it is only output packet data or should contain ESP header/trailer
> + outer IP header?

RFC 4301 says this about byte based SA lifetime:

 If byte count is used, then the implementation SHOULD count the
 number of bytes to which the IPsec cryptographic algorithm is
 applied.  For ESP, this is the encryption algorithm (including
 Null encryption) and for AH, this is the authentication
 algorithm.  This includes pad bytes, etc.

I did not quite get how async input operations would be any special
(compared to other operations) for lifetime handling or what exactly
the application needs to figure out from the output packets regarding
the lifetime calculation. If the lifetime as bytes gets exhausted,
the application will get to know about it through an event.

Janne




Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API

2016-11-16 Thread Peltonen, Janne (Nokia - FI/Espoo)
Comments below.

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bala 
> Manoharan
> Sent: Wednesday, November 16, 2016 11:16 AM
> To: Savolainen, Petri (Nokia - FI/Espoo) 
> 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API
> 
> Regards,
> Bala
> 
> 
> On 15 November 2016 at 20:17, Savolainen, Petri (Nokia - FI/Espoo)
>  wrote:
> >
> >
> >> -Original Message-
> >> From: Bala Manoharan [mailto:bala.manoha...@linaro.org]
> >> Sent: Tuesday, November 15, 2016 4:15 PM
> >> To: Savolainen, Petri (Nokia - FI/Espoo)  >> labs.com>
> >> Cc: lng-odp-forward 
> >> Subject: Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API
> >>
> >> Regards,
> >> Bala
> >>
> >>
> >> On 15 November 2016 at 14:17, Petri Savolainen
> >>  wrote:
> >> > Added definitions for a look-a-side IPSEC offload API. In addition to
> >> > IPSEC packet transformations, it also supports:
> >> > * inbound SA look up
> >> > * outbound IP fragmentation
> >> >
> >> > Signed-off-by: Petri Savolainen 
> >> > ---
> >> >  include/odp/api/spec/event.h   |   2 +-
> >> >  include/odp/api/spec/ipsec.h   | 720
> >> +
> >> >  include/odp_api.h  |   1 +
> >> >  platform/Makefile.inc  |   1 +
> >> >  platform/linux-generic/Makefile.am |   2 +
> >> >  platform/linux-generic/include/odp/api/ipsec.h |  36 ++
> >> >  .../include/odp/api/plat/event_types.h |   1 +
> >> >  .../include/odp/api/plat/ipsec_types.h |  39 ++
> >> >  8 files changed, 801 insertions(+), 1 deletion(-)
> >> >  create mode 100644 include/odp/api/spec/ipsec.h
> >> >  create mode 100644 platform/linux-generic/include/odp/api/ipsec.h
> >> >  create mode 100644 platform/linux-
> >> generic/include/odp/api/plat/ipsec_types.h
> >> >
> >> > diff --git a/include/odp/api/spec/event.h b/include/odp/api/spec/event.h
> >> > index fdfa52d..75c0bbc 100644
> >> > --- a/include/odp/api/spec/event.h
> >> > +++ b/include/odp/api/spec/event.h
> >> > @@ -39,7 +39,7 @@ extern "C" {
> >> >   * @typedef odp_event_type_t
> >> >   * ODP event types:
> >> >   * ODP_EVENT_BUFFER, ODP_EVENT_PACKET, ODP_EVENT_TIMEOUT,
> >> > - * ODP_EVENT_CRYPTO_COMPL
> >> > + * ODP_EVENT_CRYPTO_COMPL, ODP_EVENT_IPSEC_RESULT
> >> >   */
> >> >
> >
> >
> >> > +
> >> > +   /** Destination queue for IPSEC events
> >> > +*
> >> > +*  Operations in asynchronous mode enqueue resulting events
> >> into
> >> > +*  this queue.
> >> > +*/
> >> > +   odp_queue_t dest_queue;
> >> > +
> >> > +   /** User defined SA context pointer
> >> > +*
> >> > +*  User defined context pointer associated with the SA.
> >> > +*  The implementation may prefetch the context data. Default
> >> value
> >> > +*  of the pointer is NULL.
> >> > +*/
> >> > +   void *context;
> >> > +
> >> > +   /** Context data length
> >> > +*
> >> > +*  User defined context data length in bytes for prefetching.
> >> > +*  The implementation may use this value as a hint for the
> >> number of
> >> > +*  context data bytes to prefetch. Default value is zero (no
> >> hint).
> >> > +*/
> >> > +   uint32_t context_len;
> >> > +
> >> > +} odp_ipsec_sa_param_t;
> >>
> >> What will be the total number of SA in the system? Will it be around 100K?
> >> If that is the case IMO it might be better to add a "context_packet"
> >> which could be copied to the output packet instead of having 100K
> >> different queues?
> >
> > Yes, number of SAs maybe large.
> >
> > Are you asking why dest_queue or context ptr is on SA level ?
> >
> > Some speculative answers:
> > * crypto API has a queue per session (== IPSEC SA if application is IPSEC)
> > * Application may choose to use one ODP queue for all results, or one per 
> > SA, or anything in
> between
> > * In sync mode there would not be queues and no queue context, so only 
> > place for application's
> SA context pointer is this struct
> > * Packet context ptr is specified to be copied from input packets to output 
> > packets.
> Application could use it for SA context. We could remove SA specific context 
> pointer if
> application can always use packet context pointer for SA (and does not need 
> it for anything
> else)
> 
> Packet context ptr will not work in the case in which the packet is
> directly received from the interface by the offload engine.
> Is the above case valid or all the packet comes only through the application?

In this API draft the packets come only through the application,
but I think SA context is still quite useful in 

Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API

2016-11-15 Thread Peltonen, Janne (Nokia - FI/Espoo)

Hi,

> > +   /** IPv6 Differentiated Services Code Point */
> > +   uint8_t dscp;
> >
> 
> IPv6 does not use a DSCP. The field is called Traffic Class in IPv6 so this
> should be uint8_t traffic_class;

RFC 2474 renames IPv4 ToS and IPv6 traffic class fields to DS field and
defines that six bits of the DS field carry DSCP. See also RFC 3260.

The dscp tunnel header parameter in this API draft is only about DSCP
and not the full DS field (or the full traffic class field of IPv6) that
contains ECN bits too. This is because DSCP and ECN normally require
different handling (even if DSCP is set to a configured value at tunnel
encapsulation, the ECN bits are expected to be copied as defined in
RFC 6040).

If the API was to support non-standard handling of ECN (e.g. setting
to a configured value), it would make sense to have separate struct
fields for it.

Janne
 

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Wednesday, November 16, 2016 4:52 AM
> To: Savolainen, Petri (Nokia - FI/Espoo) 
> 
> Cc: lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API
> 
> On Tue, Nov 15, 2016 at 2:47 AM, Petri Savolainen <
> petri.savolai...@nokia.com> wrote:
> 
> > Added definitions for a look-a-side IPSEC offload API. In addition to
> > IPSEC packet transformations, it also supports:
> > * inbound SA look up
> > * outbound IP fragmentation
> >
> > Signed-off-by: Petri Savolainen 
> > ---
> >  include/odp/api/spec/event.h   |   2 +-
> >  include/odp/api/spec/ipsec.h   | 720
> > +
> >  include/odp_api.h  |   1 +
> >  platform/Makefile.inc  |   1 +
> >  platform/linux-generic/Makefile.am |   2 +
> >  platform/linux-generic/include/odp/api/ipsec.h |  36 ++
> >  .../include/odp/api/plat/event_types.h |   1 +
> >  .../include/odp/api/plat/ipsec_types.h |  39 ++
> >  8 files changed, 801 insertions(+), 1 deletion(-)
> >  create mode 100644 include/odp/api/spec/ipsec.h
> >  create mode 100644 platform/linux-generic/include/odp/api/ipsec.h
> >  create mode 100644 platform/linux-generic/include/odp/api/plat/ipsec_
> > types.h
> >
> > diff --git a/include/odp/api/spec/event.h b/include/odp/api/spec/event.h
> > index fdfa52d..75c0bbc 100644
> > --- a/include/odp/api/spec/event.h
> > +++ b/include/odp/api/spec/event.h
> > @@ -39,7 +39,7 @@ extern "C" {
> >   * @typedef odp_event_type_t
> >   * ODP event types:
> >   * ODP_EVENT_BUFFER, ODP_EVENT_PACKET, ODP_EVENT_TIMEOUT,
> > - * ODP_EVENT_CRYPTO_COMPL
> > + * ODP_EVENT_CRYPTO_COMPL, ODP_EVENT_IPSEC_RESULT
> >   */
> >
> >  /**
> > diff --git a/include/odp/api/spec/ipsec.h b/include/odp/api/spec/ipsec.h
> > new file mode 100644
> > index 000..b45bcd4
> > --- /dev/null
> > +++ b/include/odp/api/spec/ipsec.h
> > @@ -0,0 +1,720 @@
> > +/* Copyright (c) 2016, Linaro Limited
> > + * All rights reserved.
> > + *
> > + * SPDX-License-Identifier: BSD-3-Clause
> > + */
> > +
> > +/**
> > + * @file
> > + *
> > + * ODP IPSEC API
> > + */
> > +
> > +#ifndef ODP_API_IPSEC_H_
> > +#define ODP_API_IPSEC_H_
> > +#include 
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include 
> > +
> > +/** @defgroup odp_ipsec ODP IPSEC
> > + *  Operations of IPSEC API.
> > + *  @{
> > + */
> > +
> > +/**
> > + * @typedef odp_ipsec_sa_t
> > + * IPSEC Security Association (SA)
> > + */
> > +
> > + /**
> > + * @def ODP_IPSEC_SA_INVALID
> > + * Invalid IPSEC SA
> > + */
> > +
> > +/**
> > + * IPSEC SA direction
> > + */
> > +typedef enum odp_ipsec_dir_t {
> > +   /** Inbound IPSEC SA */
> > +   ODP_IPSEC_DIR_INBOUND = 0,
> > +
> > +   /** Outbound IPSEC SA */
> > +   ODP_IPSEC_DIR_OUTBOUND
> > +
> > +} odp_ipsec_dir_t;
> > +
> > +/**
> > + * IPSEC operation mode
> > + */
> > +typedef enum odp_ipsec_op_mode_t {
> > +   /** Synchronous IPSEC operation
> > + *
> > + * Application uses synchronous IPSEC operations,
> > + * which output all results on function return.
> > + */
> > +   ODP_IPSEC_OP_MODE_SYNC = 0,
> > +
> > +   /** Asynchronous IPSEC operation
> > + *
> > + * Application uses asynchronous IPSEC operations,
> > + * which return results via events.
> > + */
> > +   ODP_IPSEC_OP_MODE_ASYNC
> > +
> > +} odp_ipsec_op_mode_t;
> > +
> > +/**
> > + * IPSEC protocol mode
> > + */
> > +typedef enum odp_ipsec_mode_t {
> > +   /** IPSEC tunnel mode */
> > +   ODP_IPSEC_MODE_TUNNEL = 0,
> > +
> > +   /** IPSEC transport mode */
> > +   ODP_IPSEC_MODE_TRANSPORT
> > +
> > +} odp_ipsec_mode_t;
> > +
> > +/**
> > + * IPSEC protocol
> > + */
> > +typedef enum odp_ipsec_proto_t {

Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API

2016-11-15 Thread Peltonen, Janne (Nokia - FI/Espoo)

Hi,

> What will be the total number of SA in the system? Will it be around 100K?

Yes, the API should work with that many SAs.

> If that is the case IMO it might be better to add a "context_packet"
> which could be copied to the output packet instead of having 100K
> different queues?

I think there is a packet context already in the ODP packet
buffers, so the application can use that for per-packet context.
Then there is SA specific context pointer in this API for the
application to use. Thus the application could use just single
output queue without losing track of the SAs and packets,
but it could also use more if it wants to. Right?

> > +odp_ipsec_sa_t odp_ipsec_sa_create(odp_ipsec_sa_param_t *param);
> 
> How do you update the SA associated once the policy is expired? Or is
> it not in the scope of this RFC?

You create a new SA and destroy the old one. And only the application
knows about policies.

Janne




Re: [lng-odp] [API-NEXT DRAFT] api: ipsec: added IPSEC API

2016-11-15 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

> > +typedef struct odp_ipsec_lifetime_t {
> > +   /** Soft expiry limits for the session */
> > +   struct {
> > +   /** Limit in seconds */
> > +   uint64_t sec;
> 
> Does expiry time mean the ADD_TIME -Time calculated when the policy is
> added to database or
> USE_TIME - Time calculated when the policy gets used for the first time?

The lifetimes are associated with SAs in this API draft that leaves
all policies to the application to handle.

I think the intent was that the expiry occurs the given number of
seconds after the creation of the SA, regardless of the use of the SA.
Maybe the documentation could make it more clear.

> > +   /** MTU for outbound IP fragmentation offload
> > +*
> > +*  This is the maximum length of IP packets that outbound IPSEC
> > +*  operations may produce. The value may be updated later with
> > +*  odp_ipsec_mtu_update().
> > +*/
> > +   uint32_t mtu;
> 
> What happens when this MTU is greater than MTU size of the output
> pktio interface and packet might get fragmented by the Hardware since
> the MTU of pktio is smaller than the packet size.

This is a look-a-side type API so the application gets the packet
back after the IPsec operation and has to deal with the situation.
Maybe the application fragments the ESP packet and/or maybe it
updates the MTU of the SA so that later packets can make use of the
fragmentation offload or possible red side fragmentation.

Since this API does not handle policy lookups, the application
can know the eventual output interface before submitting a packet
for IPsec offload and could therefore ensure the correct MTU is
used. But I am not sure the odp_ipsec_update_mtu() way is good
enough for this or whether it should be possible to pass the
MTU value also with each packet.

Note that the MTU value is not always necessarily the link MTU
but it could also be a path MTU estimate.

> > +   /** Destination queue for IPSEC events
> > +*
> > +*  Operations in asynchronous mode enqueue resulting events into
> > +*  this queue.
> > +*/
> > +   odp_queue_t dest_queue;
> 
> Can this dest_queue can be linked to pktio output queue in outbound direction?

I do not think there exists any mechanism that could do that.

> Does application get the packet from dest_queue and then have to
> enqueue to pktio output queue/ TM queue.

Yes. This API is for look-a-side type IPsec acceleration.
Inline IPsec acceleration would require many additions to the API.

Janne




Re: [lng-odp] Fwd: IpSec protocol offload proposal

2016-09-30 Thread Peltonen, Janne (Nokia - FI/Espoo)

Quick comments on a couple of points:

> > > > What happens when there is a policy but no session (since IKE has not
> > > > negotiated the SA yet and expects a traffic based trigger)? How do I
> > get
> > > > to know which SA I need to negotiate?
> > >
> > > This was not covered in the scope of the initial proposal, but we can
> > > add an error queue or notification queue to the session so that the
> > > implementation can send notification message through the notification
> > > queue.
> >
> > Yes, something like that is needed, since I think on-demand negotiation
> > of SAs is an important use case.
> >
> 
> What we discussed is that the classifier would not do any SPI lookups since
> it doesn't have that knowledge. The intent is that the classifier simply
> recognizes the ESP header and can then send the packet directly to the
> IPsec decrypt engine. That engine would then discover that it can't process
> the packet and forward it to an appropriate output queue that would be
> input to the application. So the application would see the undecrypted
> IPsec packet and can then kick off whatever higher-level processing is
> needed to establish the credentials that would allow future IPsec packets
> for this SPI to be handled properly.

I was referring to outbound IPsec processing. When the ODP application
sends a clear text packet out, the packet is matched against SPD, with
basically three possible results:

1) IPsec is not required. The packet can be sent out as such without IPsec.

2) IPsec is required and a suitable SA is found. The packet gets encrypted
and sent out.

3) IPsec is required but no suitable SA exist. IKE negotiation is triggered.

So some sort of notification channel from ODP to the application is
needed for the last case. The information needed by the application
would be the packet itself plus the SPD entry that the packet matched.

This of course assumes that the outbound SPD lookup is performed as
part of the outbound IPsec processing provided by ODP. If ODP did not
do the policy lookup but the ODP application (like OFP) did it, then
triggering IKE would not involve ODP.

Whether ODP should offer the outbound policy lookup service is
a good question. I think it would make sense if there is HW for it.
Maybe in some cases a separate outbound IPsec policy lookup
would not be needed if the application already has classified
packets to flows that easily map to IPsec sessions. In such a case
it might be useful to be able to skip the policy lookup step even
if offered by ODP and let the application directly specify the session
(i.e. outbound SA).

> > Maybe there could be (P)MTU info in the session or in the policy
> > and application could keep it up to date as it changes.
> >
> 
> Please suggest a set of APIs or API extensions that cover these use cases,
> as they are worth discussing in one of the ARCH calls.

I am not sure I would be able to do it, at least not yet.

The PMTUD thing might be as simple as having it as an attribute
of the session and making it possible to modify it on the fly.

If it turns out that outbound policy lookup will not be offered by
ODP but remains as a responsibility of the application, then the
application could just pass the PMTU value to ODP somehow
with each packet.

> > > Yes. The crypto keys has to be provided by the implementation.
> >
> 
> Keys are negotiated as part of IKE processing, which is outside the scope
> of this work for now.

Yes, but the crypto/mac keys needed by ESP and AH need to
be given to ODP when an ipsec session (SA) is being created.

Janne

> -Original Message-
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill 
> Fischofer
> Sent: Friday, September 30, 2016 12:47 AM
> To: LNG ODP Mailman List <lng-odp@lists.linaro.org>
> Subject: [lng-odp] Fwd: IpSec protocol offload proposal
> 
> I hit reply rather than reply all on this. Sorry, this should have been a
> reply to the mailing list.
> 
> Bill
> -- Forwarded message --
> From: Bill Fischofer <bill.fischo...@linaro.org>
> Date: Thu, Sep 29, 2016 at 8:07 AM
> Subject: Re: [lng-odp] IpSec protocol offload proposal
> To: "Peltonen, Janne (Nokia - FI/Espoo)" <janne.pelto...@nokia.com>
> 
> 
> Additional thoughts on this thread...
> 
> On Thu, Sep 29, 2016 at 1:43 AM, Peltonen, Janne (Nokia - FI/Espoo) <
> janne.pelto...@nokia.com> wrote:
> 
> > Hi,
> >
> > Comment inline:
> >
> > > From: Bala Manoharan [mailto:bala.manoha...@linaro.org]
> > > Comments inline...
> > >
> > > On 27 September 2016 at 10:07, Peltonen, Janne (Nokia - FI/Espoo)
> > > <janne.pelto...@noki

Re: [lng-odp] IpSec protocol offload proposal

2016-09-29 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Comment inline:

> From: Bala Manoharan [mailto:bala.manoha...@linaro.org]
> Comments inline...
> 
> On 27 September 2016 at 10:07, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> > Hi,
> >
> > A few questions and comments:
> >
> > Can there be VLAN tags in the received packets? How do I know what L2
> > headers a received packet originally had and through which interface it
> > was received?
> 
> The original Header of the packet will be stored as part of the
> headroom of the decrypted packet.

Having some sort of port id available might be quite useful.
Deducing the actual port from just the packet header may be
difficult or even impossible with e.g. link aggregation.

> 
> >
> > How do I fall back to look-a-side mode when e.g. an IPsec packet came in
> > inside other tunnel (e.g. VxLAN, GRE) that I need to terminate first in SW
> > and then give the packet to ODP for IPsec decapsulation using the same SA
> > that is also used in inline acceleration? And the same for outbound
> > processing.
> 
> We can add additional API to send the packet to the IpSec offload
> engine (enhancement to odp_ipsec_send()) where the packet is first
> received by the application and then passed to the offload engine. In
> the model in which the packets are received by the application, does
> the application do SA lookup or is it something which has to be done
> by the implementation?

Just to be clear, I was referring to input processing in the case
that application first receives a packet (say, a VxLAN packet),
processes it (say, strips VxLAN header) resulting in an IPsec
packet. The IPsec packet is then given to the ODP implementation
for IPsec inbound processing (i.e. decapsulation) and the ODP
implementation later gives the clear text packet back to the
application for the application specific processing of the higher
layers.

So, an API function with "send" in the name would not sound
right. The SA lookup based on the SPI of the received packet
could be done by ODP or maybe in some cases by the application.

In the output path (when sending a packet for which IPsec
encapsulation needs to be done), something analogous is
needed. Maybe the packet needs some additional tunneling
encapsulation that needs to be done by the application
after IPsec encapsulation.

At the risk of stating the obvious, the use cases I have in
mind always involve application specific processing in SW by
the ODP application. In receive and transmit paths of those
packets IPsec processing (decapsulation and encapsulation,
respectively) may be needed.

Ideally, in inline-case, the application receives the clear text
packet after IPsec decapsulation and sends a clear text
packet that gets IPsec encapsulated and sent out by ODP.
Sometimes this will not be possible and the application must
fall back to look-a-side style acceleration. But in both cases
the application does process the packets as it terminates
the higher protocol layers.

> 
> >
> > What happens when there is a policy but no session (since IKE has not
> > negotiated the SA yet and expects a traffic based trigger)? How do I get
> > to know which SA I need to negotiate?
> 
> This was not covered in the scope of the initial proposal, but we can
> add an error queue or notification queue to the session so that the
> implementation can send notification message through the notification
> queue.

Yes, something like that is needed, since I think on-demand negotiation
of SAs is an important use case.

> 
> >
> > How do I implement byte based SA lifetimes? I think some sort of advance
> > notification about impending byte based life time expiry (and also
> > seqno overflow) would be needed.
> 
> This could be part of a notification message, Again since the control
> part is handled by the application, which part of SA expiry should be
> initiated by the implementation? Is it possible that the application
> take care of SA lifetime expiry while the implementation
> handles/notifies the byte based expiry?

Yes, I think it would be reasonable to let the application be
responsible of handling time based lifetimes and have
a notification mechanism for the byte based SA expiry.

> 
> >
> > Should there be support for ordering/prioritization of policy selectors
> > so that overlapping policies could be supported? Should there be discard
> > rules?
> 
> Yes. I had not defined the security accessor selector structure but
> that should include ordering and prioritisation.
> 
> >
> > Does ODP check that a decapsulated packet matches a policy associated
> > with the SA the packet used?
> >
> > Virtual routing and forwarding support (i.e. multiple IPsec instances)
> > would be 

Re: [lng-odp] [PATCH] api:crypto: Adding IPSEC protocol APIs.

2016-09-27 Thread Peltonen, Janne (Nokia - FI/Espoo)
Hi,

Is this work toward a look-a-side IPsec offload or toward inline
IPsec offload?

How would the application offer packets to IPsec processing and how
would it get the processed packets back and how would it know which
policy and session was applied to the packet? What happens when
an outbound packet matches a policy for which there is no session?

Should there be support for "discard" policies/sessions that just
drop the packets?

It would be nice to have support for multiple instances of SPD and SAD
so that one would e.g. be able to do IPsec with virtual routing and
forwarding (or network namespaces) even when addresses etc overlap.

Does the API promise to do inbound policy check for decapsulated
packets, i.e. check that the clear text packets match a policy
associated with the SA that was used in the decapsulation,
or is it something that the application has to do itself?
IOW, I want to check that packets came through a tunnel they
were supposed to come and not through some unrelated tunnel.

> +typedef enum odp_ipsec_ar_ws {
> + ODP_IPSEC_AR_WS_NONE,  /**< Anti-replay is not enabled */
> + ODP_IPSEC_AR_WS_32,/**< Anti-replay window size 32 */
> + ODP_IPSEC_AR_WS_64,/**< Anti-replay window size 64 */
> + ODP_IPSEC_AR_WS_128,   /**< Anti-replay window size 128 */
> +} odp_ipsec_ar_ws_t;

I do not see why this would be better than specifying the
anti-replay window as an integer. The implementation would
then be free to select any window size greater or equal to
what was requested. The proposed API would impose somewhat
artificial a restriction and would need to be updated
if/when bigger window sizes would be supported.

> +typedef struct odp_ipsec_params {

There is no information about the used crypto-algorithms
and their keys here.

> + odp_bool_t auto_iv; /** Auto IV generation for each operation. */

What if this is false? Is the IV provided with the packet?
How?

> + odp_bool_t remove_outer_hdr; /** remove outer header - tunnel mode 
> decap */

What happens if this is false? ESP/AH gets removed and the
packet appears as IPIP-tunneled packet? Something else?

> +int odp_crypto_ipsec_session_create(odp_crypto_session_params_t *ses_params,
> + odp_ipsec_proto_t ipsec_proto,
> + odp_ipsec_params_t *ipsec_params,
> + odp_crypto_session_t *session_out,
> + odp_crypto_ses_create_err_t *status);

The first parameter looks redundant and the type has not been
defined. I wonder why ipsec_proto is a separate parameter and
not included in ipsec_params like all the other stuff?

> +struct odp_ipsec_spd_params{

How do I put the policies in priority order in case I have
overlapping policies? Not supporting overlapping policies
at all would of course be simpler but a bit restrictive.

> + enum odp_ipsec_policy_redside_fragmentation redside;
> + /**< Fragmentation before Encapsulation option: TRUE/FALSE */

Where does the MTU value come from? I suppose it could be
part of session or policy so that the application could
update it e.g. based on the current idea of PMTU.

I suppose the application needs to control, on per-packet basis,
to which packets fragmentation is applied. I suppose normally one
might want to apply it to locally generated packets regardless of
DF and to forwarded packets with DF not set.

BR,
Janne