> Also note that this will break explicit IV support.

Why so? The iv is set in every operation.

        Janne

> -----Original Message-----
> From: Dmitry Eremin-Solenikov [mailto:dmitry.ereminsoleni...@linaro.org]
> Sent: Tuesday, December 12, 2017 12:36 PM
> To: Peltonen, Janne (Nokia - FI/Espoo) <janne.pelto...@nokia.com>
> Cc: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>; Bill 
> Fischofer
> <bill.fischo...@linaro.org>; lng-odp@lists.linaro.org
> Subject: Re: Suspected SPAM - Re: [lng-odp] IPsec and crypto performance and 
> OpenSSL
> 
> Hello,
> 
> On 12 December 2017 at 12:38, Peltonen, Janne (Nokia - FI/Espoo)
> <janne.pelto...@nokia.com> wrote:
> >
> >> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> >> global_init(). I
> >> guess context allocation depends on algorithm, etc config, but we could 
> >> e.g. pre-
> allocate
> >> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> >> they are
> today.
> >> A balance between simple solution, process support and better performance 
> >> for the
> common
> >> case.
> >
> > That would help but since the same context would be shared by
> > many crypto sessions the keys would have to be configured in the
> > context in every crypto op.
> >
> > The current code looks like this (in every crypto op):
> >
> >         ctx = EVP_CIPHER_CTX_new();
> >         EVP_EncryptInit_ex(ctx, session->cipher.evp_cipher, NULL,
> >                            session->cipher.key_data, NULL);
> >         EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);
> >         EVP_CIPHER_CTX_set_padding(ctx, 0);
> >
> >         ret = internal_encrypt(ctx, pkt, param);
> >
> >         EVP_CIPHER_CTX_free(ctx);
> >
> > I tried this, with big speedup (and broken process support):
> >
> >         ctx = session->perthread[odp_thread_id()].cipher_ctx;
> >         EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv_ptr);
> >
> >         ret = internal_encrypt(ctx, pkt, param);
> 
> Also note that this will break explicit IV support.
> 
> > What you propose, would be in between. Maybe I should try and
> > see how fast it would be.
> >
> >         Janne
> >
> >
> >> -----Original Message-----
> >> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of 
> >> Savolainen, Petri
> >> (Nokia - FI/Espoo)
> >> Sent: Tuesday, December 12, 2017 10:30 AM
> >> To: Bill Fischofer <bill.fischo...@linaro.org>; Dmitry Eremin-Solenikov
> >> <dmitry.ereminsoleni...@linaro.org>
> >> Cc: lng-odp@lists.linaro.org
> >> Subject: Suspected SPAM - Re: [lng-odp] IPsec and crypto performance and 
> >> OpenSSL
> >>
> >> We should not deliberately break process support. Since ODP and OFP are 
> >> libraries, it's
> >> the application (e.g. NGINX) that creates the threads and it may have 
> >> historical or
> other
> >> valid reasons to fork processes instead of creating pthreads. Processes 
> >> may be forked
> at
> >> many points. If we target fork after global init, we are making process 
> >> support better
> >> instead of making it worse.
> >>
> >> So, I'd suggest to preallocate Open SSL (per thread) context memory in 
> >> global_init(). I
> >> guess context allocation depends on algorithm, etc config, but we could 
> >> e.g. pre-
> allocate
> >> the most obvious ones (e.g. AES+SHA1, or AES-GCM) and leave all others as 
> >> they are
> today.
> >> A balance between simple solution, process support and better performance 
> >> for the
> common
> >> case.
> >>
> >> -Petri
> >>
> >>
> >>
> >> > -----Original Message-----
> >> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill
> >> > Fischofer
> >> > Sent: Tuesday, December 12, 2017 3:23 AM
> >> > To: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> >> > Cc: lng-odp@lists.linaro.org
> >> > Subject: Re: [lng-odp] IPsec and crypto performance and OpenSSL
> >> >
> >> > I think we've pretty much abandoned the notion that linux-generic will
> >> > support threads as separate processes as there seems to be little
> >> > justification for it. The current plan is to continue this assumption in
> >> > the "2.0" code base as well. So having OpenSSL rely on threads sharing 
> >> > the
> >> > same address space should not be a problem in practice.
> >> >
> >> > Any platform that has native HW crypto capabilities would certainly use
> >> > those in preference OpenSSL, so again such restrictions should not be of
> >> > concern here.
> >> >
> >> > I agree, however, that this sort of tuning should be a follow-on 
> >> > activity,
> >> > perhaps under the general performance improvement work we want for 2.0,
> >> > but
> >> > should not be a gating consideration for the Tiger Moth release.
> >> >
> >> > On Mon, Dec 11, 2017 at 5:19 PM, Dmitry Eremin-Solenikov <
> >> > dmitry.ereminsoleni...@linaro.org> wrote:
> >> >
> >> > > On 11 December 2017 at 19:14, Maxim Uvarov <maxim.uva...@linaro.org>
> >> > > wrote:
> >> > > > odp_init_global() allocates shm, then odp_init_local() /
> >> > odp_term_local()
> >> > > > allocates/destroys per thread contexts in array in that shm. I think
> >> > that
> >> > > > has to work.
> >> > >
> >> > > The problem lies in OpenSSL 1.1 "opaque structures" approach. They
> >> > stopped
> >> > > providing exact struct definitions which can be embedded somewhere.
> >> > Another
> >> > > option would be to switch to libnettle licensed under LGPL-2.1+. It may
> >> > be
> >> > > however less optimized compared to OpenSSL.
> >> > >
> >> > > > On 11 December 2017 at 17:02, Francois Ozog 
> >> > > > <francois.o...@linaro.org>
> >> > > > wrote:
> >> > > >
> >> > > >> I favor finishing ODP (ex 2.0) integration rather than optimizing
> >> > > >> linux-generic at this stage.
> >> > > >>
> >> > > >> On 11 December 2017 at 14:39, Peltonen, Janne (Nokia - FI/Espoo) <
> >> > > >> janne.pelto...@nokia.com> wrote:
> >> > > >>
> >> > > >> > Hi,
> >> > > >> >
> >> > > >> > When playing with IPsec I noticed that the Linux generic
> >> > > >> > ODP implementation creates a separate OpenSSL crypto context
> >> > > >> > for each crypto-operation as opposed to doing it at ODP
> >> > > >> > crypto session creation. With IPsec this adds a lot of
> >> > > >> > overhead for every packet processed and significantly
> >> > > >> > reduces packet throughput.
> >> > > >> >
> >> > > >> > I wonder what, if anything, should be done about it.
> >> > > >> >
> >> > > >> > I already almost sent a patch to create and initialize
> >> > > >> > crypto contexts only once per session but realized that
> >> > > >> > it is not that easy.
> >> > > >> >
> >> > > >> > Here are some alternatives that came to my mind, but all
> >> > > >> > of them have their own problems:
> >> > > >> >
> >> > > >> > a) Create per-thread OpenSSL contexts at crypto session
> >> > > >> >    creation time.
> >> > > >> >    - Does not work with ODP threads that do not share
> >> > > >> >      their address space since OpenSSL is allocating
> >> > > >> >      memory through malloc() during context creation.
> >> > > >> >
> >> > > >> > b) Do a) plus provide OpenSSL a custom memory allocator
> >> > > >> >    on top of shared memory.
> >> > > >> >    - There is no generic heap allocator in ODP code base.
> >> > > >> >
> >> > > >> > c) Create per-thread contexts lazily when needed.
> >> > > >> >    - Creation would work as it would happen in the right
> >> > > >> >      thread but there would be no way to delete the
> >> > > >> >      contexts. The thread destroying the ODP crypto
> >> > > >> >      session cannot delete the per-thread contexts that
> >> > > >> >      might reside in a different address spaces. That
> >> > > >> >      thread could ask every other thread to do the
> >> > > >> >      per-thread cleanup, except that there is no mechanism
> >> > > >> >      for that without application assistance or big
> >> > > >> >      changes in the generic ODP code.
> >> > > >> >
> >> > > >> > d) Create a limited-size cache of per-thread contexts.
> >> > > >> >    - This would allow postponing the deletion of each
> >> > > >> >      context either to the point the cache slot needs
> >> > > >> >      to be reused or all the way to ODP termination,
> >> > > >> >      both occuring in the right thread. But this is
> >> > > >> >      getting complicated and sizing the cache is nasty.
> >> > > >> >
> >> > > >> > Any thoughts?
> >> > > >> >
> >> > > >> >         Janne
> >> > > >> >
> >> > > >> >
> >> > > >> >
> >> > > >>
> >> > > >>
> >> > > >> --
> >> > > >> [image: Linaro] <http://www.linaro.org/>
> >> > > >> François-Frédéric Ozog | *Director Linaro Networking Group*
> >> > > >> T: +33.67221.6485
> >> > > >> francois.o...@linaro.org | Skype: ffozog
> >> > > >>
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > With best wishes
> >> > > Dmitry
> >> > >
> 
> 
> 
> --
> With best wishes
> Dmitry

Reply via email to