We should not deliberately break process support. Since ODP and OFP are 
libraries, it's the application (e.g. NGINX) that creates the threads and it 
may have historical or other valid reasons to fork processes instead of 
creating pthreads. Processes may be forked at many points. If we target fork 
after global init, we are making process support better instead of making it 
worse.

So, I'd suggest to preallocate Open SSL (per thread) context memory in 
global_init(). I guess context allocation depends on algorithm, etc config, but 
we could e.g. pre-allocate the most obvious ones (e.g. AES+SHA1, or AES-GCM) 
and leave all others as they are today. A balance between simple solution, 
process support and better performance for the common case.

-Petri



> -----Original Message-----
> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Bill
> Fischofer
> Sent: Tuesday, December 12, 2017 3:23 AM
> To: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> Cc: lng-odp@lists.linaro.org
> Subject: Re: [lng-odp] IPsec and crypto performance and OpenSSL
> 
> I think we've pretty much abandoned the notion that linux-generic will
> support threads as separate processes as there seems to be little
> justification for it. The current plan is to continue this assumption in
> the "2.0" code base as well. So having OpenSSL rely on threads sharing the
> same address space should not be a problem in practice.
> 
> Any platform that has native HW crypto capabilities would certainly use
> those in preference OpenSSL, so again such restrictions should not be of
> concern here.
> 
> I agree, however, that this sort of tuning should be a follow-on activity,
> perhaps under the general performance improvement work we want for 2.0,
> but
> should not be a gating consideration for the Tiger Moth release.
> 
> On Mon, Dec 11, 2017 at 5:19 PM, Dmitry Eremin-Solenikov <
> dmitry.ereminsoleni...@linaro.org> wrote:
> 
> > On 11 December 2017 at 19:14, Maxim Uvarov <maxim.uva...@linaro.org>
> > wrote:
> > > odp_init_global() allocates shm, then odp_init_local() /
> odp_term_local()
> > > allocates/destroys per thread contexts in array in that shm. I think
> that
> > > has to work.
> >
> > The problem lies in OpenSSL 1.1 "opaque structures" approach. They
> stopped
> > providing exact struct definitions which can be embedded somewhere.
> Another
> > option would be to switch to libnettle licensed under LGPL-2.1+. It may
> be
> > however less optimized compared to OpenSSL.
> >
> > > On 11 December 2017 at 17:02, Francois Ozog <francois.o...@linaro.org>
> > > wrote:
> > >
> > >> I favor finishing ODP (ex 2.0) integration rather than optimizing
> > >> linux-generic at this stage.
> > >>
> > >> On 11 December 2017 at 14:39, Peltonen, Janne (Nokia - FI/Espoo) <
> > >> janne.pelto...@nokia.com> wrote:
> > >>
> > >> > Hi,
> > >> >
> > >> > When playing with IPsec I noticed that the Linux generic
> > >> > ODP implementation creates a separate OpenSSL crypto context
> > >> > for each crypto-operation as opposed to doing it at ODP
> > >> > crypto session creation. With IPsec this adds a lot of
> > >> > overhead for every packet processed and significantly
> > >> > reduces packet throughput.
> > >> >
> > >> > I wonder what, if anything, should be done about it.
> > >> >
> > >> > I already almost sent a patch to create and initialize
> > >> > crypto contexts only once per session but realized that
> > >> > it is not that easy.
> > >> >
> > >> > Here are some alternatives that came to my mind, but all
> > >> > of them have their own problems:
> > >> >
> > >> > a) Create per-thread OpenSSL contexts at crypto session
> > >> >    creation time.
> > >> >    - Does not work with ODP threads that do not share
> > >> >      their address space since OpenSSL is allocating
> > >> >      memory through malloc() during context creation.
> > >> >
> > >> > b) Do a) plus provide OpenSSL a custom memory allocator
> > >> >    on top of shared memory.
> > >> >    - There is no generic heap allocator in ODP code base.
> > >> >
> > >> > c) Create per-thread contexts lazily when needed.
> > >> >    - Creation would work as it would happen in the right
> > >> >      thread but there would be no way to delete the
> > >> >      contexts. The thread destroying the ODP crypto
> > >> >      session cannot delete the per-thread contexts that
> > >> >      might reside in a different address spaces. That
> > >> >      thread could ask every other thread to do the
> > >> >      per-thread cleanup, except that there is no mechanism
> > >> >      for that without application assistance or big
> > >> >      changes in the generic ODP code.
> > >> >
> > >> > d) Create a limited-size cache of per-thread contexts.
> > >> >    - This would allow postponing the deletion of each
> > >> >      context either to the point the cache slot needs
> > >> >      to be reused or all the way to ODP termination,
> > >> >      both occuring in the right thread. But this is
> > >> >      getting complicated and sizing the cache is nasty.
> > >> >
> > >> > Any thoughts?
> > >> >
> > >> >         Janne
> > >> >
> > >> >
> > >> >
> > >>
> > >>
> > >> --
> > >> [image: Linaro] <http://www.linaro.org/>
> > >> François-Frédéric Ozog | *Director Linaro Networking Group*
> > >> T: +33.67221.6485
> > >> francois.o...@linaro.org | Skype: ffozog
> > >>
> >
> >
> >
> > --
> > With best wishes
> > Dmitry
> >

Reply via email to