Another interesting portable atomics library is
https://github.com/mintomic/mintomic
FYI, I took a stab at a simple portable atomics that uses GCC/clang
__atomic, or __sync, or Win32 Interlocked*, or a single global lock, and
with a fallback to unsafe, non-atomic implementations for no-threads
con
On Tue, Dec 15, 2015 at 10:35:58PM +0100, Florian Weimer wrote:
> * Kurt Roeckx:
>
> > On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
> >> * Nico Williams:
> >>
> >> > On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> >> >> > Maybe http://trac.mpich.org/projects/o
* Kurt Roeckx:
> On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
>> * Nico Williams:
>>
>> > On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
>> >> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
>> >>
>> >> It seems to have trouble to keep up wi
On Tue, Dec 15, 2015 at 07:54:35PM +0100, Kurt Roeckx wrote:
> Also, if you want to use atomics we really want the C11 / C++11
> memory model which prevents certain important optimazations.
Right, because compilers can reorder some operations. But we've been
living with this pre-C11 for decades.
On Tue, Dec 15, 2015 at 06:15:32PM +, Salz, Rich wrote:
> > I.e., between compiler non-C11 atomic intrinsics, C11 intrinsics, OS atomic
> > function libraries, and portable open-source atomics libraries, we can cover
> > almost all the bases.
>
> Agreed.
Thanks. This is helpful. I now think
On Tue, Dec 15, 2015 at 09:57:32AM -0600, Benjamin Kaduk wrote:
> On 12/15/2015 06:43 AM, Kurt Roeckx wrote:
> > On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
> >> * Nico Williams:
> >> Not on Windows.
> >>
> >>> What's the alternative anyways?
> >> Using C++11.
> > I think this i
> I.e., between compiler non-C11 atomic intrinsics, C11 intrinsics, OS atomic
> function libraries, and portable open-source atomics libraries, we can cover
> almost all the bases.
Agreed.
> We have a surfeit of options, not a dearth of them. I don't think lack of
> atomics primitives is remotel
On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
> * Nico Williams:
>
> > On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> >> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
> >>
> >> It seems to have trouble to keep up with new architectures.
>
On Tue, Dec 15, 2015 at 09:57:32AM -0600, Benjamin Kaduk wrote:
> On 12/15/2015 06:43 AM, Kurt Roeckx wrote:
> > On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
> >> Using C++11.
> > I think this is a relevant article:
> > http://herbsutter.com/2012/05/03/reader-qa-what-about-vc-and
On 12/15/2015 06:43 AM, Kurt Roeckx wrote:
> On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
>> * Nico Williams:
>> Not on Windows.
>>
>>> What's the alternative anyways?
>> Using C++11.
> I think this is a relevant article:
> http://herbsutter.com/2012/05/03/reader-qa-what-about-vc
On Tue, Dec 15, 2015 at 01:24:12PM +0100, Florian Weimer wrote:
> * Nico Williams:
>
> > On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> >> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
> >>
> >> It seems to have trouble to keep up with new architectures.
>
* Nico Williams:
> On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
>> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
>>
>> It seems to have trouble to keep up with new architectures.
>
> New architectures are not really a problem because between a) decent
> com
On Thu, Dec 10, 2015 at 07:06:15AM +1000, Paul Dale wrote:
> Thanks for the clarification. I was making an assumption that
> following the existing locking model, which did seem over complicated,
> was desirable. Now that that is shot down, things can be much
> simpler.
Exactly :)
Sorry if I wa
Nico,
Thanks for the clarification. I was making an assumption that following the
existing locking model, which did seem over complicated, was desirable. Now
that that is shot down, things can be much simpler.
It would make more sense to have a structure containing the reference counter
and
-dev] [openssl-team] Discussion: design issue: async
and -lpthread
> The "have-atomics" is intended to test if the callback was installed by the
> user.
I want to move away from runtime callback installations. It makes it too hard
to know what the library is doing, if it
> The "have-atomics" is intended to test if the callback was installed by the
> user.
I want to move away from runtime callback installations. It makes it too hard
to know what the library is doing, if it is correct, and it complicates the
code. There is almost never any reason for the flexib
> "have-atomics" must be known at compile time.
>
> "lock" should not be needed because we should always have atomics, even
> when we don't have true atomics: just use a global lock in a stub
> implementation of atomic_add() and such. KISS. Besides, this will add
> pressure to add true atomics
On Wed, Dec 09, 2015 at 02:33:46AM -0600, Nico Williams wrote:
> No more installing callbacks to get locking and atomics.
I should explain why.
First, lock callbacks are a serious detriment to usability.
Second, they are an admission that OpenSSL is incomplete.
Third, if we have lock callbacks
No more installing callbacks to get locking and atomics. This has to all
work out of the box (the user could be allowed tip supply their
own implementations if these things at OpenSSL build time, but that's it,
not at run-time).
Nico
--
___
openssl-dev
The "have-atomics" is intended to test if the callback was installed by the
user. If we're using an atomic library or compiler support, then it isn't
required since we know we've got them.
Likewise, the lock argument isn't required if atomics are used everywhere.
However, some code will need
On Wed, Dec 09, 2015 at 09:27:16AM +1000, Paul Dale wrote:
> It will be possible to support atomics in such a way that there is no
> performance penalty for machines without them or for single threaded
> operation. My sketcy design is along the lines of adding a new API
> CRYPTO_add_atomic that ta
It will be possible to support atomics in such a way that there is no
performance penalty for machines without them or for single threaded operation.
My sketcy design is along the lines of adding a new API CRYPTO_add_atomic that
takes the same arguments as CRYPTO_add (i.e. reference to counter, v
On Tue, Dec 08, 2015 at 11:19:32AM +0100, Florian Weimer wrote:
> > Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
>
> It seems to have trouble to keep up with new architectures.
New architectures are not really a problem because between a) decent
compilers with C11 and/or non-C
* Nico Williams:
> Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
It seems to have trouble to keep up with new architectures.
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
Maybe http://trac.mpich.org/projects/openpa/ would fit the bill?
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
On Mon, Dec 07, 2015 at 02:41:35PM +0100, Florian Weimer wrote:
> On 11/25/2015 06:48 PM, Kurt Roeckx wrote:
> > Please note that we use C, not C++. But C11 has the same atomics
> > extentions as C++11.
>
> C++11 support is much more widespread than C11 support. You will have
> trouble finding r
On 11/25/2015 06:48 PM, Kurt Roeckx wrote:
> On Wed, Nov 25, 2015 at 01:02:29PM +0100, Florian Weimer wrote:
>> On 11/23/2015 11:08 PM, Kurt Roeckx wrote:
>>
>>> I think that we currently don't do any compile / link test to
>>> detect features but that we instead explicitly say so for each
>>> plat
The figures were for connection reestablishment, RSA computations etc simply
don't feature. For initial connection establishment, on the other hand, they
are the single largest factor. The crypto is definitely not the bottleneck for
this case.
Pauli
--
Oracle
Dr Paul Dale | Cryptographer |
On Tuesday 01 December 2015 09:21:34 Paul Dale wrote:
> > are you sure that the negotiated cipher suite is the same and that
> > the NSS is not configured to reuse the server key share if you're
> > using DHE or ECDHE?
>
> There is definitely scope for improvement here. My atomic operation
> sugg
I'd suggest checking where the bottlenecks are before making major structural changes. I'll admit we have made a few changes to the basic OpenSSL sources but I don't see unacceptable amounts of locking even on large machines (100's of processing units) with thousands of threads.Blinding and the RN
On Tue, Dec 01, 2015 at 09:21:34AM +1000, Paul Dale wrote:
> However, the obstacle preventing 100% CPU utilisation for both stacks
> is lock contention. The NSS folks apparently spent a lot of effort
> addressing this and they have a far more scalable locking model than
> OpenSSL: one lock per con
> are you sure that the negotiated cipher suite is the same and that the
> NSS is not configured to reuse the server key share if you're using DHE
> or ECDHE?
The cipher suite was the same. I'd have to check to see exactly which was
used. It is certainly possible that NSS was configured as yo
On Mon, Nov 23, 2015 at 11:56:54PM +, Viktor Dukhovni wrote:
> > It may be a good idea to rethink locking completely.
>
> There is some glimmer of hope in that as various libcrypto structures
> become opaque, the locking moves from application code into the
> library. For example, we now have
On Tuesday 24 November 2015 10:49:26 Paul Dale wrote:
> On Mon, 23 Nov 2015 11:11:37 PM Alessandro Ghedini wrote:
> > Is this TLS connections?
>
> Yes, this is just measuring the TLS handshake. Renegotiations
> predominately. We deliberately didn't test the bulk symmetric crypto
> phase of the co
On Mon, 23 Nov 2015 11:11:37 PM Alessandro Ghedini wrote:
> Is this TLS connections?
Yes, this is just measuring the TLS handshake. Renegotiations predominately.
We deliberately didn't test the bulk symmetric crypto phase of the connection.
> I'd like to know more...
The data are a bit rough a
On Wed, Nov 25, 2015 at 01:02:29PM +0100, Florian Weimer wrote:
> On 11/23/2015 11:08 PM, Kurt Roeckx wrote:
>
> > I think that we currently don't do any compile / link test to
> > detect features but that we instead explicitly say so for each
> > platform.
> >
> > Anyway, the gcc the documentati
On 11/23/2015 11:08 PM, Kurt Roeckx wrote:
> I think that we currently don't do any compile / link test to
> detect features but that we instead explicitly say so for each
> platform.
>
> Anyway, the gcc the documentation is here:
> https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
>
> TLS su
On Tue, Nov 24, 2015 at 03:16:59PM +, Jonathan Larmour wrote:
> On 23/11/15 20:34, Matt Caswell wrote:
> > One other option we could pursue is to use the "__thread" syntax for
> > thread local variables and avoid the need for libpthread altogether. An
> > earlier version of the code did this. I
On 24/11/15 15:16, Jonathan Larmour wrote:
> On 23/11/15 20:34, Matt Caswell wrote:
>> On 23/11/15 17:49, Nico Williams wrote:
>>
>>> Still, if -lpthread avoidance were still desired, you'd have to find an
>>> alternative to pthread_key_create(), pthread_getspecific(), and friends.
>>
>> Just a p
On 23/11/15 20:34, Matt Caswell wrote:
> On 23/11/15 17:49, Nico Williams wrote:
>
>> Still, if -lpthread avoidance were still desired, you'd have to find an
>> alternative to pthread_key_create(), pthread_getspecific(), and friends.
>
> Just a point to note about this. The async code that introd
> On Nov 24, 2015, at 2:13 AM, Nico Williams wrote:
>
> If the OpenSSL team finally decides to do something about sane locking
> by default, then it will be a huge improvement. If this thread provides
> the impetus, so much the better.
I hope that happens. It would certainly make a big contri
On Tue, Nov 24, 2015 at 11:32:32AM +1000, Peter Waltenberg wrote:
> I wasn't saying there was anything wrong with mmap(), just that guard pages
> only work if you can guarantee your overrun hits the guard page (and
> doesn't just step over it). Large stack allocations increase the odds of
> 'steppi
On Tue, Nov 24, 2015 at 11:32:32AM +1000, Peter Waltenberg wrote:
> As for fibre's, I doubt it'll work in general, the issue there is simply
> the range of OS's OpenSSL supports. If you wire it in you still have to run
> with man+dog+world in the process, that's a hard ask. One of the good
> point
e ends up in the same
process.
Peter
From: Nico Williams
To: openssl-dev@openssl.org
Date: 24/11/2015 10:42
Subject: Re: [openssl-dev] [openssl-team] Discussion: design issue:
async and -lpthread
Sent by:"openssl-dev"
On Mon, Nov 23, 2015 at
? It does, but it also requires code changes in a few places. probable_prime()
in bn_prime.c being far and away the worst offender.
This is fixed in master which uses malloc and free. Actually, I think all
egregious stack consumption has been fixed in master.
___
On Mon, Nov 23, 2015 at 09:53:15PM +1000, Peter Waltenberg wrote:
>
> "
> Please do. It will make this much safer. Also, you might want to run
> some experiments to find the best stack size on each platform. The
> smaller the stack you can get away with, the better.
> "
>
> It does, but it als
On Mon, Nov 23, 2015 at 05:28:18PM -0600, Nico Williams wrote:
> It may be a good idea to rethink locking completely.
There is some glimmer of hope in that as various libcrypto structures
become opaque, the locking moves from application code into the
library. For example, we now have (yet to be
valuable - i.e. way back in the dim distant path when Linux had
multiple thread packages available.
Peter
From: Nico Williams
To: openssl-dev@openssl.org
Date: 24/11/2015 06:49
Subject: Re: [openssl-dev] [openssl-team] Discussion: design issue:
async and -lpthread
S
On Mon, Nov 23, 2015 at 10:18:27PM +, Matt Caswell wrote:
> On 23/11/15 21:56, Paul Dale wrote:
> > Somewhat tangentially related to this is the how thread locking in
> > OpenSSL is slowing things up.
>
> Alessandro has submitted an interesting patch to provide a much better
> threading API. S
On Tue, Nov 24, 2015 at 07:56:15am +1000, Paul Dale wrote:
> Somewhat tangentially related to this is the how thread locking in OpenSSL is
> slowing things up.
>
> We've been doing some connection establishment performance analysis recently
> and have discovered a lot of waiting on locks is occurr
Thanks for the quick reply. That patch looks much improved on this front.
We'll wait for the changes and then retest performance.
Thanks again,
Pauli
On Mon, 23 Nov 2015 10:18:27 PM Matt Caswell wrote:
>
> On 23/11/15 21:56, Paul Dale wrote:
> > Somewhat tangentially related to this is the h
> https://github.com/openssl/openssl/pull/451
> I'm not sure what the current status of this is though.
I've made several comments I think need to be addressed before we should merge
it.
___
openssl-dev mailing list
To unsubscribe: https://mta.openssl.o
On 23/11/15 21:56, Paul Dale wrote:
> Somewhat tangentially related to this is the how thread locking in
> OpenSSL is slowing things up.
Alessandro has submitted an interesting patch to provide a much better
threading API. See:
https://github.com/openssl/openssl/pull/451
I'm not sure what the
On Mon, Nov 23, 2015 at 02:48:25PM -0600, Nico Williams wrote:
>
> I use this in an autoconf project (I know, OpenSSL doesn't use autoconf):
>
> dnl Thread local storage
> have___thread=no
> AC_MSG_CHECKING(for thread-local storage)
> AC_LINK_IFELSE([AC_LANG_SOURCE([
> static __thread i
Somewhat tangentially related to this is the how thread locking in OpenSSL is
slowing things up.
We've been doing some connection establishment performance analysis recently
and have discovered a lot of waiting on locks is occurring. By far the worst
culprit is CRYPTO_LOCK_EVP_PKEY in CRYPTO_a
On Mon, Nov 23, 2015 at 08:34:29PM +, Matt Caswell wrote:
> On 23/11/15 17:49, Nico Williams wrote:
> > On a slightly related note, I asked and Viktor tells me that fiber
> > stacks are allocated with malloc(). I would prefer that they were
> > allocated with mmap(), because then you get a gua
On 23/11/15 17:49, Nico Williams wrote:
> [Resend, with slight edits.]
>
> [Viktor asked me for my advice on this issue and bounced me the post
> that I'm following up to. -Nico]
>
> The summary of what I've to say is that making libcrypto and libssl need
> -lpthread is something that does re
On Mon, Nov 23, 2015 at 01:53:47AM +, Viktor Dukhovni wrote:
[NetBSD header commentary extracts:]
> /*
> * Use macros to rename many pthread functions to the corresponding
> * libc symbols which are either trivial/no-op stubs or the real
No renaming is necessary if one's link-editor and RT
[Resend, with slight edits.]
[Viktor asked me for my advice on this issue and bounced me the post
that I'm following up to. -Nico]
The summary of what I've to say is that making libcrypto and libssl need
-lpthread is something that does require discussion, as it will have
detrimental effects on
[Viktor asked me for my advice on this issue and bounced me the post
that I'm following up to. -Nico]
The summary of what I've to say is that making libcrypto and libssl need
-lpthread is something that does require discussion, as it will have
detrimental effects on some users. Personally, I th
60 matches
Mail list logo