On Fri, Apr 18, 2014 at 02:41:07PM -0400, Thor Lancelot Simon wrote:
>
> Of the few systems which do have instructions that accellerate AES, on
> the most common implementation -- x86 -- we cannot use the instructions
> in question in the kernel because they use CPU state we do not save/
> restore
Thor Lancelot Simon wrote:
> On Wed, Apr 16, 2014 at 09:52:22PM -0400, Thor Lancelot Simon wrote:
> >
> > Attached is a patch which makes cprng_fast per-CPU and lockless. *IT
> > IS NOT WELL TESTED YET (I haven't even run test vectors) AND IS ONLY
> > FOR REVIEW.*
>
> New diff, with some missin
Date: Fri, 18 Apr 2014 21:50:46 -0400
From: Thor Lancelot Simon
> Are there actually any callers of cprng_fast at IPL_HIGH? Are there
> actually any legitimate random decisions to be made at IPL_HIGH? I'm
> sceptical.
What do you get if you cross an elephant and a rhinocerous
> I would still suggest Salsa20 or ChaCha. My measurements with naive C
> code suggest that, if you buffer the output for short outputs, these
> take on average 40-50 Ivy Bridge cycles per request. (If you don't
> buffer the output, it's 300 cycles.) Long requests get ~4 cpb. In
> contrast, lib
Miscellaneous notes -- I'm doing the benchmarking we discussed and
pondering whether it is right to switch from hc-128 to salsa20, separately.
On Thu, Apr 17, 2014 at 09:33:28PM +, Taylor R Campbell wrote:
>
> > +void
> > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv
On Fri, Apr 18, 2014 at 05:05:37PM -0400, Thor Lancelot Simon wrote:
> On Fri, Apr 18, 2014 at 05:00:50PM -0400, Thor Lancelot Simon wrote:
> >
> > Unfortunately, the virtual machines on this laptop that I use for most
> > NetBSD development don't expose the AES-NI instructions to guests, even
> >
tls@ wrote:
> > Note the caller of this hc128_init() is:
> >
> > I'm afraid "9KB stack on rekeying" is fatal on most ports.
>
> Well, the cipher should hardly ever get rekeyed. The rekeying
> intervals could be considerably larger; I did not want to
> increase them too much compared to the old
On Apr 18, 2014, at 11:23 AM, Markku-Juhani Olavi Saarinen wrote:
> It has been there on all new systems purchased in some last 3 years,
> so I would *guess* that it would be > 50% of systems fielded out
> there.
Not everything is x86 based.
On Fri, Apr 18, 2014 at 11:04:04PM +0200, Markku-Juhani Olavi Saarinen wrote:
> Hi,
>
> Just one last thought: it will be on all *future* systems ?
No, it won't, unless you have some funny definition of "system" that
excludes anything that's not a high-end x86 implementation from one
particular m
On Fri, Apr 18, 2014 at 05:00:50PM -0400, Thor Lancelot Simon wrote:
>
> Unfortunately, the virtual machines on this laptop that I use for most
> NetBSD development don't expose the AES-NI instructions to guests, even
> when doing hardware assisted virtualization. Not RDRAND neither, for
So, sin
Hi,
Just one last thought: it will be on all *future* systems ?
Cheers,
- markku
Dr. Markku-Juhani O. Saarinen US +1 (424) 666 2713
On Fri, Apr 18, 2014 at 11:00 PM, Thor Lancelot Simon wrote:
> On Fri, Apr 18, 2014 at 09:54:09PM +0100, Roland C. Dowdeswell wrote:
>> On Fri, Apr 18, 2014 at
On Fri, Apr 18, 2014 at 09:54:09PM +0100, Roland C. Dowdeswell wrote:
> On Fri, Apr 18, 2014 at 08:23:11PM +0200, Markku-Juhani Olavi Saarinen wrote:
> >
>
> > Agreed. AES is worse if you don't have AES-NI.
> >
> > It has been there on all new systems purchased in some last 3 years,
> > so I woul
On Fri, Apr 18, 2014 at 08:23:11PM +0200, Markku-Juhani Olavi Saarinen wrote:
>
> Agreed. AES is worse if you don't have AES-NI.
>
> It has been there on all new systems purchased in some last 3 years,
> so I would *guess* that it would be > 50% of systems fielded out
> there.
It hasn't been the
On Fri, Apr 18, 2014 at 08:33:20PM +0200, Thomas Klausner wrote:
> On Fri, Apr 18, 2014 at 01:39:18PM -0400, Thor Lancelot Simon wrote:
> > How do you count to 9K? I see:
> >
> > 2K for p
> > 2K for q
> > 1280 bytes for w
>
> Are you talking about this w?
> + uint32_t w[1280],
On Fri, Apr 18, 2014 at 06:11:39PM +, Taylor R Campbell wrote:
>
> The majority of systems certainly don't have AES-NI. Only some recent
> Intel CPUs do, and we can't use it in the kernel anyway.
Right: plenty of systems accellerate AES, but in the wide world of
systems that are not all x86
On Fri, Apr 18, 2014 at 01:39:18PM -0400, Thor Lancelot Simon wrote:
> How do you count to 9K? I see:
>
> 2K for p
> 2K for q
> 1280 bytes for w
Are you talking about this w?
+ uint32_t w[1280], *p = state->p, *q = state->q;
This looks like 1280x4 bytes to me.
Thomas
On Fri, Apr 18, 2014 at 8:11 PM, Taylor R Campbell
wrote:
>Date: Fri, 18 Apr 2014 19:58:06 +0200
>From: Markku-Juhani Olavi Saarinen
>
>If you want to get rid of RC4, use AES in CTR mode. It is standard,
>compact, clean, and really fast solution. May sound boring, but gives
>m
Date: Fri, 18 Apr 2014 19:58:06 +0200
From: Markku-Juhani Olavi Saarinen
If you want to get rid of RC4, use AES in CTR mode. It is standard,
compact, clean, and really fast solution. May sound boring, but gives
me a feel of solid security engineering.
We use that for /dev/u?random
Hi,
If you want to get rid of RC4, use AES in CTR mode. It is standard,
compact, clean, and really fast solution. May sound boring, but gives
me a feel of solid security engineering.
Note that majority of systems now have the AES-NI instructions which
speed up AES implementations by an order of m
Date: Fri, 18 Apr 2014 12:38:38 -0400
From: Thor Lancelot Simon
3) If the algorithm's use of state-dependent array indices
presents a real weakness in practice, why aren't there any
published results on this and why was it chosen as, and
On Fri, Apr 18, 2014 at 10:27:45PM +0900, Izumi Tsutsui wrote:
>
> Note the caller of this hc128_init() is:
>
> > > +static void
> > > +cprng_fast_randrekey(cprng_fast_ctx_t *ctx)
> > > +{
> > > + uint8_t key[16], iv[16];
> > > + hc128_state_t tempstate;
> > > + int s;
> > > +
> > > + int have_in
On Fri, Apr 18, 2014 at 04:26:19PM +, Taylor R Campbell wrote:
>
> Closer inspection of HC-128 reveals that it uses secret-dependent
> array indices[*], so for that reason alone I don't think we should
> adopt it. It also has a very large state, which is going to hurt the
> cache on big system
On Fri, Apr 18, 2014 at 10:20:24PM +0900, Izumi Tsutsui wrote:
>
> LITTLE_ENDIAN != x86
>
> This should simply be le32dec(9) otherwise
> it will cause unaligned trap on arm and mips etc.
I believe the input is -- though declared as uint8_t -- required to
always be alinged (see the comment on the
Closer inspection of HC-128 reveals that it uses secret-dependent
array indices[*], so for that reason alone I don't think we should
adopt it. It also has a very large state, which is going to hurt the
cache on big systems and hurt the stack on little systems.
So, could you please split your chan
On Fri, Apr 18, 2014 at 10:27:45PM +0900, Izumi Tsutsui wrote:
>
> Note the caller of this hc128_init() is:
>
> I'm afraid "9KB stack on rekeying" is fatal on most ports.
Well, the cipher should hardly ever get rekeyed. The rekeying
intervals could be considerably larger; I did not want to
incr
campbell+netbsd-tech-kern@ wrote:
> > +void
> > +hc128_init(hc128_state_t *state, const uint8_t *key, const uint8_t *iv)
> > +{
> > + unsigned int i;
> > + uint32_t w[1280], *p = state->p, *q = state->q;
>
> 5 KB on the stack is a lot! Granted, this is a leaf routine which in
> our case will
tls@ wrote:
> @@ -160,6 +160,7 @@ include "crypto/cast128/files.cast128"
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ crypto/hc128/hc128.c 17 Apr 2014 03:17:18 -
:
> +static inline uint32_t
> +pack_littleendian(const uint8_t *v)
> +{
> +#ifdef LITTLE_ENDIAN
> + return *((const uin
Thank you for looking at this!
On Thu, Apr 17, 2014 at 09:33:28PM +, Taylor R Campbell wrote:
>
> And the only performance constraint is that its single-threaded
> performance should not be worse than the existing arc4random-based
> cprng_fast.
This, I don't agree with, unless we're going to
Repeating what I said privately for the public record:
The changes to cprng.h expose a lot of guts of the new cprng_fast
implementation. I don't see justification for this. As I understand
it, the goals of rewriting cprng_fast are:
1. Replace the unsafe arc4random by a safer algorithm.
2. Impro
On Wed, Apr 16, 2014 at 09:52:22PM -0400, Thor Lancelot Simon wrote:
>
> Attached is a patch which makes cprng_fast per-CPU and lockless. *IT IS NOT
> WELL TESTED YET (I haven't even run test vectors) AND IS ONLY FOR REVIEW.*
New diff, with some missing files and incorporating some more comments
On Wed, Apr 16, 2014 at 09:52:22PM -0400, Thor Lancelot Simon wrote:
>
> Attached is a patch which makes cprng_fast per-CPU and lockless. *IT IS NOT
> WELL TESTED YET (I haven't even run test vectors) AND IS ONLY FOR REVIEW.*
> The diff is against the head of the tls-earlyentropy branch.
Taylor
On Wed, Apr 09, 2014 at 05:16:55PM +0100, Mindaugas Rasiukevicius wrote:
>
> Perhaps I missed it, but were the benchmark results published? Also, how
> about cprng_fast32/64()? If the current mechanism is indeed faster, then
> I am glad that you have made an improvement.
To quote http://mail-in
32 matches
Mail list logo