>> The names are the order they were written in. "One" is the lib/sha1.c
>> code (547 bytes with -Os). "Four" is a 5x unrolled C version (1106 bytes).
>
> I'd like to see your version four.
Here's the test driver wrapped around the earlier assembly code.
It's an ugly mess of copy & paste code,
The names are the order they were written in. One is the lib/sha1.c
code (547 bytes with -Os). Four is a 5x unrolled C version (1106 bytes).
I'd like to see your version four.
Here's the test driver wrapped around the earlier assembly code.
It's an ugly mess of copy paste code, of course.
On Tue, Jun 12, 2007 at 01:05:44AM -0400, [EMAIL PROTECTED] wrote:
> > I got this code from Nettle, originally, and I never looked at the SHA-1
> > round structure very closely. I'll give that approach a try.
>
> Attached is some (tested, working, and public domain) assembly code for
> three
On Tue, Jun 12, 2007 at 01:05:44AM -0400, [EMAIL PROTECTED] wrote:
I got this code from Nettle, originally, and I never looked at the SHA-1
round structure very closely. I'll give that approach a try.
Attached is some (tested, working, and public domain) assembly code for
three different
> I got this code from Nettle, originally, and I never looked at the SHA-1
> round structure very closely. I'll give that approach a try.
Attached is some (tested, working, and public domain) assembly code for
three different sha_transform implementations. Compared to C code, the
timings to
Benjamin Gilbert wrote:
Jan Engelhardt wrote:
UTF-8 please. Hint: it should most likely be an ö.
Whoops, I had thought I had gotten that right. I'll get updates for
parts 2 and 3 sent out on Monday.
I'm sending the corrected parts 2 and 3 as replies to this email. The
UTF-8 fix is the
[EMAIL PROTECTED] wrote:
/* Majority: (x^y)|(y)|(z) = (x & z) + ((x ^ z) & y)
#define F3(x,y,z,dest) \
movlz, TMP; \
andlx, TMP; \
addlTMP, dest; \
movlz, TMP; \
xorlx, TMP; \
andl
Matt Mackall wrote:
In 2003, I was getting 17MB/s out of my Athlon. Now I'm getting 2.7MB/s.
Were your tests with or without the latest /dev/urandom fixes? This
one in particular:
Matt Mackall <[EMAIL PROTECTED]> writes:
>
> Have you benchmarked this against lib/sha1.c? Please post the results.
> Until then, I'm frankly skeptical that your unrolled version is faster
> because when I introduced lib/sha1.c the rolled version therein won by
> a significant margin and had
+#define F3(x,y,z) \
+ movlx, TMP2;\
+ andly, TMP2;\
+ movlx, TMP; \
+ orl y, TMP;
+#define F3(x,y,z) \
+ movlx, TMP2;\
+ andly, TMP2;\
+ movlx, TMP; \
+ orl y, TMP;
Matt Mackall [EMAIL PROTECTED] writes:
Have you benchmarked this against lib/sha1.c? Please post the results.
Until then, I'm frankly skeptical that your unrolled version is faster
because when I introduced lib/sha1.c the rolled version therein won by
a significant margin and had 1/10th the
Matt Mackall wrote:
In 2003, I was getting 17MB/s out of my Athlon. Now I'm getting 2.7MB/s.
Were your tests with or without the latest /dev/urandom fixes? This
one in particular:
[EMAIL PROTECTED] wrote:
/* Majority: (x^y)|(yz)|(zx) = (x z) + ((x ^ z) y)
#define F3(x,y,z,dest) \
movlz, TMP; \
andlx, TMP; \
addlTMP, dest; \
movlz, TMP; \
xorlx, TMP; \
andl
Benjamin Gilbert wrote:
Jan Engelhardt wrote:
UTF-8 please. Hint: it should most likely be an ö.
Whoops, I had thought I had gotten that right. I'll get updates for
parts 2 and 3 sent out on Monday.
I'm sending the corrected parts 2 and 3 as replies to this email. The
UTF-8 fix is the
I got this code from Nettle, originally, and I never looked at the SHA-1
round structure very closely. I'll give that approach a try.
Attached is some (tested, working, and public domain) assembly code for
three different sha_transform implementations. Compared to C code, the
timings to hash
On Sun, Jun 10, 2007 at 12:47:19PM -0400, Benjamin Gilbert wrote:
> Matt Mackall wrote:
> >On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
> >>It's not just the loop unrolling; it's the register allocation and
> >>spilling. For comparison, I built SHATransform() from the
>
Matt Mackall wrote:
On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
It's not just the loop unrolling; it's the register allocation and
spilling. For comparison, I built SHATransform() from the
drivers/char/random.c in 2.6.11, using gcc 3.3.5 with -O2 and
SHA_CODE_SIZE == 3
On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
> Jeff Garzik wrote:
> >Matt Mackall wrote:
> >>Have you benchmarked this against lib/sha1.c? Please post the results.
> >>Until then, I'm frankly skeptical that your unrolled version is faster
> >>because when I introduced
On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
Jeff Garzik wrote:
Matt Mackall wrote:
Have you benchmarked this against lib/sha1.c? Please post the results.
Until then, I'm frankly skeptical that your unrolled version is faster
because when I introduced lib/sha1.c the
Matt Mackall wrote:
On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
It's not just the loop unrolling; it's the register allocation and
spilling. For comparison, I built SHATransform() from the
drivers/char/random.c in 2.6.11, using gcc 3.3.5 with -O2 and
SHA_CODE_SIZE == 3
On Sun, Jun 10, 2007 at 12:47:19PM -0400, Benjamin Gilbert wrote:
Matt Mackall wrote:
On Sat, Jun 09, 2007 at 08:33:25PM -0400, Benjamin Gilbert wrote:
It's not just the loop unrolling; it's the register allocation and
spilling. For comparison, I built SHATransform() from the
Jan Engelhardt wrote:
On Jun 8 2007 17:42, Benjamin Gilbert wrote:
@@ -0,0 +1,299 @@
+/*
+ * x86-optimized SHA1 hash algorithm (i486 and above)
+ *
+ * Originally from Nettle
+ * Ported from M4 to cpp by Benjamin Gilbert <[EMAIL PROTECTED]>
+ *
+ * Copyright (C) 2004, Niels M?ller
+ * Copyright
Jeff Garzik wrote:
Matt Mackall wrote:
Have you benchmarked this against lib/sha1.c? Please post the results.
Until then, I'm frankly skeptical that your unrolled version is faster
because when I introduced lib/sha1.c the rolled version therein won by
a significant margin and had 1/10th the
On Sat, Jun 09, 2007 at 04:23:27PM -0400, Jeff Garzik wrote:
> Matt Mackall wrote:
> >On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
> >>Add x86-optimized implementation of the SHA-1 hash function, taken from
> >>Nettle under the LGPL. This code will be enabled on kernels
Matt Mackall wrote:
On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
486es or better; kernels which support 386es will use the
On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
> Add x86-optimized implementation of the SHA-1 hash function, taken from
> Nettle under the LGPL. This code will be enabled on kernels compiled for
> 486es or better; kernels which support 386es will use the generic
>
On Jun 8 2007 17:42, Benjamin Gilbert wrote:
>@@ -0,0 +1,299 @@
>+/*
>+ * x86-optimized SHA1 hash algorithm (i486 and above)
>+ *
>+ * Originally from Nettle
>+ * Ported from M4 to cpp by Benjamin Gilbert <[EMAIL PROTECTED]>
>+ *
>+ * Copyright (C) 2004, Niels M?ller
>+ * Copyright (C) 2006-2007
On Jun 8 2007 17:42, Benjamin Gilbert wrote:
@@ -0,0 +1,299 @@
+/*
+ * x86-optimized SHA1 hash algorithm (i486 and above)
+ *
+ * Originally from Nettle
+ * Ported from M4 to cpp by Benjamin Gilbert [EMAIL PROTECTED]
+ *
+ * Copyright (C) 2004, Niels M?ller
+ * Copyright (C) 2006-2007 Carnegie
On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
486es or better; kernels which support 386es will use the generic
implementation
Matt Mackall wrote:
On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
486es or better; kernels which support 386es will use the
On Sat, Jun 09, 2007 at 04:23:27PM -0400, Jeff Garzik wrote:
Matt Mackall wrote:
On Fri, Jun 08, 2007 at 05:42:53PM -0400, Benjamin Gilbert wrote:
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
Jeff Garzik wrote:
Matt Mackall wrote:
Have you benchmarked this against lib/sha1.c? Please post the results.
Until then, I'm frankly skeptical that your unrolled version is faster
because when I introduced lib/sha1.c the rolled version therein won by
a significant margin and had 1/10th the
Jan Engelhardt wrote:
On Jun 8 2007 17:42, Benjamin Gilbert wrote:
@@ -0,0 +1,299 @@
+/*
+ * x86-optimized SHA1 hash algorithm (i486 and above)
+ *
+ * Originally from Nettle
+ * Ported from M4 to cpp by Benjamin Gilbert [EMAIL PROTECTED]
+ *
+ * Copyright (C) 2004, Niels M?ller
+ * Copyright
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
486es or better; kernels which support 386es will use the generic
implementation (since we need BSWAP).
We disable building lib/sha1.o when an
Add x86-optimized implementation of the SHA-1 hash function, taken from
Nettle under the LGPL. This code will be enabled on kernels compiled for
486es or better; kernels which support 386es will use the generic
implementation (since we need BSWAP).
We disable building lib/sha1.o when an
36 matches
Mail list logo