Re: [PATCH] - trivial - Improve appletalk checksum calculation
Herbert Xu [EMAIL PROTECTED] writes: But your code differs significantly from Stephen's version. However, if it is correct it does look like a good improvement. So please write a simple test program. It can't that bad since there are only 65536 values to test :) I think the code is simple enough to see immediately that it should calculate the same checksum. But I have written a short test program to show this and tested it on i686 (gcc-4.2.2) and powerpc (gcc-3.4.5) without optimization and with -O6. BTW, shouldn't unsigned long be replaced by unsigned int to avoid 64-bit operations one some platforms? urs #include stdio.h unsigned long f1(unsigned long sum) { sum = 1; if (sum 0x1) { sum++; sum = 0x; } return sum; } unsigned long f2(unsigned long sum) { sum = ((sum 0x8000)15) | ((sum 0x7fff)1); return sum; } int main() { unsigned long s; for (s = 0; s 65536; s++) { if (f1(s) != f2(s) || f1(s) 65535) printf(%ld\n, s); } return 0; } - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Sun, 2007-10-28 at 21:18 +0100, Urs Thuermann wrote: I have written a short test program to show this and tested it on i686 (gcc-4.2.2) and powerpc (gcc-3.4.5) without optimization and with -O6. Thanks for that. BTW, shouldn't unsigned long be replaced by unsigned int to avoid 64-bit operations one some platforms? probably unsigned long should be u16. Compiled, untested. diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c index 7c0b515..c3890cc 100644 --- a/net/appletalk/ddp.c +++ b/net/appletalk/ddp.c @@ -922,25 +922,19 @@ static int atrtr_ioctl(unsigned int cmd, void __user *arg) * Checksum: This is 'optional'. It's quite likely also a good * candidate for assembler hackery 8) */ -static unsigned long atalk_sum_partial(const unsigned char *data, - int len, unsigned long sum) +static u16 atalk_sum_partial(const unsigned char *data, +int len, u16 sum) { - /* This ought to be unwrapped neatly. I'll trust gcc for now */ while (len--) { - sum += *data; - sum = 1; - if (sum 0x1) { - sum++; - sum = 0x; - } - data++; + sum += *data++; + sum = ((sum 0x8000) 15) | ((sum 0x7fff) 1); } return sum; } /* Checksum skb data -- similar to skb_checksum */ -static unsigned long atalk_sum_skb(const struct sk_buff *skb, int offset, - int len, unsigned long sum) +static u16 atalk_sum_skb(const struct sk_buff *skb, int offset, +int len, u16 sum) { int start = skb_headlen(skb); int i, copy; @@ -1010,13 +1004,13 @@ static unsigned long atalk_sum_skb(const struct sk_buff *skb, int offset, static __be16 atalk_checksum(const struct sk_buff *skb, int len) { - unsigned long sum; + u16 sum; /* skip header 4 bytes */ sum = atalk_sum_skb(skb, 4, len-4, 0); /* Use 0x for 0. 0 itself means none */ - return sum ? htons((unsigned short)sum) : htons(0x); + return cpu_to_be16(sum ? sum : 0x); } static struct proto ddp_proto = { * - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Sun, 2007-10-28 at 14:01 -0700, Joe Perches wrote: probably unsigned long should be u16. Or not. x86 u16 performs much worse than u32 (~ 50% worse) - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
From: Joe Perches [EMAIL PROTECTED] Date: Mon, 22 Oct 2007 18:36:28 -0700 On Mon, 2007-10-22 at 17:35 -0700, David Miller wrote: Your code is rotating bit 15 down by one bit and bits 0-14 up by one bit. Yes, a 16 bit rotate left. There was a discussion a few years ago: http://oss.sgi.com/archives/netdev/2003-10/msg00734.html From the spec: ... Reception of a datagram with CkSum equal to 0 implies that a checksum is not performed. Ok, but again did you test it? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Mon, 2007-10-22 at 18:43 -0700, David Miller wrote: Ok, but again did you test it? Nope. Stephen Hemminger did in 2003. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Mon, 2007-10-22 at 17:35 -0700, David Miller wrote: Your code is rotating bit 15 down by one bit and bits 0-14 up by one bit. Yes, a 16 bit rotate left. There was a discussion a few years ago: http://oss.sgi.com/archives/netdev/2003-10/msg00734.html From the spec: Implementers of DDP should treat generating the checksum as an optional feature. The 16-bit DDP checksum is computed as follows: CkSum := 0 ; FOR each datagram byte starting with the byte immediately following this Checksum field REPEAT the following algorithm: CkSum := CkSum + byte; (unsigned addition) Rotate CkSum left one bit, rotating the most significant bit in least significant bit; IF, at the end, CkSum = 0 THEN CkSum := $ (all ones). Reception of a datagram with CkSum equal to 0 implies that a checksum is not performed. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Mon, 22 Oct 2007 12:36:19 -0700 Joe Perches [EMAIL PROTECTED] wrote: It's a bit after 2.6.1 now... Removes unnecessary if, uses 16 bit rotate left. Performance improves ~30% Signed-off-by: Joe Perches [EMAIL PROTECTED] diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c index 7c0b515..1c50f4c 100644 --- a/net/appletalk/ddp.c +++ b/net/appletalk/ddp.c @@ -925,15 +925,9 @@ static int atrtr_ioctl(unsigned int cmd, void __user *arg) static unsigned long atalk_sum_partial(const unsigned char *data, int len, unsigned long sum) { - /* This ought to be unwrapped neatly. I'll trust gcc for now */ while (len--) { - sum += *data; - sum = 1; - if (sum 0x1) { - sum++; - sum = 0x; - } - data++; + sum += *data++; + sum = ((sum 0x8000)15) | ((sum 0x7fff)1); } return sum; } The end of the message you quoted was: Corrected fast code is: while (len--) { sum += *data++; sum = 1; sum = (((sum 0x1) 16) + sum) 0x; } At least it is correct on the standalone random data test, and the new code is 30% faster for the cached memory case (13.7 clks/byte vs 18 clks/byte). Your code looks different... -- Stephen Hemminger [EMAIL PROTECTED] - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
Joe Perches [EMAIL PROTECTED] wrote: On Mon, 2007-10-22 at 18:43 -0700, David Miller wrote: Ok, but again did you test it? Nope. Stephen Hemminger did in 2003. But your code differs significantly from Stephen's version. However, if it is correct it does look like a good improvement. So please write a simple test program. It can't that bad since there are only 65536 values to test :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmVHI~} [EMAIL PROTECTED] Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] - trivial - Improve appletalk checksum calculation
On Mon, 2007-10-22 at 20:30 -0700, Stephen Hemminger wrote: Corrected fast code is: while (len--) { sum += *data++; sum = 1; sum = (((sum 0x1) 16) + sum) 0x; } At least it is correct on the standalone random data test, and the new code is 30% faster for the cached memory case (13.7 clks/byte vs 18 clks/byte). Your code looks different... Both are 16 bit rotate lefts. Which looks clearer? - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html