[PATCH 0/5] IP checksum improvements

2010-04-25 Thread Joakim Tjernlund
Here are a series of performance improvements on the Internet checksum. With these changes applied I get about 20-30% better performance on x86 and PowerPC. Even though we got off on the wrong foot I got curious enough to do some more investigation and I leared more about "add with carry" and how

[PATCH 4/5] checksum: optimize loop and get rid of add16()

2010-04-25 Thread Joakim Tjernlund
Use some better vaiable names and get rid of add16 as add32 will do just as well. Fold the 32 bit checkum into 16 bits in the end. Signed-off-by: Joakim Tjernlund --- lib/checksum.c | 47 +++ 1 files changed, 23 insertions(+), 24 deletions(-) diff -

[PATCH 1/5] checksum: improve add32

2010-04-25 Thread Joakim Tjernlund
Gcc does not recognize z + (z < sum) as an "add with carry" However, x86 recognizes if (z < x) z++ as an "add with carry" operation so lets use that instead. Signed-off-by: Joakim Tjernlund --- lib/checksum.c |4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/lib/checksu

[PATCH 2/5] checksum: Optimize add32() for PowerPC

2010-04-25 Thread Joakim Tjernlund
PowerPC does not recognize add32() as an "add with carry" operation so use inline assembler instead. Signed-off-by: Joakim Tjernlund --- lib/checksum.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/lib/checksum.c b/lib/checksum.c index bf70cab..cd0fefd 1006

[PATCH 3/5] checksum: use pre increment.

2010-04-25 Thread Joakim Tjernlund
Some archs(RISC like archs) can do pre increment and load in one insn but gcc optimization often fails to take advantage of that. Help gcc to do the right thing by using pre increment instead of post increment. Signed-off-by: Joakim Tjernlund --- lib/checksum.c |8 +++- 1 files changed,

[PATCH 5/5] checksum: Optimize first addition.

2010-04-25 Thread Joakim Tjernlund
The first add op. adds two 16 bit nums which cannot overflow one 32 bits receiver so use plain addition instead. Signed-off-by: Joakim Tjernlund --- lib/checksum.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/checksum.c b/lib/checksum.c index 7b183d9..ae759d7 100

[PATCH 4/5 v2] checksum: optimize loop and get rid of add16()

2010-04-25 Thread Joakim Tjernlund
Use some better vaiable names and get rid of add16 as add32 will do just as well. Fold the 32 bit checkum into 16 bits in the end. --- v2 - bug fix lib/checksum.c | 47 +++ 1 files changed, 23 insertions(+), 24 deletions(-) diff --git a/lib/checksu

[PATCH 5/5 v2] checksum: Optimize first addition.

2010-04-25 Thread Joakim Tjernlund
The first add op. adds two 16 bit nums which cannot overflow one 32 bits receiver so use plain addition instead. --- v2 - adapt after bug fix. lib/checksum.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/lib/checksum.c b/lib/checksum.c index 9e30bdc..d4ffdfd 100644 --

Re: OSPF performance/SPF calculations

2010-04-25 Thread Joakim Tjernlund
Slowly getting back to SPF again .. Ondrej Zajicek wrote on 2010/04/23 16:01:16: > > On Fri, Apr 23, 2010 at 03:27:10PM +0200, Joakim Tjernlund wrote: > > > Anyhow I looked at the new code and it is an improvement but I think there > > is a flaw: It looks like the ptp code just find ANY interface

VRRP, OSPF and 1-way state.

2010-04-25 Thread Rob Epping
Hi list, Today I spend a lot of time debugging why BIRD doesn't get our IPSO VRRP cluster in 2WAY state, while junos and IOS routers do. Here's the setup. 1 VLAN with all OSPF routers. Network is X.Y.0.0/28 .1 and .2 are the IPSO nodes, .3 is the VRRP address. IPSO nodes run OSPF priority 0. At t

Re: OSPF performance/SPF calculations

2010-04-25 Thread Ondrej Zajicek
On Sun, Apr 25, 2010 at 06:08:04PM +0200, Joakim Tjernlund wrote: > > This is not a problem because both SPF and calc_next_hop() chooses the > > cheapest (full) ptp link. They both uses the same (local) metrics. > > Our ptp links typically have the same cost between the same two routers so > it is

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Ondrej Zajicek
On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > Here are a series of performance improvements on the > Internet checksum. With these changes applied I get about > 20-30% better performance on x86 and PowerPC. Although i agree with Martin Mares that such kind of optimizations sh

Re: VRRP, OSPF and 1-way state.

2010-04-25 Thread Ondrej Zajicek
On Sun, Apr 25, 2010 at 06:13:58PM -, Rob Epping wrote: > Hi list, > > Today I spend a lot of time debugging why BIRD doesn't get our > IPSO VRRP cluster in 2WAY state, while junos and IOS routers do. ... > Both IPSO routers send OSPF messages with RID .3, see below. ... > My guess is that nei

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Martin Mares
Hello! > Here are a series of performance improvements on the > Internet checksum. With these changes applied I get about > 20-30% better performance on x86 and PowerPC. > > Even though we got off on the wrong foot I got curious > enough to do some more investigation and I leared > more about "ad

Re: [PATCH 3/5] checksum: use pre increment.

2010-04-25 Thread Martin Mares
> Some archs(RISC like archs) can do pre increment and load > in one insn but gcc optimization often fails to take advantage > of that. Help gcc to do the right thing by using pre increment > instead of post increment. This one is a little bit dubious, I would rather not twist the code so much in

Re: [PATCH 3/5] checksum: use pre increment.

2010-04-25 Thread Joakim Tjernlund
Martin Mares wrote on 2010/04/25 23:33:49: > > > Some archs(RISC like archs) can do pre increment and load > > in one insn but gcc optimization often fails to take advantage > > of that. Help gcc to do the right thing by using pre increment > > instead of post increment. > > This one is a little

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Joakim Tjernlund
Ondrej Zajicek wrote on 2010/04/25 23:20:52: > > On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > > Here are a series of performance improvements on the > > Internet checksum. With these changes applied I get about > > 20-30% better performance on x86 and PowerPC. > > Although i

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Joakim Tjernlund
> Ondrej Zajicek wrote on 2010/04/25 23:20:52: > > > > On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > > > Here are a series of performance improvements on the > > > Internet checksum. With these changes applied I get about > > > 20-30% better performance on x86 and PowerPC. >

Patch ping

2010-04-25 Thread Joakim Tjernlund
Haven't seen any action on this patch, forgotten? http://marc.info/?l=bird-users&m=127200916013140&w=2

Re: OSPF performance/SPF calculations

2010-04-25 Thread Joakim Tjernlund
Ondrej Zajicek wrote on 2010/04/25 23:20:33: > > On Sun, Apr 25, 2010 at 06:08:04PM +0200, Joakim Tjernlund wrote: > > > This is not a problem because both SPF and calc_next_hop() chooses the > > > cheapest (full) ptp link. They both uses the same (local) metrics. > > > > Our ptp links typically

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Joakim Tjernlund
> > > Ondrej Zajicek wrote on 2010/04/25 23:20:52: > > > > > > On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > > > > Here are a series of performance improvements on the > > > > Internet checksum. With these changes applied I get about > > > > 20-30% better performance on x86 a

Re: [PATCH 0/5] IP checksum improvements

2010-04-25 Thread Otto Solares
On Sun, Apr 25, 2010 at 11:41:17AM +0200, Joakim Tjernlund wrote: > Here are a series of performance improvements on the > ... Joakim, although you seems to have a strong character I appreciate your performance tuning on BIRD, everyday less people do this kind of analysis on F/OSS projects so than