One fix: the data was for 2000 sessions --> 4000 connections (sorry) More info on the setup: Client a (100Mbit) ----> switch ---> client b (100Mbit) | \/ proxy(1Gbit)
Anyway: With 2000 connections I get ~90Mbits in + ~90Mbits out With 4000 connections I get ~60Mbits in + ~60 Mbits out With 8000 connections and no connection tracking in the kernel (test setup): ~85Mbits in + ~85Mbits out 90Mbits*2 ==> ~15000 packets/second grep conntrack /proc/slabinfo (about 4200 connections at the time) ip_conntrack 4527 4532 352 412 412 1 8192 buckets, 65536 max I guess the hash function fails for my setup (note: this is a test setup, connection originating from a single ip to a single ip) Thanx aviv -----Original Message----- From: Patrick Schaaf [mailto:[EMAIL PROTECTED]] Sent: Monday, March 18, 2002 8:10 PM To: Aviv Bergman Cc: '[EMAIL PROTECTED]' Subject: Re: [Q] connection tracking scaling > I'm working on a proxy type program, using REDIRECT to catch (tcp) > traffic, and I'm seeing severe network degradation above ~2000 > connection. > > (computer: 1Gb p3, 2Gb memory, kernel 2.4.18 + aa1 patch) Two questions first: how many packets/second is these 2000 connections? And, do you have a feeling about how far the system would scale if it were'nt for connection tracking. Also, what is the total number of conntrack entries you have? Do this: grep conntrack /proc/slabinfo and show us the output line, please. > I've profiled the kernel and found that > 50% of the cpu time is in > __ip_conntrack_find Ouch. > - is there a patch to make connection tracking use a > more scalable data structure (as I understand it uses a list), or to > improve it's performance? Please look a bit more careful at the code. While __ip_conntrack_find() itself operates on a list, this is only the "inner loop" of a hash table implementation. The setup is the usual "array of pointers to hash lists, and hopefully they don't collide" setup. Nothing really wrong with that, in principle, as far as data structures go. The most likely cause for that 50% CPU usage, is that the hash table is too small for your application. Contrary to ip_conntrack_max, the total number of entries in the conntracking, the hash table size is not modifiable at runtime. In ip_conntrack_core.c, that size is gotten from the variable ip_conntrack_htable_size. This variable is computed at boot / module load time, depending on the amount of RAM in your system. IF you run ip_conntrack as a module, you can override the computed hash table size by specifying "hashsize=XXX" as a module load parameter. You can see the active value chosen for the hash table size, in syslog: ip_conntrack (XXX buckets, YYY max) The XXX is the number I'm talking about. Given ZZZ active conntrack entries (as seen in /proc/slabinfo) you'll have, on average, a list size within __ip_conntrack_find() of ZZZ/XXX. The secondary reason for overly long lists, would be a bad hash function. If you want to find out whether that could be the case, you could instrument __ip_conntrack_find() to count the length of each list during traversal, remember that somewhere, and occasionally printk() an average, minimum, and maximum. Hope this helps... Patrick