Henrik,
        I ran some tests in August 2007 on similar hardware.

Thread "OpenBSD on Dell issues ?"
http://lists.sdbug.org/pipermail/sdbug/2007-August/thread.html

        So, what do these data mean for production use ?...

        In actual production use one would never want to go near 100% interrupt
time. In production I notice things get a little shaky at 50% interrupt
time (brief transients can get too be too much). My chief complaint is
that OpenBSD doesn't use both cores for this work. I mean, I don't even
think you can buy a single core CPU anymore... can you ? (rhetorical
question).
        In actuality, my CPU utilization is a much higher in real use than in
*my* tests. My tests were awful simple and were not realistic in that I
only hurled a single protocol through the firewall at a time. No mixed
protocol traffic and there was no traffic generated to be dropped by the
firewall and my ruleset was pass everything
        In my current real ruleset and real traffic, PF uses about <10%
interrupt time, and the network traffic uses 40% interrupt time
(determined by turning PF off to see the delta in interrupt time). But,
if I could use both cores (and assuming an unrealistic 100% linearity),
I could handle twice the traffic. This is where I need to be for the
next 12 months. Beyond that, who knows.
        Your numbers are 75kpps at 100% (right?). In practice I'd not want to
run at over 50% of maximum capacity on a regular basis to leave room for
anomalies. So, we'd want to be at under 32.5kpps. In my network, my
traffic is very close to even in pps inbound vs outbound. So, that's
~~16kpps in each direction. I use 13kpps in each direction each afternoon
now @ 40% interrupt time total. So, my "real" numbers confirm your test
numbers. Nice test ! :o)
        I have asked this question before: Will FreeBSD's ability to use both
cores but older PF code make it's overall capacity in pps higher than
OpenBSD with the newer improvements in PF and on the same multi-core
hardware ? My hypothesis is YES since the majority of the interrupt time
is not PF but is the network traffic. Need to test to get that answer.
        Thanks for posting your results !

Mike



On Fri, 2008-02-01 at 15:09 +0100, Henrik Johansen wrote:
> Hi list,
> 
> I have been conducting a variety of OpenBSD / PF performance tests
> during the last week and I might as well share my findings with you.
> 
> The firewall tested is a Dell 2950 PE II equipped with 2 x Intel Quad 
> and 1 x Intel Dual NIC's running OpenBSD 4.2-CURRENT.
> 
> The test machines were 6 Dell PE II servers (1950 & 2950) running 
> Knoppix. We used iperf, nmap and hping for our tests.
> 
> Our data was pulled from a Dell PowerConnect 5324 switch using SNMP and
> Cacti for drawing nice graphs. We did take random samples from netstat
> and other OpenBSD tools on the firewall itself to verify that the SNMP 
> numbers were matching up nicely.
> 
> The first test was about raw throughput:
> 
> Using iperf we pulled ~920 Mbits/s per bridge over 2 bridge devices when 
> PF was disabled. When PF was enabled with a pass all ruleset throughput 
> was measured to ~760 Mbits/s.
> 
> A bidirectional iperf reached ~340 Mbit/s per bridge with a pass all PF 
> ruleset, again over 2 bridges.
> 
> When a 2100 line ruleset from an old production firewall was used 
> throughput went down to ~320 Mbit/s using a single bridge. That was 
> improved a _great deal_ by proper ruleset optimisation but your milage may 
> vary.
> 
> The second test was a simulated DDOS SYN-flood attack against 1 of our 4 
> bridge devices with the the following ruleset:
> 
> ######################################################################
> set timeout { adaptive.start 0, adaptive.end 0 interval 10 frag 10}
> set limit { states 250000, frags 10000 }
> set optimization normal
> set block-policy drop
> set skip on { lo0, em0, em1, em3, em5, em7, em9 }
> scrub in all 
> 
> table <block_test> persist file "/root/block_test"
> 
> block in quick on em4 from <block_test> to any
> pass in all no state
> pass out all no state
> #######################################################################
> 
> The block_test table contained 2000 entirely random IPv4 addresses.
> 
> The results were pretty impressive:
> 
> ~~120k pps  ~40% interrup load (measured using top)
> ~~160k pps  ~65% interupt load (-"-)
> ~~240k pps  ~85% interupt load (-"-)
> 
> We reached a maximum of ~330k pps before the box went into a livelock.
> 
> As a side note, when flooding the NIC's that shared irq 6 with the LSI SAS 
> controller we discovered that noticeable fewer packets were needed to send 
> the box into a livelock but the exact numbers have unfortunatly escaped my 
> notes :(
> 
> Flooding all 4 bridges simultaniously yielded max ~75k pps per bridge.
> 
> For comparison reasons we actually repeated this test with a Cizzco ASA 
> 5510 Security Plus firewall using the same ruleset and it was killed by 
> ~~50k pps on a single bridge.
> 
> During the course of our tests we tuned the following sysctl parameters       
>                                                                               
>                                                 
> to yield the best performance with our hardware,ruleset,traffic pattern,
> etc :
> 
> kern.maxclusters  
> net.inet.ip.ifq.maxlen
> 
> Our tests showed that you can degrade your performance by blindly tuning      
>                                                                               
>                                                     
> these values so caution (and proper testing) is advised.   
> 
-- 
************************************************************
Michael J. McCafferty
Principal, Security Engineer
M5 Hosting
http://www.m5hosting.com

You can have your own custom Dedicated Server up and running today !
RedHat Enterprise, CentOS, Fedora, Debian, OpenBSD, FreeBSD, and more
************************************************************

Reply via email to