Re: Expected throughput in an OpenBSD virtual server

2011-08-26 Thread Henning Brauer
* Christer Solskogen christer.solsko...@gmail.com [2011-08-22 22:53]:
 On Mon, Aug 22, 2011 at 10:04 PM, Stuart Henderson s...@spacehopper.org 
 wrote:
  - faster ram
 Are you sure about that? Almost every benchmark I've seen, fast ram
 has almost nothing to say.

you have only seen very bad benchmarks then. memory access speed
(foremost: latency) is the #1 bottleneck on firewall/router style
setups. 

 I would be delighted if what I've been
 reading is wrong :-)

be delighted

-- 
Henning Brauer, h...@bsws.de, henn...@openbsd.org
BS Web Services, http://bsws.de
Full-Service ISP - Secure Hosting, Mail and DNS Services
Dedicated Servers, Rootservers, Application Hosting



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Patrick Lamaiziere
Le Tue, 23 Aug 2011 19:21:32 +0200,
Per-Olov SjC6holm p...@incedo.org a C)crit :

Hello,

  Here we reach 400 MBits/s with a CPU rate ~70% but we
  run OpenBSD 4.9.

 How fast is your CPU ?

cpu0: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz, 2261.30 MHz
It's a Dell R610 with 4Go RAM.



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Per-Olov Sjöholm
On 24 aug 2011, at 12:01, Patrick Lamaiziere wrote:
 Le Tue, 23 Aug 2011 19:21:32 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 How fast is your CPU ?

 cpu0: Intel(R) Xeon(R) CPU E5520 @ 2.27GHz, 2261.30 MHz
 It's a Dell R610 with 4Go RAM.



Maybe that is normal then (if we have similar quality on NICs, tuning and RAM)
that I reach 400Mbit at 100% with one dedicated Xeon 5504 2GHz core.
(I have two Intel(R) Xeon(R) CPU  E5504  @ 2.00GHz stepping 05)

You run on a physical server, right? As I run on a virtual server with near
similar performance and a slower CPU it seems I have very good performance.
It's hard for me to try faster CPUs just for fun as they are VERY expensive
with faster ones...

/Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Per-Olov Sjöholm
On 23 aug 2011, at 19:30, Tomas Bodzar wrote:
 On Tue, Aug 23, 2011 at 7:21 PM, Per-Olov Sjvholm p...@incedo.org wrote:
 On 23 aug 2011, at 10:54, Patrick Lamaiziere wrote:
 Le Mon, 22 Aug 2011 22:49:47 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

 It would be nice to know if current is better with Intel em(4) cards.
 because of this commit : http://freshbsd.org/2011/04/13/00/19/01

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 Regards.



 How fast is your CPU ?

 Yes I can see the 1.254 commit with this came in after the 4.9 release that
I
 use. I can try to see if I can measure any performance gain with this
update.

 I will try this from aug 17...
 http://ftp.sunet.se/pub/os/OpenBSD/snapshots/i386/install50.iso

 Can't see that mirror here http://www.openbsd.org/ftp.html , it's
 better to use something more official


 I4ll get back

 [ YES !! More fun tests :D ]

 Regards
 Per-Olov





Have tried it now... I tried the 5.0 snapshot from aug 17 with the improved em
driver. Also tested with more allocated cores and the SMP kernel.

Result on 5.0 snapshot with improved em driver:

- SMP
worse. Really sucks! _Dramatically_ reduced throughput.

- One processor core (as most of my tests have used)
An improvement, but very little. Maybe 10% better


/Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Tomas Bodzar
On Wed, Aug 24, 2011 at 7:00 PM, Per-Olov SjC6holm p...@incedo.org wrote:
 On 23 aug 2011, at 19:30, Tomas Bodzar wrote:
 On Tue, Aug 23, 2011 at 7:21 PM, Per-Olov Sjvholm p...@incedo.org wrote:
 On 23 aug 2011, at 10:54, Patrick Lamaiziere wrote:
 Le Mon, 22 Aug 2011 22:49:47 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

 It would be nice to know if current is better with Intel em(4) cards.
 because of this commit : http://freshbsd.org/2011/04/13/00/19/01

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 Regards.



 How fast is your CPU ?

 Yes I can see the 1.254 commit with this came in after the 4.9 release
that
 I
 use. I can try to see if I can measure any performance gain with this
 update.

 I will try this from aug 17...
 http://ftp.sunet.se/pub/os/OpenBSD/snapshots/i386/install50.iso

 Can't see that mirror here http://www.openbsd.org/ftp.html , it's
 better to use something more official


 I4ll get back

 [ YES !! More fun tests :D ]

 Regards
 Per-Olov





 Have tried it now... I tried the 5.0 snapshot from aug 17 with the improved
em
 driver. Also tested with more allocated cores and the SMP kernel.

 Result on 5.0 snapshot with improved em driver:

 - SMP
 worse. Really sucks! _Dramatically_ reduced throughput.

Will be fine to see systat ; systat mbufs ; netstat -m ; vmstat -i and
compare them with previous version. Including dmesg (if something
changed in dmesg)


 - One processor core (as most of my tests have used)
 An improvement, but very little. Maybe 10% better

As stated in some of links and posts sent to you - SMP doesn't offer
better throughput/sped automatically. You need to test on i386
non-SMP/SMP and amd64 non-SMP/SMP to see what's best.



 /Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Lars Hansson
If you want a comparison, I have run a small OpenBSD router under KVM
and it easily sustained 80Mbps. It was connected to a FastEthernet
switch so it couldnt actually go much higher. This was using the
emulated e1000 KVM device and OpenBSD 4.9 release with mpbios  iic
disabled (disabling iic removes some annoying boot messages). The KVM
server was a modest 3Ghz Core2 Duo with 4Gb RAM and a lot of other
VM's running.

Cheers,
Lars



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread LeviaComm Networks

On 8/24/2011 11:31 AM, Lars Hansson wrote:

If you want a comparison, I have run a small OpenBSD router under KVM
and it easily sustained 80Mbps. It was connected to a FastEthernet
switch so it couldnt actually go much higher. This was using the
emulated e1000 KVM device and OpenBSD 4.9 release with mpbios  iic
disabled (disabling iic removes some annoying boot messages). The KVM
server was a modest 3Ghz Core2 Duo with 4Gb RAM and a lot of other
VM's running.

Cheers,
Lars





You might see a bit more performance by load-balancing across two or 
more VMs.  Where I work, we have a couple virtual routers / firewalls 
(these systems are internal-only so security on these machines isn't 
critical)
I found that having 2 VMs load balanced in CARP gave more performance 
than doubling the resources on a single system.  No tweaking was done on 
the systems which makes them much easier to maintain.  Plus we can spin 
up more to add additional throughput without any downtime


Recently we have added a few more firewalls to load balance with, each 
using the same configuration and adding performance to the cluster.  We 
are seeing diminishing returns on each firewall we add (Overhead due to 
pfsync, CARP, etc)


The VM host runs VMware ESXi on 16 GB RAM and 2 8-core Opterons (6128HE, 
2 Ghz) and has two 10-Gb network cards (inside and outside) and 2x 1 Gb 
cards (Management and inter-host network).  The VMs are configured with 
a single processor core and 256 Mb RAM and 3 Virtual Gb network cards 
(Inside, outside and pf-sync)




Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Per-Olov Sjöholm
On 24 aug 2011, at 19:13, Tomas Bodzar wrote:
 On Wed, Aug 24, 2011 at 7:00 PM, Per-Olov Sjvholm p...@incedo.org wrote:
 On 23 aug 2011, at 19:30, Tomas Bodzar wrote:
 On Tue, Aug 23, 2011 at 7:21 PM, Per-Olov Sjvholm p...@incedo.org wrote:
 On 23 aug 2011, at 10:54, Patrick Lamaiziere wrote:
 Le Mon, 22 Aug 2011 22:49:47 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

 It would be nice to know if current is better with Intel em(4) cards.
 because of this commit : http://freshbsd.org/2011/04/13/00/19/01

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 Regards.



 How fast is your CPU ?

 Yes I can see the 1.254 commit with this came in after the 4.9 release
that
 I
 use. I can try to see if I can measure any performance gain with this
 update.

 I will try this from aug 17...
 http://ftp.sunet.se/pub/os/OpenBSD/snapshots/i386/install50.iso

 Can't see that mirror here http://www.openbsd.org/ftp.html , it's
 better to use something more official


 I4ll get back

 [ YES !! More fun tests :D ]

 Regards
 Per-Olov





 Have tried it now... I tried the 5.0 snapshot from aug 17 with the improved
em
 driver. Also tested with more allocated cores and the SMP kernel.

 Result on 5.0 snapshot with improved em driver:

 - SMP
 worse. Really sucks! _Dramatically_ reduced throughput.

 Will be fine to see systat ; systat mbufs ; netstat -m ; vmstat -i and
 compare them with previous version. Including dmesg (if something
 changed in dmesg)


 - One processor core (as most of my tests have used)
 An improvement, but very little. Maybe 10% better

 As stated in some of links and posts sent to you - SMP doesn't offer
 better throughput/sped automatically. You need to test on i386
 non-SMP/SMP and amd64 non-SMP/SMP to see what's best.



 /Per-Olov





YES, YES and YES again !!!

I have done  a huge mistake during my tests. To much kernel copying... The
result was that the kernel with disabled mpbios was /bsd.old. Very
embarrassing.

I have now a throughput of no less than 560Mbit / s. And that is through the
VIRTUAL firewall with more than 50% IDLE CPU. Y e e e e e e s s ! How is
really possible. But it is...

### Summary: ###
- KVM virtualized STOCK OpenBSD 4.9 + Stable updates + sysctl.conf tuning +
disabled mpbios. running uniprocesor kernel
- 324 rows PF ruleset
- 2 Intel PRO/1000 MT (82574L) desktop NICs used through PCI passthrough
from the KVM virtualization host
- OpenBSD have got 512MB RAM, One CPU core from host (Xeon 5504 2.0Ghz)

Test:
An SCP with the crypto overhead (default crypto) you get from A 64 bit
SuseLinux through the firewall to  my Macbook pro (quadcore i7 2.2GHz 8GM RAM,
OCZ-Vertex 3 SSD disk). Several tests with DVD ISO files between 3-6 GB i
size. 540Mbit was the _lowest_ average speed in the test and 560 Mbit / s was
the highest
#


I am really satisfied with this. I was going to test FreeBSD beta 9 with its
PF 4.5 just for fun. But I will skip that when the results ended up this
good.


OpenBSD really indeed perform V E R Y well in this area.


Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-24 Thread Ryan McBride
On Wed, Aug 24, 2011 at 07:00:09PM +0200, Per-Olov SjC6holm wrote:
 - SMP
 worse. Really sucks! _Dramatically_ reduced throughput.

This is probably a result of you testing a virtualised guest rather than
real hardware.
 

 - One processor core (as most of my tests have used)
 An improvement, but very little. Maybe 10% better

10% is fantastic.  What were you expecting? 10x improvement from a
network driver change?  All the easy optimizations have already been
done.



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Per-Olov Sjöholm
On 23 aug 2011, at 01:32, john slee wrote:
 On 22 August 2011 23:45, Per-Olov Sjvholm p...@incedo.org wrote:
 As http://www.openbsd.org/faq/faq6.html states, there's little you can
 tweak
 to improve your numbers; just get a nice-clocked, good cache-sized CPU and
 give it some loving.

 The FAQ you refer to seems to be of no use at all and is totally unrelated
 to
 this post.

 It is quite pertinent, actually. See the beginning of section 6.6;

 http://www.openbsd.org/faq/faq6.html#Tuning

 John



If you please will explain how baddynamic and avoiding certain ports will
affect what we are talking about...

Naaahh lets forget that section

/Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Patrick Lamaiziere
Le Mon, 22 Aug 2011 22:49:47 +0200,
Per-Olov SjC6holm p...@incedo.org a C)crit :

Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

It would be nice to know if current is better with Intel em(4) cards. 
because of this commit : http://freshbsd.org/2011/04/13/00/19/01

Here we reach 400 MBits/s with a CPU rate ~70% but we
run OpenBSD 4.9.

Regards.



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Patrick Lamaiziere
Le Mon, 22 Aug 2011 20:04:50 + (UTC),
Stuart Henderson s...@spacehopper.org a C)crit :

Hello,

 OpenBSD has another way to handle this, MCLGETI.

Is there a documentation (for the human being, not the developer)
about how MCLGETI works? (don't find a lot about it)

Thanks, regards.



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Ryan McBride
On Tue, Aug 23, 2011 at 09:10:05AM +0200, Per-Olov SjC6holm wrote:
 If you please will explain how baddynamic and avoiding certain ports will
 affect what we are talking about...
 
 Naaahh lets forget that section

I believe people are referring to the text above that:

   One goal of OpenBSD is to have the system Just Work for the vast
   majority of our users. Twisting knobs you don't understand is far more
   likely to break the system than it is to improve its performance. Always
   start from the default settings, and only adjust things you actually see
   a problem with.

   VERY FEW people will need to adjust any networking parameters!


Earlier you asked:

 So the question remains. Is it likely that a faster cpu core will give
 better performance (not that I need it. Just doing some laborations
 here).  Is a faster CPU the best / only way to increase throughput.

Yes, all other things being equal faster CPU will help. Other hardware factors
include:

- CPU vendor (AMD vs Intel)
- CPU cache, bus, chipset
- PCI bus
- Network card
- If you are doing IPSec, AES-specific instructions (AES-NI on Intel) 

Some CPU architectures have much better IO and interrupt performance for
a given clock speed (Sparc64, for example), but cost makes them an
unlikely choice for a firewall. 

Things that seem to make very little difference in testing:

- MP vs SP kernel
- i386 vs AMD64


 Of course we assume the OS tweak is ok and that reasonable 
 NIC:s are used.  

OS tweaks are usually not OK. The general rule of thumb is that if you
have to ask about them on misc@ because there is no documentation and
you don't understand the effects, then you shouldn't touch it

PF configuration can have a big effect on your performance for some
types of traffic. In general it's better to worry about making your
ruleset correct and maintainable, but if you MUST write your ruleset
with performance in mind, the following article discusses most of the
issues:

http://www.undeadly.org/cgi?action=articlesid=20060927091645


 Is there a plan to change the interrupt handling model in OpenBSD to
 device polling in future releases ?

No. 



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Tomas Bodzar
On Tue, Aug 23, 2011 at 11:10 AM, Patrick Lamaiziere
patf...@davenulle.org wrote:
 Le Mon, 22 Aug 2011 20:04:50 + (UTC),
 Stuart Henderson s...@spacehopper.org a C)crit :

 Hello,

 OpenBSD has another way to handle this, MCLGETI.

 Is there a documentation (for the human being, not the developer)
 about how MCLGETI works? (don't find a lot about it)

Maybe these?
http://blogs.oracle.com/video/entry/mclgeti_effective_network_livelock_mitigation
https://www.youtube.com/watch?v=fv-AQJqUzRI
http://wikis.sun.com/display/KCA2009/KCA2009+Conference+Agenda  (see
Friday 17th)

looks like only David Gwynne may point to something useful.



 Thanks, regards.



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Stuart Henderson
On 2011-08-22, Per-Olov Sj?holm p...@incedo.org wrote:
 MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?

it's in various files, see mbuf(9) and look for videos/slides from talks
by dlg (David Gwynne), there's an asiabsdcon talk with more details and quite
possibly some others.



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Ryan McBride
On Tue, Aug 23, 2011 at 10:42:59AM +, Stuart Henderson wrote:
 On 2011-08-22, Per-Olov Sj?holm p...@incedo.org wrote:
  MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?
 
 it's in various files, see mbuf(9) and look for videos/slides from talks
 by dlg (David Gwynne), there's an asiabsdcon talk with more details and quite
 possibly some others.

The effects of MCLGETI are quite visible in the PF testing I did for our
'10 years of PF' talk, see pages 70-74 of the slides from BSDCan for
example:

http://quigon.bsws.de/papers/2011/pf10yrs/



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Per-Olov Sjöholm
On 23 aug 2011, at 10:54, Patrick Lamaiziere wrote:
 Le Mon, 22 Aug 2011 22:49:47 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

 It would be nice to know if current is better with Intel em(4) cards.
 because of this commit : http://freshbsd.org/2011/04/13/00/19/01

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 Regards.



How fast is your CPU ?

Yes I can see the 1.254 commit with this came in after the 4.9 release that I
use. I can try to see if I can measure any performance gain with this update.

I will try this from aug 17...
http://ftp.sunet.se/pub/os/OpenBSD/snapshots/i386/install50.iso

I4ll get back

[ YES !! More fun tests :D ]

Regards
Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-23 Thread Tomas Bodzar
On Tue, Aug 23, 2011 at 7:21 PM, Per-Olov SjC6holm p...@incedo.org wrote:
 On 23 aug 2011, at 10:54, Patrick Lamaiziere wrote:
 Le Mon, 22 Aug 2011 22:49:47 +0200,
 Per-Olov SjC6holm p...@incedo.org a C)crit :

 Hello,
 Have not tried current, but will try current as soon as I can.
 Also... I will try to do some laborations with CPU speed of the core
 the OpenBSD virtual machine has. This to see how the interrupts and
 throughput is related to the CPU speed of the allocated core.

 It would be nice to know if current is better with Intel em(4) cards.
 because of this commit : http://freshbsd.org/2011/04/13/00/19/01

 Here we reach 400 MBits/s with a CPU rate ~70% but we
 run OpenBSD 4.9.

 Regards.



 How fast is your CPU ?

 Yes I can see the 1.254 commit with this came in after the 4.9 release that
I
 use. I can try to see if I can measure any performance gain with this
update.

 I will try this from aug 17...
 http://ftp.sunet.se/pub/os/OpenBSD/snapshots/i386/install50.iso

Can't see that mirror here http://www.openbsd.org/ftp.html , it's
better to use something more official


 I4ll get back

 [ YES !! More fun tests :D ]

 Regards
 Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Per-Olov Sjöholm
On 22 aug 2011, at 07:45, Tomas Bodzar wrote:
 Try OpenBSD outside of KVM on real HW and you will see where's the
 bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
 fine or try to compare with OpenBSD running in VMware as there's fine
 support for that use.

 Of course security is around zero in this scenario, but as you said
 you're doing it for fun :-)

 On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov Sjvholm p...@incedo.org wrote:
 Hi Misc

 # Background #

 I have done som fun laborations with a virtual fully patched OpenBSD 4.9
 firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
 OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
 (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
 dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI passthorugh.
So
 OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
 are unbound).

 I have not measured packets per second which of course is more relevant.
But
 as I try to tweak the speed I don't care if I measure packets or Mbits as
long
 as my tweaks give a higher value during the next test. Going in on one
 physcial nic and out on the other with my small ruleset that uses keep
state
 everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
 (I copy large files, a few Gig each). I started with a lower value and
after a
 few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this
speed
 I can see that the interrupts in the firewall simply eat all resources.
Have
 no ip.ifq.drops or any other drops that I am aware of...


 # Question #

 I now simply wonder if I can increase this speed I did one test and
 replaced these two physical desktop Intel Nics with a dual port server
adapter
 (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
 server adapter could lower my interrupt load. However... OpenBSD yelled
 something about unable to reset PCI device. So I went back to these two
 desktop adapters. These low price dektop adapters however in a intel i7
 desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
 up the Gig pipe. So they cannot be to bad...


 As PF cannot use SMP, is the only way to bump up the firewall throughput
(in
 this scenario) to increase the speed of the processor core (i.e change
 server)? Or are there any other interesting configs to try ?


 Regards

 /Per-Olov
 --
 GPG keyID: 5231C0C4
 GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
 GPG key:
 http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4





Plz, don't top post

Vmware is commercial software = avoid if I can. Also Linux guests with virtio
drivers gives much better performance on the same hardware if using KVM
instead of Vmware. Also, no need for vmware tools as everything is in stock
Linux kernel.

I cannot at this time give a fair test running it on the same hardware but as
a physical server instead of a virtual one. This as the KVM host runs 10 other
servers. I have however tested the OpenBSD on another hardware which ended up
with similar performance. That was on a physical box with Gig Intel Nics
(82541 cards) but on a weak Quad core Intel Atom 1.6GHz processor running the
SMP kernel. At the bottle neck speed there was 100% interrupts at around
400Mbit (same tested files and protocols to be able to give a fair
comparison). Maybe the Intel atom 1.6 can be compared to a Xeon 5504 core on
2GHz ??? I am not a processor guru. Anyone??


regarding security which you say is around zero. Yes this is a laboration.
But maybe you should say increased risk which is a more fair statement. I have
not heard of anyone that managed to hack a scenario like this in VMware or
KVM. Also note that the host OS itself in my case cannot even see these
devices as they are unbound. From my point of view it's like the race on WiFi
where people say you should use WPA2 with AES to be secure. But the real fact
is that standard old WPA without AES and with a reasonable key length (20+
chars) have not been broken by anyone in the world yet (what we know). One
person claims he manage to break a part of it in a lab. So... WPA = secure,
better performance and better compatibility. If I was Nasa or DoD I would
probable avoid WPA as someone someday of course will break it, otherwise
not...



So the question remains. Is it likely that a faster cpu core will give better
performance (not that I need it. Just doing some laborations here). Is a
faster CPU the best / only way to increase throughput. Of course we assume the
OS tweak is ok and that reasonable NIC:s are used. Is there a plan to change
the  interrupt handling model in OpenBSD to device polling in future releases
?




plz don't make this thread a security one from now on as this is not the main
purpose.


/Per-Olov

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Alexander Hall

On 08/22/11 10:59, Per-Olov Sjvholm wrote:


Q: What is the most annoying thing in e-mail?


Rants.



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Tomas Bodzar
On Mon, Aug 22, 2011 at 10:59 AM, Per-Olov SjC6holm p...@incedo.org wrote:
 On 22 aug 2011, at 07:45, Tomas Bodzar wrote:
 Try OpenBSD outside of KVM on real HW and you will see where's the
 bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
 fine or try to compare with OpenBSD running in VMware as there's fine
 support for that use.

 Of course security is around zero in this scenario, but as you said
 you're doing it for fun :-)

 On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov Sjvholm p...@incedo.org wrote:
 Hi Misc

 # Background #

 I have done som fun laborations with a virtual fully patched OpenBSD 4.9
 firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
 OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
 5504
 (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
 dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI
passthorugh.
 So
 OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
 they
 are unbound).

 I have not measured packets per second which of course is more relevant.
 But
 as I try to tweak the speed I don't care if I measure packets or Mbits as
 long
 as my tweaks give a higher value during the next test. Going in on one
 physcial nic and out on the other with my small ruleset that uses keep
 state
 everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
 results
 (I copy large files, a few Gig each). I started with a lower value and
 after a
 few tweaks in sysctl.conf B ended up with this speed of 400 Mbit. At this
 speed
 I can see that the interrupts in the firewall simply eat all resources.
 Have
 no ip.ifq.drops or any other drops that I am aware of...


 # Question #

 I now simply wonder if I can increase this speed I did one test and
 replaced these two physical desktop Intel Nics with a dual port server
 adapter
 (also Intel, 82546GB). I was interested to see if a dual port, more
 expensive,
 server adapter could lower my interrupt load. However... OpenBSD yelled
 something about unable to reset PCI device. So I went back to these two
 desktop adapters. These low price dektop adapters however in a intel i7
 desktop workstation download over SMB from my server at 119 Mbyte/s and
 fill
 up the Gig pipe. So they cannot be to bad...


 As PF cannot use SMP, is the only way to bump up the firewall throughput
 (in
 this scenario) to increase the speed of the processor core (i.e change
 server)? Or are there any other interesting configs to try ?


 Regards

 /Per-Olov
 --
 GPG keyID: 5231C0C4
 GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
 GPG key:
 http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4





 Plz, don't top post

sorry. Sometimes I forgot because here are different rules.


 Vmware is commercial software = avoid if I can. Also Linux guests with
virtio
 drivers gives much better performance on the same hardware if using KVM
 instead of Vmware. Also, no need for vmware tools as everything is in stock
 Linux kernel.

 I cannot at this time give a fair test running it on the same hardware but
as
 a physical server instead of a virtual one. This as the KVM host runs 10
other
 servers. I have however tested the OpenBSD on another hardware which ended
up
 with similar performance. That was on a physical box with Gig Intel Nics
 (82541 cards) but on a weak Quad core Intel Atom 1.6GHz processor running
the
 SMP kernel. At the bottle neck speed there was 100% interrupts at around
 400Mbit (same tested files and protocols to be able to give a fair
 comparison). Maybe the Intel atom 1.6 can be compared to a Xeon 5504 core
on
 2GHz ??? I am not a processor guru. Anyone??

http://marc.info/?l=openbsd-miscm=126204017310569w=2



 regarding security which you say is around zero. Yes this is a
laboration.
 But maybe you should say increased risk which is a more fair statement. I
have
 not heard of anyone that managed to hack a scenario like this in VMware or
 KVM. Also note that the host OS itself in my case cannot even see these
 devices as they are unbound. From my point of view it's like the race on
WiFi
 where people say you should use WPA2 with AES to be secure. But the real
fact
 is that standard old WPA without AES and with a reasonable key length (20+
 chars) have not been broken by anyone in the world yet (what we know). One
 person claims he manage to break a part of it in a lab. So... WPA = secure,
 better performance and better compatibility. If I was Nasa or DoD I would
 probable avoid WPA as someone someday of course will break it, otherwise
 not...



 So the question remains. Is it likely that a faster cpu core will give
better
 performance (not that I need it. Just doing some laborations here). Is a
 faster CPU the best / only way to increase throughput. Of course we assume
the
 OS tweak is ok and that reasonable NIC:s are used. Is there a plan to
change
 the B interrupt handling model in OpenBSD to device 

Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Daniel Gracia
AFAIK, OpenBSD kernel is not designed accounting for any form of 
virtualization toy, so don't even try figuring performance numbers out 
of it. These will be plain wrong.


As http://www.openbsd.org/faq/faq6.html states, there's little you can 
tweak to improve your numbers; just get a nice-clocked, good cache-sized 
CPU and give it some loving.


If OBSD doesn't satisfies you as is, recode it or stay appart, as you like.

Good luck!

El 22/08/2011 2:03, Per-Olov Sjvholm escribis:

Hi Misc

# Background #

I have done som fun laborations with a virtual fully patched OpenBSD 4.9
firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon 5504
(2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI passthorugh. So
OpenBSD sees and uses the real nic:s (they are then unusable to Linux as they
are unbound).

I have not measured packets per second which of course is more relevant. But
as I try to tweak the speed I don't care if I measure packets or Mbits as long
as my tweaks give a higher value during the next test. Going in on one
physcial nic and out on the other with my small ruleset that uses keep state
everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar results
(I copy large files, a few Gig each). I started with a lower value and after a
few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this speed
I can see that the interrupts in the firewall simply eat all resources. Have
no ip.ifq.drops or any other drops that I am aware of...


# Question #

I now simply wonder if I can increase this speed I did one test and
replaced these two physical desktop Intel Nics with a dual port server adapter
(also Intel, 82546GB). I was interested to see if a dual port, more expensive,
server adapter could lower my interrupt load. However... OpenBSD yelled
something about unable to reset PCI device. So I went back to these two
desktop adapters. These low price dektop adapters however in a intel i7
desktop workstation download over SMB from my server at 119 Mbyte/s and fill
up the Gig pipe. So they cannot be to bad...


As PF cannot use SMP, is the only way to bump up the firewall throughput (in
this scenario) to increase the speed of the processor core (i.e change
server)? Or are there any other interesting configs to try ?


Regards

/Per-Olov
--
GPG keyID: 5231C0C4
GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
GPG key:
http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4




Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Stuart Henderson
 Plz, don't top post

 sorry. Sometimes I forgot because here are different rules.

Just try and make your emails look nice and easy to read if you want
other people to read them, especially if you're asking others for help.
Before you hit send, read through your email, if it doesn't look good,
re-edit until it does.

A mess of hundreds of lines of irrelevant quotes with poor line-wrapping
is always hard to read, whether your text is written at the top, the bottom,
or interspersed with the quoted text.



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Per-Olov Sjöholm
On 22 aug 2011, at 12:09, Daniel Gracia wrote:
 AFAIK, OpenBSD kernel is not designed accounting for any form of
virtualization toy, so don't even try figuring performance numbers out of it.
These will be plain wrong.

 As http://www.openbsd.org/faq/faq6.html states, there's little you can tweak
to improve your numbers; just get a nice-clocked, good cache-sized CPU and
give it some loving.

 If OBSD doesn't satisfies you as is, recode it or stay appart, as you like.

 Good luck!

 El 22/08/2011 2:03, Per-Olov Sjvholm escribis:
 Hi Misc

 # Background #

 I have done som fun laborations with a virtual fully patched OpenBSD 4.9
 firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
 OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
 (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
 dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI passthorugh.
So
 OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
 are unbound).

 I have not measured packets per second which of course is more relevant.
But
 as I try to tweak the speed I don't care if I measure packets or Mbits as
long
 as my tweaks give a higher value during the next test. Going in on one
 physcial nic and out on the other with my small ruleset that uses keep
state
 everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
 (I copy large files, a few Gig each). I started with a lower value and
after a
 few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this
speed
 I can see that the interrupts in the firewall simply eat all resources.
Have
 no ip.ifq.drops or any other drops that I am aware of...


 # Question #

 I now simply wonder if I can increase this speed I did one test and
 replaced these two physical desktop Intel Nics with a dual port server
adapter
 (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
 server adapter could lower my interrupt load. However... OpenBSD yelled
 something about unable to reset PCI device. So I went back to these two
 desktop adapters. These low price dektop adapters however in a intel i7
 desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
 up the Gig pipe. So they cannot be to bad...


 As PF cannot use SMP, is the only way to bump up the firewall throughput
(in
 this scenario) to increase the speed of the processor core (i.e change
 server)? Or are there any other interesting configs to try ?


 Regards

 /Per-Olov
 --
 GPG keyID: 5231C0C4
 GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
 GPG key:
 http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4




  AFAIK, OpenBSD kernel is not designed accounting for any form of
virtualization toy, so don't even try figuring performance numbers out of it.
These will be plain wrong.

Why is that? The speed so far seems good enough for a virtual fw with this
2Ghz CPU core. No matter if you use a virtual of physical server, you always
want to get the most out of it. I do NOT compare with a physical server at
all. I want to try to maximize the throughput and se what I can get out of it
as a virtual FW test. The same applies if you use a physical server. You can
hit the limit and get 100% interrupts with both a physical and virtual server,
right? I didn't ask for a comparison with a physical server... I asked what I
can do more with it under these circumstances...


 As http://www.openbsd.org/faq/faq6.html states, there's little you can tweak
to improve your numbers; just get a nice-clocked, good cache-sized CPU and
give it some loving.

The FAQ you refer to seems to be of no use at all and is totally unrelated to
this post.



But if you can give hints of how to decrease the interrupt load I am all ears.
As I see it, if the interrupt handling model i OpenBSD would change to a
polling one u could maybe increase the throughput at the same processor speed
(just me guessing though). But now the fact is that it is not polling. So what
can I do with what we have

Is pure cpu speed the only way? Or is it possible to decrease the interrupt
load with even better NIC:s?


Regards
/Per-Olov



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Stuart Henderson
 But if you can give hints of how to decrease the interrupt load I am all ears.
 As I see it, if the interrupt handling model i OpenBSD would change to a
 polling one u could maybe increase the throughput at the same processor speed
 (just me guessing though). But now the fact is that it is not polling. So what
 can I do with what we have

polling is one mechanism to ensure you aren't handling interrupts all the
time, so you can ensure userland remains responsive even when the machine is
under heavy network load. OpenBSD has another way to handle this, MCLGETI.

 Is pure cpu speed the only way? Or is it possible to decrease the interrupt
 load with even better NIC:s?

here are some things that might help:

- faster cpu
- larger cpu cache
- faster ram
- reduce overheads (things like switching VM context while handling
packets is not going to help matters)
- improving code efficiency

have you tried -current?



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Per-Olov Sjöholm
On 22 aug 2011, at 22:04, Stuart Henderson wrote:
 But if you can give hints of how to decrease the interrupt load I am all
ears.
 As I see it, if the interrupt handling model i OpenBSD would change to a
 polling one u could maybe increase the throughput at the same processor
speed
 (just me guessing though). But now the fact is that it is not polling. So
what
 can I do with what we have

 polling is one mechanism to ensure you aren't handling interrupts all the
 time, so you can ensure userland remains responsive even when the machine
is
 under heavy network load. OpenBSD has another way to handle this, MCLGETI.


 With polling if I get it right the context switch overhead is mostly avoided
because the system can choose to look at the device when it is already in the
right context. The drawback could be increased latency in processsing events
in a polling model. But according to what I have read, the latency is reduced
to a very low low values by raising the clock interrupt frequency. They say
polling is better  from a OS time spent on device control perspective. Note
that I am not a pro in this area, but will for sure look deeper...

MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?


 Is pure cpu speed the only way? Or is it possible to decrease the
interrupt
 load with even better NIC:s?

 here are some things that might help:

 - faster cpu
 - larger cpu cache
 - faster ram
 - reduce overheads (things like switching VM context while handling
 packets is not going to help matters)
 - improving code efficiency

 have you tried -current?




I tried to share and use the same interrupt for my network ports as I have a
guess it could be a boost, but the bios did not want what I wanted
Interrupts could be shared, but not between the ports I wanted. I simple did
not understand the interrupt allocation scheme in my Dell T410 tower server.

Have not tried current, but will try current as soon as I can. Also... I will
try to do some laborations with CPU speed of the core the OpenBSD virtual
machine has. This to see how the interrupts and throughput is related to the
CPU speed of the allocated core.


Tnx

/Per-Olov


GPG keyID: 5231C0C4
GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
GPG key:
http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Christer Solskogen
On Mon, Aug 22, 2011 at 10:04 PM, Stuart Henderson s...@spacehopper.org wrote:
 - faster ram

Are you sure about that? Almost every benchmark I've seen, fast ram
has almost nothing to say. I would be delighted if what I've been
reading is wrong :-)

-- 
chs,



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Claudio Jeker
On Mon, Aug 22, 2011 at 10:53:05PM +0200, Christer Solskogen wrote:
 On Mon, Aug 22, 2011 at 10:04 PM, Stuart Henderson s...@spacehopper.org 
 wrote:
  - faster ram
 
 Are you sure about that? Almost every benchmark I've seen, fast ram
 has almost nothing to say. I would be delighted if what I've been
 reading is wrong :-)
 

Yes. memory speed matters a lot. DMA goes into main memory and needs to be
read into the cache when the recieved packet is accessed. Having the
memory close by the CPU and on fast busses helps in that regard. Big
caches will do the rest.

-- 
:wq Claudio



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Claudio Jeker
On Mon, Aug 22, 2011 at 10:49:47PM +0200, Per-Olov Sjvholm wrote:
 On 22 aug 2011, at 22:04, Stuart Henderson wrote:
  But if you can give hints of how to decrease the interrupt load I am all
 ears.
  As I see it, if the interrupt handling model i OpenBSD would change to a
  polling one u could maybe increase the throughput at the same processor
 speed
  (just me guessing though). But now the fact is that it is not polling. So
 what
  can I do with what we have
 
  polling is one mechanism to ensure you aren't handling interrupts all the
  time, so you can ensure userland remains responsive even when the machine
 is
  under heavy network load. OpenBSD has another way to handle this, MCLGETI.
 
 
  With polling if I get it right the context switch overhead is mostly avoided
 because the system can choose to look at the device when it is already in the
 right context. The drawback could be increased latency in processsing events
 in a polling model. But according to what I have read, the latency is reduced
 to a very low low values by raising the clock interrupt frequency. They say
 polling is better  from a OS time spent on device control perspective. Note
 that I am not a pro in this area, but will for sure look deeper...

Polling only works reliably at insane HZ settings which will cause other
issues at other places (some obvious some not so obvious). In the end
polling is a poor mans interrupt mitigation (which is also enabled on
em(4) btw.) since instead of using the interrupt of the network card you
use the interrupt of the clock to process the DMA rings. Polling does not
gain you much on good modern HW.
 
 MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?
 

Yes. em(4) has MCLGETI().

 
  Is pure cpu speed the only way? Or is it possible to decrease the
 interrupt
  load with even better NIC:s?
 
  here are some things that might help:
 
  - faster cpu
  - larger cpu cache
  - faster ram
  - reduce overheads (things like switching VM context while handling
  packets is not going to help matters)
  - improving code efficiency
 
  have you tried -current?
 
 
 
 
 I tried to share and use the same interrupt for my network ports as I have a
 guess it could be a boost, but the bios did not want what I wanted
 Interrupts could be shared, but not between the ports I wanted. I simple did
 not understand the interrupt allocation scheme in my Dell T410 tower server.
 
 Have not tried current, but will try current as soon as I can. Also... I will
 try to do some laborations with CPU speed of the core the OpenBSD virtual
 machine has. This to see how the interrupts and throughput is related to the
 CPU speed of the allocated core.
 

Also make sure that the guest can actually access the physical HW directly
without any virtualisation in between. In the end real HW is going to have
less overhead and will be faster then a VM solution.

-- 
:wq Claudio



Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread Per-Olov Sjöholm
On 22 aug 2011, at 23:28, Claudio Jeker wrote:
 On Mon, Aug 22, 2011 at 10:49:47PM +0200, Per-Olov Sjvholm wrote:
 On 22 aug 2011, at 22:04, Stuart Henderson wrote:
 But if you can give hints of how to decrease the interrupt load I am all
 ears.
 As I see it, if the interrupt handling model i OpenBSD would change to a
 polling one u could maybe increase the throughput at the same processor
 speed
 (just me guessing though). But now the fact is that it is not polling.
So
 what
 can I do with what we have

 polling is one mechanism to ensure you aren't handling interrupts all the
 time, so you can ensure userland remains responsive even when the machine
 is
 under heavy network load. OpenBSD has another way to handle this,
MCLGETI.


 With polling if I get it right the context switch overhead is mostly
avoided
 because the system can choose to look at the device when it is already in
the
 right context. The drawback could be increased latency in processsing
events
 in a polling model. But according to what I have read, the latency is
reduced
 to a very low low values by raising the clock interrupt frequency. They
say
 polling is better  from a OS time spent on device control perspective.
Note
 that I am not a pro in this area, but will for sure look deeper...

 Polling only works reliably at insane HZ settings which will cause other
 issues at other places (some obvious some not so obvious). In the end
 polling is a poor mans interrupt mitigation (which is also enabled on
 em(4) btw.) since instead of using the interrupt of the network card you
 use the interrupt of the clock to process the DMA rings. Polling does not
 gain you much on good modern HW.

 MCLGETI ?? Is it in if_em.c if I want to see how it is implemented?


 Yes. em(4) has MCLGETI().


 Is pure cpu speed the only way? Or is it possible to decrease the
 interrupt
 load with even better NIC:s?

 here are some things that might help:

 - faster cpu
 - larger cpu cache
 - faster ram
 - reduce overheads (things like switching VM context while handling
 packets is not going to help matters)
 - improving code efficiency

 have you tried -current?




 I tried to share and use the same interrupt for my network ports as I have
a
 guess it could be a boost, but the bios did not want what I wanted
 Interrupts could be shared, but not between the ports I wanted. I simple
did
 not understand the interrupt allocation scheme in my Dell T410 tower
server.

 Have not tried current, but will try current as soon as I can. Also... I
will
 try to do some laborations with CPU speed of the core the OpenBSD virtual
 machine has. This to see how the interrupts and throughput is related to
the
 CPU speed of the allocated core.


 Also make sure that the guest can actually access the physical HW directly
 without any virtualisation in between. In the end real HW is going to have
 less overhead and will be faster then a VM solution.



--snip--
The KVM hypervisor supports attaching PCI devices on the host system to
virtualized guests. PCI passthrough allows guests to have exclusive access to
PCI devices for a range of tasks. PCI passthrough allows PCI devices to appear
and behave as if they were physically attached to the guest operating system.
--snip--
From:
http://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/chap-
Virtualization-PCI_passthrough.html


The link above doesn't say anything about performance loss though of doing PCI
pass through. But the OpenBSD indeed sees and uses the correct real physical
NIC:s . I am of course _very_ interested in testing by installing OpenBSD
directly on the hardware. But I cannot do that at this time. This is what the
OpenBSD sees..
--snip--
em0 at pci0 dev 4 function 0 Intel PRO/1000 MT (82574L) rev 0x00: apic 1 int
11 (irq 11), address 00:1b:21:c2:8a:b0
em1 at pci0 dev 5 function 0 Intel PRO/1000 MT (82574L) rev 0x00: apic 1 int
10 (irq 10), address 00:1b:21:bf:76:77
--snip--
The MAC:s are these adapters real MAC:s. When used in OpenBSD these adapters
are totally unbound in Linux and cannot be seen or used.

This virtual fully patched OpenBSD 4.9 has got one (of total eight) Xeon 5504
2Ghz core, 512MB RAM and the above NIC:s and some raised values in sysctl. It
(as said earlier) gives about max 400Mbit throughput with a small ruleset will
keep state everywhere. Have tested with NFS, AFP, SCP, SMB and with different
created 2GB ISO:s. All protocols gives near the same result (AFP performs
best). Another physical server with a 1.6 Ghz Intel Atom with Intel Gig cards
(not the same cards) performs similar (a little lower though) and max out at
near the same speed. When these systems (both the physical and the virtual)
max out, the interrupts eat 100%. Removing the firewall the file transfer give
119 Mbyte/s and max out the Gigabit pipe.

These measurements (i.e comparison with the physical server) make me believe
that the virtualization is not that bad. At least not from a performance
perspective. A 

Re: Expected throughput in an OpenBSD virtual server

2011-08-22 Thread john slee
On 22 August 2011 23:45, Per-Olov Sjvholm p...@incedo.org wrote:
 As http://www.openbsd.org/faq/faq6.html states, there's little you can
tweak
 to improve your numbers; just get a nice-clocked, good cache-sized CPU and
 give it some loving.

 The FAQ you refer to seems to be of no use at all and is totally unrelated
to
 this post.

It is quite pertinent, actually. See the beginning of section 6.6;

http://www.openbsd.org/faq/faq6.html#Tuning

John



Expected throughput in an OpenBSD virtual server

2011-08-21 Thread Per-Olov Sjöholm
Hi Misc

# Background #

I have done som fun laborations with a virtual fully patched OpenBSD 4.9
firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon 5504
(2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI passthorugh. So
OpenBSD sees and uses the real nic:s (they are then unusable to Linux as they
are unbound).

I have not measured packets per second which of course is more relevant. But
as I try to tweak the speed I don't care if I measure packets or Mbits as long
as my tweaks give a higher value during the next test. Going in on one
physcial nic and out on the other with my small ruleset that uses keep state
everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar results
(I copy large files, a few Gig each). I started with a lower value and after a
few tweaks in sysctl.conf  ended up with this speed of 400 Mbit. At this speed
I can see that the interrupts in the firewall simply eat all resources. Have
no ip.ifq.drops or any other drops that I am aware of...


# Question #

I now simply wonder if I can increase this speed I did one test and
replaced these two physical desktop Intel Nics with a dual port server adapter
(also Intel, 82546GB). I was interested to see if a dual port, more expensive,
server adapter could lower my interrupt load. However... OpenBSD yelled
something about unable to reset PCI device. So I went back to these two
desktop adapters. These low price dektop adapters however in a intel i7
desktop workstation download over SMB from my server at 119 Mbyte/s and fill
up the Gig pipe. So they cannot be to bad...


As PF cannot use SMP, is the only way to bump up the firewall throughput (in
this scenario) to increase the speed of the processor core (i.e change
server)? Or are there any other interesting configs to try ?


Regards

/Per-Olov
--
GPG keyID: 5231C0C4
GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
GPG key:
http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4



Re: Expected throughput in an OpenBSD virtual server

2011-08-21 Thread Tomas Bodzar
Try OpenBSD outside of KVM on real HW and you will see where's the
bottleneck. Anyway getting 400Mbit/s under virtualization seems pretty
fine or try to compare with OpenBSD running in VMware as there's fine
support for that use.

Of course security is around zero in this scenario, but as you said
you're doing it for fun :-)

On Mon, Aug 22, 2011 at 2:03 AM, Per-Olov SjC6holm p...@incedo.org wrote:
 Hi Misc

 # Background #

 I have done som fun laborations with a virtual fully patched OpenBSD 4.9
 firewall on top of SuSE Enterprise Linux 11 SP1 running KVM. The Virtual
 OpenBSD got 512MB RAM and one core from a system with two quadcore Xeon
5504
 (2Ghz) sitting in a Dell T410 Tower Server. I have given the OpenBSD FW 2
 dedicated Intel PRO/1000 MT (82574L) physical nic:s via PCI passthorugh.
So
 OpenBSD sees and uses the real nic:s (they are then unusable to Linux as
they
 are unbound).

 I have not measured packets per second which of course is more relevant.
But
 as I try to tweak the speed I don't care if I measure packets or Mbits as
long
 as my tweaks give a higher value during the next test. Going in on one
 physcial nic and out on the other with my small ruleset that uses keep
state
 everywhere give me about 400 Mbit. AFP, SMB, SCP or NFS give similar
results
 (I copy large files, a few Gig each). I started with a lower value and after
a
 few tweaks in sysctl.conf B ended up with this speed of 400 Mbit. At this
speed
 I can see that the interrupts in the firewall simply eat all resources.
Have
 no ip.ifq.drops or any other drops that I am aware of...


 # Question #

 I now simply wonder if I can increase this speed I did one test and
 replaced these two physical desktop Intel Nics with a dual port server
adapter
 (also Intel, 82546GB). I was interested to see if a dual port, more
expensive,
 server adapter could lower my interrupt load. However... OpenBSD yelled
 something about unable to reset PCI device. So I went back to these two
 desktop adapters. These low price dektop adapters however in a intel i7
 desktop workstation download over SMB from my server at 119 Mbyte/s and
fill
 up the Gig pipe. So they cannot be to bad...


 As PF cannot use SMP, is the only way to bump up the firewall throughput
(in
 this scenario) to increase the speed of the processor core (i.e change
 server)? Or are there any other interesting configs to try ?


 Regards

 /Per-Olov
 --
 GPG keyID: 5231C0C4
 GPG fingerprint: B232 3E1A F5AB 5E10 7561 6739 766E D29D 5231 C0C4
 GPG key:
 http://wwwkeys.eu.pgp.net/pks/lookup?op=getsearch=0x766ED29D5231C0C4