On Sat, Jul 19, 2008 at 7:17 PM, Brian McGinty <[EMAIL PROTECTED]> wrote:
> G'day Kip,
>
>> I'm hoping to get to it some time in August. I'm a bit behind in my
>> contracts at the moment.
>
> A few weeks ago, I did a quick comparison of the driver between
> FreeBSD and Linux, and found quite a few
G'day Kip,
> I'm hoping to get to it some time in August. I'm a bit behind in my
> contracts at the moment.
A few weeks ago, I did a quick comparison of the driver between
FreeBSD and Linux, and found quite a few differences that's worth
pulling over. The guy from Intel working on FreeBSD, Jack?
On Fri, Jul 11, 2008 at 11:44 PM, Brian McGinty <[EMAIL PROTECTED]> wrote:
>> Hi Brian
>> I very much doubt that this is ceteris paribus. This is 384 random IPs
>> -> 384 random IP addresses with a flow lookup for each packet. Also,
>> I've read through igb on Linux - it has a lot of optimizations
On Mon, 7 Jul 2008, Robert Watson wrote:
On Mon, 7 Jul 2008, Bruce Evans wrote:
(1) sendto() to a specific address and port on a socket that has been
bound to
INADDR_ANY and a specific port.
(2) sendto() on a specific address and port on a socket that has been
bound to
a specific IP
> Hi Brian
> I very much doubt that this is ceteris paribus. This is 384 random IPs
> -> 384 random IP addresses with a flow lookup for each packet. Also,
> I've read through igb on Linux - it has a lot of optimizations that
> the FreeBSD driver lacks and I have yet to implement.
Hey Kip,
when wi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
>> Good news is using iptables only adds a few percentage onto the CPU
>> usage. But still, what's with that..
>> So far FreeBSD got the highest pps rating for forwarding. I haven't
>> tried bridge mode. Ipfw probably takes a big hit in that too th
Hi Paul,
Paul wrote:
I tested Linux in bridge configuration with the same machine and it
CPUed out at about 600kpps through the bridge..
600kpps incoming or 600kpps incoming+ outgoing ?
That's a bit low :/ Soft interrupt using all the cpu. Same opteron
, 82571EB Pci express NIC.
Tried
I tested Linux in bridge configuration with the same machine and it
CPUed out at about 600kpps through the bridge..
That's a bit low :/ Soft interrupt using all the cpu.
Same opteron , 82571EB Pci express NIC.
Tried SMP/ non-smp , load balanced irqs, etc..
Good news is using iptables onl
On Tue, 8 Jul 2008, Peter Jeremy wrote:
On 2008-Jul-07 13:25:13 -0700, Julian Elischer <[EMAIL PROTECTED]> wrote:
what you need is a speculative prefetch where you an tell teh
processor "We will probably need the following address so start
getting it while we go do other stuff".
This looks li
> I have a pre-production card. With some bug fixes and some tuning of
> interrupt handling (custom stack - I've been asked to push the changes
> back in to CVS, I just don't have time right now) an otherwise
> unoptimized igb can forward 1.04Mpps from one port to another (1.04
> Mpps in on igb0 a
On Tue, Jul 8, 2008 at 1:46 PM, Brian McGinty <[EMAIL PROTECTED]> wrote:
>> I have a pre-production card. With some bug fixes and some tuning of
>> interrupt handling (custom stack - I've been asked to push the changes
>> back in to CVS, I just don't have time right now) an otherwise
>> unoptimized
But this is probably no routing table, and single source and dst ips or
very limited number of ips and ports.
the entire problem with Linux is the route cache, try and generate
random source ips and random source/dst ports
and it won't even do 100kpps without problems.
I would like to log into
On 7/8/08, Robert Watson <[EMAIL PROTECTED]> wrote:
> There were some patches floating around for if_em to do a prefetch of the
> first bit of packet data on packets before handing them up the stack. My
I found Andre Oppermann's optimization patch mentioned in july 2005
status report:
http://lis
Hi,
Kip Macy wrote:
On Mon, Jul 7, 2008 at 6:22 PM, Paul <[EMAIL PROTECTED]> wrote:
I read through the IGB driver, and it says 82575/6 only... which is the new
chip Intel is releasing on the cards this month 2 port
and october 4 port, but the chips are on some of the motherboards right now.
On Mon, 7 Jul 2008, Erik Trulsson wrote:
On Mon, Jul 07, 2008 at 10:30:53PM +1000, Bruce Evans wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
The theoretical maximum at 64byte frames is 1,488,100. I've looked
up my notes the 1.244Mpps number can be ajusted to 1.488Mpps.
Where is the extr
Will someone confirm if it will support the 82571EB ? I don't see a
reason why not as it's very similar hardware
and it's available now in large quantities so making 82571 part of igb I
think would be a good idea.
Kip Macy wrote:
I have a pre-production card. With some bug fixes and some tun
>> I have a pre-production card. With some bug fixes and some tuning of
>> interrupt handling (custom stack - I've been asked to push the changes
>> back in to CVS, I just don't have time right now) an otherwise
>> unoptimized igb can forward 1.04Mpps from one port to another (1.04
>> Mpps in on i
Hi,
Kip Macy wrote:
On Mon, Jul 7, 2008 at 6:07 PM, Mike Tancsa <[EMAIL PROTECTED]> wrote:
At 02:44 PM 7/7/2008, Paul wrote:
Also my 82571 NIC supports multiple received queues and multiple transmit
queues so why hasn't
anyone written the driver to support this? It's not a 10gb card
On Mon, 7 Jul 2008, Artem Belevich wrote:
As was already mentioned, we can't avoid all cache misses as there's data
that's recently been updated in memory via DMA and therefor kicked out of
cache.
However, we may hide some of the latency penalty by prefetching
'interesting' data early. I.e.
On Mon, Jul 7, 2008 at 6:22 PM, Paul <[EMAIL PROTECTED]> wrote:
> I read through the IGB driver, and it says 82575/6 only... which is the new
> chip Intel is releasing on the cards this month 2 port
> and october 4 port, but the chips are on some of the motherboards right now.
> Why can't it also
On Mon, Jul 7, 2008 at 6:07 PM, Mike Tancsa <[EMAIL PROTECTED]> wrote:
> At 02:44 PM 7/7/2008, Paul wrote:
>
>> Also my 82571 NIC supports multiple received queues and multiple transmit
>> queues so why hasn't
>> anyone written the driver to support this? It's not a 10gb card and it
>> still suppo
I read through the IGB driver, and it says 82575/6 only... which is the
new chip Intel is releasing on the cards this month 2 port
and october 4 port, but the chips are on some of the motherboards right now.
Why can't it also use the 82571 ? doesn't make any sense.. I haven't
tried it but just
At 02:44 PM 7/7/2008, Paul wrote:
Also my 82571 NIC supports multiple received queues and multiple
transmit queues so why hasn't
anyone written the driver to support this? It's not a 10gb card and
it still supports it and it's widely
available and not too expensive either. The new 82575/6 ch
Artem Belevich wrote:
Prefetching when you are waiting for the data isn't a help.
Agreed. Got to start prefetch around ns
before you actually need the data and move on doing other things that
do not depend on the data you've just started prefetching.
what you need is a speculative prefetch
Mike Tancsa wrote:
At 02:44 PM 7/7/2008, Paul wrote:
Also my 82571 NIC supports multiple received queues and multiple
transmit queues so why hasn't
anyone written the driver to support this? It's not a 10gb card and
it still supports it and it's widely
Intel actually maintains the driver. Not
We could add this as a part of the fastforwarding code and for a router
turn it on and for a server leave it off.
When I use a FBSD box for a router, it doesn't do anything else, so
there could be two optimized paths that is one for
routing/forwarding/firewalling only
and one for use as a serv
> Prefetching when you are waiting for the data isn't a help.
Agreed. Got to start prefetch around ns
before you actually need the data and move on doing other things that
do not depend on the data you've just started prefetching.
> what you need is a speculative prefetch where you an tell teh
Peter Jeremy wrote:
On 2008-Jul-07 13:25:13 -0700, Julian Elischer <[EMAIL PROTECTED]> wrote:
what you need is a speculative prefetch where you an tell teh
processor "We will probably need the following address so start
getting it while we go do other stuff".
This looks like the PREFETCH inst
On 2008-Jul-07 13:25:13 -0700, Julian Elischer <[EMAIL PROTECTED]> wrote:
>what you need is a speculative prefetch where you an tell teh
>processor "We will probably need the following address so start
>getting it while we go do other stuff".
This looks like the PREFETCH instructions that exist
Artem Belevich wrote:
Hi,
As was already mentioned, we can't avoid all cache misses as there's
data that's recently been updated in memory via DMA and therefor
kicked out of cache.
However, we may hide some of the latency penalty by prefetching
'interesting' data early. I.e. we know that we wan
I hope so, if they maintain the driver then why wouldn't they make it
take advantage of their own hardware?
I hope they are stuck focusing on windows users :/
Mike Tancsa wrote:
At 02:44 PM 7/7/2008, Paul wrote:
Also my 82571 NIC supports multiple received queues and multiple
transmit queues
At 02:44 PM 7/7/2008, Paul wrote:
Also my 82571 NIC supports multiple received queues and multiple
transmit queues so why hasn't
anyone written the driver to support this? It's not a 10gb card and
it still supports it and it's widely
Intel actually maintains the driver. Not sure if there are
Hi,
As was already mentioned, we can't avoid all cache misses as there's
data that's recently been updated in memory via DMA and therefor
kicked out of cache.
However, we may hide some of the latency penalty by prefetching
'interesting' data early. I.e. we know that we want to access some
etherne
On Tue, 8 Jul 2008, Bruce Evans wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Bruce Evans wrote:
So it seems that the major overheads are not near the driver (as I already
knew), and upper layers are responsible for most of the cache misses.
The packet header is accessed even in monitor m
I use low-end memory, but on the machine that does 640 kpps it somehow
has latency almost 4 times as low as on new FreeBSD cluster machines
(~42 nsec instead of ~150). perfmon (fixed for AXP and A64) and hwpmc
report an average of 11 k8-dc-misses per sendto() while sending via
bge at 640 kpps.
one that will later on handle the taskqueue to process the packets.
That adds overhead. Ideally the interrupt for each network interface
is bound to exactly one pre-determined CPU and the taskqueue is bound
to the same CPU. That way the overhead for interrupt and taskqueue
scheduling can be ke
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Bruce Evans wrote:
So it seems that the major overheads are not near the driver (as I already
knew), and upper layers are responsible for most of the cache misses.
The packet header is accessed even in monitor mode, so I think most of
the cache misses
Bruce Evans wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Paul,
to get a systematic analysis of the performance please do the following
tests and put them into a table for easy comparison:
1. inbound pps w/o loss with interface in monitor mode (ifconfig em0
monitor)
...
I won't be run
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Paul,
to get a systematic analysis of the performance please do the following
tests and put them into a table for easy comparison:
1. inbound pps w/o loss with interface in monitor mode (ifconfig em0
monitor)
...
I won't be running many of these t
On Mon, 7 Jul 2008, Bruce Evans wrote:
(1) sendto() to a specific address and port on a socket that has been bound
to
INADDR_ANY and a specific port.
(2) sendto() on a specific address and port on a socket that has been bound
to
a specific IP address (not INADDR_ANY) and a specific po
Bruce Evans wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Bruce Evans wrote:
What are the other overheads? I calculate 1.644Mpps counting the
inter-frame
gap, with 64-byte packets and 64-header_size payloads. If the 64 bytes
is for the payload, then the max is much lower.
The theoreti
On Mon, Jul 07, 2008 at 10:30:53PM +1000, Bruce Evans wrote:
> On Mon, 7 Jul 2008, Andre Oppermann wrote:
>
> > Bruce Evans wrote:
> >> What are the other overheads? I calculate 1.644Mpps counting the
> >> inter-frame
> >> gap, with 64-byte packets and 64-header_size payloads. If the 64 bytes
>
On Mon, 7 Jul 2008, Robert Watson wrote:
Since you're doing fine-grained performance measurements of a code path that
interests me a lot, could you compare the cost per-send on UDP for the
following four cases:
(1) sendto() to a specific address and port on a socket that has been bound
to
On Mon, 7 Jul 2008, Robert Watson wrote:
The last of these should really be quite a bit faster than the first of
these, but I'd be interested in seeing specific measurements for each if
that's possible!
And, if you're feeling particualrly subject to suggestion, you might consider
comparing
On Mon, 7 Jul 2008, Bruce Evans wrote:
I use low-end memory, but on the machine that does 640 kpps it somehow has
latency almost 4 times as low as on new FreeBSD cluster machines (~42 nsec
instead of ~150). perfmon (fixed for AXP and A64) and hwpmc report an
average of 11 k8-dc-misses per se
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Bruce Evans wrote:
What are the other overheads? I calculate 1.644Mpps counting the
inter-frame
gap, with 64-byte packets and 64-header_size payloads. If the 64 bytes
is for the payload, then the max is much lower.
The theoretical maximum at 64byt
Andre Oppermann wrote:
Robert Watson wrote:
Experience suggests that forwarding workloads see significant lock
contention in the routing and transmit queue code. The former needs
some kernel hacking to address in order to improve parallelism for
routing lookups. The latter is harder to addre
Bruce Evans wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Ingo Flaschberger wrote:
I don't think you will be able to route 64byte packets at 1gbit
wirespeed (2Mpps) with a current x86 platform.
You have to take inter-frame gap and other overheads too. That gives
about 1.244Mpps max on a
Robert Watson wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Distributing the interrupts and taskqueues among the available CPUs
gives concurrent forwarding with bi- or multi-directional traffic. All
incoming traffic from any particular interface is still serialized
though.
... although
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Distributing the interrupts and taskqueues among the available CPUs gives
concurrent forwarding with bi- or multi-directional traffic. All incoming
traffic from any particular interface is still serialized though.
... although not on multiple input
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Ingo Flaschberger wrote:
I don't think you will be able to route 64byte packets at 1gbit wirespeed
(2Mpps) with a current x86 platform.
You have to take inter-frame gap and other overheads too. That gives
about 1.244Mpps max on a 1GigE interface.
Paul wrote:
SMP DISABLED on my Opteron 2212 (ULE, Preemption on)
Yields ~750kpps in em0 and out em1 (one direction)
I am miffed why this yields more pps than
a) with all 4 cpus running and b) 4 cpus with lagg load balanced over 3
incoming connections so 3 taskq threads
SMP adds quite some ov
Paul,
to get a systematic analysis of the performance please do the following
tests and put them into a table for easy comparison:
1. inbound pps w/o loss with interface in monitor mode (ifconfig em0 monitor)
2. inbound pps w/ fastforward into a single blackhole route
3. inbound pps /w fast
Robert Watson wrote:
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Robert Watson wrote:
Experience suggests that forwarding workloads see significant lock
contention in the routing and transmit queue code. The former needs
some kernel hacking to address in order to improve parallelism for
rou
Ingo Flaschberger wrote:
Dear Paul,
I tried all of this :/ still, 256/512 descriptors seem to work the best.
Happy to let you log into the machine and fiddle around if you want :)
yes, but I'm shure I will also not be able to achieve much more pps.
As it seems that you hit hardware-software-
On Mon, 7 Jul 2008, Andre Oppermann wrote:
Robert Watson wrote:
Experience suggests that forwarding workloads see significant lock
contention in the routing and transmit queue code. The former needs some
kernel hacking to address in order to improve parallelism for routing
lookups. The lat
Robert Watson wrote:
Experience suggests that forwarding workloads see significant lock
contention in the routing and transmit queue code. The former needs
some kernel hacking to address in order to improve parallelism for
routing lookups. The latter is harder to address given the hardware
y
On Sat, 5 Jul 2008, Paul wrote:
ULE + PREEMPTION for non SMP no major differences with SMP with ULE/4BSD and
preemption ON/OFF
32 bit UP test coming up with new cpu and I'm installing dragonfly sometime
this weekend :] UP: 1mpps in one direction with no firewall/no routing table
is not too
On Sat, Jul 5, 2008 at 2:03 AM, Paul <[EMAIL PROTECTED]> wrote:
> I tried all of this :/ still, 256/512 descriptors seem to work the best.
> Happy to let you log into the machine and fiddle around if you want :)
I think you need to ktr the packet processing time. Standard gigabit
max at ~1488095
I'm no expert, but I imagine the problem is because the net processing
of FreeBSD is not pipelined enough. We are now able to affordably throw
many gigabytes of RAM into a machine, as well 2 to 8 CPUs. So why not
allow for big buffers and multiple processing steps?
I be happy to give up a
Bart Van Kerckhove wrote:
Perhaps you have some better luck at some different hardware systems
(ppc, mips, ..?) or use freebsd only for routing-table-updates and
special network-cards (netfpga) for real routing.
The netfpga site seems more or less dead - is this project still alive?
It does l
Bart Van Kerckhove wrote:
The netfpga site seems more or less dead - is this project still alive?
It does look like a very interesting idea, even though it's currently quite
linux-centric (and according to docs doesn't have VLAN nor ip6 support, the
former being quite a dealbreaker)
Just last
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Paul / Ingo,
>
>> I tried all of this :/ still, 256/512 descriptors seem to work the
>> best. Happy to let you log into the machine and fiddle around if you
>> want :)
I've been watching this thread closely, since I'm in a very similair
situation.
A
UP 32 bit test vs 64 bit:
negligible difference in forwarding performance without polling
slightly better polling performance but still errors at lower packet rates
same massive hit with ipfw loaded
Installing dragonfly in a bit..
If anyone has a really fast PPC type system or SUN or something i'
ULE + PREEMPTION for non SMP
no major differences with SMP with ULE/4BSD and preemption ON/OFF
32 bit UP test coming up with new cpu
and I'm installing dragonfly sometime this weekend :]
UP: 1mpps in one direction with no firewall/no routing table is not too
bad, but 1mpps both directions is the
Dear Paul,
I tried all of this :/ still, 256/512 descriptors seem to work the best.
Happy to let you log into the machine and fiddle around if you want :)
yes, but I'm shure I will also not be able to achieve much more pps.
As it seems that you hit hardware-software-level-barriers, my only id
I tried all of this :/ still, 256/512 descriptors seem to work the best.
Happy to let you log into the machine and fiddle around if you want :)
Paul
Ingo Flaschberger wrote:
Dear Paul,
what could cause this?
*) kern.polling.idle_poll enabled?
*) kern.polling.user_frac ?
*) kern.polling.reg
Dear Paul,
what could cause this?
*) kern.polling.idle_poll enabled?
*) kern.polling.user_frac ?
*) kern.polling.reg_frac ?
*) kern.polling.burst_max ?
*) kern.polling.each_burst ?
I tried tons of different values for these and nothing made any significant
difference.
Idle polling makes a
ngo Flaschberger wrote:
Dear Paul,
Opteron UP mode, no polling
input (em0) output
packets errs bytespackets errs bytes colls
1071020 0 66403248 2 0404 0
that looks good. (but seems to be near the limit).
Yes
Dear Paul,
Opteron UP mode, no polling
input (em0) output
packets errs bytespackets errs bytes colls
1071020 0 66403248 2 0404 0
that looks good. (but seems to be near the limit).
Polling turned on provided be
On Fri, 4 Jul 2008, Paul wrote:
Numbers are maximum with near 100% cpu usage and some errors occuring, just
for testing.
FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #6: Thu Jul 3 19:32:38 CDT 2008
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/ROUTER amd64
CPU: Dual-Core AMD Opteron(tm) Processor (3
Numbers are maximum with near 100% cpu usage and some errors occuring,
just for testing.
FreeBSD 7.0-STABLE FreeBSD 7.0-STABLE #6: Thu Jul 3 19:32:38 CDT
2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ROUTER amd64
CPU: Dual-Core AMD Opteron(tm) Processor (3015.47-MHz K8-class CPU)
NON-S
Opteron UP mode, no polling
input (em0) output
packets errs bytespackets errs bytes colls
1071020 0 66403248 2 0404 0
1049793 0 65087174 2 0356 0
1040320 0 64499848
Dear Steve,
My next "router" appliance will be:
http://www.axiomtek.com.tw/products/ViewProduct.asp?view=429
This is exactly the device that I have been testing with (just rebranded).
cool.
what performace do you reach?
After some very quick testing with everything default, I am witnessing
Ingo Flaschberger wrote:
Dear Steve,
My next "router" appliance will be:
http://www.axiomtek.com.tw/products/ViewProduct.asp?view=429
This is exactly the device that I have been testing with (just
rebranded).
cool.
what performace do you reach?
After some very quick testing with everythi
For your information we have mesured 730Kpps using pollng and
fastforwarding
with 64bits frame without loss (<0.001% packet loss) on a Spirent
Smarbits (Pentium D 2.8GHZ + 8xGig em)
You can find the code / and some performance report at :
http://www.netasq.com/opensource/pollng-rev1-freebsd.
Ingo Flaschberger wrote:
Dear Steve,
My next "router" appliance will be:
http://www.axiomtek.com.tw/products/ViewProduct.asp?view=429
This is exactly the device that I have been testing with (just
rebranded).
cool.
what performace do you reach?
It's hard to say right now as I've really o
Dear Steve,
My next "router" appliance will be:
http://www.axiomtek.com.tw/products/ViewProduct.asp?view=429
This is exactly the device that I have been testing with (just rebranded).
cool.
what performace do you reach?
Kind regards,
Ingo Flaschberger
__
Ingo Flaschberger wrote:
My next "router" appliance will be:
http://www.axiomtek.com.tw/products/ViewProduct.asp?view=429
This is exactly the device that I have been testing with (just rebranded).
Steve
___
freebsd-net@freebsd.org mailing list
http:
Err.. pciconf -lv ?
[EMAIL PROTECTED]:0:0:0: class=0x05 card=0x151115d9 chip=0x036910de
rev=0xa2 hdr=0x00
vendor = 'Nvidia Corp'
device = 'MCP55 Memory Controller'
class = memory
subclass = RAM
[EMAIL PROTECTED]:0:1:0: class=0x060100 card=0x151115d9 c
Bruce Evans wrote:
On Thu, 3 Jul 2008, Paul wrote:
Bruce Evans wrote:
No polling:
843762 25337 52313248 1 0178 0
763555 0 47340414 1 0178 0
830189 0 51471722 1 0178 0
838724 0 52000892
Dear Stefan,
So my maximum without polling is close to 800kpps but if I push that it
starts locking me from doing things, or
how many kpps do you want to achieve?
Do not know for Paul but, I want to be able to route (and/or bridge to
handle) 600-700mbps syn flood,
which is something like 150
Dear Paul,
Tomorrow comes opteron so it's 1ghz faster than this one, and I can see
if it scales directly with cpu speed or what happens.
can you send me a lspci -v?
I did another SMP test with an interesting results. I took one of the cpus
out of the machine, so it was just left with a
Dear Stefan,
So my maximum without polling is close to 800kpps but if I push that it
starts locking me from doing things, or
how many kpps do you want to achieve?
Do not know for Paul but, I want to be able to route (and/or bridge to
handle) 600-700mbps syn flood,
which is something like 150
On Thu, 3 Jul 2008, Paul wrote:
Bruce Evans wrote:
No polling:
843762 25337 52313248 1 0178 0
763555 0 47340414 1 0178 0
830189 0 51471722 1 0178 0
838724 0 52000892 1 0178
Bruce Evans wrote:
On Wed, 2 Jul 2008, Paul wrote:
...
---Reboot with 4096/4096(my guess is that it will be
a lot worse, more errors..)
Without polling, 4096 is horrible, about 200kpps less ... :/
Turning on polling..
polling on, 4096 is bad,
input
On Wed, 2 Jul 2008, Paul wrote:
...
---Reboot with 4096/4096(my guess is that it will be a lot
worse, more errors..)
Without polling, 4096 is horrible, about 200kpps less ... :/
Turning on polling..
polling on, 4096 is bad,
input (em0) outp
Preliminary 32 bit results...
When I started out it looked like 32 bit was worse than 64 bit, but it's
just the timers are different.
For instance, 4000 hz in 64 bit gives better results than 4000hz in 32 bit.
Low HZ gives better result with polling on in 32 bit
Bottom line, so far I'm not able
On Mon, Jun 30, 2008 at 6:39 PM, Ingo Flaschberger <[EMAIL PROTECTED]> wrote:
>> I'm curious now... how do you change individual device polling via sysctl?
>
> not via sysctl, via ifconfig:
>
> # enable interface polling
> /sbin/ifconfig em0 polling
> /sbin/ifconfig em1 polling
> /sbin/ifconfig em2
Fastforward works with lagg , lagg just has some issues that need to be
fixed, even on UP system. It has the same issue as IPFW.
kern.polling.idlepoll_sleeping: 1
kern.polling.stalled: 806
kern.polling.suspect: 97861
kern.polling.phase: 0
kern.polling.enable: 0
kern.polling.handlers: 2
kern.poll
2008/7/2 Stefan Lambrev <[EMAIL PROTECTED]>:
> Do not know for Paul but, I want to be able to route (and/or bridge to
> handle) 600-700mbps syn flood,
> which is something like 1500kpps in every direction. Is it unrealistic?
> If the code is optimized to fully utilize MP I do not see a reason why
Hi
Ingo Flaschberger wrote:
Dear Paul,
I still don't like the huge hit ipfw and lagg take :/
You have to try PF, then you will respect IPFW again ;)
-cut-
So my maximum without polling is close to 800kpps but if I push that
it starts locking me from doing things, or
how many kpps do you
Dear Paul,
I still don't like the huge hit ipfw and lagg take :/
I think, you can't use fastforward with with lagg.
** I tried polling in UP mode and I got some VERY interesting results..
CPU is 44% idle (idle polling isn't on) but I'm getting errors! It's doing
530kpps with ipfw loaded,
Ipfw rule was simply allow ip from any to any :)
This is 64bit i'm testing now.. I have a 32 bit install I tested on
another machine but it only has bge NIC and wasn't performing as well
so I'll reinstall 32 bit on this 2212 and test then drop in the
(3ghz) and test.
I still don't like the
Dear Paul,
SMP DISABLED on my Opteron 2212 (ULE, Preemption on)
Yields ~750kpps in em0 and out em1 (one direction)
I am miffed why this yields more pps than
a) with all 4 cpus running and b) 4 cpus with lagg load balanced over 3
incoming connections so 3 taskq threads
because less locking,
SMP DISABLED on my Opteron 2212 (ULE, Preemption on)
Yields ~750kpps in em0 and out em1 (one direction)
I am miffed why this yields more pps than
a) with all 4 cpus running and b) 4 cpus with lagg load balanced over 3
incoming connections so 3 taskq threads
I would be willing to set up test e
Apparently lagg hasn't been giant fixed :/ Can we do something about
this quickly?
with adaptive giant i get more performance on lagg but the cpu usage is
smashed 100%
I get about 50k more pps per interface (so 150kpps total which STILL is
less than a single gigabit port)
Check it out
68 p
Ok, now THIS is absoultely a whole bunch of ridiculousness..
I set up etherchannel, and I'm evenly distributing packets over em0 em1
and em2 to lagg0
and i get WORSE performance than with a single interface.. Can anyone
explain this one? This is horrible.
I got em0-em2 taskq's using 80% cpu EAC
Paul wrote:
ULE without PREEMPTION is now yeilding better results.
input (em0) output
packets errs bytespackets errs bytes colls
571595 40639 34564108 1 0226 0
577892 48865 34941908 1 0178 0
ULE without PREEMPTION is now yeilding better results.
input (em0) output
packets errs bytespackets errs bytes colls
571595 40639 34564108 1 0226 0
577892 48865 34941908 1 0178 0
545240 8474
1 - 100 of 157 matches
Mail list logo