I'll assume you're just not that clear on specific implementation. Hooking 
directly into if_input() bypasses all of the "cruft". It basically uses the 
driver "as-is", so any driver can be used and it will be as good as the driver. 
The bloat starts in if_ethersubr.c, which is easily completely avoided. Most 
drivers need to be tuned (or modified a bit) as most freebsd drivers are full 
of bloat and forced into a bad, cookie-cutter type way of doing things.
The problem with doing things in user space is that user space is 
unpredictable. Things work just dandily when nothing else is going on, but you 
can't control when a user space program gets context under heavy loads. In the 
kernel you can control almost exactly what the polling interval is through 
interrupt moderation on most modern controllers. 
Many otherwise credible programmers argued for years that polling was "faster", 
but it was only faster in artificially controlled environment. Its mainly 
because 1) they're not thinking about the entire context of what "can" happen, 
and 2) because they test under unrealistic conditions that don't represent real 
world events, and 3) they don't have properly tuned ethernet drivers.
BC 


     On Monday, May 4, 2015 12:37 PM, Jim Thompson <j...@netgate.com> wrote:
   

 
While it is a true statement that, "You can do anything in the kernel that you 
can do in user space.”, it is not a helpful statement.  Yes, the kernel is just 
a program.
In a similar way, “You can just pop it into any kernel and it works.” is also 
not helpful.  It works, but it doesn’t work well, because of other 
infrastructure issues.
Both of your statements reduce to the age-old, “proof is left as an exercise 
for the student”.

There is a lot of kernel infrastructure that is just plain crusty(*) and which 
directly impedes performance in this area.

But there is plenty of cruft, Barney.  Here are two threads which are three 
years old, with the issues it points out still unresolved, and multiple places 
where 100ns or more is lost:
https://lists.freebsd.org/pipermail/freebsd-current/2012-April/033287.html
https://lists.freebsd.org/pipermail/freebsd-current/2012-April/033351.html

100ns is death at 10Gbps with min-sized packets.

quoting: http://luca.ntop.org/10g.pdf
---
Taking as a reference a 10 Gbit/s link, the raw throughput is well below the 
memory bandwidth of modern systems (between 6 and 8 GBytes/s for CPU to memory, 
up to 5 GBytes/s on PCI-Express x16). How- ever a 10Gbit/s link can generate up 
to 14.88 million Packets Per Second (pps), which means that the system must be 
able to process one packet every 67.2 ns. This translates to about 200 clock 
cycles even for the faster CPUs, and might be a challenge considering the per- 
packet overheads normally involved by general-purpose operating systems. The 
use of large frames reduces the pps rate by a factor of 20..50, which is great 
on end hosts only concerned in bulk data transfer.  Monitoring systems and 
traffic generators, however, must be able to deal with worst case conditions.”

Forwarding and filtering must also be able to deal with worst case, and nobody 
does well with kernel-based networking here.  
https://github.com/gvnn3/netperf/blob/master/Documentation/Papers/ABSDCon2015Paper.pdf

10Gbps NICs are $200-$300 today, and they’ll be included on the motherboard 
during the next hardware refresh.  Broadwell-DE (Xeon-D) has 10G in the SoC, 
and others are coming.
10Gbps switches can be had at around $100/port.  This is exactly the point at 
which the adoption curve for 1Gbps Ethernet ramped over a decade ago.


(*) A few more simple examples of cruft:

Why, in 2015 does the kernel have a ‘fast forwarding’ option, and worse, one 
that isn’t enabled by default?  Shouldn’t “fast forwarding" be the default?

Why, in 2015, does FreeBSD not ship with IPSEC enabled in GENERIC?  (Reason: 
each and every time this has come up in recent memory, someone has pointed out 
that it impacts performance.  
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=128030)

Why, in 2015, does anyone think it’s acceptable for “fast forwarding” to break 
IPSEC?

Why, in 2015, does anyone think it’s acceptable that the setkey(8) man page 
documents, of all things, DES-CBC and HMAC-MD5 for a SA?  That’s some kind of 
sick joke, right?
This completely flies in the face of RFC 4835.


> On May 4, 2015, at 10:29 AM, Barney Cordoba via freebsd-net 
> <freebsd-net@freebsd.org> wrote:
> 
> It's not faster than "wedging" into the if_input()s. It simply can't be. Your 
> getting packets at interrupt time as soon as their processed and  you there's 
> no network stack involved, and your able to receive and transmit without a 
> process switch. At worst it's the same, without the extra plumbing. It's not 
> rocket science to "bypass the network stack".
> The only advantage of bringing it into user space would be that it's easier 
> to write threaded handlers for complex uses; but not as a firewall (which is 
> the limit of the context of my comment). You can do anything in the kernel 
> that you can do in user space. The reason a kernel module with if_input() 
> hooks is better is that you can use the standard kernel without all of the 
> netmap hacks. You can just pop it into any kernel and it works.
> BC 
> 
> 
>    On Sunday, May 3, 2015 2:13 PM, Luigi Rizzo <ri...@iet.unipi.it> wrote:
> 
> 
> On Sun, May 3, 2015 at 6:17 PM, Barney Cordoba via freebsd-net <
> freebsd-net@freebsd.org> wrote:
> 
>> Frankly I'm baffled by netmap. You can easily write a loadable kernel
>> module that moves packets from 1 interface to another and hook in the
>> firewall; why would you want to bring them up into user space? It's 1000s
>> of lines of unnecessary code.
>> 
>> 
> Because it is much faster.
> 
> The motivation for netmap-like
> solutions (that includes Intel's DPDK, PF_RING/DNA
> and several proprietary implementations) is speed:
> they bypass the entire network stack, and a
> good part of the device drivers, so you can access
> packets 
> 
> 10+ times faster.
> So things are actually the other way around:
> the 1000's of unnecessary
> lines of code
> (not really thousands, though)
> are
> those that you'd pay going through the standard
> network stack
> when you
> don't need any of its services.
> 
> Going to userspace is just a side effect -- turns out to
> be easier to develop and run your packet processing code
> in userspace, but there are netmap clients (e.g. the
> VALE software switch) which run entirely in the kernel.
> 
> cheers
> luigi
> 
> 
> 
>> 
>> 
>>      On Sunday, May 3, 2015 3:10 AM, Raimundo Santos <rait...@gmail.com>
>> wrote:
>> 
>> 
>>  Clarifying things for the sake of documentation:
>> 
>> To use the host stack, append a ^ character after the name of the interface
>> you want to use. (Info from netmap(4) shipped with FreeBSD 10.1 RELEASE.)
>> 
>> Examples:
>> 
>> "kipfw em0" does nothing useful.
>> "kipfw netmap:em0" disconnects the NIC from the usual data path, i.e.,
>> there are no host communications.
>> "kipfw netmap:em0 netmap:em0^" or "kipfw netmap:em0+" places the
>> netmap-ipfw rules between the NIC and the host stack entry point associated
>> (the IP addresses configured on it with ifconfig, ARP and RARP, etc...)
>> with the same NIC.
>> 
>> On 10 November 2014 at 18:29, Evandro Nunes <evandronune...@gmail.com>
>> wrote:
>> 
>>> dear professor luigi,
>>> i have some numbers, I am filtering 773Kpps with kipfw using 60% of CPU
>> and
>>> system using the rest, this system is a 8core at 2.4Ghz, but only one
>> core
>>> is in use
>>> in this next round of tests, my NIC is now an avoton with igb(4) driver,
>>> currently with 4 queues per NIC (total 8 queues for kipfw bridge)
>>> i have read in your papers we should expect something similar to 1.48Mpps
>>> how can I benefit from the other CPUs which are completely idle? I tried
>>> CPU Affinity (cpuset) kipfw but system CPU usage follows userland kipfw
>> so
>>> I could not set one CPU to userland while other for system
>>> 
>> 
>> All the papers talk about *generating* lots of packets, not *processing*
>> lots of packets. What this netmap example does is processing. If someone
>> really wants to use the host stack, the expected performance WILL BE worse
>> - what's the point of using a host stack bypassing tool/framework if
>> someone will end up using the host stack?
>> 
>> And by generating, usually the papers means: minimum sized UDP packets.
>> 
>> 
>>> 
>>> can you please enlighten?
>>> 
>> 
>> For everyone: read the manuals, read related and indicated materials
>> (papers, web sites, etc), and, as a least resource, read the code. Within
>> netmap's codes, it's more easy than it sounds.
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
>> 
>> 
>> _______________________________________________
>> freebsd-net@freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-net
>> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
>> 
> 
> 
> 
> -- 
> -----------------------------------------+-------------------------------
> Prof. Luigi RIZZO, ri...@iet.unipi.it  . Dip. di Ing. dell'Informazione
> http://www.iet.unipi.it/~luigi/       . Universita` di Pisa
> TEL      +39-050-2217533              . via Diotisalvi 2
> Mobile  +39-338-6809875              . 56122 PISA (Italy)
> -----------------------------------------+-------------------------------
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"
> 
> 
> 
> _______________________________________________
> freebsd-net@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

  
_______________________________________________
freebsd-net@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "freebsd-net-unsubscr...@freebsd.org"

Reply via email to