On Oct 23, 2006, at 12:56 PM, madhuresh wrote:
Note, however, "in kernel space" doesn't necessarily mean "at the
driver level"; on Linux, the in-kernel filtering is done by "socket
filters" above the driver, and, even on BSD, although the driver
directly calls the BPF routine to supply a packet, the BPF code,
not the driver itself, does the filtering.
To my understanding tcpdump converts the filter options (set by the
user in plain text) into BPF code and passes it to libpcap.
No, tcpdump passes the filter option string to libpcap, which
translates it into BPF code.
In the standard Linux architecture a filter may be attached to a
socket by using a setsockopt call with the SO ATTACH FILTER flag. A
pointer to the BPF filter code is also passed to the kernel with it.
This call tries to set a filter for a socket.
That's what libpcap uses on Linux.
If we are trying to attach a heavy or multiple filters to a single
socket, the Linux kernel rejects all the filters allowing
all packets to cross the kernel-user space boundary and arrive at
libpcap. In such a situation, libpcap then filters the packets in
the user space and passes them to tcpdump or the caller program.
Libpcap never tries to attach multiple filters, i.e. multiple BPF
programs, to a single socket, so that's not relevant. It only tries
to attach a single BPF program. Note that "tcp or udp" is *NOT*
multiple filters. It's a single filter, that checks for TCP traffic
or UDP traffic; that's implemented as a *single* BPF program.
If the filter is too large for the Linux kernel to handle, the kernel
will reject the filter, so, instead of just completely preventing the
user from capturing, it does the filtering in user space. That is not
a bug; that is a feature, and will not change.
If the Linux kernel is rejecting the BPF program for a filter you're
using, you need to change the kernel to support bigger filters.
That's not a libpcap issue.
Hence it means that still the filtering has not been completed in
kernel space but in user space !!!
It means that the filtering is done in user space *IF THE FILTER
PROGRAM IS TOO BIG FOR THE LINUX KERNEL*, rather than just returning
an error and preventing the user from doing the capture they're trying
to do at all. It does *NOT* mean that we always do filtering in user
space.
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.