Re: [tcpdump-workers] About pcap rules

Jefferson Ogata Thu, 24 Aug 2006 20:50:28 -0700

On 2006-08-21 15:47, Alexander Dupuy wrote:
>>> when given a rule consisting of a set of sub rules to pcap,  if a packet 
>>> matches the rule, how do I know which sub rule it matches? 
> 
>> libpcap will not tell you that.  As far as it's concerned - and as far
>> as the kernel is concerned, on those platforms where the packet
>> filtering is done in the kernel - there are no subrules, there's just 
>> one big program that either says "matches" or "doesn't match".
> 
> If you're willing to dive below the libpcap interface and generate a custom 
> BPF program, you may be able to distinguish subrules, since the final result 
> is actually not just "matches" or "doesn't match" but rather how many bytes 
> to capture, from 0 to 64K.
> 
> If you know that all traffic of interest will be at least say 40 bytes you 
> can have a BPF program that captures 38 bytes for one subrule and 39 bytes 
> for another. This won't work, obviously, if you need to capture the entire 
> packet, or if packet lengths shorter than your BPF program returns are 
> observed. It's also a bit tricky to do this coding, and you may want to rely 
> on the Linux "any" interface so that a single BPF program would work 
> regardless of the actual NIC interface type. (if you are using Linux).
> 
> You can use tcpdump -d to see the BPF programs generated from pcap 
> expressions, which helps, but this definitely qualifies as a very advanced 
> libpcap hack, and unless the performance gains will be significant, this 
> approach is probably unwise to use. I myself have considered this for a 
> particular application, but have never actually implemented it.


Some years ago, I actually implemented a mechanism in libpcap to handle
this sort of scenario. I can't share the code for various reasons, but I
bring it up to point out a couple of issues you might run into if you do
delve more deeply into it.

First of all, BPF short-circuits its tests when it finds a successful
match. Thus if you are using

    host foo or host bar

and you give it a packet sent from host foo to host bar, you will get a
match on one of these subrules (which subrule may depend on the
optimizer) and the other rule will never be examined, because, after
all, the packet already matches the overall rule. That is, using the
approach Alexander describes, you may be able to find out which subrule
the packet matched /first/, where order is determined in part by the
optimizer, but not which other subrules the packet might have matched if
you kept looking. While it may be useful to know that the packet matches
"host foo", in most cases where you care about subrules, you would also
like to know whether the packet matches "host bar", and you don't get to
know both facts. So if you are interested in knowing /all/ of the
subrules a packet matches, you're out of luck unless you take a
radically different approach.

If you don't want to build BPF programs from scratch, you can add a
comma operator to the pcap compiler. Then when you want to evaluate
subrules, you use comma in place of "or". For example,

    host foo, host bar

would test "host foo" first, then continue to evaluate "host bar"
regardless of the outcome of the first test.

This doesn't buy you much without some method of communicating the
subrule results outside the BPF engine. The method of returning a
distinct value per subrule doesn't work when you want to detect multiple
subrule matches. One method that works is to add a callback opcode to
the BPF engine and an appropriate operator to the pcap compiler.
(Naturally, this only works if you execute the BPF engine in userland.)
So then you would have something like:

   host foo and call 1, host bar and call 2

meaning if the packet matches "host foo", the engine should then call
out to your callback handler with the argument 1, then proceed to test
"host bar" and call out with argument 2 if the packet matches the second
subrule. Because of the comma operator, if the packet matches both
subrules, you'll get both callbacks.

An alternate way to communicate subrule results would be to use the
existing BPF instructions to store flags directly into the packet data,
for example in the ethernet header where you might not care. This is a
way more klugey but you /might/ get away from the BPF-in-userland
requirement using this technique.

Yet another option, going back to userland, is to compile each of your
subrules independently, save the BPF program for each subrule, and then
execute every BPF program against each packet in turn. This is the most
programmatically clean way to do it, though much less efficient in most
cases because the optimizer can't factor out common code in all of the
subrules.

My hack was to add a comma and callback operator to the pcap compiler
and implement a callback opcode in the BPF engine, and do the packet
inspection in userland. If I did it again, I might do it differently,
but it works.

My main point in all this, however, is that when you start digging, the
question of "which subrule" is somewhat more subtle than it might seem
at first.

-- 
Jefferson Ogata <[EMAIL PROTECTED]>
NOAA Computer Incident Response Team (N-CIRT) <[EMAIL PROTECTED]>
"Never try to retrieve anything from a bear."--National Park Service
-
This is the tcpdump-workers list.
Visit https://cod.sandelman.ca/ to unsubscribe.

Re: [tcpdump-workers] About pcap rules

Reply via email to