On 2006-08-21 15:47, Alexander Dupuy wrote: >>> when given a rule consisting of a set of sub rules to pcap, if a packet >>> matches the rule, how do I know which sub rule it matches? > >> libpcap will not tell you that. As far as it's concerned - and as far >> as the kernel is concerned, on those platforms where the packet >> filtering is done in the kernel - there are no subrules, there's just >> one big program that either says "matches" or "doesn't match". > > If you're willing to dive below the libpcap interface and generate a custom > BPF program, you may be able to distinguish subrules, since the final result > is actually not just "matches" or "doesn't match" but rather how many bytes > to capture, from 0 to 64K. > > If you know that all traffic of interest will be at least say 40 bytes you > can have a BPF program that captures 38 bytes for one subrule and 39 bytes > for another. This won't work, obviously, if you need to capture the entire > packet, or if packet lengths shorter than your BPF program returns are > observed. It's also a bit tricky to do this coding, and you may want to rely > on the Linux "any" interface so that a single BPF program would work > regardless of the actual NIC interface type. (if you are using Linux). > > You can use tcpdump -d to see the BPF programs generated from pcap > expressions, which helps, but this definitely qualifies as a very advanced > libpcap hack, and unless the performance gains will be significant, this > approach is probably unwise to use. I myself have considered this for a > particular application, but have never actually implemented it.
Some years ago, I actually implemented a mechanism in libpcap to handle this sort of scenario. I can't share the code for various reasons, but I bring it up to point out a couple of issues you might run into if you do delve more deeply into it. First of all, BPF short-circuits its tests when it finds a successful match. Thus if you are using host foo or host bar and you give it a packet sent from host foo to host bar, you will get a match on one of these subrules (which subrule may depend on the optimizer) and the other rule will never be examined, because, after all, the packet already matches the overall rule. That is, using the approach Alexander describes, you may be able to find out which subrule the packet matched /first/, where order is determined in part by the optimizer, but not which other subrules the packet might have matched if you kept looking. While it may be useful to know that the packet matches "host foo", in most cases where you care about subrules, you would also like to know whether the packet matches "host bar", and you don't get to know both facts. So if you are interested in knowing /all/ of the subrules a packet matches, you're out of luck unless you take a radically different approach. If you don't want to build BPF programs from scratch, you can add a comma operator to the pcap compiler. Then when you want to evaluate subrules, you use comma in place of "or". For example, host foo, host bar would test "host foo" first, then continue to evaluate "host bar" regardless of the outcome of the first test. This doesn't buy you much without some method of communicating the subrule results outside the BPF engine. The method of returning a distinct value per subrule doesn't work when you want to detect multiple subrule matches. One method that works is to add a callback opcode to the BPF engine and an appropriate operator to the pcap compiler. (Naturally, this only works if you execute the BPF engine in userland.) So then you would have something like: host foo and call 1, host bar and call 2 meaning if the packet matches "host foo", the engine should then call out to your callback handler with the argument 1, then proceed to test "host bar" and call out with argument 2 if the packet matches the second subrule. Because of the comma operator, if the packet matches both subrules, you'll get both callbacks. An alternate way to communicate subrule results would be to use the existing BPF instructions to store flags directly into the packet data, for example in the ethernet header where you might not care. This is a way more klugey but you /might/ get away from the BPF-in-userland requirement using this technique. Yet another option, going back to userland, is to compile each of your subrules independently, save the BPF program for each subrule, and then execute every BPF program against each packet in turn. This is the most programmatically clean way to do it, though much less efficient in most cases because the optimizer can't factor out common code in all of the subrules. My hack was to add a comma and callback operator to the pcap compiler and implement a callback opcode in the BPF engine, and do the packet inspection in userland. If I did it again, I might do it differently, but it works. My main point in all this, however, is that when you start digging, the question of "which subrule" is somewhat more subtle than it might seem at first. -- Jefferson Ogata <[EMAIL PROTECTED]> NOAA Computer Incident Response Team (N-CIRT) <[EMAIL PROTECTED]> "Never try to retrieve anything from a bear."--National Park Service - This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.