On Fri, Dec 13, 2002 at 08:31:21AM -0000, Richard Urwin wrote:
> What investment do we have in the libpcap parser?

What do you mean by "investment"?

There *is* a fair bit invested in the libpcap parser and code generator
by libpcap developers (of which I am one); I am unconvinced that it
would be a worthwhile investment of our time to attempt to duplicate -
and match the further investment in - the code generator.

The parser, itself (in the sense of "grammar.y" and "scanner.l"), is
another matter.  That's the easy part.

> One option would be to
> build our own capture filter parser somewhat closer to the display
> filter syntax.

As noted, there's more involved than just a parser.

It wouldn't be hard to build the parser.  The question then is what it
would produce.

There are two options I see, at present:

        1) producing BPF code;

        2) producing a libpcap-format capture filter string.

1) involves duplicating the effort that's gone into libpcap - and
continuing to duplicate that effort.  I'm not sure that's worthwhile.

2) doesn't involve duplicating that effort; the major problem is that we
might have to use the

          expr relop expr
               True if the relation holds, where relop is one  of
               >,  <,  >=,  <=,  =, !=, and expr is an arithmetic
               expression   composed   of    integer    constants
               (expressed  in  standard  C  syntax),  the  normal
               binary operators [+, -, *,  /,  &,  |],  a  length
               operator,  and  special packet data accessors.  To
               access data inside the packet, use  the  following
               syntax:
                    proto [ expr : size ]
               Proto is one of ether, fddi, ip, arp,  rarp,  tcp,
               udp, or icmp, and indicates the protocol layer for
               the index operation.  The byte offset, relative to
               the  indicated  protocol  layer, is given by expr.
               Size is optional and indicates the number of bytes
               in  the  field  of interest; it can be either one,
               two, or four, and defaults  to  one.   The  length
               operator,  indicated by the keyword len, gives the
               length of the packet.

feature in the generated string, but, in current versions of libpcap,
there's a leak in the code generator that can cause it, in some cases,
to forget that BPF "registers" are no longer being used, so sufficiently
complicated expressions might produce errors *and* all subsequent
compiles in the same program would also prodce errors even if the
expression isn't too complicated.

However:

        1) I fixed that memory leak in the current CVS version of
           libpcap (and 0.8 might be released early next year; it would
           include that fix);

        2) that might not be *too* severe a problem, especially for
           expressions that could be compiled into expressions not using
           that feature (e.g., "ip.addr == XXX" would compile into "ip
           host XXX").

We would, of course, want to take any string that fails to parse in our
syntax and hand it directly to "pcap_compile()", so that libpcap-format
strings continue to work.

Reply via email to