Re: faster fast grep
On Wed, May 01, 2013 at 04:28, J?r?mie Courr?ges-Anglas wrote: > Ted Unangst writes: > >> For simple patterns, grep has an optimization to avoid regex and run >> about 50% faster. The problem is its idea of simple patterns is too >> simple. > > IIUC the idea is to optimize for a lazy user that didn't want to > type grep -F or fgrep: More or less. Note that it does support . wildcards unlike -F. I also have scripts that run grep where it would be annoying to get -F to grep when necessary. > The following diff also tests for ']', ')' and '}' so that unbalanced > use of those can be catched by regcomp if the latter happens to become > stricter. > > Regress passes; does it seem OK? I think that's a good improvement.
Re: Test needed: ehci(4) suspend/resume rework
On Sun, Apr 28, 2013 at 15:44, Martin Pieuchot wrote: > Diff below is a rework of the suspend/resume logic in ehci(4). > > In case this diff doesn't help or if you have a problem when resuming, > I left an "#ifdef 0" block in the DVACT_RESUME. Try enabling it and tell > me if it changes something. Got around to testing this. Now everything works. It still prints echi_idone about 100 times after resume, but it doesn't print it forever. I'd say the diff works, but only with the reset in the resume case as well.
Re: DPI for pf(4)
On May 1, 2013, at 9:41 AM, Stuart Henderson wrote: > I should have expanded the acronum to make it clear - osfp i.e. the > OS fingerprinting code (pf_osfp.c). oh, sorry, my mistake. This I can comment on. :) The idea is the same. I'd say at this stage osfp has more complexity due to parsing the TCP header, splitting fields, pulling in external descriptions, etc. Looking beyond the headers is far less structured, because applications do the structuring on their own, which in turn makes external descriptions hard to, er, describe -- hence the hard- wired C approach. The only "complexity" is the growing amount of application descriptions, but each application function is completely isolated. Here's the DPI hook function (a bit simplified for the context of this discussion): li_get(const struct li_packet *packet, const struct li_flow *flow) { unsigned int i; if (!packet->app_len) { return (LI_UNKNOWN); } for (i = 0; i < lengthof(apps); ++i) { if ((apps[i].p1 == flow->type) || (apps[i].p2 == flow->type)) { if (apps[i].function(packet, flow)) { return (apps[i].number); } } } /* * Set 'undefined' right away. Only one chance for * each side of the flow. This makes it easier for * a rules engine to do negation of policies. */ return (LI_UNDEFINED); } apps is an array of all of the available application functions. It looks something like this: static const struct li_apps apps[] = { LI_LIST_APP(LI_PPTP, pptp, IPPROTO_TCP, IPPROTO_GRE), LI_LIST_APP(LI_HTTP, http, IPPROTO_TCP, IPPROTO_MAX), /* more stuff here */ }; Really, that's all there is to it. > So another example might be: "pass proto tcp app $someapp divert-packet > $someproxy", with $someproxy handling the second stage? Yes, that looks reasonable. "proto tcp" may be zapped as well. If we are talking use cases the biggest ones would be traffic shaping and policy enforcement in general (no SMTP to the outside, blocking non-TLS stuff on port 443, etc.) > Yes, this is clearly a less messy approaach than opendpi ;) I probably shouldn't say I worked for these guys a few years ago. Nobody would believe me I never touched the DPI code, but it's the truth! Franco
Re: DPI for pf(4)
On Tue, Apr 30, 2013 at 07:14:50PM -0400, Ted Unangst wrote: > On Wed, May 01, 2013 at 00:16, Franco Fichtner wrote: > > Yes, I am proposing a lightweight approach: hard-wired regex-like > > code, no allocations, no reassembly or state machines. I've seen > > far worse things being put into Kernels and I assure you that I do > > refrain from putting in anything that could cause segmentation > > faults, sleeps, or other non-suitable behaviour. > > > And talking about complexity: 1000 LOC for 25 protocols. I'm afraid > > it can't be simplified any more than this. > > Well, it's really hard to comment on code we can't see. > > My thoughts on the matter have always been that it would be cool to > integrate bpf into pf (though other developers surely have other > opinions). Then you get filtering for as many protocols as you care to > write bpf matchers for. My first thought was why not to have something like squid does (ICAP) you can forward some inspection to other app and it would return you some agreed data (tag) and then you could work with then in pf rules... ???
Re: DPI for pf(4)
On 2013/05/01 09:01, Franco Fichtner wrote: > Hi Stuart, > > On May 1, 2013, at 1:11 AM, Stuart Henderson wrote: > > > On 2013/05/01 00:16, Franco Fichtner wrote: > >> > >> Yes, I am proposing a lightweight approach: hard-wired regex-like > >> code, no allocations, no reassembly or state machines. I've seen > >> far worse things being put into Kernels and I assure you that I do > >> refrain from putting in anything that could cause segmentation > >> faults, sleeps, or other non-suitable behaviour. > > > > Would it be fair to describe it as a bit more complex than osfp, > > but not hugely so? > > Not sure if that's a fitting comparison; and I know too little OSPF > to answer. I should have expanded the acronum to make it clear - osfp i.e. the OS fingerprinting code (pf_osfp.c). > Let me try another route. The logic consists of an array > of application detection functions, which can be invoked via their > respective IP types. There's 32 bits of external state for the > table and a single hook into the application detection. And the > detection for TLS/SSL3.0 follows. I have really tried to condense > it down to the bare minimum. > > LI_DESCRIBE_APP(tls) > { > struct tls { > uint8_t record_type; > uint16_t version; > uint16_t data_length; > } __packed *ptr = (void *)packet->app.raw; > uint16_t decoded; > > if (packet->app_len < sizeof(struct tls)) { > return (0); > } > > decoded = be16dec(&ptr->data_length); > > if (!decoded || decoded > 0x4000) { > /* no empty records possible, also <= 2^14 */ > return (0); > } > > switch (ptr->record_type) { > case 20:/* change_cipher_spec */ > case 21:/* alert */ > case 22:/* handshake */ > case 23:/* application_data */ > break; > default: > return (0); > } > > switch (be16dec(&ptr->version)) { > case 0x0300:/* SSL 3.0 */ > case 0x0301:/* TLS 1.0 */ > case 0x0302:/* TLS 1.1 */ > case 0x0303:/* TLS 1.2 */ > break; > default: > return (0); > } > > return (1); > } This type of thing looks sane to me, but others will want to comment. (I'll point others at your posts at http://lastsummer.de/category/technology/ too :-) > >> Would a protocol like BGP have a bright future in relayd(8)? > >> I don't know enough, maybe Reyk can clear this up? > >> > >> L7 filtering is cute, but ipfw-classifyd isn't maintained, DPI in > >> Linux netfilter is not hitting it off, and there really is no > >> BSD DPI. Franky, I don't care which way to go, but I believe > >> that pf(4) is a suitable candidate. I especially like the one- > >> rule-to-rule-them-all approach. Adding a keyword "app" to > >> pf.conf(5) seems like the simplest solution -- much like "proto" > >> does deal with IP types. > >> > >> And talking about complexity: 1000 LOC for 25 protocols. I'm afraid > >> it can't be simplified any more than this. > > > > What sort of protocols do you think could be reasonably handled by > > this approach, and what would be too complicated? > > Good question! Text protocols are easy, RFCs and open implementations > are generally easy. Anything too commercial/proprietary, especially > in binary, is more guessing than anything else and may not be worth > the effort. I don't see "world of warcraft" happening as a supported > application. This is what I have done so far (by no means free of > errors, though): > > -- BitTorrent > -- Gnutella > -- Network Basic Input Output System > -- Telecommunication Network > -- Hypertext Transfer Protocol > -- Post Office Protocol (Version 3) > -- Internet Message Access Protocol > -- Simple Mail Transfer Protocol > -- Session Traversal Utilities for NAT > -- Dynamic Host Configuration Protocol > -- Point-to-Point Tunneling Protocol > -- Lightweight Directory Access Protocol > -- Simple Network Management Protocol > -- Secure Shell > -- File Transfer Protocol > -- Session Initiation Protocol > -- Domain Name System > -- Real-time Transport Control Protocol > -- Real-time Transport Protocol > -- Routing Information Protocol > -- Boarder Gateway Protocol > -- Internet Key Exchange > -- Datagram Transport Layer Security > -- Transport Layer Security > -- Concurrent Versions System > > > There is definitely something appealing about being able to say, for > > example, 'block proto tcp on port 443; pass proto tcp on port 443 app tls', > > or 'block app ssh; pass proto tcp from to port 22 app ssh' > > without a bunch more complexity involved in passing across to a separate > > proxy (which would then need to implement its own completely separate > > filtering and would, I think, not really be able to integrate with > > things like PF tags and queue
Re: DPI for pf(4)
Hi Ted, On May 1, 2013, at 1:14 AM, Ted Unangst wrote: > On Wed, May 01, 2013 at 00:16, Franco Fichtner wrote: >> Yes, I am proposing a lightweight approach: hard-wired regex-like >> code, no allocations, no reassembly or state machines. I've seen >> far worse things being put into Kernels and I assure you that I do >> refrain from putting in anything that could cause segmentation >> faults, sleeps, or other non-suitable behaviour. > >> And talking about complexity: 1000 LOC for 25 protocols. I'm afraid >> it can't be simplified any more than this. > > Well, it's really hard to comment on code we can't see. I understand. The code is hooked up to a library feeding off of recorded network traces at the moment. The idea doesn't feel mature enough to me at this time, not knowing where to put it. So there's no point in releasing a half-done code blob that does nothing on its own, but I'm willing to share it off-list with OpenBSD developers. > My thoughts on the matter have always been that it would be cool to > integrate bpf into pf (though other developers surely have other > opinions). Then you get filtering for as many protocols as you care to > write bpf matchers for. You mean externalising the DPI? People[1] have tried to work on such ideas, but the general drift is that there are not enough interested individuals in the field to drive "second tier" development for application detections. I find C to be quite flexible and empowering if one doesn't overcomplicate[2]. Franco [1] https://code.google.com/p/appid/source/browse/trunk/apps/aim [2] https://github.com/fichtner/OpenDPI/blob/master/src/lib/protocols/ssl.c
Re: DPI for pf(4)
Hi Stuart, On May 1, 2013, at 1:11 AM, Stuart Henderson wrote: > On 2013/05/01 00:16, Franco Fichtner wrote: >> >> Yes, I am proposing a lightweight approach: hard-wired regex-like >> code, no allocations, no reassembly or state machines. I've seen >> far worse things being put into Kernels and I assure you that I do >> refrain from putting in anything that could cause segmentation >> faults, sleeps, or other non-suitable behaviour. > > Would it be fair to describe it as a bit more complex than osfp, > but not hugely so? Not sure if that's a fitting comparison; and I know too little OSPF to answer. Let me try another route. The logic consists of an array of application detection functions, which can be invoked via their respective IP types. There's 32 bits of external state for the table and a single hook into the application detection. And the detection for TLS/SSL3.0 follows. I have really tried to condense it down to the bare minimum. LI_DESCRIBE_APP(tls) { struct tls { uint8_t record_type; uint16_t version; uint16_t data_length; } __packed *ptr = (void *)packet->app.raw; uint16_t decoded; if (packet->app_len < sizeof(struct tls)) { return (0); } decoded = be16dec(&ptr->data_length); if (!decoded || decoded > 0x4000) { /* no empty records possible, also <= 2^14 */ return (0); } switch (ptr->record_type) { case 20:/* change_cipher_spec */ case 21:/* alert */ case 22:/* handshake */ case 23:/* application_data */ break; default: return (0); } switch (be16dec(&ptr->version)) { case 0x0300:/* SSL 3.0 */ case 0x0301:/* TLS 1.0 */ case 0x0302:/* TLS 1.1 */ case 0x0303:/* TLS 1.2 */ break; default: return (0); } return (1); } >> Would a protocol like BGP have a bright future in relayd(8)? >> I don't know enough, maybe Reyk can clear this up? >> >> L7 filtering is cute, but ipfw-classifyd isn't maintained, DPI in >> Linux netfilter is not hitting it off, and there really is no >> BSD DPI. Franky, I don't care which way to go, but I believe >> that pf(4) is a suitable candidate. I especially like the one- >> rule-to-rule-them-all approach. Adding a keyword "app" to >> pf.conf(5) seems like the simplest solution -- much like "proto" >> does deal with IP types. >> >> And talking about complexity: 1000 LOC for 25 protocols. I'm afraid >> it can't be simplified any more than this. > > What sort of protocols do you think could be reasonably handled by > this approach, and what would be too complicated? Good question! Text protocols are easy, RFCs and open implementations are generally easy. Anything too commercial/proprietary, especially in binary, is more guessing than anything else and may not be worth the effort. I don't see "world of warcraft" happening as a supported application. This is what I have done so far (by no means free of errors, though): -- BitTorrent -- Gnutella -- Network Basic Input Output System -- Telecommunication Network -- Hypertext Transfer Protocol -- Post Office Protocol (Version 3) -- Internet Message Access Protocol -- Simple Mail Transfer Protocol -- Session Traversal Utilities for NAT -- Dynamic Host Configuration Protocol -- Point-to-Point Tunneling Protocol -- Lightweight Directory Access Protocol -- Simple Network Management Protocol -- Secure Shell -- File Transfer Protocol -- Session Initiation Protocol -- Domain Name System -- Real-time Transport Control Protocol -- Real-time Transport Protocol -- Routing Information Protocol -- Boarder Gateway Protocol -- Internet Key Exchange -- Datagram Transport Layer Security -- Transport Layer Security -- Concurrent Versions System > There is definitely something appealing about being able to say, for > example, 'block proto tcp on port 443; pass proto tcp on port 443 app tls', > or 'block app ssh; pass proto tcp from to port 22 app ssh' > without a bunch more complexity involved in passing across to a separate > proxy (which would then need to implement its own completely separate > filtering and would, I think, not really be able to integrate with > things like PF tags and queue assignment)... Yes, that would be one scenario. I like to think of lightweight packet inspection as application "tagging". That's the first stage. Second stage is a real parser/proxy/endpoint. It's not a security functionality per se, but it can help to break down the workload. It doesn't care aboute IP versions, ports (mostly ;) ), different flavours ("netbios" could be session, datagram, and name service as one for example), and so forth. > Basically what I'm wondering if it's possible to go far enough to be > useful whilst keeping the c