Hi all, On Wed, 18 Feb 2009, Paolo Lucente wrote:
> In concept, and as documentation says, what you want to achieve is > feasible and your understanding of the classifier() is correct - you > only have to write down your own patterns: re-phrased, regular > expressions are typically employed to recognize protocols but they can > be of course used to recognize virtual hosts when in presence of > text-based protocols (ie. HTTP, FTP or POP3). > > As you said this is quite innovative and interesting - so let me know if > i can support you somehow (feel also free to contact me privately). For > now i have not received any feedback which can help you dimensioning the > solution - so can't say how easy it would be to deploy in this sense; > perhaps somebody reading can fill this gap? I have thought about doing this as well. The main problem that I had with using classifiers is that I ultimately would have to implement a TCP engine to reassemble the stream from packets (perhaps the one in snort can be borrowed?). Otherwise the Host: header could (accidentally or deliberately) be split across multiple packets. There is plenty of opportunity for exploitation here as well, e.g. multiple Host: headers, invalid characters in headers, packets that look like HTTP requests in the middle of streams, bad Content-Lengths, etc. What I was planning to do, but have not done yet, is to: * force everyone to use a HTTP proxy (transparent or not) so that dealing with malicious requests becomes someone else's problem; * use the HTTP proxy's logging features to capture the full details of both requests (inbound to proxy and outbound from proxy) along with the requested URI and current time; * save all this in a separate table in the database; * left join from pmacct's acct_v* table to the proxy table on the unique quadruple (ip_src,ip_dst,src_port,dst_port) and time. Thsi was appropriate for my situation as I wanted everyone to use a caching proxy anyway to save bandwidth, and hopefully to authenticate. However I discovered that Squid's logging formats do not provide all the information that I needed to reliably match up the connection (no client port, see http://www.visolve.com/squid/squid30/logs.php#logformat). The external ACL program does have enough information for this (http://www.visolve.com/squid/squid30/externalsupport.php#external_acl_type), so writing a program to run as an external ACL helper and log the information to the database is a possibility. In our case this also was not good enough, as it does not tell us whether the request will be served from the cache or not, and therefore does not correspond to the client's real bandwidth usage. I would be very interested to see what you do in this space. Cheers, Chris. -- Aptivate | http://www.aptivate.org | Phone: +44 1223 760887 The Humanitarian Centre, Fenner's, Gresham Road, Cambridge CB1 2ES Aptivate is a not-for-profit company registered in England and Wales with company number 04980791. _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists