A couple of notes on the parser: - It is pretty lightweight and self contained - This HTML parser can be used for a multitude of applications, in Apache 2.0 filter modules. The filter processes content generated by Apache or proxied content and can rewrite URLs embedded in HTML pages: a) URL rewriting on the fly for cookieless tracking and single sign-on b) Allow ProxyPassReverse to modify not only HTTP headers but look inside proxied content and rewrite hardcoded URLs c) Strip banners or malicious javascript before it reaches the client
Those are some possibilities that having a fast, lightweight parser allows. Jon, maybe you can post the source somewhere so people can have a look and play with it Daniel On Mon, Aug 26, 2002 at 08:32:16PM -0700, Jon Travis wrote: > Hi all... > Jon Travis here... > > Covalent has written a pretty keen HTML parser (called el-kabong) > which we'd like to offer to the ASF for inclusion in APR-util (or > whichever other umbrella it fits under.) It's faster than > anything I can find, provides a SAX stylee interface, uses > APR for most of its operations (hash tables, etc.), and has a > pretty nice testsuite. We use it in our code to re-write HTML on > the fly. I would be the initial maintainer of the code. > > Please voice any interest, thanks. > > -- Jon -- Teach Yourself Apache 2 -- http://apacheworld.org/ty24/