On Fri, 2002-09-27 at 04:37, Ian Holsman wrote: > fabio rohrich wrote: > > I'm going to develop this topic for thesis. > > Has anybody of you any suggest for it? Something to > > addin the development (like compression of the string > > ) or some feature to implement! > > > > And, the last thing, what do you think about it? > > > > Thanks a lot, > > Fabio > > > > - mod_blanks: a module for the Apache web server which > > would on-the-fly > > remove unnecessary blank space, comments and other > > non-interesting > > things from the served page. Skills needed: the C > > langugae, a bit of > > text parsing techniques, HTML, learn Apache API. > > Complexity: low to > > moderate (after learning the API). > I would disagree on this > We have an internal module which does > this as we have found that html is general is not easy to strip > as you would think. > If you do do this, please make sure you test your module on a lot of > different HTML out there, as well as multiple browsers..
This comment led me to another idea - how about plugging tidy (http://tidy.sourceforge.net/) in there instead, which will not only strip blanks if you tell it, but also clean the (X)HTML as well. Just a thought... Bojan