On 3 Dec 2001, Steve Mynott wrote:

> Obvious things are is the response 200 or 404 for requests for
> default.htm or index.html?

We're going in circles here, but if it's up to me, I like to tell Apache
to map all .htm files to .html, and in the case of home pages I redirect
requests for default.htm to index.html, just so that anyone used to one or
the other naming scheme will still be able to get to the documents. SO you
might want to expand that to 200, 30x, or 404 responses...

> You could also get clever and pretend to be a web proxy since there are
> probably differences in the way these get treated.  Also try to find
> things which are due to basic server differences like threads or
> whatever (simultaneous requests?) 
 
Examples? Every Apache specific feature I can think of, I can think of a
way to make it act like IIS, at least superficially. I'm sure someone
cleverer than me can do the same more discretely, or get IIS to act like
Apache (? maybe, dunno...). 

I think probabilistic measures are the way to go here. For each host, hit
it with a battery of tests & use that to come up with a hypothesis and a
confidence value for that hypothesis. What features could go into this
battery, and what rankings/weights should they get? Criteria to try, off
the top of my head (surely an incomplete list):

    * http header strings
    * 404 response
    * 3xx responses
    * response to requests for index.html
    * response to requests for default.htm
    * response to requests for foo.cgi
    * response to requests for foo.pl
    * response to requests for foo.py
    * response to requests for foo.php
    * response to requests for foo.exe
    * response to requests for foo.asp
    * response to requests for foo.htm
    * response to requests for foo.html
    * existance of a /cgi-bin/ directory

There are probably lots more like that last one, in particular. To each of
those criteria you can attach a weight -- 0.2, 0.05, etc -- that would
ideally work out to 1.0, and then just work out the weighted mean or
whatever. 


-- 
Chris Devers

"People with machines that think, will in times of crisis, 
make up stuff and attribute it to me" - "Nikla-nostra-debo"


Reply via email to