Hi,

Recently some massive bad image scraping sites started mangling the requests to the images on my site, I get thousands of requests to urls like:

"/blah/blah/images/Image-5.jpg" width="128" height="49" alt="image"/></a> </div> <div class="c0 r"><a href="/m/imgres?..."

this is the request uri verbatim, not just a snippet of the html on the client side...

So google webmaster tools has been reporting tons of Soft 404 on my sites.

The problem is that I have mod_speling enabled and using it only to correct the case of the url:

        CheckSpelling On
        CheckCaseOnly On

For some reason the module doesn't behave correctly and offers a 300 response:

--------&<------------&<----------
http://example.com/blah/blah/images/Image-5.jpg%22%20width=%22128%22%20height=%2249%22%20alt=%22image%22/%3E%3C/a%3E%20%3C/div%3E%20%3Cdiv%20class=%22c0%20r%22%3E%3Ca%20href=%22/m/imgres?q=foobar

Multiple Choices
The document name you requested (/blah/blah/images/Image-5.jpg" width="128" height="49" alt="image"/></a> </div> <div class="c0 r"><a href="/m/imgres) could not be found on this server. However, we found documents with names similar to the one you requested.

Available documents:

/blah/blah/images/Image-5.jpg/></a> </div> <div class="c0 r"><a href="/m/imgres?q=foobar (common basename)
--------&<------------&<----------

The image /blah/blah/images/Image-5.jpg exists on the filesystem.

There are two problems here:

1. Why does it return response 300
2. Why the offered available document is bogus

Turning CheckSpelling off correctly reports this as 404. But then I lose the ability to correct the spelling of misspelled URLs which is another huge problem, as many clients don't respect the case-sensitivity of the urls.

What is the right course of action in this case? Is it a bug in mod_speling, or am I missing some other configuration?

Server version: Apache/2.2.23 (Unix)

I looked through the svn, there weren't any changes in the last 5 years, so probably it's pointless trying the very latest version.

(the same problem exists on apache/1.3)

Thank you.



--
________________________________________________
Stas Bekman              http://stasosphere.com
http://stason.org        http://chestofbooks.com
http://vitalitylink.com  http://healingcloud.com


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@httpd.apache.org
For additional commands, e-mail: users-h...@httpd.apache.org

Reply via email to