On 28 Feb 2012, at 14:57, Rich Bowen wrote:

>> That's what robots.txt is for!  Surely we can use that to stop indexing 2.0 
>> as well as 1.3?  Maybe even 2.2 once 2.4 is windows-ready and in the distros?
> 
> The rel canonical thing is a way to actively update the Google index for a 
> particular page and search term, and has been very effective in updating 
> certain searches. For example, searching Google for "rewriterule" has long 
> given the 1.3 Rewrite Guide, but within 24 hours of adding a rel canonical 
> tag, it started pointing to the 2.2 mod_rewrite docs as the top hit.

I agree with Nick.
Why not change http://httpd.apache.org/robots.txt so that the 1.3 documents are 
no longer crawled? If I wanted to go through each page to make more 
fine-grained changes I'd only end up adding:
<meta name="robots" content="noindex">

…which does almost exactly the same thing, for more effort.


The ASF doesn't really need extra help getting the top Google / Bing / whatever 
hit for “httpd”, “Apache” etc. That's why most people use Link: … 
rel="canonical": they want to preserve their PageRank.
But this discussion has been about the 1.3 docs having *too much* PageRank.


I can spot one downside. Excluding a document with robots.txt also blocks 
access to historical versions via web.archive.org
Is this important?

-- 
Tim Bannister – [email protected]

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to