[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2016-09-11 Thread Barry A. Warsaw

Barry A. Warsaw added the comment:

Two things: is it worth fixing this bug given the impending move to github?  
Also, why is this reported here and not the pydotorg tracker?  
https://github.com/python/pythondotorg/issues

Given that the last comment was 2014, I'm going to go ahead and close this 
issue.

--
nosy: +barry
resolution:  -> wont fix
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2014-06-09 Thread Antoine Pitrou

Antoine Pitrou added the comment:

Yes, I think we should whitelist rather than blacklist. The problem with 
letting engines index the repositories is the sheer resource cost when they 
fetch many heavy pages (such as annotate, etc.).

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2014-06-07 Thread Emily Zhao

Emily Zhao added the comment:

I don't know too much about robots.txt but how about

Disallow: */rev/*
Disallow: */shortlog/*
Allow:

Are there any other directories we'd like to exclude?

--
nosy: +emily.zhao

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2014-06-07 Thread Benjamin Peterson

Benjamin Peterson added the comment:

Unfortunately, I don't think it will be that easy because I don't think 
robots.txt supports wildcard paths like that. Possibly, we should just 
whitelist a few important repositories.

--
nosy: +benjamin.peterson

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2013-08-17 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
keywords: +easy
stage:  - needs patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2012-02-02 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Can you propose a robots.txt file?

--
nosy: +georg.brandl, pitrou

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2012-02-02 Thread Ezio Melotti

Changes by Ezio Melotti ezio.melo...@gmail.com:


--
nosy: +ezio.melotti

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13924] Mercurial robots.txt should let robots crawl landing pages.

2012-02-01 Thread Ivaylo Popov

New submission from Ivaylo Popov popov@gmail.com:

http://hg.python.org/robots.txt currently disallows all robots from all paths. 
This means that the site doesn't show up in Google search results seeking, for 
instance, browsing access to the python source
https://www.google.com/search?ie=UTF-8q=python+source+browse
https://www.google.com/search?ie=UTF-8q=python+repo+browse
https://www.google.com/search?ie=UTF-8q=hg+python+browse
etc...

Instead, robots.txt should allow access to the landing page, 
http://hg.python.org/, and the landing pages for hosted projects, e.g. 
http://hg.python.org/cpython/, while prohibiting access to the */rev/*, 
*/shortlog/*, ..., directories.

This change would be very easy, cost virtually nothing, and let users find the 
mercurial repository viewer from search engines. Note that 
http://svn.python.org/ does show up in search results, as an illustration of 
how convenient this is.

--
components: None
messages: 152446
nosy: Ivaylo.Popov
priority: normal
severity: normal
status: open
title: Mercurial robots.txt should let robots crawl landing pages.
type: enhancement

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13924
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com