Bug#458939: allow search engines to index http://bugs.debian.org

2015-05-25 Thread Ivan Baldo
So, now in 2015, is it still necessary to block some bots and some URLs or should everything be opened or should this bug be closed or...? Just a ping :-). -- Ivan Baldo - iba...@adinet.com.uy - http://ibaldo.codigolibre.net/ From Montevideo, Uruguay, at the south of South America.

Bug#458939: allow search engines to index, http://bugs.debian.org

2008-05-18 Thread Tomasz Chmielewski
So right now Google is allowed to spider bugs.debian.org, but other search engines are not. Sounds discriminating. Perhaps it could be extracted from web server logs to see how much load does the Googlebot make? If the numbers are not very significant, other spiders could be allowed,

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-10 Thread Don Armstrong
On Thu, 10 Jan 2008, Anthony Towns wrote: (In practice, with google barely indexing anything in the BTS yet; lookup for bug#459818 by googling for `medium dhclient-script' works fine; using hyperstraier on merkel takes ages and doesn't return any hits) That's because you actually meant to

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-10 Thread Don Armstrong
On Wed, 09 Jan 2008, Don Armstrong wrote: On Thu, 10 Jan 2008, Anthony Towns wrote: I've made those changes on rietz directly; what's the procedure for committing them? sudo -u debbugs -H bzr commit ? There was a pre-existing change in pkgreport.cgi (adding a^ to the Go away regexp) that

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Anthony Towns
On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote: There are already mirrors which allow indexing, and you can use the BTS's own search engine which is far superior to gooogle [...] Uh, you're kidding right? The BTS's own search engine won't turn up hits outside the BTS, as a

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Anthony Towns
On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote: Getting smarturl.cgi properly done is still probably the real solution. Okay, so I've made smaturl.cgi work again; it was broken by: - Debbugs::CGI not accepting params from ARGV (smarturl.cgi changed to set QUERY_STRING)

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Don Armstrong
On Wed, 09 Jan 2008, Anthony Towns wrote: On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote: There are already mirrors which allow indexing, and you can use the BTS's own search engine which is far superior to gooogle [...] Uh, you're kidding right? The BTS's own search engine

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Don Armstrong
On Thu, 10 Jan 2008, Anthony Towns wrote: On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote: Getting smarturl.cgi properly done is still probably the real solution. Okay, so I've made smaturl.cgi work again; it was broken by: - Debbugs::CGI not accepting params from ARGV

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Anthony Towns
On Wed, Jan 09, 2008 at 12:54:32PM -0800, Don Armstrong wrote: On Wed, 09 Jan 2008, Anthony Towns wrote: Uh, you're kidding right? The BTS's own search engine won't turn up hits outside the BTS, as a trivial example... It's far superior to google for searching for results *in* the BTS.

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-09 Thread Anthony Towns
On Wed, Jan 09, 2008 at 05:58:34PM +1000, Anthony Towns wrote: Disallow: /*/ # exclude everything but the shortcuts Allow: /cgi-bin/bugreport.cgi?bug= Allow: /cgi-bin/pkgreport.cgi?pkg=*;dist=unstable$ I've set that up on rietz for Googlebot, we'll see if it works ok. I

Bug#458939: Fwd: Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-04 Thread Jason Spiro
2008/1/3, Don Armstrong [EMAIL PROTECTED] wrote: On Thu, 03 Jan 2008, Jason Spiro wrote: http://en.wikipedia.org/wiki/Robots.txt#Crawl-delay_directive will help. Yahoo and MSNBot both support it. I bet other major bots support it too. So we can allow Yahoo and MSNBot (plus Googlebot, if

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Jason Spiro
Package: www.debian.org Severity: wishlist Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Cheers, -- Jason Spiro: corporate trainer, web developer, IT consultant. I support Linux, UNIX, Windows, and more.

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Simon Paillard
reassign 458939 bugs.debian.org thanks On Thu, Jan 03, 2008 at 07:40:12PM +, Jason Spiro wrote: Package: www.debian.org Severity: wishlist Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Hello, the

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Don Armstrong
On Thu, 03 Jan 2008, Jason Spiro wrote: Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Just for the record, the reasons why we disallow indexing are because the robots.txt specification isn't complete enough

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Justin Pryzby
On Thu, Jan 03, 2008 at 01:07:15PM -0800, Don Armstrong wrote: On Thu, 03 Jan 2008, Jason Spiro wrote: Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Just for the record, the reasons why we disallow

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Jason Spiro
2008/1/3, Don Armstrong [EMAIL PROTECTED] wrote: On Thu, 03 Jan 2008, Jason Spiro wrote: Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Just for the record, the reasons why we disallow indexing are

Bug#458939: Fwd: Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Don Armstrong
On Thu, 03 Jan 2008, Jason Spiro wrote: http://en.wikipedia.org/wiki/Robots.txt#Crawl-delay_directive will help. Yahoo and MSNBot both support it. I bet other major bots support it too. So we can allow Yahoo and MSNBot (plus Googlebot, if they support it too) and block everyone else. Google

Bug#458939: allow search engines to index http://bugs.debian.org

2008-01-03 Thread Raphael Hertzog
On Thu, 03 Jan 2008, Jason Spiro wrote: Package: www.debian.org Severity: wishlist Please allow search engines to index http://bugs.debian.org. This can be done by deleting the file http://bugs.debian.org/robots.txt. Most of the content is generated dynamically nowadays and this file has