Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt

Thérèse Godefroy Wed, 10 May 2023 12:04:33 -0700

Le 10/05/2023 à 20:50, Alfred M. Szmidt a écrit :

    > You've not explained the actual problem.  What are you trying to
    > solve?


    "it" is the www-commits list, which registers all changes to the www
    directory, including to pages that are not published yet. I suspect most
    of the other *-commits lists deal with source code repositories, which
    are public anyway.

If you wish to disallow access to pages, do not publish them --
www-commits is a public list, the www repository is a public
repository -- any commit is by definition published.  It is no
different than getting bug reports for commits in a software
repository that still hasn't had a release.

    If you let crawlers access changes to disallowed directories, you are
    defeating the purpose of robots.txt. What was supposed to be unpublished
    is actually published.

The purpose of robots.txt is to avoid overloading a web site, it is
not to disallow access to pages.

This still does not explain what the problem is -- "don't let crawlers
crawl" doesn't explain it.  What are you trying to solve?  That
unpublished articles are not published before they are finished?


Yes, basically. The purpose of the staging area is to work on articles
that are not ready yet.

Re: [Savannah-hackers-public] Please disallow www-commits in robots.txt

Reply via email to