To help sanity check the website changes for the 4.0 release, i ran a quick and dirty little wget --spider command against the *staging* site...

wget --spider  -o wget.log  -e robots=off --wait 1 -r -p 
http://lucene.staging.apache.org/

...this found 32 broken links, some of which are obviously just becuase we only have copies of the per-version docs on the live site.

So then i ran those URLs through a little perl script to snaity check which of those URLs were broken even on the live site...

perl -MLWP::Simple -nle 'chomp; s/\.staging//; print "BAD\t$_" unless defined 
head $_'

..and this uncovered some genuine broken links that we should try to fix at some point (see below) but unfortunately this really simple wget approach doens't tell you *where* the broken link comes from -- just have to do a bit of intuition/grepping.

The good/bad news is none of them seem to relate to the edits for 4.0 -- they mostly seem like bugs in our templates...


BAD     http://lucene.apache.org/solr/images/solr-favicon.ico
BAD     http://lucene.apache.org/core/systemrequirements.html
BAD     http://lucene.apache.org/pylucene/jcc/privacy.html
BAD     http://lucene.apache.org/core/scripts/effects.js
BAD     http://lucene.apache.org/core/whoweare.html
BAD     http://lucene.apache.org/solr/scripts/search.js
BAD     http://lucene.apache.org/openrelevance/scripts/prototype.js
BAD     http://lucene.apache.org/pylucene/jcc/images/lucene-favicon.ico
BAD     http://lucene.apache.org/solr/images/lucene-favicon.ico
BAD     http://lucene.apache.org/core/scripts/slides.js
BAD     http://lucene.apache.org/openrelevance/scripts/effects.js
BAD     http://lucene.apache.org/pylucene/images/lucene-favicon.ico
BAD     http://lucene.apache.org/openrelevance/scripts/slides.js
BAD     http://lucene.apache.org/solr/scripts/effects.js
BAD     http://lucene.apache.org/core/scripts/prototype.js
BAD     http://lucene.apache.org/pylucene/version_control.html
BAD     http://lucene.apache.org/solr/scripts/prototype.js
BAD     http://lucene.apache.org/solr/scripts/slides.js
BAD     http://lucene.apache.org/core/images/lucene-favicon.ico
BAD     http://lucene.apache.org/openrelevance/scripts/search.js
BAD     http://lucene.apache.org/openrelevance/images/lucene-favicon.ico
BAD     http://lucene.apache.org/core/scripts/search.js




-Hoss

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to