svn commit: r219563 - in /lucene/nutch/branches/mapred/conf: crawl-urlfilter.txt.template regex-urlfilter.txt.template

2005-07-18 Thread cutting
Author: cutting Date: Mon Jul 18 13:42:37 2005 New Revision: 219563 URL: http://svn.apache.org/viewcvs?rev=219563&view=rev Log: Skip URLs with repeating segments. Modified: lucene/nutch/branches/mapred/conf/crawl-urlfilter.txt.template lucene/nutch/branches/mapred/conf/regex-urlfilter.txt

svn commit: r219564 - /lucene/nutch/branches/mapred/src/java/org/apache/nutch/mapred/JobClient.java

2005-07-18 Thread cutting
Author: cutting Date: Mon Jul 18 13:47:07 2005 New Revision: 219564 URL: http://svn.apache.org/viewcvs?rev=219564&view=rev Log: Don't log the same report twice. Modified: lucene/nutch/branches/mapred/src/java/org/apache/nutch/mapred/JobClient.java Modified: lucene/nutch/branches/mapred/src/

svn commit: r219560 - /lucene/nutch/branches/mapred/src/java/org/apache/nutch/ndfs/FSNamesystem.java

2005-07-18 Thread cutting
Author: cutting Date: Mon Jul 18 12:49:10 2005 New Revision: 219560 URL: http://svn.apache.org/viewcvs?rev=219560&view=rev Log: Fix a null pointer exception. Modified: lucene/nutch/branches/mapred/src/java/org/apache/nutch/ndfs/FSNamesystem.java Modified: lucene/nutch/branches/mapred/src/j

svn commit: r219568 - in /lucene/nutch/branches/mapred: conf/nutch-default.xml src/java/org/apache/nutch/crawl/Generator.java

2005-07-18 Thread cutting
Author: cutting Date: Mon Jul 18 14:08:36 2005 New Revision: 219568 URL: http://svn.apache.org/viewcvs?rev=219568&view=rev Log: Add per-host url limit in generate. Modified: lucene/nutch/branches/mapred/conf/nutch-default.xml lucene/nutch/branches/mapred/src/java/org/apache/nutch/crawl/Ge

[Nutch Wiki] Update of "DissectingTheNutchCrawler" by ErikHatcher

2005-07-18 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The following page has been changed by ErikHatcher: http://wiki.apache.org/nutch/DissectingTheNutchCrawler The comment on the change is: spelling correction

svn commit: r219566 - in /lucene/nutch/branches/mapred/src/java/org/apache/nutch/mapred: TaskRunner.java TaskTracker.java

2005-07-18 Thread cutting
Author: cutting Date: Mon Jul 18 13:57:34 2005 New Revision: 219566 URL: http://svn.apache.org/viewcvs?rev=219566&view=rev Log: Catch Throwable, not just Exception, and always log and report it to tracker. Modified: lucene/nutch/branches/mapred/src/java/org/apache/nutch/mapred/TaskRunner.jav

svn commit: r219476 - /lucene/nutch/trunk/build.xml

2005-07-18 Thread pkosiorowski
Author: pkosiorowski Date: Mon Jul 18 05:12:28 2005 New Revision: 219476 URL: http://svn.apache.org/viewcvs?rev=219476&view=rev Log: parse-mp3 and parse-rtf plugins excluded from JavaDoc build Modified: lucene/nutch/trunk/build.xml Modified: lucene/nutch/trunk/build.xml URL: http://svn.apac

[Nutch Wiki] Update of "FrontPage" by ajbanck

2005-07-18 Thread Apache Wiki
Dear Wiki user, You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change notification. The following page has been changed by ajbanck: http://wiki.apache.org/nutch/FrontPage -- ||'''About N