Re: [Nutch-dev] Nutch code now at Apache

2005-03-01 Thread Matt Kangas
congratulations! i just checked-out from the anon url without a hitch. targets "jar" and "war" worked flawlessly, "test" reported one failure on TestSegmentMergeTool, same as in the latest cvs. --matt On Tue, 01 Mar 2005 14:31:49 -0800, Doug Cutting <[EMAIL PROTECTED]> wrote: > The Nutch code ha

[Nutch-dev] Nutch code now at Apache

2005-03-01 Thread Doug Cutting
The Nutch code has been moved from CVS to Subversion at Apache. The Subversion repository is: https://svn.apache.org/repos/asf/incubator/nutch/trunk Anonymous access is available with HTTP: % svn co http://svn.apache.org/repos/asf/incubator/nutch/trunk nutch Committer access is available with HTT

Re: [Nutch-dev] updatedb error

2005-03-01 Thread Doug Cutting
When did you update Nutch? There was a bug in CVS, introduced on 7 February which would cause this error. A fix was committed 19 February. Did you grab things during that window? If so, my apologies, but a simple CVS update should fix things. Doug Luke Baker wrote: Has anybody seen this err

[Nutch-dev] [ nutch-Bugs-1066096 ] Can't build with JDK 1.5

2005-03-01 Thread SourceForge.net
Bugs item #1066096, was opened at 2004-11-14 11:13 Message generated for change (Settings changed) made by sullija721 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=491356&aid=1066096&group_id=59548 Category: indexer Group: mainline >Status: Closed >Resolution: Fix

[Nutch-dev] updatedb error

2005-03-01 Thread Luke Baker
Has anybody seen this error? I'm trying to update the db from scratch using some previously fetched segments. I get this error on a 2 million page segment during the past two attempts to update the db with this segment's data: - 050301 012722 Processing document 200

Re: [Nutch-dev] Crawl did not finish

2005-03-01 Thread sub paul
Hi Olaf, Thank you very much for your response and luke would be a great tool for getting familiar with lucene and nutch. I was able to build my index 3-4 times, but I had queries turned off, so it would only catch first few pages as most of the website uses queries. You were right when you said