Thanks for the response. I did get it going by specifing the segment (ie. crawl/segments/20060425173804)
Per your last email, that's probably a bug as it looks like it is supposed to invertlinks on all the segments (LinkDb.java: 147). I'll wait for the 0.2 release, for now this is okay for me. As quick feedback on the tutorials, a few short lines on these commands might really help out. The commands that tooks me a few minutes to figure out were: bin/nutch inject db urls (where db is that database directory and urls is the url directory, not the actual url.txt file) and the line in indexing - wiki shows: bin/nutch index indexes crawl/linkdb crawl/segments/* should be: bin/nutch index crawl/index crawl/crawldb crawl/linkdb crawl/segments/* Again, as you said, maybe this is just the windows path names bug. In which case I'll try again on hadoop 0.2. Otherwise, everything else is fairly self-explanatory. I'm definitely enjoying the product. When I tried 0.7.2, I was up and running in under an hour! --- Doug Cutting <[EMAIL PROTECTED]> wrote: > Chris Fellows wrote: > > I'm having what appears to be the same issue on > 0.8 > > trunk. I can get through inject, generate, fetch > and > > updatedb, but am getting the IOException: No input > > directories on invertlinks and cannot figure out > why. > > I'm only using nutch on a single local windows > > machine. Any idea's? Configuration has not changed > > since checking out from svn. > > The handling of Windows pathnames is still buggy in > Hadoop 0.1.1. You > might try replacing your lib/hadoop-0.1.1.jar file > with the latest > Hadoop nightly jar, from: > > http://cvs.apache.org/dist/lucene/hadoop/nightly/ > > The file name code has been extensively re-written. > The next Hadoop > release (0.2), containing these fixes, will be made > in around a week. > > Doug > ------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
