Hi, nutch version 8.1 and 9.0 I'm trying, this steps: #mkdir dmoz #bin/nutch inject crawl/crawldb dmoz (in dmoz file urls whith my target sites) #bin/nutch generate crawl/crawldb crawl/segments #s1=`ls -d crawl/segments/2* | tail -1` #bin/nutch fetch $s1 #bin/nutch updatedb crawl/crawldb $s1 #bin/nutch invertlinks crawl/linkdb $s1 #bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb $s1
ok, now some pages in index. Next, i try to add new segment to index #bin/nutch generate crawl/crawldb crawl/segments -topN 1000 #s2=`ls -d crawl/segments/2* | tail -1` #bin/nutch fetch $s2 #bin/nutch updatedb crawl/crawldb $s2 #bin/nutch invertlinks crawl/linkdb $s2 #bin/nutch index crawl/indexes crawl/crawldb crawl/linkdb $s2 Exception in thread "main" java.io.IOException: Output directory /root/SE/SE_java/nutch-0.8.1/crawl/indexes already exists. In version 7.2 it's simple to add by segments, but i'm dont understand how do this in 8.1 and 9.0 versions 10x -- View this message in context: http://www.nabble.com/How-to-add-ney-segment-to-index-tf3571707.html#a9979218 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/nutch-general
