Hi list,

Problem with my parameters... I think.

Testing on Nutch-1.2 then aiming to run on branch.

I send this as a single command after undertaking generate, fetch. parse. 
updatedb, invertlinks
$NUTCH_HOME/bin/nutch solrindex http://localhost:8080/wombra/data/ index 
crawl/crawldb crawl/linkdb crawl/segments/*

'wombra' is my webapp
'data' is the data directory where I wish to store the new index generated 
after a fresh daily crawl

Having looked at posts o mailing list, I can't see any obvious problems with my 
parameters... but I get the following output

SolrIndexer: starting at 2011-04-13 19:28:29
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: 
file:/home/lewis/Downloads/nutch-1.2/crawl/linkdb/crawl_fetch
Input path does not exist: 
file:/home/lewis/Downloads/nutch-1.2/crawl/linkdb/crawl_parse
Input path does not exist: 
file:/home/lewis/Downloads/nutch-1.2/crawl/linkdb/parse_data
Input path does not exist: 
file:/home/lewis/Downloads/nutch-1.2/crawl/linkdb/parse_text
Input path does not exist: file:/home/lewis/Downloads/nutch-1.2/index/current

I have hadoop.tmp.dir property value set to a partition with plenty of free 
space as well.

Any ideas please

Thank you Lewis

Glasgow Caledonian University is a registered Scottish charity, number SC021474

Winner: Times Higher Education’s Widening Participation Initiative of the Year 
2009 and Herald Society’s Education Initiative of the Year 2009.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,6219,en.html

Winner: Times Higher Education’s Outstanding Support for Early Career 
Researchers of the Year 2010, GCU as a lead with Universities Scotland partners.
http://www.gcu.ac.uk/newsevents/news/bycategory/theuniversity/1/name,15691,en.html

Reply via email to