HI, I am trying to run nutch-0.8.1 souce code in eclipse, in eclipse i am selecting new ->project->java project->existing from build file. and it compiles all the source and includes it in my workspace for eg. E:/workspace. I have included conf directory in as souce file, copied plugins and lib directory and attached with my project file.
parallaly did settings in nutch-default.xml for agentname, and robots and nutch-site.xml for searcher.dir. and in crawl-urlfilter.txt. Made a urltest directory in my workspace containing seed of url as http://localhost:8080/nutch-0.8.1/RATNESH/index.html. in the tomcat settings , I copied nutch-0.8.1.war , nutch-0.8.1 and pasted it inside the folder webapps. and inside the WEB-INF/classes i did changing in file nutch-site.xml for searcher.dir. start the tomcat server. and tried for crawl by giving command in eclipse as crawl -d urltest -dir crawl-result -depth 1 -topN 3 what happens is it does not show any error and in the console window , I find as crawl successfully finished. but it does not fetch my html pages stored inside the webapps/nutch-0.8.1/RATNESH folder of tomcat. so please help me where I am getting wrong??? Looking for your valuable inputs Thanks Ratnesh V2Solutions, India -- View this message in context: http://www.nabble.com/Crawling-sucessful-without-fetching-tf3418547.html#a9527809 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
