[Nutch-general] Crawling sucessful without fetching

Ratnesh,V2Solutions India Sat, 17 Mar 2007 01:49:47 -0800

HI,
I am trying to run nutch-0.8.1 souce code in eclipse, in eclipse i am
selecting new ->project->java project->existing from build file. and it
compiles all the source and includes it in my workspace for eg.
E:/workspace.
I have included conf directory in as souce file, copied plugins and lib
directory and attached with my project file.


parallaly did settings in nutch-default.xml for agentname, and robots and
nutch-site.xml for searcher.dir. and in crawl-urlfilter.txt. Made a urltest
directory in my workspace containing seed of url as 

http://localhost:8080/nutch-0.8.1/RATNESH/index.html.

in the tomcat settings , I copied nutch-0.8.1.war , nutch-0.8.1 and pasted
it inside the folder webapps. and inside the WEB-INF/classes i did changing
in file nutch-site.xml for searcher.dir.

start the tomcat server.

and tried for crawl by giving command in eclipse as
crawl -d urltest -dir crawl-result -depth 1 -topN 3

what happens is it does not show any error and in the console window , I
find as crawl successfully finished.

but it does not fetch my html pages stored inside the
webapps/nutch-0.8.1/RATNESH folder of tomcat.

so please help me where I am getting wrong???

Looking for your valuable inputs

Thanks
Ratnesh V2Solutions, India
-- 
View this message in context: 
http://www.nabble.com/Crawling-sucessful-without-fetching-tf3418547.html#a9527809
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

[Nutch-general] Crawling sucessful without fetching

Reply via email to