ya , 
prashant try to check nutch-site.xml for crawl-dir and nutch-default.xml for
agents and robots entry.
then give the url what you want to crawl, after that in crawl-urlfilter.txt
write whole path of site. for example www.rajshri.com.

for tracking the error enable log file of haddop, by clicking log4j
properites file .

and then let me know?? if it works out.

Thnx 
Ratnesh 

prashant_nutch wrote:
> 
> Any help for Crawling in Eclipise on windows enviornment.
> i made following changes: 
>      
> 1.crawl-urlfilter.txt--------->#+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ and
> put my site name.
>       2.nutch-site.xml----------->change in robot agent and agent name &
> also in search.dir
> and then made folder ---urldir in which url name present.
> all are work fine bcoz after running on eclipse no any error but still
> that particular site is not crawled...
> what is problem...................
> 

-- 
View this message in context: 
http://www.nabble.com/Nutch-On-Eclipse-%28windows%29-tf3426037.html#a9571635
Sent from the Nutch - User mailing list archive at Nabble.com.


-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to