ya , prashant try to check nutch-site.xml for crawl-dir and nutch-default.xml for agents and robots entry. then give the url what you want to crawl, after that in crawl-urlfilter.txt write whole path of site. for example www.rajshri.com.
for tracking the error enable log file of haddop, by clicking log4j properites file . and then let me know?? if it works out. Thnx Ratnesh prashant_nutch wrote: > > Any help for Crawling in Eclipise on windows enviornment. > i made following changes: > > 1.crawl-urlfilter.txt--------->#+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/ and > put my site name. > 2.nutch-site.xml----------->change in robot agent and agent name & > also in search.dir > and then made folder ---urldir in which url name present. > all are work fine bcoz after running on eclipse no any error but still > that particular site is not crawled... > what is problem................... > -- View this message in context: http://www.nabble.com/Nutch-On-Eclipse-%28windows%29-tf3426037.html#a9571635 Sent from the Nutch - User mailing list archive at Nabble.com. ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
