Ah forget about it, you are on 2.x i read in the next message. But i think it also has a freegen tool. Markus
-----Original message----- > From:Markus Jelsma <markus.jel...@openindex.io> > Sent: Wednesday 24th February 2016 13:41 > To: user@nutch.apache.org > Subject: RE: recrawling of specific URLS > > Hi - easiest method is to use the freegen tool. But if you really want > homepages, not just domain roots, you can use the hostdb with freegen for it. > > # Update the hostdb > bin/nutch updatehostdb -hostdb crawl/hostdb -crawldb crawl/crawldb/ > > # Get list of homepages for each host > bin/nutch readhostdb crawl/hostdb/ output -dumpHomepages > > Then use freegen. > > Markus > > > -----Original message----- > > From:harsh <harsh.sha...@orkash.com> > > Sent: Wednesday 24th February 2016 12:49 > > To: user@nutch.apache.org > > Subject: recrawling of specific URLS > > > > Hi All > > > > Nutch is made to update ALL the URLs after a certain point of time. > > But I want to recrawl only the home page of seed URL so that i could get > > new link from the home page to crawl. > > Currently I am using the bug "Inject command re-inject seed URLS." for > > recrawling my seed URLs.But this is not the standard way. > > Please give a suggestion.I have read articles/discussions on > > re-crawling.But could not find the solution. > > Lewis,Tejas Please help!!!!! > > > > Thanks > > >