Hi Stefan, I meant when I want to refetch the new pages, and add those pages to the index. How can I do that?
It seems like intranet crawl using the bin/nutch crawl command is a one time deal. You get whatever you want, and if you want to fetch again, and index again ( more pages ), you start over. I want to fetch only the pages that are not in the index anymore. Thanks Stefan for you help. Regards, Paul On Thu, 3 Mar 2005 11:09:33 +0100, Stefan Groschupf <[EMAIL PROTECTED]> wrote: > Paul, > i do not understand what you mean. > When you use the crawl command you should already have an updated index > in the end. > If you like to reindex may since you plan to use more plugin, simply > delete index* in your segment folders and use the nutch index command. > HTH > Stefan > Am 02.03.2005 um 20:49 schrieb sub paul: > > > Hi, > > > > I was trying to find out how to update my index after I have done the > > intial intranet crawl. > > > > Should I use the same procedure as whole-web crawl to crawl my list of > > websites? > > > > Regards, > > Paul > > > > > > ------------------------------------------------------- > > SF email is sponsored by - The IT Product Guide > > Read honest & candid reviews on hundreds of IT Products from real > > users. > > Discover which products truly live up to the hype. Start reading now. > > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click > > _______________________________________________ > > Nutch-general mailing list > > [email protected] > > https://lists.sourceforge.net/lists/listinfo/nutch-general > > > > > -----------information technology------------------- > company: http://www.media-style.com > forum: http://www.text-mining.org > blog: http://www.find23.net > > ------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
