Hi Stefan,

I meant when I want to refetch the new pages, and add those pages to
the index. How can I do that?

It seems like intranet crawl using the bin/nutch crawl command is a
one time deal. You get whatever you want, and if you want to fetch
again, and index again ( more pages ), you start over.

I want to fetch only the pages that are not in the index anymore. 

Thanks Stefan for you help.

Regards,
Paul



On Thu, 3 Mar 2005 11:09:33 +0100, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> Paul,
> i do not understand what you mean.
> When you use the crawl command you should already have an updated index
> in the end.
> If you like to reindex may since you plan to use more plugin, simply
> delete index* in your segment folders and use the nutch index command.
> HTH
> Stefan
> Am 02.03.2005 um 20:49 schrieb sub paul:
> 
> > Hi,
> >
> > I was trying to find out how to update my index after I have done the
> > intial intranet crawl.
> >
> > Should I use the same procedure as whole-web crawl to crawl my list of
> > websites?
> >
> > Regards,
> > Paul
> >
> >
> > -------------------------------------------------------
> > SF email is sponsored by - The IT Product Guide
> > Read honest & candid reviews on hundreds of IT Products from real
> > users.
> > Discover which products truly live up to the hype. Start reading now.
> > http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
> > _______________________________________________
> > Nutch-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nutch-general
> >
> >
> -----------information technology-------------------
> company:     http://www.media-style.com
> forum:           http://www.text-mining.org
> blog:                http://www.find23.net
> 
>


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to