I mean the directories like this:

crawl-20110920160208
crawl-20110920211805
etc ...




On Wed, Oct 5, 2011 at 11:08, Markus Jelsma <[email protected]>wrote:

> "crawls" or segment directories? You can delete old segment files is all
> files
> are fetched in newer segments, that is, older than 30 days if your crawl
> can
> keep up with the limit.
>
> On Wednesday 05 October 2011 16:57:52 Fred Zimmerman wrote:
> > hi,
> >
> > I have a bunch of test crawls that I have carried out in the past sitting
> > around.  most of them are indexed by solr configured per nutch-config to
> > run again in 30 days.  these old crawls are a subset of (and redundant
> to)
> > my current "master" crawl. How should I get rid of these old crawls so
> > that Nutch doesn't run them again and they are no longer cluttering up my
> > directories? Also, are they all creating duplicate entries in the solr
> > index?
> >
> > Fred
>
> --
> Markus Jelsma - CTO - Openindex
> http://www.linkedin.com/in/markus17
> 050-8536620 / 06-50258350
>

Reply via email to