I mean the directories like this: crawl-20110920160208 crawl-20110920211805 etc ...
On Wed, Oct 5, 2011 at 11:08, Markus Jelsma <[email protected]>wrote: > "crawls" or segment directories? You can delete old segment files is all > files > are fetched in newer segments, that is, older than 30 days if your crawl > can > keep up with the limit. > > On Wednesday 05 October 2011 16:57:52 Fred Zimmerman wrote: > > hi, > > > > I have a bunch of test crawls that I have carried out in the past sitting > > around. most of them are indexed by solr configured per nutch-config to > > run again in 30 days. these old crawls are a subset of (and redundant > to) > > my current "master" crawl. How should I get rid of these old crawls so > > that Nutch doesn't run them again and they are no longer cluttering up my > > directories? Also, are they all creating duplicate entries in the solr > > index? > > > > Fred > > -- > Markus Jelsma - CTO - Openindex > http://www.linkedin.com/in/markus17 > 050-8536620 / 06-50258350 >

