Don't delete the crawl db, that's pointless. You can either delete the whole segment or remove all but crawl_generate and try again. You should delete the segment if you've successfully crawled another segment after that segment because it'll contain the same URL's.
-----Original message----- > From:Alaak <al...@gmx.de> > Sent: Sat 08-Sep-2012 10:43 > To: user@nutch.apache.org > Cc: Markus Jelsma <markus.jel...@openindex.io> > Subject: Re: Keeping an externally created field in solr. > > Hi, > > Ok. Thanks. Then I guess I will follow your last proposal and read the > value from the Solr Index if the URL is already there. > > Am Sa 08 Sep 2012 00:11:41 CEST schrieb Markus Jelsma: > > > > No, but you could modify the indexer to do so. Or make use of Solr's > > new capability of updating specific fields. You could also modifiy > > that indexer plugin to fetch the value for that field from some source > > you have prior to indexing. I think the latter is the easiest to make > > but it only works for fields specifically set by Nutch. > > > > -----Original message----- > >> > >> From:Alaak <al...@gmx.de> > >> Sent: Sat 08-Sep-2012 00:08 > >> To: user@nutch.apache.org > >> Subject: Keeping an externally created field in solr. > >> > >> Hi, > >> > >> I have an external program which changes a field for some websites > >> within my Solr index. Nutch sets this field to a default value using a > >> plugin on indexing a page. My problem now is that nutch resets the field > >> for already indexed pages as well, when it updates those pages. Do I > >> have any possibility to tell Nutch it should not touch that field if it > >> already exists within the Solr Index? > >> > >> Thanks and Regards >