Both db.ignore.internal.links and db.ignore.external.links are true in my case. Since I crawl only one domain, I suppose setting db.ignore.external.links to true is a good idea.
So db.ignore.internal.links should be false? >From what I understand db.ignore.external.links is a setting for the crawldb, while db.ignore.internal.links is a setting for the linkdb? On Wed, Jun 6, 2012 at 10:02 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > > -----Original message----- >> From:Matthias Paul <magethle.nu...@gmail.com> >> Sent: Wed 06-Jun-2012 09:47 >> To: user@nutch.apache.org >> Subject: Linkdb empty >> >> Hi all, > > hi > >> >> I noticed that my linkdb is always empty although I use the generated >> segments from the last crawl for the generation of the linkdb. > > Check the db.ignore.* settings in your config. > >> Do I have to keep more segments? >> As I use Solr for indexing, I only keep the segments from the last crawl. > > You can discard the segments if you never do a full reindex. > >> >> Thanks >> Matthias >>