-----Original message-----
> From:Matthias Paul <[email protected]>
> Sent: Wed 06-Jun-2012 11:49
> To: [email protected]
> Subject: Re: Linkdb empty
> 
> Both db.ignore.internal.links and db.ignore.external.links are true in my 
> case.
> Since I crawl only one domain, I suppose setting
> db.ignore.external.links to true is a good idea.
> 
> So db.ignore.internal.links should be false?

yes

> 
> From what I understand db.ignore.external.links is a setting for the
> crawldb, while db.ignore.internal.links is a setting for the linkdb?

almost correct. It is not used in the crawldb but in the parse job, which is 
input to the crawldb.

> 
> 
> 
> On Wed, Jun 6, 2012 at 10:02 AM, Markus Jelsma
> <[email protected]> wrote:
> >
> > -----Original message-----
> >> From:Matthias Paul <[email protected]>
> >> Sent: Wed 06-Jun-2012 09:47
> >> To: [email protected]
> >> Subject: Linkdb empty
> >>
> >> Hi all,
> >
> > hi
> >
> >>
> >> I noticed that my linkdb is always empty although I use the generated
> >> segments from the last crawl for the generation of the linkdb.
> >
> > Check the db.ignore.* settings in your config.
> >
> >> Do I have to keep more segments?
> >> As I use Solr for indexing, I only keep the segments from the last crawl.
> >
> > You can discard the segments if you never do a full reindex.
> >
> >>
> >> Thanks
> >> Matthias
> >>
> 

Reply via email to