If you please post your crawldb dump then we could see the structure of your crawldb and may be able to begin pin pointing the issue.
It should not be required for you to undertake another crawl after inverting links for these URLs to be indexed when calling solrindex command... there must be more to it. On Tue, Aug 23, 2011 at 6:54 PM, abhayd <[email protected]> wrote: > hi > after doing invert link i see the complete link graph...THANKS > > I m bit confused, please help me understand.. > > I do crawl using crawl command. I see around 7000+ urls when i dump > crawldb. > Then i do invertlink and i see the complete link graph. > After this i do solrindex. > > After solr indexing is completed i see only 2421 docs. I was expecting > 7000+ > docs (i.e exact number of unique urls which i got from dumping crawldb as > text) > > Why i just see 2421 urls/docs in solr? > Do i need to execute crawl again after invertlink? > > Here are some settings > -------------------------------------------------------------- > <name>db.update.max.inlinks</name> > <value>10000</value> > > <name>db.ignore.internal.links</name> > <value>false</value> > > <name>db.max.inlinks</name> > <value>10000</value> > > <name>db.max.outlinks.per.page</name> > <value>-1</value> > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/readdblink-not-showing-alllinks-tp3274127p3278779.html > Sent from the Nutch - User mailing list archive at Nabble.com. > -- *Lewis*

