Managed to find out the problem. The property indexer.max.tokens in nutch-default.xml was causing the top level pages to be skipped. After changing the value to something like 30000, the crawler was able to pick up all the pages as per the configured depth.
muraliweb wrote: > > Nutch crawl does not pick up pages at depth 1 and 2 when its configured > for depth 3. > When the crawl is configured at depth 2 it does not pickup the homepage. > Can anyone please help > thanks in advance > murali > -- View this message in context: http://www.nabble.com/Nutch-crawl-does-not-capture-pages-of-lower-depth-tp25084017p25271774.html Sent from the Nutch - User mailing list archive at Nabble.com.
