I just did and confirmed index-basic has no relevance to the crawl db. Here's a piece of log output for injector and crawl db reader. There are only two registered plugins, protocol-http and lib-http. After injection the crawldb has 1 entry which is the same URL as in my seed list.
2011-10-14 15:30:03,683 INFO crawl.Injector - Injector: starting at 2011-10-14 15:30:03 2011-10-14 15:30:03,684 INFO crawl.Injector - Injector: crawlDb: crawl/crawldb 2011-10-14 15:30:03,684 INFO crawl.Injector - Injector: urlDir: urls 2011-10-14 15:30:03,684 INFO crawl.Injector - Injector: Converting injected urls to crawl db entries. 2011-10-14 15:30:04,041 INFO plugin.PluginRepository - Plugins: looking in: /home/markus/projects/apache/nutch/trunk/runtime/local/plugins 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true] 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Registered Plugins: 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - HTTP Framework (lib-http) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Registered Extension- Points: 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter) 2011-10-14 15:30:04,131 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) 2011-10-14 15:30:04,132 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) 2011-10-14 15:30:04,132 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser) 2011-10-14 15:30:04,132 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter) 2011-10-14 15:30:04,946 INFO crawl.Injector - Injector: Merging injected urls into crawl db. 2011-10-14 15:30:05,160 WARN util.NativeCodeLoader - Unable to load native- hadoop library for your platform... using builtin-java classes where applicable 2011-10-14 15:30:06,104 INFO crawl.Injector - Injector: finished at 2011-10-14 15:30:06, elapsed: 00:00:02 2011-10-14 15:30:08,727 INFO crawl.CrawlDbReader - CrawlDb statistics start: crawl/crawldb/ 2011-10-14 15:30:08,836 WARN mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - Statistics for CrawlDb: crawl/crawldb/ 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - TOTAL urls: 1 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - retry 0: 1 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - min score: 1.0 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - avg score: 1.0 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - max score: 1.0 2011-10-14 15:30:10,052 INFO crawl.CrawlDbReader - status 1 (db_unfetched): 1 2011-10-14 15:30:10,053 INFO crawl.CrawlDbReader - CrawlDb statistics: done On Friday 14 October 2011 15:23:00 Radim Kolar wrote: > try it yourself. in 1.4 remove index-basic from list of included > plugins, then run nutch inject in hadoop mode and you will get 0 rows on > first map output. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

