Hi, I have changed http.verbose and fetcher.verbose properties to true and
this is fragment of logs:

2011-05-30 11:21:20,124 INFO  crawl.Injector - Injector: starting at
2011-05-30 11:21:20
2011-05-30 11:21:20,124 INFO  crawl.Injector - Injector: crawlDb:
crawl/crawldb
2011-05-30 11:21:20,124 INFO  crawl.Injector - Injector: urlDir: urls
2011-05-30 11:21:20,125 INFO  crawl.Injector - Injector: Converting injected
urls to crawl db entries.
2011-05-30 11:21:20,968 INFO  plugin.PluginRepository - Plugins: looking in:
/usr/local/inp/apache-nutch-1.2/plugins
2011-05-30 11:21:21,095 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2011-05-30 11:21:21,095 INFO  plugin.PluginRepository - Registered Plugins:
(list of plugins)

2011-05-30 11:21:21,095 INFO  plugin.PluginRepository - Registered
Extension-Points:
(list of points)

2011-05-30 11:21:21,128 WARN  regex.RegexURLNormalizer - can't find rules
for scope 'inject', using default
2011-05-30 11:21:21,819 INFO  crawl.Injector - Injector: Merging injected
urls into crawl db.
2011-05-30 11:21:23,383 INFO  crawl.Injector - Injector: finished at
2011-05-30 11:21:23, elapsed: 00:00:03
2011-05-30 11:21:24,008 INFO  crawl.Generator - Generator: starting at
2011-05-30 11:21:24
2011-05-30 11:21:24,009 INFO  crawl.Generator - Generator: Selecting
best-scoring urls due for fetch.
2011-05-30 11:21:24,009 INFO  crawl.Generator - Generator: filtering: true
2011-05-30 11:21:24,009 INFO  crawl.Generator - Generator: normalizing: true
2011-05-30 11:21:24,009 INFO  crawl.Generator - Generator: topN: 500
2011-05-30 11:21:24,010 INFO  crawl.Generator - Generator: jobtracker is
'local', generating exactly one partition.
2011-05-30 11:21:24,746 INFO  plugin.PluginRepository - Plugins: looking in:
/usr/local/inp/apache-nutch-1.2/plugins
2011-05-30 11:21:24,872 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2011-05-30 11:21:24,872 INFO  plugin.PluginRepository - Registered Plugins:
(list of plugins)

2011-05-30 11:21:24,873 INFO  plugin.PluginRepository - Registered
Extension-Points:
(list of points)

2011-05-30 11:21:24,900 INFO  crawl.FetchScheduleFactory - Using
FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2011-05-30 11:21:24,900 INFO  crawl.AbstractFetchSchedule -
defaultInterval=2592000
2011-05-30 11:21:24,900 INFO  crawl.AbstractFetchSchedule -
maxInterval=7776000
2011-05-30 11:21:25,042 INFO  plugin.PluginRepository - Plugins: looking in:
/usr/local/inp/apache-nutch-1.2/plugins
2011-05-30 11:21:25,134 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2011-05-30 11:21:25,134 INFO  plugin.PluginRepository - Registered Plugins:
(list of plugins)

2011-05-30 11:21:25,134 INFO  plugin.PluginRepository - Registered
Extension-Points:
(list of points)

2011-05-30 11:21:25,148 INFO  crawl.FetchScheduleFactory - Using
FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
2011-05-30 11:21:25,148 INFO  crawl.AbstractFetchSchedule -
defaultInterval=2592000
2011-05-30 11:21:25,149 INFO  crawl.AbstractFetchSchedule -
maxInterval=7776000
2011-05-30 11:21:25,589 WARN  crawl.Generator - Generator: 0 records
selected for fetching, exiting ...


I have started new crawling and despite of 5 domains in "seed file" Nutch
haven't discovered any urls for fetching

-----
Regards,
Jotta

PS. Sorry for my English :)
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Crawling-process-Fetching-tp2873786p3001404.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to