Hi, I`m using nutch 0.9 with cygwin. I´d started a iterative generate,
fetch,update process, wich at this time was at
depth of 5 until it hung up. I´ll run the command in backgound with
nohup.When the fetch process reach a site it freeze
(monitoring the nohup.out). The hadoop.log show nutch accesing the plugins,
I don´tunderstand what is. I preciate any
help since I spent 5 days in this fetching process...anyway, is there any
means of resuming the process or someone has
a plugin to do that?, when a crawl fails at bigger depths, it is painfull
the time waisted. I´ll send you my log....thanks again

2007-07-03 19:10:08,728 INFO  fetcher.Fetcher - fetching
http://www.lanparty.com.uy/phpBB2/privmsg.php?folder=inbox&sid=3ffb48875ad340581a2d943a59108063
2007-07-03 19:10:09,269 INFO  fetcher.Fetcher - fetching
http://www.infoteca.com.uy/foro/faq.php?sid=952f0c285d497774edf61305910881df
2007-07-03 19:10:12,714 INFO  fetcher.Fetcher - fetching
http://www.derechocomercial.edu.uy/ClaseContSoc01.htm#_ftnref13
2007-07-03 19:41:25,417 WARN  util.NativeCodeLoader - Unable to load
native-hadoop library for your platform... using builtin-java classes where
applicable
2007-07-03 19:41:25,567 INFO  plugin.PluginRepository - Plugins: looking in:
D:\nutch\nutch-0.9\plugins
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository - Registered Plugins:
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     CyberNeko HTML
Parser (lib-nekohtml)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Site Query
Filter (query-site)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     MSWord Parse
Plug-in (parse-msword)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Html Parse
Plug-in (parse-html)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Regex URL Filter
Framework (lib-regex-filter)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Basic Indexing
Filter (index-basic)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Pdf Parse
Plug-in (parse-pdf)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Jakarta POI -
Java API To Access Microsoft Format Files (lib-jakarta-poi)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Text Parse
Plug-in (parse-text)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Basic Query
Filter (query-basic)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Regex URL Filter
(urlfilter-regex)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     HTTP Framework
(lib-http)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     URL Query Filter
(query-url)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Parse MS
Documents Framework (lib-parsems)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Log4j
(lib-log4j)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Http Protocol
Plug-in (protocol-http)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     More Indexing
Filter (index-more)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     the nutch core
extension points (nutch-extensionpoints)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     More Query
Filter (query-more)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     OPIC Scoring
Plug-in (scoring-opic)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository - Registered
Extension-Points:
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Summarizer
(org.apache.nutch.searcher.Summarizer)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Scoring (
org.apache.nutch.scoring.ScoringFilter)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Protocol (
org.apache.nutch.protocol.Protocol)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch URL
Normalizer (org.apache.nutch.net.URLNormalizer)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch URL Filter
(org.apache.nutch.net.URLFilter)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     HTML Parse
Filter (org.apache.nutch.parse.HtmlParseFilter)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Online
Search Results Clustering Plugin (
org.apache.nutch.clustering.OnlineClusterer)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Indexing
Filter (org.apache.nutch.indexer.IndexingFilter)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Content
Parser (org.apache.nutch.parse.Parser)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Ontology Model
Loader (org.apache.nutch.ontology.Ontology)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Analysis (
org.apache.nutch.analysis.NutchAnalyzer)
2007-07-03 19:41:26,388 INFO  plugin.PluginRepository -     Nutch Query
Filter (org.apache.nutch.searcher.QueryFilter)

at which it ends
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to