Hi,

How did you set up your configuration in Nutch? Using source or binary
installation? Are you sure that all configuration specific files are
available on classpath?

On Thu, Nov 3, 2011 at 4:22 PM, Skiming_Zhang <[email protected]> wrote:

> Hello dear :****
>
>                    I have the following running information from
> “hadoop.log” when I configured Nutch 1.3 in Eclipse (Win 7), but I don’t
> know how to resolve it ,Can you help me . I’m new to nutch , so forgive me
> for some mistakes of using wrong terminology!****
>
> ** **
>
> 2011-11-03 16:51:53,300 WARN  crawl.Crawl - solrUrl is not set, indexing
> will be skipped...****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - crawl started in: crawl****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - rootUrlDir = urls****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - threads = 4****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - depth = 5****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - solrUrl=null****
>
> 2011-11-03 16:51:53,502 INFO  crawl.Crawl - topN = 10****
>
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: starting at
> 2011-11-03 16:51:53****
>
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: crawlDb:
> crawl/crawldb****
>
> 2011-11-03 16:51:53,518 INFO  crawl.Injector - Injector: urlDir: urls****
>
> 2011-11-03 16:51:53,534 INFO  crawl.Injector - Injector: Converting
> injected urls to crawl db entries.****
>
> 2011-11-03 16:51:53,658 WARN  mapred.JobClient - No job jar file set.
> User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
> ****
>
> 2011-11-03 16:51:54,267 INFO  plugin.PluginRepository - Plugins: looking
> in: E:\IdealTimes\WorkSpace\Nutch1.3\plugin****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Plugin
> Auto-activation mode: [true]****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Registered Plugins:
> ****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - the nutch core
> extension points (nutch-extensionpoints)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Basic URL
> Normalizer (urlnormalizer-basic)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Html Parse Plug-in
> (parse-html)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Basic Indexing
> Filter (index-basic)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - HTTP Framework
> (lib-http)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Pass-through URL
> Normalizer (urlnormalizer-pass)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL Filter
> (urlfilter-regex)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Http Protocol
> Plug-in (protocol-http)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL
> Normalizer (urlnormalizer-regex)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Tika Parser
> Plug-in (parse-tika)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - OPIC Scoring
> Plug-in (scoring-opic)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - CyberNeko HTML
> Parser (lib-nekohtml)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Anchor Indexing
> Filter (index-anchor)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Regex URL Filter
> Framework (lib-regex-filter)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Registered
> Extension-Points:****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch URL
> Normalizer (org.apache.nutch.net.URLNormalizer)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Protocol
> (org.apache.nutch.protocol.Protocol)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Segment
> Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch URL Filter
> (org.apache.nutch.net.URLFilter)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Indexing
> Filter (org.apache.nutch.indexer.IndexingFilter)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - HTML Parse Filter
> (org.apache.nutch.parse.HtmlParseFilter)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Content
> Parser (org.apache.nutch.parse.Parser)****
>
> 2011-11-03 16:51:54,345 INFO  plugin.PluginRepository - Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)****
>
> 2011-11-03 16:51:54,345 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-basic instance in getURLNormalizers function:
> attempting to continue instantiating plugins****
>
> 2011-11-03 16:51:54,360 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-regex instance in getURLNormalizers function:
> attempting to continue instantiating plugins****
>
> 2011-11-03 16:51:54,360 WARN  net.URLNormalizers -
> URLNormalizers:PluginRuntimeException when initializing url normalizer
> plugin urlnormalizer-pass instance in getURLNormalizers function:
> attempting to continue instantiating plugins****
>
> 2011-11-03 16:51:54,360 WARN  mapred.LocalJobRunner - job_local_0001****
>
> java.lang.RuntimeException: Error in configuring object****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> ****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)***
> *
>
>          at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> ****
>
>          at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
> ****
>
>          at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)****
>
>          at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)**
> **
>
> Caused by: java.lang.reflect.InvocationTargetException****
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)***
> *
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)***
> *
>
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> ****
>
>          at java.lang.reflect.Method.invoke(Unknown Source)****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ****
>
>          ... 5 more****
>
> Caused by: java.lang.RuntimeException: Error in configuring object****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
> ****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)***
> *
>
>          at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> ****
>
>          at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
> ****
>
>          ... 10 more****
>
> Caused by: java.lang.reflect.InvocationTargetException****
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)***
> *
>
>          at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)***
> *
>
>          at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
> ****
>
>          at java.lang.reflect.Method.invoke(Unknown Source)****
>
>          at
> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
> ****
>
>          ... 13 more****
>
> Caused by: java.lang.RuntimeException:
> org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter****
>
>          at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:77)****
>
>          at
> org.apache.nutch.crawl.Injector$InjectMapper.configure(Injector.java:72)**
> **
>
>          ... 18 more****
>
> Caused by: org.apache.nutch.plugin.PluginRuntimeException:
> java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter****
>
>          at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:166)
> ****
>
>          at org.apache.nutch.net.URLFilters.<init>(URLFilters.java:57)****
>
>          ... 19 more****
>
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.urlfilter.regex.RegexURLFilter****
>
>          at java.net.URLClassLoader$1.run(Unknown Source)****
>
>          at java.net.URLClassLoader$1.run(Unknown Source)****
>
>          at java.security.AccessController.doPrivileged(Native Method)****
>
>          at java.net.URLClassLoader.findClass(Unknown Source)****
>
>          at java.lang.ClassLoader.loadClass(Unknown Source)****
>
>          at java.lang.ClassLoader.loadClass(Unknown Source)****
>
>          at
> org.apache.nutch.plugin.Extension.getExtensionInstance(Extension.java:156)
> ****
>
>          ... 20 more****
>
> ** **
>
> ** **
>
> Best withes !****
>
> ** **
>
> Skiming_zhang****
>



-- 
*Lewis*

Reply via email to