060307 141033 parsing file:/home/hdiwan/nutch-0.7.1/conf/nutch-default.xml
060307 141033 parsing file:/home/hdiwan/nutch-0.7.1/conf/crawl-tool.xml
060307 141033 parsing file:/home/hdiwan/nutch-0.7.1/conf/nutch-site.xml
060307 141033 SEVERE bad conf file: top-level element not <nutch-conf>
060307 141033 No FS indicated, using default:local
060307 141033 crawl started in: ../SpectraSearch/crawl/
060307 141033 rootUrlFile = ../SpectraSearch/urls
060307 141033 threads = 3
060307 141033 depth = 2
060307 141033 Created webdb at LocalFS,/home/hdiwan/SpectraSearch/crawl/db
060307 141033 Starting URL processing
060307 141033 Plugins: looking in: /home/hdiwan/nutch-0.7.1/build/plugins
060307 141033 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/protocol-file
060307 141033 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/protocol-ftp
060307 141033 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/protocol-http/plugin.xml
060307 141033 impl: point=org.apache.nutch.protocol.Protocol class=
org.apache.nutch.protocol.http.Http
060307 141033 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/protocol-httpclient/plugin.xml
060307 141034 impl: point=org.apache.nutch.protocol.Protocol class=
org.apache.nutch.protocol.httpclient.Http
060307 141034 impl: point=org.apache.nutch.protocol.Protocol class=
org.apache.nutch.protocol.httpclient.Http
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-html/plugin.xml
060307 141034 impl: point=org.apache.nutch.parse.Parser class=
org.apache.nutch.parse.html.HtmlParser
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-js/plugin.xml
060307 141034 impl: point=org.apache.nutch.parse.Parser class=
org.apache.nutch.parse.js.JSParseFilter
060307 141034 impl: point=org.apache.nutch.parse.HtmlParseFilter class=
org.apache.nutch.parse.js.JSParseFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-text/plugin.xml
060307 141034 impl: point=org.apache.nutch.parse.Parser class=
org.apache.nutch.parse.text.TextParser
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-pdf
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-rss
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-msword
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/parse-ext
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/index-basic/plugin.xml
060307 141034 impl: point=org.apache.nutch.indexer.IndexingFilter class=
org.apache.nutch.indexer.basic.BasicIndexingFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/index-more/plugin.xml
060307 141034 impl: point=org.apache.nutch.indexer.IndexingFilter class=
org.apache.nutch.indexer.more.MoreIndexingFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/query-basic/plugin.xml
060307 141034 impl: point=org.apache.nutch.searcher.QueryFilter class=
org.apache.nutch.searcher.basic.BasicQueryFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/query-more/plugin.xml
060307 141034 impl: point=org.apache.nutch.searcher.QueryFilter class=
org.apache.nutch.searcher.more.TypeQueryFilter
060307 141034 impl: point=org.apache.nutch.searcher.QueryFilter class=
org.apache.nutch.searcher.more.DateQueryFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/query-site/plugin.xml
060307 141034 impl: point=org.apache.nutch.searcher.QueryFilter class=
org.apache.nutch.searcher.site.SiteQueryFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/query-url/plugin.xml
060307 141034 impl: point=org.apache.nutch.searcher.QueryFilter class=
org.apache.nutch.searcher.url.URLQueryFilter
060307 141034 parsing: /home/hdiwan/nutch-0.7.1
/build/plugins/urlfilter-regex/plugin.xml
060307 141034 impl: point=org.apache.nutch.net.URLFilter class=
org.apache.nutch.net.RegexURLFilter
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/urlfilter-prefix
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/creativecommons
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/language-identifier
060307 141034 not including: /home/hdiwan/nutch-0.7.1
/build/plugins/clustering-carrot2
060307 141034 not including: /home/hdiwan/nutch-0.7.1/build/plugins/ontology
060307 141034 SEVERE org.apache.nutch.plugin.PluginRuntimeException:
extension point: org.apache.nutch.protocol.Protocol does not exist.
Exception in thread "main" java.lang.ExceptionInInitializerError
        at org.apache.nutch.db.WebDBInjector.addPage(WebDBInjector.java:437)
        at org.apache.nutch.db.WebDBInjector.injectURLFile(
WebDBInjector.java:378)
        at org.apache.nutch.db.WebDBInjector.main(WebDBInjector.java:535)
        at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:134)
Caused by: java.lang.RuntimeException:
org.apache.nutch.plugin.PluginRuntimeException: extension point:
org.apache.nutch.protocol.Protocol does not exist.
        at org.apache.nutch.plugin.PluginRepository.getInstance(
PluginRepository.java:147)
        at org.apache.nutch.net.URLFilters.<clinit>(URLFilters.java:40)
        ... 4 more
Caused by: org.apache.nutch.plugin.PluginRuntimeException: extension point:
org.apache.nutch.protocol.Protocol does not exist.
        at org.apache.nutch.plugin.PluginRepository.installExtensions(
PluginRepository.java:78)
        at org.apache.nutch.plugin.PluginRepository.<init>(
PluginRepository.java:61)
        at org.apache.nutch.plugin.PluginRepository.getInstance(
PluginRepository.java:144)
        ... 5 more

That's from my log. A preliminary investigation follows, with steps and
results pasted:

1. check the nutch-0.7.1 war file for the relevant class:

% jar tvf ./nutch-0.7.1.jar | grep Protocol

server: 2:14pm % jar tvf ./nutch-0.7.1.jar | grep Protocol.class
   756 Tue Mar 07 13:17:04 PST 2006
org/apache/nutch/mapReduce/InterTrackerProtocol.class
   491 Tue Mar 07 13:17:04 PST 2006
org/apache/nutch/mapReduce/JobSubmissionProtocol.class
   324 Tue Mar 07 13:17:04 PST 2006
org/apache/nutch/mapReduce/MapOutputProtocol.class
   409 Tue Mar 07 13:17:04 PST 2006
org/apache/nutch/mapReduce/TaskUmbilicalProtocol.class
   517 Tue Mar 07 13:17:04 PST 2006 org/apache/nutch/protocol/Protocol.class
   469 Tue Mar 07 13:17:04 PST 2006
org/apache/nutch/searcher/DistributedSearch$Protocol.class

So it indeed exists.

2. ... Perhaps, it wasn't found in the source tree...

find ./src/java -name 'Protocol.java' -print

server: 2:14pm % find ./src -name 'Protocol.java' -print        [~/nutch-
0.7.1]
./src/java/org/apache/nutch/protocol/Protocol.java

Now I'm stumped... Help!

--
Cheers,
Hasan Diwan <[EMAIL PROTECTED]>

Reply via email to