Hi,

<value>/opt/nutch-0.9/crawl/segments</value>

Try going to just your crawl directory...

Jason

On Fri, Jun 13, 2008 at 12:32 PM, nutch_newbie <[EMAIL PROTECTED]> wrote:
>
> I aplogize in advance for the lengthy e-mail, but i tried to provide as much
> info as i could...
> Everything, including the crawler works fine right until i go on my
> localhost, and enter something and click search, it always says Hits 0-0
> (out of about 0 total matching pages): . I already spent about 18 hours
> total reading everything related to Nutch and trying to find my mistake, but
> no luck.(I've spent almost a week before that trying to set it up with
> tutorials on Nutch wiki) I have to have nutch working by Sunday afternoon,
> and right now i'm very stressed out about it. So if anyone would please help
> i would be so very very thankfull.
>
>
> Ok, my computer is Fedora Core 5, and i have jdk1.6.0_06,
> apache-tomcat-5.5.16, and currently trying to get nutch 9 running (after
> having been extremly unsucessfull with versions 7 and 8)
>
> Here is my nutch-site.xml, from
> /opt/apache-tomcat-5.5.16/webapps/nutch-0.9/WEB-INF/classes
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
> <property>
>        <name>searcher.dir</name>
>        <value>/opt/nutch-0.9/crawl/segments</value>
> </property>
> </configuration>
>
> Here is my nutch-site.xml from /opt/nutch-0.9/conf
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <!-- Put site-specific property overrides in this file. -->
>
> <configuration>
>        <property>
>
>                <name>http.agent.name</name>
>
>                <value>User</value>
>
>                <description>User
>
>                </description>
>
>        </property>
>
>
>
>        <property>
>
>                <name>http.agent.description</name>
>
>                <value>Nutch spiderman</value>
>
>                <description> Nutch spiderman
>
>                </description>
>
>        </property>
>
>
>
>
>
> </property>
> </configuration>
>
>
>
> Here is my shoppinglist.txt from /opt/nutch-0.9/urls
>
> http://www.google.com/
>
> Here is the part of crawl-urlfilter.txt from  /opt/nutch-0.9/conf  that i
> modified.
>
> # accept hosts in MY.DOMAIN.NAME
> +^http://([a-z0-9]*\.)*google.com
>
> And here is a recent piece from my catalina.out
>
>
> Jun 13, 2008 2:13:49 PM org.apache.catalina.startup.Catalina start
> INFO: Server startup in 1376 ms
> 2008-06-13 14:14:09,617 INFO  PluginRepository - Plugins: looking in:
> /opt/apache-tomcat-5.5.16/webapps/nutch-0.9/WEB-INF/classes/plugins
> 2008-06-13 14:14:09,762 INFO  PluginRepository - Plugin Auto-activation
> mode: [true]
> 2008-06-13 14:14:09,762 INFO  PluginRepository - Registered Plugins:
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        the nutch core 
> extension
> points (nutch-extensionpoints)
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        Basic Query Filter
> (query-basic)
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        Basic URL Normalizer
> (urlnormalizer-basic)
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        Html Parse Plug-in
> (parse-html)
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        Basic Indexing Filter
> (index-basic)
> 2008-06-13 14:14:09,762 INFO  PluginRepository -        Basic Summarizer 
> Plug-in
> (summary-basic)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Site Query Filter
> (query-site)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        HTTP Framework 
> (lib-http)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Text Parse Plug-in
> (parse-text)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Regex URL Filter
> (urlfilter-regex)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Pass-through URL
> Normalizer (urlnormalizer-pass)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Http Protocol Plug-in
> (protocol-http)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        Regex URL Normalizer
> (urlnormalizer-regex)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        OPIC Scoring Plug-in
> (scoring-opic)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        CyberNeko HTML Parser
> (lib-nekohtml)
> 2008-06-13 14:14:09,763 INFO  PluginRepository -        JavaScript Parser
> (parse-js)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        URL Query Filter
> (query-url)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Regex URL Filter 
> Framework
> (lib-regex-filter)
> 2008-06-13 14:14:09,764 INFO  PluginRepository - Registered
> Extension-Points:
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch Summarizer
> (org.apache.nutch.searcher.Summarizer)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch URL Normalizer
> (org.apache.nutch.net.URLNormalizer)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch Analysis
> (org.apache.nutch.analysis.NutchAnalyzer)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch Indexing Filter
> (org.apache.nutch.indexer.IndexingFilter)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        Nutch Online Search
> Results Clustering Plugin (org.apache.nutch.clustering.OnlineClusterer)
> 2008-06-13 14:14:09,764 INFO  PluginRepository -        HTML Parse Filter
> (org.apache.nutch.parse.HtmlParseFilter)
> 2008-06-13 14:14:09,765 INFO  PluginRepository -        Nutch Content Parser
> (org.apache.nutch.parse.Parser)
> 2008-06-13 14:14:09,765 INFO  PluginRepository -        Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 2008-06-13 14:14:09,765 INFO  PluginRepository -        Nutch Query Filter
> (org.apache.nutch.searcher.QueryFilter)
> 2008-06-13 14:14:09,765 INFO  PluginRepository -        Ontology Model Loader
> (org.apache.nutch.ontology.Ontology)
> 2008-06-13 14:14:09,774 INFO  NutchBean - creating new bean
> 2008-06-13 14:14:09,791 INFO  NutchBean - opening indexes in
> /opt/nutch-0.9/crawl/segments/indexes
> 2008-06-13 14:14:09,833 INFO  Configuration - found resource
> common-terms.utf8 at
> file:/opt/apache-tomcat-5.5.16/webapps/nutch-0.9/WEB-INF/classes/common-terms.utf8
> 2008-06-13 14:14:09,839 INFO  NutchBean - opening segments in
> /opt/nutch-0.9/crawl/segments/segments
> 2008-06-13 14:14:09,851 INFO  SummarizerFactory - Using the first summarizer
> extension found: Basic Summarizer
> 2008-06-13 14:14:09,851 INFO  NutchBean - opening linkdb in
> /opt/nutch-0.9/crawl/segments/linkdb
> 2008-06-13 14:14:09,858 INFO  NutchBean - query request from 127.0.0.1
> 2008-06-13 14:14:09,870 INFO  NutchBean - query: bananas
> 2008-06-13 14:14:09,871 INFO  NutchBean - lang: en
> 2008-06-13 14:14:09,901 INFO  NutchBean - searching for 20 raw hits
> 2008-06-13 14:14:09,942 INFO  NutchBean - total hits: 0
> 2008-06-13 14:15:16,336 INFO  NutchBean - query request from 127.0.0.1
> 2008-06-13 14:15:16,337 INFO  NutchBean - query: HTTP Status 500 root cause
> 2008-06-13 14:15:16,338 INFO  NutchBean - lang: en
> 2008-06-13 14:15:16,341 INFO  NutchBean - searching for 20 raw hits
> 2008-06-13 14:15:16,353 INFO  NutchBean - total hits: 0
>
>
> --
> View this message in context: 
> http://www.nabble.com/Please-help-me-find-my-mistake--Searching-tp17830512p17830512.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>
>

Reply via email to