All of my config stuff sits in  nutch-default.xml (nutch-site.xml
doesn't change anything but I assume this should be fine)

The out put to my log file of what you advise below is 

Query->+(url:test^4.0 anchor:test^2.0 content:test title:test^1.5
host:test^2.0)

I'm not too sure why all the carots are appearing?


-----Original Message-----
From: Alvaro Cabrerizo [mailto:[EMAIL PROTECTED] 
Sent: Thursday, 8 February 2007 1:25 AM
To: [email protected]
Subject: Re: n00b question follow up

Hi:

First you can check that query plugins (query-basic, more, etc) appear
in your nutch-site.xml. If everything is ok, you can add a LOG line in
the method "search" of the class org.apache.nutch.searcher.IndexSearcher
in order to see how the lucene query is built. If I'm not wrong you have
to add in line 99 LOG.info("query
->"+luceneQuery.toString()); This method should look like this:

public Hits search(Query query...)
...
try{
 org.apache.lucene.search.BooleanQuery luceneQuery =
this.queryFilters.filter(query); LOG.info("query ->
"+luceneQuery.toString()); return ..

Recompile, and make a new query.

Hope it helps.





2007/2/7, Patrick Simon <[EMAIL PROTECTED]>:
>
> Hi All,
>
> The is an older post I made with more details from logs that will 
> hopefully be painfully obvious to someone out there why its not 
> working..
>
> It appears that I have successfully created a Nutch index via the 
> command "nutch/bin :>./nutch crawl ../urls -dir ../crawl.test -depth
5".
>
> I say it is successful as when I use Luke (a Lucene GUI tool that 
> interegates Lucene indexes) to view the index, a valid index and 
> search results come up.
>
> The directory I point Luke to is
> /home/simonp/nutch-0.8/crawl.test/indexes/part-00000 (the value I give

> for searcher.dir in nutch-default.xml is
> "/home/simonp/nutch-0.8/crawl.test")
>
> The problem is that I cannot see any results via the command 
> "bin/nutch org.apache.nutch.searcher.NutchBean apache" or when I 
> search for the string apache within the nutch servlet.
>
> I don't run any fetching or indexing as the tutorial says not to for 
> simple intranet searching.
>
> I am using Tomcat 5.5 and Nutch 0.8.
>
> Can any body help with this one please?
>
> The output from catalina.out is
>
> 2007-02-06 09:01:27,990 INFO  NutchBean - opening indexes in 
> /home/simonp/nutch-8.0/crawl.test/indexes
> 2007-02-06 09:01:28,032 INFO  Configuration - found resource
> common-terms.utf8 at
> file:/usr/local/tomcat/webapps/nutch-0.8/WEB-INF/classes/common-terms.
> ut
> f8
> 2007-02-06 09:01:28,037 INFO  NutchBean - opening segments in 
> /home/simonp/nutch-8.0/crawl.test/segments
> 2007-02-06 09:01:28,056 INFO  SummarizerFactory - Using the first 
> summarizer extension found: Basic Summarizer
> 2007-02-06 09:01:28,056 INFO  NutchBean - opening linkdb in 
> /home/simonp/nutch-8.0/crawl.test/linkdb
> 2007-02-06 09:01:28,062 INFO  NutchBean - query request from
> 192.168.5.173
> 2007-02-06 09:01:28,072 INFO  NutchBean - query: ubuntu
> 2007-02-06 09:01:28,072 INFO  NutchBean - lang: en
> 2007-02-06 09:01:28,101 INFO  NutchBean - searching for 20 raw hits
> 2007-02-06 09:01:28,142 INFO  NutchBean - total hits: 0
> 2007-02-06 09:01:30,506 INFO  NutchBean - query request from
> 192.168.5.173
> 2007-02-06 09:01:30,506 INFO  NutchBean - query: apache
> 2007-02-06 09:01:30,506 INFO  NutchBean - lang: en
> 2007-02-06 09:01:30,507 INFO  NutchBean - searching for 20 raw hits
> 2007-02-06 09:01:30,507 INFO  NutchBean - total hits: 0
> 2007-02-06 09:01:51,191 INFO  NutchBean - query request from
> 192.168.5.173
> 2007-02-06 09:01:51,191 INFO  NutchBean - query: test
> 2007-02-06 09:01:51,191 INFO  NutchBean - lang: en
> 2007-02-06 09:01:51,193 INFO  NutchBean - searching for 20 raw hits
> 2007-02-06 09:01:51,193 INFO  NutchBean - total hits: 0
> 2007-02-06 10:22:51,068 INFO  NutchBean - query request from
> 192.168.5.173
> 2007-02-06 10:22:51,070 INFO  NutchBean - query: test
> 2007-02-06 10:22:51,070 INFO  NutchBean - lang: en
> 2007-02-06 10:22:51,073 INFO  NutchBean - searching for 20 raw hits
> 2007-02-06 10:22:51,076 INFO  NutchBean - total hits: 0 OAG Best Low 
> Cost Airline Of The Year
>
> The content of this e-mail, including any attachments, is a 
> confidential communication between Virgin Blue, Pacific Blue or a 
> related entity (or the sender if this email is a private 
> communication) and the intended addressee and is for the sole use of 
> that intended addressee. If you are not the intended addressee, any 
> use, interference with, disclosure or copying of this material is 
> unauthorized and prohibited. If you have received this e-mail in error

> please contact the sender immediately and then delete the message and 
> any attachment(s). There is no warranty that this email is error, 
> virus or defect free. This email is also subject to copyright. No part

> of it should be reproduced, adapted or communicated without the 
> written consent of the copyright owner. If this is a private 
> communication it does not represent the views of Virgin Blue, Pacific 
> Blue or their related entities. Please be aware that the contents of 
> any emails sent to or from Virgin Blue, Pacific Blue or their related 
> entities may be periodically monitored and reviewed. Virgin Blue,
Pacific Blue and their related entities respect your privacy. Our
privacy policy can be accessed from our website:
> www.virginblue.com.au
>
>


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier.
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to