JC>> we would like to perform Lucene searches over 
JC>> several Nutch-generated indexes residing on 
JC>> separate servers ... [including] lucene queries
JC>> such as span, fuzzy, range, etc.

DC> It should not be hard to implement these as Nutch
DC> QueryFilter plugins.  Thus, one could add 
DC> "fuzzy:foo" or "range:foo-bar" to a Nutch query
and
DC> the plugin  would translate these into appropriate

DC> Lucene clauses and add them to the generated
Lucene
DC> query.  Does this make sense?

Ah, yes, thanks!

JC>> Why does net.nutch.searcher.Query implement the 
JC>> Writable interface?

DC> The translation from Nutch query to Lucene query 
DC> happens locally on each search node, so that it
can 
DC> utilize index-specific information, so we do not 
DC> need to serialize the Lucene query.

Where 'search node' is a node running
searcher.DistributedSearch.Server?

DC> Nutch uses it's own serialization and IPC 
DC> implementations instead of Java's serialization
and
DC> RMI for better scalablilty, reliability and 
DC> performance. ... Nutch's goal is to scale to 
DC> hundreds or thousands of nodes.

If we expect to never want more than a hundred nutch
(crawler, indexer) nodes, but may want full Lucene
functionality or more, then would we perhaps be better
off sticking with java serialization and RMI?

Thank you.

Jeremy


                
__________________________________ 
Do you Yahoo!? 
Read only the mail you want - Yahoo! Mail SpamGuard. 
http://promotions.yahoo.com/new_mail 


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to