This is related to something I must have only day dreamed (dreamt?) about, but 
not actually mentioned on solr-dev.
My feeling is we are moving Solr in a direction of a more general web service 
that can host various NLP and ML components, and no longer only do IR/Lucene.  
We see that with a few patches that Grant is cooking, I think we'll see that in 
the Solr+Mahout marriage down the road, and so on.

Is it time to start thinking about Solr sa a server for IR and ML and NLP tasks 
and see how the tightly coupled Lucene can be made more....pluggable?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Grant Ingersoll <[EMAIL PROTECTED]>
> To: [email protected]
> Sent: Monday, October 20, 2008 7:56:32 PM
> Subject: Must QueryComponent always be on and other Design Questions
> 
> I've run into this a couple of times now and I feel like it warrants a  
> discussion
> 
> For both the SpellCheckComponent (SCC) and now for the new  
> ClusteringComponent (SOLR-769) I think there are cases where the  
> QueryComponent (QC) is not required.  In the SpellCheckComponent case  
> it is when building the spelling index.  In the ClusteringComponent,  
> it is possible to ask for document clusters without running any query  
> (it also will be possible to get clusters _with_ a query as well, and  
> it also is distinguished from the handling of search results  
> clustering, too).  Thus, it seems really weird to have to pass in a  
> dummy query, yet that is what one has to do in order to avoid getting  
> an NPE in the QC.
> 
> Now, I suppose these pieces could be modeled as something else or it's  
> possible to split the two functionalities into separate things (1  
> ReqHandler, 1 SearchComp).  In fact, the said functionality is not  
> really "search" functionality, or SearchComponent functionality, yet  
> much of the rest of the functionality in the code in question is  
> "search" functionality and logically belongs as a SearchComponent.  In  
> the case of the SCC build, it's akin to an indexing operation.  In the  
> clustering case, it's a query, albeit a non-traditional one.  In some  
> sense, this kind of document clustering is like non-query based  
> faceting which leads to more navigation/browsing instead of searching.
> 
> The quick fix is to just put in null checks into the QC or pass in a  
> dummy query with rows=0, but I'm not sure if there isn't a slightly  
> bigger picture here that needs adjusting in terms of  
> SearchComponents.  Namely, must the QC always be on?  And, should we  
> think a little more about components that don't require a query in  
> order to function and how they play in the scheme of things?
> 
> Thoughts?  Recommendations?
> 
> -Grant

Reply via email to