Re: BM25 & field-configurable similarity ?

Chris Hostetter Mon, 10 Mar 2008 17:44:24 -0700

: Does it make (any) sense to try implementing this within Solr or should I
: just forget about this ?
: As a more general note, does it make sense to try to use Solr as a
: "research" playground for similarities instead of Lucene? Or is this the
: "wrong" level (aka Lucene being a better one)?


If i were going to sit down and reallyresearch alternate SImilarity 
systems -- I would use Lucene directly.  Solr adds a lot of nice features 
and abstractions, but for experimentations like this, those features and 
abstractions can get in the way of experimenting.  In addition, the 
benchmarking contrib in Lucene is designed to make it really easy to 
run lots of repeatable tests changing small variables -- i beleive Grant 
already did some work to support evaluating "quality" metrics, so you just 
have to decide what "good" is and then you can run lots of tests where you 
change lots of variables in your custom similarity to which combination of 
varaibles gets you the closest to "good"

If/When you've got a custom similarity and you know what knobs you need 
turned on that simillarity each time the index changes, that's when i'd 
start to ask "how can I use this in Solr and get Solr to turn those knobs 
for me."

The most straight forward way i can think of to make an "adaptive" 
Similarity class that works with Solr, would be if that Similarity class 
shipped with a RequestHandler that knows about it and it's knobs ... you 
register a single query against that request handler in newSearcher and 
firstSearcher event listeners and that request handler can generate 
whatever stats it wnats and call whatever methods it wants on 
((YourSimilarityClass)searcher.getSimilarity())


hmmm... we might need to change schema.getSimilarity() to 
schema.getSimilarity().clone() in SolrIndexSearcher for that to work 
safely, but hopefully you get the idea.


-Hoss

Re: BM25 & field-configurable similarity ?

Reply via email to