: Does it make (any) sense to try implementing this within Solr or should I : just forget about this ? : As a more general note, does it make sense to try to use Solr as a : "research" playground for similarities instead of Lucene? Or is this the : "wrong" level (aka Lucene being a better one)?
If i were going to sit down and reallyresearch alternate SImilarity systems -- I would use Lucene directly. Solr adds a lot of nice features and abstractions, but for experimentations like this, those features and abstractions can get in the way of experimenting. In addition, the benchmarking contrib in Lucene is designed to make it really easy to run lots of repeatable tests changing small variables -- i beleive Grant already did some work to support evaluating "quality" metrics, so you just have to decide what "good" is and then you can run lots of tests where you change lots of variables in your custom similarity to which combination of varaibles gets you the closest to "good" If/When you've got a custom similarity and you know what knobs you need turned on that simillarity each time the index changes, that's when i'd start to ask "how can I use this in Solr and get Solr to turn those knobs for me." The most straight forward way i can think of to make an "adaptive" Similarity class that works with Solr, would be if that Similarity class shipped with a RequestHandler that knows about it and it's knobs ... you register a single query against that request handler in newSearcher and firstSearcher event listeners and that request handler can generate whatever stats it wnats and call whatever methods it wants on ((YourSimilarityClass)searcher.getSimilarity()) hmmm... we might need to change schema.getSimilarity() to schema.getSimilarity().clone() in SolrIndexSearcher for that to work safely, but hopefully you get the idea. -Hoss
