May I suggest looking at some of the related issues, say SOLR-1682

This issue is related to:  
  SOLR-1682 Implement CollapseComponent       
 SOLR-1311 pseudo-field-collapsing       
 LUCENE-1421 Ability to group search results by field       
 SOLR-1773 Field Collapsing (lightweight version)       
  SOLR-237  Field collapsing  

 

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Bharat Jain <bharat.j...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Fri, July 30, 2010 10:40:19 AM
> Subject: Re: question about relevance
> 
> Hi,
>    Thanks a lot for the info and your time. I think field collapse  will work
> for us. I looked at the https://issues.apache.org/jira/browse/SOLR-236 but
> which file I should  use for patch. We use solr-1.3.
> 
> Thanks
> Bharat Jain
> 
> 
> On Fri,  Jul 30, 2010 at 12:53 AM, Chris Hostetter
> <hossman_luc...@fucit.org>wrote:
> 
> >
> >  : 1. There are user records of type A, B, C etc. (userId field in index  is
> > : common to all records)
> > : 2. A user can have any number of  A, B, C etc (e.g. think of A being a
> > : language then user can know many  languages like french, english, german
> > etc)
> > : 3. Records are  currently stored as a document in index.
> > : 4. A given query can match  multiple records for the user
> > : 5. If for a user more records are  matched (e.g. if he knows both french
> > and
> > : german) then he is  more relevant and should come top in UI. This is the
> > : reason I wanted  to add lucene scores assuming the greater score means
> > more
> > :  relevance.
> >
> > if your goal is to get back "users" from each search,  then you should
> > probably change your indexing strategry so that each  "user" has a single
> > document -- fields like "langauge" can be  multivalued, etc...
> >
> > then a search for "language:en langauge:fr"  will return users who speak
> > english or french, and hte ones that speak  both will score higher.
> >
> > if you really cant change the index  structure, then essentially waht you
> > are looking for is a "field  collapsing" solution on the userId field,
> > where you want each collapsed  group to get a cumulative score.  i don't
> > know if the existing  field collapsing patches support this -- if you are
> > already  willing/capable to do it in the lcient then that may be the
> > simplest  thing to support moving foward.
> >
> > Adding the scores is certainly  one metric you could use -- it's generally
> > suspicious to try and imply  too much meaning to scores in lucene/solr but
> > that's becuase people  typically try to imply broader absolute meaning.  in
> > the case of a  single query the scores are relative eachother, and adding
> > up all the  scores for a given userId is approximaly what would happen in
> > my example  above -- except that there is also a "coord" factor that would
> >  penalalize documents that only match one clause ... it's complicated,  but
> > as an approximation adding the scores might give you what you are  looking
> > for -- only you can know for sure based on your specific  data.
> >
> >
> >
> > -Hoss
> >
> >
> 

Reply via email to