Re: unique result
FWIW... We run a hash or the content and other bits of our docs, and then remove duplicates according to specific algorithms. (exactly the same page content can clearly be hosted on many different urls but, and domains) Then, the choosen ones are indexed. Though we toss the synonyms in the index too, so we know all it's other "names." cheers gene Gene Campbell http:www.picante.co.nz gene at picante point co point nz http://www.travelbeen.com - "the social search engine for travel" On Fri, Feb 27, 2009 at 5:53 AM, Cheng Zhang wrote: > It's exactly what I'm looking for. Thank you Grant. > > > - Original Message > From: Grant Ingersoll > To: solr-user@lucene.apache.org > Sent: Thursday, February 26, 2009 6:56:22 AM > Subject: Re: unique result > > I presume these all have different unique ids? > > If you can address it at indexing time, then have a look at > https://issues.apache.org/jira/browse/SOLR-799 > > Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236 > > > On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote: > >> Is it possible to have Solr to remove duplicated query results? >> >> For example, instead of return >> >> >> Wireless >> Wireless >> Wireless >> Video Games >> Video Games >> >> >> return: >> >> Wireless >> Video Games >> >> >> Thanks a lot, >> Kevin >> > > -- > Grant Ingersoll > http://www.lucidimagination.com/ > > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) > using Solr/Lucene: > http://www.lucidimagination.com/search >
Re: unique result
It's exactly what I'm looking for. Thank you Grant. - Original Message From: Grant Ingersoll To: solr-user@lucene.apache.org Sent: Thursday, February 26, 2009 6:56:22 AM Subject: Re: unique result I presume these all have different unique ids? If you can address it at indexing time, then have a look at https://issues.apache.org/jira/browse/SOLR-799 Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236 On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote: > Is it possible to have Solr to remove duplicated query results? > > For example, instead of return > > > Wireless > Wireless > Wireless > Video Games > Video Games > > > return: > > Wireless > Video Games > > > Thanks a lot, > Kevin > -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
Re: unique result
I presume these all have different unique ids? If you can address it at indexing time, then have a look at https://issues.apache.org/jira/browse/SOLR-799 Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236 On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote: Is it possible to have Solr to remove duplicated query results? For example, instead of return Wireless Wireless Wireless Video Games Video Games return: Wireless Video Games Thanks a lot, Kevin -- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search