Re: unique result

2009-03-13 Thread ristretto.rb
FWIW...  We run a hash or the content and other bits of our docs, and
then remove duplicates according to specific algorithms.  (exactly the
same page content can clearly be hosted on many different urls but,
and domains)  Then, the choosen ones are indexed.  Though we toss the
synonyms in the index too, so we know all it's other "names."

cheers
gene

Gene Campbell
http:www.picante.co.nz
gene at picante point co point nz

http://www.travelbeen.com - "the social search engine for travel"

On Fri, Feb 27, 2009 at 5:53 AM, Cheng Zhang  wrote:
> It's exactly what I'm looking for. Thank you Grant.
>
>
> - Original Message 
> From: Grant Ingersoll 
> To: solr-user@lucene.apache.org
> Sent: Thursday, February 26, 2009 6:56:22 AM
> Subject: Re: unique result
>
> I presume these all have different unique ids?
>
> If you can address it at indexing time, then have a look at 
> https://issues.apache.org/jira/browse/SOLR-799
>
> Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236
>
>
> On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote:
>
>> Is it possible to have Solr to remove duplicated query results?
>>
>> For example, instead of return
>>
>> 
>>   Wireless 
>>   Wireless 
>>   Wireless 
>>   Video Games 
>>   Video Games 
>> 
>>
>> return:
>>  
>>     Wireless 
>>     Video Games 
>>  
>>
>> Thanks a lot,
>> Kevin
>>
>
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>


Re: unique result

2009-02-26 Thread Cheng Zhang
It's exactly what I'm looking for. Thank you Grant. 


- Original Message 
From: Grant Ingersoll 
To: solr-user@lucene.apache.org
Sent: Thursday, February 26, 2009 6:56:22 AM
Subject: Re: unique result

I presume these all have different unique ids?

If you can address it at indexing time, then have a look at 
https://issues.apache.org/jira/browse/SOLR-799

Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236


On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote:

> Is it possible to have Solr to remove duplicated query results?
>
> For example, instead of return
>
> 
>   Wireless 
>   Wireless 
>   Wireless 
>   Video Games 
>   Video Games 
> 
>
> return:
>  
> Wireless 
> Video Games 
>  
>
> Thanks a lot,
> Kevin
>

--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: unique result

2009-02-26 Thread Grant Ingersoll

I presume these all have different unique ids?

If you can address it at indexing time, then have a look at 
https://issues.apache.org/jira/browse/SOLR-799

Otherwise, you might look at https://issues.apache.org/jira/browse/SOLR-236


On Feb 25, 2009, at 6:54 PM, Cheng Zhang wrote:


Is it possible to have Solr to remove duplicated query results?

For example, instead of return


  Wireless 
  Wireless 
  Wireless 
  Video Games 
  Video Games 


return:
 
Wireless 
Video Games 
 

Thanks a lot,
Kevin



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search