hello 
I am awfully sorry to post this message again with the same content but with
a different title, I have done that because I found the title  "how to add a
new parameter to solr request"that I have given to my last  post don't
reflect really what I want to do, so I prefer posting it gain with the title
"How to modify solr results ".

Hello every body

I want to modify  a little bit the behaviour of Solr and I want to know if
it is possible; Here is my problem :
I give to Solr document to index which UniqueKey Field is based on the Url
and the  Time at which the croawler downloaded it  so  UniqueKey is a digit
obtained like that  MyAlgo(Url+Time); the problem occur at searching time
solr return me the result which contain duplication it means for example the 
10 first result correspond to the same  web page with the same content 
because in  fact it is the same Url. So I  want to remove this duplication,
so I want to add a parameter  in the solr request for example  permitdupp
which takes values (true or false ) if  permitdupp= true I will let the
default Solr behaviour but if permitdupp=false I want to remouve all the
duplicative document and just to keep the recent indexed document (to get
the one recent my documents contain a date field ) .
So I want to know which is the easiest way to do this;
may be there is solr parametters I have to use (faceting???????).    or
Programmatically : in that case  which classes I have to modify or  I have
to inherit  from to develop this solution.
any suggestion  is welcome. and thank you in advance. 


hello every body
I want just to add this example to be more clear. I have this result from
solr.

<result name="response" numFound="7" start="0" maxScore="0.59129626">
−
        <doc>
<str name="id">1</str>
<str name="DocUrl">http://www.sarkozy.fr</str>
<str name="date">01/01/2008</str>
</doc>
−
        <doc>
<str name="id">2</str>
<str name="DocUrl">http://www.sarkozy.fr</str>
<str name="date">31/01/2008</str>
</doc>
−
        <doc>
<str name="id">3</str>
<str name="DocUrl">http://www.sarkozy.fr</str>
<str name="date">15/01/2008</str>
</doc>
     .
     .
     .
</result>

Note that it's the same field   DocUrl (http://www.sarkozy.fr) for the three
shown document above. I want to get in  the result something like that.

<result name="response" numFound="7" start="0" maxScore="0.59129626">
−
        <doc>
<str name="id">2</str>
<str name="DocUrl">http://www.sarkozy.fr</str>
<str name="date">31/01/2008</str>

</doc>


     .
     .
     .
</result>
keep the recent one.

How to deal with that. Thank you in advance. 
-- 
View this message in context: 
http://www.nabble.com/how-to-add-a-new-parameter-to-solr-request-tp17338190p17357687.html
Sent from the Solr - Dev mailing list archive at Nabble.com.

Reply via email to