I agree with your comment on separating noise with the actual relevant
result.
My approach to separate relevant result with noise is not algorithmic but
an absolute measure, i.e. top 5 or top 10 results will always be relevant
(at-least the probability is higher).
But again, that kind of simple sort can be done by the client too.

The current relevant results are purely based off PMIs which is calculated
using the clickstream data. I am also trying to figure out if I can place
extra dimensions to the solr score which takes other attributes into
consideration.
i.e. extending the way solr computes the score with attachment_count (more
attachments, more important), confidence (stronger source has higher
confidence) etc.

Is there a way I can have my custom scoring function which extends (and not
overwrites) solr's scores?

Thanks,
-Utkarsh


On Wed, Jul 24, 2013 at 7:35 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> You can certainly just include the attachment count in the
> response and have the app apply the secondary sort. But....
> that doesn't separate the "noise" as you say.
>
> How would you identify "noise"? If you don't have an algorithmic
> way to do that, I don't know how you'd manage to separate
> the signal from the noise....
>
> Best
> Erick
>
> On Wed, Jul 24, 2013 at 4:37 PM, Utkarsh Sengar <utkarsh2...@gmail.com>
> wrote:
> > I have a solr query which has a bunch of boost params for relevancy. This
> > search works fine and returns the most relevant documents as per the user
> > query. For example, if user searches for: "iphone 5", keywords like
> > "apple", "wifi" etc are boosted. I get these keywords from external
> > training. The top 10-20 results are iphone 5 phones and then it follows
> > iphone cases and other noise.
> >
> > But I also have a field in the schema called: attachment_count. I need to
> > sort the top N result I get after boost based on this field.
> >
> > Example:
> > I want to sort the top 5 documents based on attachment_count on the
> boosted
> > result (which are relevant for the user).
> >
> > 1. iphone 5 32gb, attachment_count=0
> > 2. iphone 5 16gb, attachment_count=5
> > 3. iphone 5 32gb, attachment_count=10
> > 4. iphone 4gs, attachment_count=3
> > 5. iphone 4, attachment_count=1
> > ...
> > 11. iphone 5 case, attachment_count=100
> >
> >
> > Expected result:
> > 1. iphone 5 32gb, attachment_count=10
> > 2. iphone 5 16gb, attachment_count=5
> > 3. iphone 4gs, attachment_count=3
> > 4. iphone 4, attachment_count=1
> > 5. iphone 5 32gb, attachment_count=0
> > ...
> > 11. iphone 5 case, attachment_count=100
> >
> >
> > Is this possible using a function query? I am not sure how the results
> will
> > look like but I want to try it out.
> >
> > --
> > Thanks,
> > -Utkarsh
>



-- 
Thanks,
-Utkarsh

Reply via email to