I think I get it right way.

Referring back to my example.

I will get 3 groups:
Large group with 8 documents in it and
two other groups with one document in each

If I limit a group by 5 docs then 1st group will have only 5 docs and the
other two will stay contain one doc.

And the order (based on score) won't be different. Each document in the
first group will have higher score,won't it? Or document score in each
group is calculated relatively so that top docs have similar score?

So this approach just limits number of similar documents. Instead I want to
keep all documents in results but shuffle them appropriately.

Best Regards
Alexander Aristov


On 29 October 2012 15:55, Erick Erickson <erickerick...@gmail.com> wrote:

> I don't think you're reading the grouping right. When you use grouping,
> you get the top N groups, and within each group you get the top M
> scoring documents. So you can actually get _more_ documents back than in
> the non-grouping case and your app can then intelligently intersperse them
> however you want.
>
> Best
> Erick
>
> On Mon, Oct 29, 2012 at 5:02 AM, Alexander Aristov
> <alexander.aris...@gmail.com> wrote:
> > Interesting but not exactly what I want to get.
> >
> > If I group items then I will get small number of docs. I don't want
> this. I
> > need all of them.
> >
> > Best Regards
> > Alexander Aristov
> >
> >
> > On 29 October 2012 12:05, yunfei wu <yunfei...@gmail.com> wrote:
> >
> >> Besides changing the scoring algorithm, what about "Field Collapsing" -
> >> http://wiki.apache.org/solr/FieldCollapsing - to collapse the results
> from
> >> same website url?
> >>
> >> Yunfei
> >>
> >>
> >> On Mon, Oct 29, 2012 at 12:43 AM, Alexander Aristov <
> >> alexander.aris...@gmail.com> wrote:
> >>
> >> > Hi everybody,
> >> >
> >> > I have a question about scoring calculation algorithms and approaches.
> >> >
> >> > Lets say I have 10 documents. 8 of the them come from one web site (I
> >> have
> >> > a field in schema with URL) and the other 2 from other different web
> >> sites.
> >> > So for this example I have 3 web sites.
> >> >
> >> > For some queries those 8 documents have better terms matching and they
> >> > appear at the top of results. It makes that 8 docs from one source
> come
> >> > first and the other two come next and the last.
> >> >
> >> > I want to maybe artificially improve score of those 2 docs and put
> them
> >> > atop. I don't want that they necessarily go first but if they come in
> the
> >> > middle of the result set it would be perfect.
> >> >
> >> > One of the ideas is to reduce score for docs in the result set from
> one
> >> > site so that if it contains too many docs from one source total
> scoring
> >> of
> >> > each those docs would be reduced proportionally.
> >> >
> >> > Important thing is that I don't want to reduce doc score permanently.
> >> Only
> >> > at query time. Maybe some functional queries can help me?
> >> >
> >> > How can I do this or maybe there are other ideas.
> >> >
> >> > Best Regards
> >> > Alexander Aristov
> >> >
> >>
>

Reply via email to