Hello Mikhail, Thank you for the reply. In terms of user experience, I want to spread out the products from same brand farther from each other, *atleast* in the first 50-100 results we display. I am thinking about two different approaches as solution.
1. For first few results, display one top scoring product of a manufacturer (For a given field, display the top scoring results of the unique field values for the first N matches) . This N could be either a percentage relative to total matches or a configurable absolute value. 2. Enforce a penalty on the score for the results that have duplicate field values. The penalty can be enforced such a way that, the results with higher scores will not be affected as against the ones with lower score. Both of the solutions can be implemented while sorting the documents with TopFieldCollector / TopScoreDocCollector. Does this answer your question? Please let me know if you have any more questions. Thanks, Karthick On Mon, Aug 20, 2012 at 3:26 AM, Mikhail Khludnev < mkhlud...@griddynamics.com> wrote: > Hello, > > I've got the problem description below. Can you explain the expected user > experience, and/or solution approach before diving into the algorithm > design? > > Thanks > > > On Sat, Aug 18, 2012 at 2:50 AM, Karthick Duraisamy Soundararaj < > karthick.soundara...@gmail.com> wrote: > >> My problem is that when there are a lot of documents representing >> products, >> products from same manufacturer seem to appear in close proximity in the >> results and therefore, it doesnt provide brand diversity. When you search >> for sofas, you get sofas from a manufacturer A dominating the first page >> while the sofas from manufacturer B dominating the second page, etc. The >> issue here is that a manufacturer tends to describes the different sofas >> he >> produces the same way and therefore there is a very little difference >> between the documents representing two sofas. >> > > > > -- > Sincerely yours > Mikhail Khludnev > Tech Lead > Grid Dynamics > > <http://www.griddynamics.com> > <mkhlud...@griddynamics.com> > >