> I have a collection of about 30K documents which pertain to > pop artists (eg. Madonna, Michael Jackson). These artist > names are indexed in the field named "artist_t" which has > the following properties in dynamic field declaration: > <dynamicField name="*_t" type="text" indexed="true" > stored="true"/> > > Most of the documents will have MJ as their artist. I am > using EdgeNGram filter factory to get a typeahead > implementation. i.e. > > when I type in "m" I would get "madonna", michael jackson", > "miley cyrus" etc as results. The problem that I have now is > that all these terms are repeated. > > When I search for "m", instead of "madonna", "michael > jackson".... I am getting MJ repeated many times in the > initiall 10 docs that solr brings by default. > > I need to make all these artists unique i.e if I search > "m", I should get individual results just once? > > How should I change the schema file and is there a query > tweaking required?
http://wiki.apache.org/solr/TermsComponent (can be used for Auto-Suggest) can eliminate repeated terms. With this solution you don't need EdgeNGram anymore. If you want to suggest more than one term, you can add ShingleFilterFactory to your index analyzer chain.
