How about a pattern replace char filter that checks for repeating groups? I'd
probably not the fastest option but should work right away.
-----Original message-----
> From:Emir Arnautovic <emir.arnauto...@sematext.com>
> Sent: Thursday 9th February 2017 13:52
> To: solr-user@lucene.apache.org
> Subject: Re: Removing duplicate terms from query
>
> Hi Ere,
>
> I don't think that there is such filter. Implementing such filter would
> require looking backward which violates streaming approach of token
> filters and unpredictable memory usage.
>
> I would do it as part of query preprocessor and not necessarily as part
> of Solr.
>
> HTH,
> Emir
>
>
> On 09.02.2017 12:24, Ere Maijala wrote:
> > Hi,
> >
> > I just noticed that while we use RemoveDuplicatesTokenFilter during
> > query time, it will consider term positions and not really do anything
> > e.g. if query is 'term term term'. As far as I can see the term
> > positions make no difference in a simple non-phrase search. Is there a
> > built-in way to deal with this? I know I can write a filter to do
> > this, but I feel like this would be something quite basic to do for
> > the query. And I don't think it's even anything too weird for normal
> > users to do. Just consider e.g. searching for music by title:
> >
> > Hey, hey, hey ; Shivers of pleasure
> >
> > I also verified that at least according to debugQuery=true and
> > anecdotal evicende the search really slows down if you repeat the same
> > term enough.
> >
> > --Ere
>
> --
> Monitoring * Alerting * Anomaly Detection * Centralized Log Management
> Solr & Elasticsearch Support * http://sematext.com/
>
>