Unfortunately, that does require a new type of query.  As you probably know, 
you can do the "at least" (minimum number should match) with regular 
BooleanQueries, but you can't yet do the "at least" with SpanQuery.  You might 
want to look at modifying the SpanOrQuery to get this functionality.  It would 
be a great capability to have.  Perhaps open an issue and submit a patch?

-----Original Message-----
From: Saar Carmi [mailto:saarca...@gmail.com] 
Sent: Tuesday, August 30, 2016 11:03 PM
To: java-user@lucene.apache.org
Subject: New type of proximity/fuzzy search

Hi
I will appreciate some guidance for implementing the following type of query.

Given a set of search terms (t1, t2, t3, ti), return all documents where in a 
sequence of x=10 tokens at least c=3 of the search terms appear within the 
sequence

So for example the following document matches the search (expand, discount, 
file, search, lookup)

"Many of us rely on Windows Search to find files and launch programs, but 
searching for text within files is limited to specific filetypes by default. 
Here’s how you can *expand *your *search *to include other text based *files*."

Within the sequence of the last 10 words of the document the expand, files, and 
search terms appear so there is a match.

Does any documentation exist on adding new types of queries into the Luence 
engine?

Saar

Reply via email to