Thanks Allison and Uwe. Yes, indeed the SpanNearQuery is probably a better starting point as it already considers the proximity.
I see you have opened an issue to track it. Are you looking to add the functionality to the existing SpanNearQuery or subclass it? Anyway, looking at the issue created I am not sure the description exactly matches what I had in mind "Add minNumberShouldMatch parameter to SpanNearQuery" My understanding is that SpanNearQuery will allow the maximum slop between each of the elements. So we have 10 SpanTermQuery with a slope of 3, it could be that the distance between the first and the last occurrence is 30. What I was looking for is to define the maximum distance between the first and last terms and additional require the minNumber. That would probably require adding to parameters - minNumberShouldMatch and maxSearchWindow. Thanks Saar On Thu, Sep 1, 2016 at 9:14 PM, Allison, Timothy B. <talli...@mitre.org> wrote: > https://issues.apache.org/jira/browse/LUCENE-7434 > > -----Original Message----- > From: Allison, Timothy B. [mailto:talli...@mitre.org] > Sent: Wednesday, August 31, 2016 3:41 PM > To: java-user@lucene.apache.org > Subject: RE: New type of proximity/fuzzy search > > Doh, sorry, Uwe, didn't see your response first. > > Scratch SpanOr, take a look at SpanNear. This would be a great capability > to have! > > -----Original Message----- > From: Allison, Timothy B. > Sent: Wednesday, August 31, 2016 3:30 PM > To: java-user@lucene.apache.org > Subject: RE: New type of proximity/fuzzy search > > Unfortunately, that does require a new type of query. As you probably > know, you can do the "at least" (minimum number should match) with regular > BooleanQueries, but you can't yet do the "at least" with SpanQuery. You > might want to look at modifying the SpanOrQuery to get this functionality. > It would be a great capability to have. Perhaps open an issue and submit a > patch? > > -----Original Message----- > From: Saar Carmi [mailto:saarca...@gmail.com] > Sent: Tuesday, August 30, 2016 11:03 PM > To: java-user@lucene.apache.org > Subject: New type of proximity/fuzzy search > > Hi > I will appreciate some guidance for implementing the following type of > query. > > Given a set of search terms (t1, t2, t3, ti), return all documents where > in a sequence of x=10 tokens at least c=3 of the search terms appear within > the sequence > > So for example the following document matches the search (expand, > discount, file, search, lookup) > > "Many of us rely on Windows Search to find files and launch programs, but > searching for text within files is limited to specific filetypes by > default. Here’s how you can *expand *your *search *to include other text > based *files*." > > Within the sequence of the last 10 words of the document the expand, > files, and search terms appear so there is a match. > > Does any documentation exist on adding new types of queries into the > Luence engine? > > Saar >