On Thu, Sep 5, 2013 at 3:40 PM, Toke Eskildsen <t...@statsbiblioteket.dk>wrote:
> On Thu, 2013-09-05 at 09:28 +0200, Kristofer Karlsson wrote: > > For an example, I may have a million documents with just the term "foo" > in > > field A, and one particular document with the term "foo" in both field A > > and B, or have two terms "foo" in the same field. > > > > If I search for "foo foo" I would like to filter out all the documents > with > > only one matching term - is this possible? > > A bit of creative querying should do it: > > For the "only one foo-field"-case, you could do > (A:foo NOT B:foo) OR (B:foo NOT A:foo) > > To avoid two foo's in the same field, you could do > NOT field:"foo foo"~1000 > > Combining those we get > ((A:foo NOT B:foo) OR (B:foo NOT A:foo)) NOT A:"foo foo"~1000 NOT > B:"foo foo"~1000 > > > Or did I misunderstand? Do you want to keep the documents that has at > least two foo's and discard the ones that only has one? That is simpler: > (A:foo AND B:foo) OR A:"foo foo"~1000 OR B:"foo foo"~1000 > > > This all works under the assumption that you have less than 1000 terms > in each instance of your fields. Adjust accordingly. > > - Toke Eskildsen, State and University Library, Denmark > > > > Yes, I meant that latter part - getting rid of hits that didn't actually have as many occurrences of the term as the search query. The query generation sort of works if I just have two fields. For more fields and more search terms it quickly gets more complicated - it would be a combinatorial explosion. --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >