Replying to myself, after an IRC chat the problem got clarified:
the scale problem is not the amount of data to match against but the
amount of Queries being registered in the system, to which the new
Document needs to be matched.
Assuming we can store the Queries as Lucene Queries in the grid as
Hi Ales,
there are several strategies, what might work best depends on several
factors, not least on how many queries, index size, how much memory we
can dedicate for query caches, and what the ratio of updates is.
A Lucene Query produces a sparse BitSet, you can think of it as an
ordered list of