Quoting the guys "it depends" <G>... At root, a filter is a bitset. So size-wise, you are using 1 bit/doc (plus some small overhead). Both the storage required and the time to construct are dependent on the characteristics of your corpus. I guess the only way you can answer that for your particular situation is to test with your corpus. I can say that I was surprised at how very fast constructing a filter was in my situation. Which has no relevance to your situation <G>....
More of "it depends" is the fluidity of your index. If you construct it once and don't modify it, you could consider storing your filters permanently. Either in files or as "special documents" in your index or perhaps even in a meta-data index. You can store documents of meta-data just by putting in fields that are in none of your other documents..... Deletions/additions and re-optimizations will affect the internal lucene doc IDs, so you've got to be careful here about synchronization... You could consider constructing your filters all in a bunch when you open your searcher. Again, depending upon whether you modify your searcher often will determine whether you want to do this or not. What I'd really recommend is that you start by constructing your filters on the fly, without even a caching wrapper and get some timings, mostly for your peace of mind. I'd also do some timings when combining filters, just for yucks.. There's no reason not to use a caching wrapper if you expect to use these filters, which will load the first user with a delay, but you can warm up your filters by issuing some canned queries upon startup.... Only if constructing any filters on the fly and using a caching wrapper proves unsatisfactory would I move on to any kind of permanent storage. Premature optimization and all that.... So, I don't have a good answer since I don't have a detailed knowledge of your problem, but it should be relatively easy for you to get a sense of whether this is a reasonable approach or not. Hope this helps Erick