If this is true the DocIdSet would look like this:
public interface DocIdSet
{
public abstract boolean contains(int docId);
}
And Filter would become:
public interface Filter
{
public abstract DocIdSet getDocIdSet(IndexReader reader) throws IOException;
}
As you suggest, the DocIdSet would be cached and the policy for evicting DocIdSets from cache would have to balance these factors for each DocIdSet:
1) Cache "Hit rate" on the set
2) Cost of recreating the set (ie computational cost/ disk access)
3) Memory used by set
We can compute #1 easily enough, #2 may prove hard to quantify but we could ensure we have #3 by insisting that the DocIdSet include this method:
public abstract int getCachedSizeInBytes();
We could also consider the option of allowing DocIdSets to implement "Serializable" in which case the cache manager would be able to serialize DocIdSets to temporary storage.
I'm not sure how you would want to handle the versioning issues around a change to the Filter interface though.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]