[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987261#action_12987261 ] Michael McCandless commented on LUCENE-2474: OK, I cutover to MBS(CHM)! Thanks Shay. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Assignee: Michael McCandless Fix For: 3.1, 4.0 Attachments: LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2574.patch, MapBackedSet.java Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12987267#action_12987267 ] Uwe Schindler commented on LUCENE-2474: --- I would make the map wrapper final, else no comments for now, but I want to look into the code, if for perf reasons more AbstractSet methods should be overridden. Is this map backed set some impl from e.g. Commons Collect or Google Collect? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Assignee: Michael McCandless Fix For: 3.1, 4.0 Attachments: LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2574.patch, MapBackedSet.java Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12984280#action_12984280 ] Michael McCandless commented on LUCENE-2474: bq. Actually, I am against the last patch you posted, as it clearly has nothing to do with this issue Woops! Heh. bq. A MultiReader is just a wrapper - you don't reopen it, so it could just start off with an empty listener list, the subs could all retain their listener lists and an addListener() could just delegate to the contained readers. Well, it does have a reopen (reopens the subs wraps in a new MR), but I guess delegation would work for MR. And, same for ParallelReader. And I think the NRT case should work fine, since we don't expose IW.getReader anymore (hmm -- this was never backported to 3.x?) -- if you new IndexReader(IW), it creates a single collection holding all listeners, and then shares it w/ all SRs. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Fix For: 3.1, 4.0 Attachments: LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2574.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983818#action_12983818 ] Yonik Seeley commented on LUCENE-2474: -- bq. eg if you make a MultiReader (or ParallelReader) w/ subs... what should happen to their listeners? This doesn't have to be a super-flexible public API - it's clearly expert level and we can impose limitations that make sense. A MultiReader is just a wrapper - you don't reopen it, so it could just start off with an empty listener list, the subs could all retain their listener lists and an addListener() could just delegate to the contained readers. bq. ReaderContext doesn't really make sense here? Right, it doesn't (which circles around to the fact that the callbacks should be set on the top-reader) bq. Isn't this typically a containsKey against the app level caches...? In a (solr) multi-core environment, you don't know *which* caches to check, or even how to get to those caches. You would either need to add *all* of them and check all of them on a purge (which has other problems, including performance problems for highly multi-tenanted uses where people have thousands of cores), or have one big static cache (which has other problems such as not being configurable per-core). I'm not against this patch, I'm just pointing out that the staticness of it (which I hope we can continue to move away from over time) severely limits it's usefulness, and it doesn't represent incremental progress if we do want to get rid of statics. Actually, I am against the last patch you posted, as it clearly has nothing to do with this issue ;-) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch, LUCENE-2574.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12983159#action_12983159 ] Michael McCandless commented on LUCENE-2474: bq. Still, I think that using CopyOnWriteArrayList is best here. OK I'll switch back to COWAL... it makes me nervous though. I like being defensive and the added cost of CHM iteration really should be negligible here. {quote} I'd like even more for there to be just a single CopyOnWriteArrayList per top-level reader that is then propagated to all sub/segment readers, including new ones on a reopen. But I guess Mike indicated that was currently too hard/hairy. {quote} This did get hairy... eg if you make a MultiReader (or ParallelReader) w/ subs... what should happen to their listeners? Ie what if the subs already have listeners enrolled? It also spooked me that apps may think they have to re-register after re-open (if we stick w/ ArrayList) since then the list'd just grow... it's trappy. And, if you pull an NRT reader from IW (which is what reopen does under the hood for an NRT reader), how to share its listeners? Ie, we'd have to add a setter to IW as well, so it's also single source (propagates on reopen). This is why I fell back to a simple static as the baby step for now. {quote} The static is really non-optimal though - among other problems, it requires systems with multiple readers (and wants to do different things with different readers, such as maintain separate caches) to figure out what top-level reader a segment reader is associated with. And given that we are dealing with IndexReader instances in the callbacks, and not ReaderContext objects, this seems impossible? {quote} ReaderContext doesn't really make sense here? Ie, the listener is invoked when any/all composite readers sharing a given segment have now closed (ie when the RC for that segment's core drops to 0), or when a composite reader is closed. Also, in practice, is it really so hard for the app to figure out which SR goes to which of their caches? Isn't this typically a containsKey against the app level caches...? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982604#action_12982604 ] Michael McCandless commented on LUCENE-2474: Ahh, I get it -- invoking the listeners (on cache evict) is dangerous to do under a global lock since they could conceivably be costly. I had switched to Set to try to prevent silliness in the event that an app adds same listener over over (w/o removing it), and also to not have O(N^2) cost when removing listeners. I mean, it is an expert API, but I still think we should attempt to be defensive against silliness? How about CHM? (There is not builtin CHS, right? And HS just wraps an HM anyway). Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982695#action_12982695 ] Shay Banon commented on LUCENE-2474: Yea, I got the reasoning for Set, we can use that, CHM with PRESENT. If you want, I can attach a simple MapBackedSet that makes any Map a Set. Still, I think that using CopyOnWriteArrayList is best here. I don't think that adding and removing listeners is something that will be done often in an app. But I might be mistaken. In this case, traversal over listeners is much better on CopyOnWriteArrayList compared to CHM. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982761#action_12982761 ] Yonik Seeley commented on LUCENE-2474: -- bq. Still, I think that using CopyOnWriteArrayList is best here. Agree - I think we should optimize for good/correct behavior. I'd like even more for there to be just a single CopyOnWriteArrayList per top-level reader that is then propagated to all sub/segment readers, including new ones on a reopen. But I guess Mike indicated that was currently too hard/hairy. The static is really non-optimal though - among other problems, it requires systems with multiple readers (and wants to do different things with different readers, such as maintain separate caches) to figure out what top-level reader a segment reader is associated with. And given that we are dealing with IndexReader instances in the callbacks, and not ReaderContext objects, this seems impossible? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12982509#action_12982509 ] Shay Banon commented on LUCENE-2474: bq. OK, here's a patch exposing the readerFinishedListeners as static methods on IndexReader. I think we should use a CopyOneWriteArrayList so calling the listeners will not happen under a global synchronize block. If maintaining set behavior is required, then I can patch with a ConcurrentHashSet implementation or we can simply replace it with a CHM with PRESENT, or any other solution that does not require calling the listeners under a global sync block. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979700#action_12979700 ] Michael McCandless commented on LUCENE-2474: I started to implement the forwards to all subs and to all reopened readers and... it's kinda hairy. I mean there are TONS of places where we make new readers (Earwin's working on improving this, I think, under LUCENE-2355). So then I wondered: what if we just make this a static method, eg on IndexReader, add/removeReaderFinishedListener? (Or we could put it on FieldCache). That'd be a tiny change... Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979720#action_12979720 ] Jason Rutherglen commented on LUCENE-2474: -- {quote}make this a static method, eg on IndexReader, add/removeReaderFinishedListener? (Or we could put it on FieldCache). That'd be a tiny change...{quote} This makes the most sense however it feels temporary as should should probably move to a unified IWC/IRC config where all parameters are set and shared for writers and readers? This way we can eventually coordinate things like IO scheduling, eg, LUCENE-2793's IOContext. Also shouldn't there simply be a reader event listener and perhaps even a writer event listener? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979733#action_12979733 ] Michael McCandless commented on LUCENE-2474: I think we can generalize this to any event and to writer in the future... for today, just letting something external be notified when a reader is gone, just as FieldCache is privately notified today, is a good baby step. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12979888#action_12979888 ] Earwin Burrfoot commented on LUCENE-2474: - bq. Earwin's working on improving this, I think, under LUCENE-2355 I stalled, and then there were just so many changes under trunk, so I have to restart now :) Thanks for another kick. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch, LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978719#action_12978719 ] Shay Banon commented on LUCENE-2474: It would be a cache of anything... one element of that cache would be the FieldCache, there could be one for filters, or one entry per-filter. edit: Maybe a better way to think about it is like a ServletContext or something - it's just a way to attach anything arbitrary to a reader. Got you. My personal taste is to try and keep those readers as lightweight as possible, and have the proper constructs in place to allow to externally use them for caching, without having them manage it as well. Not with this current patch, as there is no mechanism to get a callback when you do care about deletes. If I want to cache something that depends on deletions, I want to purge that cache when the actual reader is closed (as opposed to the reader's core cache key that is shared amongst all readers that just have different deletions). So if we go a close event route, we really want two different events... one for the close of a reader (i.e. deleted matter), and one for the close of the segment (deletes don't matter). I think that a cache that is affected by deletes is a problematic cache to begin with, so was thinking that maybe it should be discouraged by not allowing for it. Especially with NRT. My idea was to simply expand the purge capability that the FC gets for free to other external custom components. Also, if we did have a type safe separation between segment readers and compound readers, I would not have added the ability to register a listener on the compound readers, just the segment readers, as this will encourage people to write caches that only work on segment readers (since the registration for the purge event will happen within the cache, and it should work only with segment readers). That was why my patch does not take compound readers into account. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978889#action_12978889 ] Michael McCandless commented on LUCENE-2474: bq. My idea was to simply expand the purge capability that the FC gets for free to other external custom components. I think that's a good baby step, for this issue, today, just for segment readers. Separately, eventually, we can tackle the bigger challenge of allowing caching on any reader. {quote} Also, if we did have a type safe separation between segment readers and compound readers, I would not have added the ability to register a listener on the compound readers, just the segment readers, as this will encourage people to write caches that only work on segment readers (since the registration for the purge event will happen within the cache, and it should work only with segment readers). That was why my patch does not take compound readers into account. {quote} I very much want to get type safe IndexReaders done before 4.0... But: I think we'd want to have composite reader just forward the registration down to the atomic readers? (And, forward on reopen). Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978975#action_12978975 ] Shay Banon commented on LUCENE-2474: bq. But: I think we'd want to have composite reader just forward the registration down to the atomic readers? (And, forward on reopen). I am not sure that you would want to do it. Any caching layer or an external component that is properly written would work on the low level segment readers, it will not even compile against compound readers. This will help direct people to write proper code and dealing only with segment readers. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978982#action_12978982 ] Yonik Seeley commented on LUCENE-2474: -- But: I think we'd want to have composite reader just forward the registration down to the atomic readers? (And, forward on reopen). +1 I am not sure that you would want to do it. So a user doesn't have to walk the tree and re-register it's listeners with every new segment on a reopen. Any caching layer or an external component that is properly written would work on the low level segment readers It's completely proper to cache stuff wrt the top level reader - Solr does so (and will continue to provide that option) . Some people update their index infrequently, and the performance gains of having cached information on a unified view of the index is a win for them. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978371#action_12978371 ] Yonik Seeley commented on LUCENE-2474: -- There's a lot of little issues here I think: - close event doesn't really describe the behavior since an event is not generated on every close of every reader as one might expect. - This implementation is problematic since higher level readers don't propagate the event listeners to subreaders... i.e. I need to walk the tree myself and add add a listener to every reader, and on a reopen() I would need to walk the tree again and add listeners only to the new readers that have a new coreKey. We've talked before about putting caches directly on the readers - that still seems like the most straightforward approach? We really need one cache that doesn't care about deletions, and one cache that does. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978391#action_12978391 ] Michael McCandless commented on LUCENE-2474: bq. close event doesn't really describe the behavior since an event is not generated on every close of every reader as one might expect. Right, it's really more like a segment is unloaded event. Ie a single segment can have many cloned/reopened SegmentReaders, all sharing the same core (= same cache entry eg in FieldCache)... when this event occurs in means all SegmentReaders for a given segment have been closed. But, then we also need to generate this for toplevel readers, since [horribly] such readers are still allowed into eg the FieldCache. bq. This implementation is problematic since higher level readers don't propagate the event listeners to subreaders... i.e. I need to walk the tree myself and add add a listener to every reader, and on a reopen() I would need to walk the tree again and add listeners only to the new readers that have a new coreKey. I think we should just fix that, ie so your listener is propagated to the subs and to reopened readers (and their subs and their reopens, etc.). {quote} We've talked before about putting caches directly on the readers - that still seems like the most straightforward approach? {quote} This would be great, but I'm not sure I'd call it straightforward :) I think a separate baby step (ie this proposed approach) is fine for today? bq. We really need one cache that doesn't care about deletions, and one cache that does. And maybe norms, since they too can change in cloned SegmentReaders that otherwise share the same core. Or, maybe we make the core a first class object, and you interact with it to cache things that don't care about changes to deletions/norms. Or, the core could just the first SegmentReader to be opened on this segment. Or something. I think Earwin has also worked out some sort of caching model w/ IndexReaders... Earwin, how do you handle timely eviction? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978480#action_12978480 ] Shay Banon commented on LUCENE-2474: Right, I was thinking that its a low level API that you can just add it to the low level readers, but I agree, it will be nicer to have it on the high level as well. Regarding the close method name, I guess we can name it similar to the FieldCache one, maybe purge? We've talked before about putting caches directly on the readers - that still seems like the most straightforward approach? not sure I understand that. Do you mean getting FieldCache into the readers? And then what about cached filters? And other custom caching constructs that rely on the same mechanism as the CachingWrapperFilter? I think that if one implements such caching, its an advance enough feature where you should know how to handle deletes and other tidbits (if you need to). We really need one cache that doesn't care about deletions, and one cache that does. Isn't that up to the cache to decide? That cache can be anything (internally implemented in Lucene or externally) that follows the mechanism of caching based on (segment) readers. As long as there are constructs to get the deleted docs to handle deletes (for example), then the implementation can use it. Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978493#action_12978493 ] Yonik Seeley commented on LUCENE-2474: -- We've talked before about putting caches directly on the readers - that still seems like the most straightforward approach? not sure I understand that. Do you mean getting FieldCache into the readers? And then what about cached filters? It would be a cache of anything... one element of that cache would be the FieldCache, there could be one for filters, or one entry per-filter. We really need one cache that doesn't care about deletions, and one cache that does. Isn't that up to the cache to decide? Not with this current patch, as there is no mechanism to get a callback when you do care about deletes. If I want to cache something that depends on deletions, I want to purge that cache when the actual reader is closed (as opposed to the reader's core cache key that is shared amongst all readers that just have different deletions). So if we go a close event route, we really want two different events... one for the close of a reader (i.e. deleted matter), and one for the close of the segment (deletes don't matter). Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] Commented: (LUCENE-2474) Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey)
[ https://issues.apache.org/jira/browse/LUCENE-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12870559#action_12870559 ] Michael McCandless commented on LUCENE-2474: Should we rename this to CloseEventListener? Ie, when an IR is closed it'll notify those subscribers who asked to find out? Also, shouldn't the FieldCache's listener be created/installed from FieldCache, not from IR? Ie, when FieldCache creates an entry it should at that point ask the reader to notify it when that reader is closed? Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey) Key: LUCENE-2474 URL: https://issues.apache.org/jira/browse/LUCENE-2474 Project: Lucene - Java Issue Type: Improvement Components: Search Reporter: Shay Banon Attachments: LUCENE-2474.patch Allow to plug in a Cache Eviction Listener to IndexReader to eagerly clean custom caches that use the IndexReader (getFieldCacheKey). A spin of: https://issues.apache.org/jira/browse/LUCENE-2468. Basically, its make a lot of sense to cache things based on IndexReader#getFieldCacheKey, even Lucene itself uses it, for example, with the CachingWrapperFilter. FieldCache enjoys being called explicitly to purge its cache when possible (which is tricky to know from the outside, especially when using NRT - reader attack of the clones). The provided patch allows to plug a CacheEvictionListener which will be called when the cache should be purged for an IndexReader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org