On Thu, Jul 23, 2015 at 3:37 PM, William Burns <mudokon...@gmail.com> wrote: > I actually found another hiccup with cache stores. It seems currently we > only allow for a callback when an entry is expired from a cache store when > using the reaper thread [1]. However we don't allow for such a callback on > a read which finds an expired entry and wants to remove it [2]. > > Interestingly our cache stores in general don't even expire entries on load > with the few exceptions below: > > 1. SingleCacheStore returns true for an expired entry on contains > 2. SingleCacheStore removes expired entries on load > 3. RemoteStore does not need to worry about expiration since it is handled > by another remote server. > > Of all of the other stores I have looked at they return false properly for > expired entries and only purge elements from within reaper thread. > > I propose we change SingleCacheStore to behave as the other cache stores. > This doesn't require any API changes. We would then rely on store expiring > elements only during reaper thread or if the element expires in memory. We > should also guarantee that when a cache store is used that the reaper thread > is enabled (throw exception if not enabled and store is present at init). > Should I worry about when only a RemoteStore is used (this seems a bit > fragile)?
+1, I wouldn't add a special case for RemoteStore. > > To be honest we would need to revamp the CacheLoader/Writer API at a later > point to allow for values to be optionally provided for expiration anyways, > so I would say to do that in addition to allowing loader/stores to expire on > access. Sounds good. > > [1] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/AdvancedCacheWriter.java#L29 > [2] > https://github.com/infinispan/infinispan/blob/master/core/src/main/java/org/infinispan/persistence/spi/CacheLoader.java#L34 > > > ---------- Forwarded message --------- > From: William Burns <mudokon...@gmail.com> > Date: Wed, Jul 22, 2015 at 11:06 AM > Subject: Re: [infinispan-dev] Strict Expiration > To: infinispan -Dev List <infinispan-dev@lists.jboss.org> > > > On Wed, Jul 22, 2015 at 10:53 AM Dan Berindei <dan.berin...@gmail.com> > wrote: >> >> Is it possible/feasible to skip the notification from the backups to >> the primary (and back) when there is no clustered expiration listener? > > > Unfortunately there is no way to distinguish whether or a listener is > create, modify, remove or expiration. So this would only work if there are > no clustered listeners. > > This however should be feasible. This shouldn't be hard to add. It should be good enough. > > The only thing I would have to figure out is what happens in the case of a > rehash and the node that removed the value is now the primary owner and some > nodes have the old value and someone registers an expiration listener. I am > thinking I should only raise the event if the primary owner still has the > value. +1 > >> >> >> Dan >> >> >> On Tue, Jul 21, 2015 at 5:25 PM, William Burns <mudokon...@gmail.com> >> wrote: >> > So I wanted to sum up what it looks like the plan is for this in regards >> > to >> > cluster expiration for ISPN 8. >> > >> > First off to not make it ambiguous, maxIdle being used with a clustered >> > cache will provide undefined and unsupported behavior. This can and >> > will >> > expire entries on a single node without notifying other cluster members >> > (essentially it will operate as it does today unchanged). >> > >> > This leaves me to talk solely about lifespan cluster expiration. >> > >> > Lifespan Expiration events are fired by the primary owner of an expired >> > key >> > >> > - when accessing an expired entry. >> > >> > - by the reaper thread. >> > >> > If the expiration is detected by a node other than the primary owner, an >> > expiration command is sent to it and null is returned immediately not >> > waiting for a response. >> > >> > Expiration event listeners follow the usual rules for sync/async: in the >> > case of a sync listener, the handler is invoked while holding the lock, >> > whereas an async listener will not hold locks. >> > >> > It is desirable for expiration events to contain both the key and value. >> > However currently cache stores do not provide the value when they expire >> > values. Thus we can only guarantee the value is present when an in >> > memory >> > expiration event occurs. We could plan on adding this later. >> > >> > Also as you may have guessed this doesn't touch strict expiration, which >> > I >> > think we have come to the conclusion should only work with maxIdle and >> > as >> > such this is not explored with this iteration. >> > >> > Let me know if you guys think this approach is okay. >> > >> > Cheers, >> > >> > - Will >> > >> > On Tue, Jul 14, 2015 at 1:51 PM Radim Vansa <rva...@redhat.com> wrote: >> >> >> >> Yes, I know about [1]. I've worked that around by storing timestamp in >> >> the entry as well and when a new record is added, the 'expired' >> >> invalidations are purged. But I can't purge that if I don't access it - >> >> Infinispan needs to handle that internally. >> >> >> >> Radim >> >> >> >> [1] https://hibernate.atlassian.net/browse/HHH-6219 >> >> >> >> On 07/14/2015 05:45 PM, Dennis Reed wrote: >> >> > On 07/14/2015 11:08 AM, Radim Vansa wrote: >> >> >> On 07/14/2015 04:19 PM, William Burns wrote: >> >> >>> >> >> >>> On Tue, Jul 14, 2015 at 9:37 AM William Burns <mudokon...@gmail.com >> >> >>> <mailto:mudokon...@gmail.com>> wrote: >> >> >>> >> >> >>> On Tue, Jul 14, 2015 at 4:41 AM Dan Berindei >> >> >>> <dan.berin...@gmail.com <mailto:dan.berin...@gmail.com>> >> >> >>> wrote: >> >> >>> >> >> >>> Processing expiration only on the reaper thread sounds >> >> >>> nice, >> >> >>> but I >> >> >>> have one reservation: processing 1 million entries to see >> >> >>> that >> >> >>> 1 of >> >> >>> them is expired is a lot of work, and in the general case >> >> >>> we >> >> >>> will not >> >> >>> be able to ensure an expiration precision of less than 1 >> >> >>> minute (maybe >> >> >>> more, with a huge SingleFileStore attached). >> >> >>> >> >> >>> >> >> >>> This isn't much different then before. The only difference >> >> >>> is >> >> >>> that if a user touched a value after it expired it wouldn't >> >> >>> show >> >> >>> up (which is unlikely with maxIdle especially). >> >> >>> >> >> >>> >> >> >>> What happens to users who need better precision? In >> >> >>> particular, I know >> >> >>> some JCache tests were failing because HotRod was only >> >> >>> supporting >> >> >>> 1-second resolution instead of the 1-millisecond >> >> >>> resolution >> >> >>> they were >> >> >>> expecting. >> >> >>> >> >> >>> >> >> >>> JCache is an interesting piece. The thing about JCache is >> >> >>> that >> >> >>> the spec is only defined for local caches. However I >> >> >>> wouldn't >> >> >>> want to muddy up the waters in regards to it behaving >> >> >>> differently >> >> >>> for local/remote. In the JCache scenario we could add an >> >> >>> interceptor to prevent it returning such values (we do >> >> >>> something >> >> >>> similar already for events). JCache behavior vs ISPN >> >> >>> behavior >> >> >>> seems a bit easier to differentiate. But like you are >> >> >>> getting >> >> >>> at, >> >> >>> either way is not very appealing. >> >> >>> >> >> >>> >> >> >>> >> >> >>> I'm even less convinced about the need to guarantee that >> >> >>> a >> >> >>> clustered >> >> >>> expiration listener will only be triggered once, and that >> >> >>> the >> >> >>> entry >> >> >>> must be null everywhere after that listener was invoked. >> >> >>> What's the >> >> >>> use case? >> >> >>> >> >> >>> >> >> >>> Maybe Tristan would know more to answer. To be honest this >> >> >>> work >> >> >>> seems fruitless unless we know what our end users want here. >> >> >>> Spending time on something for it to thrown out is never fun >> >> >>> :( >> >> >>> >> >> >>> And the more I thought about this the more I question the >> >> >>> validity >> >> >>> of maxIdle even. It seems like a very poor way to prevent >> >> >>> memory >> >> >>> exhaustion, which eviction does in a much better way and has >> >> >>> much >> >> >>> more flexible algorithms. Does anyone know what maxIdle >> >> >>> would >> >> >>> be >> >> >>> used for that wouldn't be covered by eviction? The only >> >> >>> thing I >> >> >>> can think of is cleaning up the cache store as well. >> >> >>> >> >> >>> >> >> >>> Actually I guess for session/authentication related information >> >> >>> this >> >> >>> would be important. However maxIdle isn't really as usable in that >> >> >>> case since most likely you would have a sticky session to go back >> >> >>> to >> >> >>> that node which means you would never refresh the last used date on >> >> >>> the copies (current implementation). Without cluster expiration >> >> >>> you >> >> >>> could lose that session information on a failover very easily. >> >> >> I would say that maxIdle can be used as for memory management as >> >> >> kind >> >> >> of >> >> >> WeakHashMap - e.g. in 2LC the maxIdle is used to store some record >> >> >> for >> >> >> a >> >> >> short while (regular transaction lifespan ~ seconds to minutes), and >> >> >> regularly the record is removed. However, to make sure that we don't >> >> >> leak records in this cache (if something goes wrong and the remove >> >> >> does >> >> >> not occur), it is removed. >> >> > Note that just relying on maxIdle doesn't guarantee you won't leak >> >> > records in this use case (specifically with the way the current >> >> > hibernate-infinispan 2LC implementation uses it). >> >> > >> >> > Hibernate-infinispan adds entries to its own Map stored in >> >> > Infinispan, >> >> > and expects maxIdle to remove the map if it skips a remove. But in a >> >> > current case, we found that due to frequent accesses to that same map >> >> > the entries never idle out and it ends up in OOME). >> >> > >> >> > -Dennis >> >> > >> >> >> I can guess how long the transaction takes place, but not how many >> >> >> parallel transactions there are. With eviction algorithms (where I >> >> >> am >> >> >> not sure about the exact guarantees) I can set the cache to not hold >> >> >> more than N entries, but I can't know for sure that my record does >> >> >> not >> >> >> suddenly get evicted after shorter period, possibly causing some >> >> >> inconsistency. >> >> >> So this is similar to WeakHashMap by removing the key "when it can't >> >> >> be >> >> >> used anymore" because I know that the transaction will finish before >> >> >> the >> >> >> deadline. I don't care about the exact size, I don't want to tune >> >> >> that, >> >> >> I just don't want to leak. >> >> >> >> >> >> From my POV the non-strict maxIdle and strict expiration would be >> >> >> a >> >> >> nice compromise. >> >> >> >> >> >> Radim >> >> >> >> >> >>> Note that this would make the reaper thread less >> >> >>> efficient: >> >> >>> with >> >> >>> numOwners=2 (best case), half of the entries that the >> >> >>> reaper >> >> >>> touches >> >> >>> cannot be expired, because the node isn't the primary >> >> >>> node. >> >> >>> And to >> >> >>> make matters worse, the same reaper thread would have to >> >> >>> perform a >> >> >>> (synchronous?) RPC for each entry to ensure it expires >> >> >>> everywhere. >> >> >>> >> >> >>> >> >> >>> I have debated about this, it could something like a sync >> >> >>> removeAll which has a special marker to tell it is due to >> >> >>> expiration (which would raise listeners there), while also >> >> >>> sending >> >> >>> a cluster expiration event to other non owners. >> >> >>> >> >> >>> >> >> >>> For maxIdle I'd like to know more information about how >> >> >>> exactly the >> >> >>> owners would coordinate to expire an entry. I'm pretty >> >> >>> sure >> >> >>> we >> >> >>> cannot >> >> >>> avoid ignoring some reads (expiring an entry immediately >> >> >>> after >> >> >>> it was >> >> >>> read), and ensuring that we don't accidentally extend an >> >> >>> entry's life >> >> >>> (like the current code does, when we transfer an entry to >> >> >>> a >> >> >>> new owner) >> >> >>> also sounds problematic. >> >> >>> >> >> >>> >> >> >>> For lifespan it is simple, the primary owner just expires it >> >> >>> when >> >> >>> it expires there. There is no coordination needed in this >> >> >>> case >> >> >>> it >> >> >>> just sends the expired remove to owners etc. >> >> >>> >> >> >>> Max idle is more complicated as we all know. The primary >> >> >>> owner >> >> >>> would send a request for the last used time for a given key >> >> >>> or >> >> >>> set >> >> >>> of keys. Then the owner would take those times and check for >> >> >>> a >> >> >>> new access it isn't aware of. If there isn't then it would >> >> >>> send >> >> >>> a >> >> >>> remove command for the key(s). If there is a new access the >> >> >>> owner >> >> >>> would instead send the last used time to all of the owners. >> >> >>> The >> >> >>> expiration obviously would have a window that if a read >> >> >>> occurred >> >> >>> after sending a response that could be ignored. This could >> >> >>> be >> >> >>> resolved by using some sort of 2PC and blocking reads during >> >> >>> that >> >> >>> period but I would say it isn't worth it. >> >> >>> >> >> >>> The issue with transferring to a new node refreshing the last >> >> >>> update/lifespan seems like just a bug we need to fix >> >> >>> irrespective >> >> >>> of this issue IMO. >> >> >>> >> >> >>> >> >> >>> I'm not saying expiring entries on each node >> >> >>> independently >> >> >>> is >> >> >>> perfect, >> >> >>> far from it. But I wouldn't want us to provide new >> >> >>> guarantees that >> >> >>> could hurt performance without a really good use case. >> >> >>> >> >> >>> >> >> >>> I would guess that user perceived performance should be a >> >> >>> little >> >> >>> faster with this. But this also depends on an alternative >> >> >>> that >> >> >>> we >> >> >>> decided on :) >> >> >>> >> >> >>> Also the expiration thread pool is set to min priority atm so >> >> >>> it >> >> >>> may delay removal of said objects but hopefully (if the jvm >> >> >>> supports) it wouldn't overrun a CPU while processing unless >> >> >>> it >> >> >>> has >> >> >>> availability. >> >> >>> >> >> >>> >> >> >>> Cheers >> >> >>> Dan >> >> >>> >> >> >>> >> >> >>> On Mon, Jul 13, 2015 at 9:25 PM, Tristan Tarrant >> >> >>> <ttarr...@redhat.com <mailto:ttarr...@redhat.com>> wrote: >> >> >>> > After re-reading the whole original thread, I agree >> >> >>> with >> >> >>> the >> >> >>> proposal >> >> >>> > with two caveats: >> >> >>> > >> >> >>> > - ensure that we don't break JCache compatibility >> >> >>> > - ensure that we document this properly >> >> >>> > >> >> >>> > Tristan >> >> >>> > >> >> >>> > On 13/07/2015 18:41, Sanne Grinovero wrote: >> >> >>> >> +1 >> >> >>> >> You had me convinced at the first line, although "A >> >> >>> lot >> >> >>> of >> >> >>> code can now >> >> >>> >> be removed and made simpler" makes it look extremely >> >> >>> nice. >> >> >>> >> >> >> >>> >> On 13 Jul 2015 18:14, "William Burns" >> >> >>> <mudokon...@gmail.com >> >> >>> <mailto:mudokon...@gmail.com> >> >> >>> >> <mailto:mudokon...@gmail.com >> >> >> >> >>> <mailto:mudokon...@gmail.com>>> wrote: >> >> >>> >> >> >> >>> >> This is a necro of [1]. >> >> >>> >> >> >> >>> >> With Infinispan 8.0 we are adding in clustered >> >> >>> expiration. That >> >> >>> >> includes an expiration event raised that is >> >> >>> clustered >> >> >>> as well. >> >> >>> >> Unfortunately expiration events currently occur >> >> >>> multiple times (if >> >> >>> >> numOwners > 1) at different times across nodes in >> >> >>> a >> >> >>> cluster. This >> >> >>> >> makes coordinating a single cluster expiration >> >> >>> event >> >> >>> quite difficult. >> >> >>> >> >> >> >>> >> To work around this I am proposing that the >> >> >>> expiration >> >> >>> of an event >> >> >>> >> is done solely by the owner of the given key that >> >> >>> is >> >> >>> now expired. >> >> >>> >> This would fix the issue of having multiple events >> >> >>> and >> >> >>> the event can >> >> >>> >> be raised while holding the lock for the given key >> >> >>> so >> >> >>> concurrent >> >> >>> >> modifications would not be an issue. >> >> >>> >> >> >> >>> >> The problem arises when you have other nodes that >> >> >>> have >> >> >>> expiration >> >> >>> >> set but expire at different times. Max idle is >> >> >>> the >> >> >>> biggest offender >> >> >>> >> with this as a read on an owner only refreshes the >> >> >>> owners timestamp, >> >> >>> >> meaning other owners would not be updated and >> >> >>> expire >> >> >>> preemptively. >> >> >>> >> To have expiration work properly in this case you >> >> >>> would >> >> >>> need >> >> >>> >> coordination between the owners to see if anyone >> >> >>> has >> >> >>> a >> >> >>> higher >> >> >>> >> value. This requires blocking and would have to >> >> >>> be >> >> >>> done while >> >> >>> >> accessing a key that is expired to be sure if >> >> >>> expiration happened or >> >> >>> >> not. >> >> >>> >> >> >> >>> >> The linked dev listing proposed instead to only >> >> >>> expire >> >> >>> an entry by >> >> >>> >> the reaper thread and not on access. In this case >> >> >>> a >> >> >>> read will >> >> >>> >> return a non null value until it is fully expired, >> >> >>> increasing hit >> >> >>> >> ratios possibly. >> >> >>> >> >> >> >>> >> Their are quire a bit of real benefits for this: >> >> >>> >> >> >> >>> >> 1. Cluster cache reads would be much simpler and >> >> >>> wouldn't have to >> >> >>> >> block to verify the object exists or not since >> >> >>> this >> >> >>> would only be >> >> >>> >> done by the reaper thread (note this would have >> >> >>> only >> >> >>> happened if the >> >> >>> >> entry was expired locally). An access would just >> >> >>> return the value >> >> >>> >> immediately. >> >> >>> >> 2. Each node only expires entries it owns in the >> >> >>> reaper >> >> >>> thread >> >> >>> >> reducing how many entries they must check or >> >> >>> remove. >> >> >>> This also >> >> >>> >> provides a single point where events would be >> >> >>> raised >> >> >>> as >> >> >>> we need. >> >> >>> >> 3. A lot of code can now be removed and made >> >> >>> simpler >> >> >>> as >> >> >>> it no longer >> >> >>> >> has to check for expiration. The expiration check >> >> >>> would only be >> >> >>> >> done in 1 place, the expiration reaper thread. >> >> >>> >> >> >> >>> >> The main issue with this proposal is as the other >> >> >>> listing mentions >> >> >>> >> is if user code expects the value to be gone after >> >> >>> expiration for >> >> >>> >> correctness. I would say this use case is not as >> >> >>> compelling for >> >> >>> >> maxIdle, especially since we never supported it >> >> >>> properly. And in >> >> >>> >> the case of lifespan the user could very easily >> >> >>> store >> >> >>> the expiration >> >> >>> >> time in the object that they can check after a get >> >> >>> as >> >> >>> pointed out in >> >> >>> >> the other thread. >> >> >>> >> >> >> >>> >> [1] >> >> >>> >> >> >> >>> >> >> >>> >> >> >>> http://infinispan-developer-list.980875.n3.nabble.com/infinispan-dev-strictly-not-returning-expired-values-td3428763.html >> >> >>> >> >> >> >>> >> _______________________________________________ >> >> >>> >> infinispan-dev mailing list >> >> >>> >> infinispan-dev@lists.jboss.org >> >> >>> <mailto:infinispan-dev@lists.jboss.org> >> >> >>> <mailto:infinispan-dev@lists.jboss.org >> >> >>> <mailto:infinispan-dev@lists.jboss.org>> >> >> >>> >> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> _______________________________________________ >> >> >>> >> infinispan-dev mailing list >> >> >>> >> infinispan-dev@lists.jboss.org >> >> >>> <mailto:infinispan-dev@lists.jboss.org> >> >> >>> >> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >> >>> > >> >> >>> > -- >> >> >>> > Tristan Tarrant >> >> >>> > Infinispan Lead >> >> >>> > JBoss, a division of Red Hat >> >> >>> > _______________________________________________ >> >> >>> > infinispan-dev mailing list >> >> >>> > infinispan-dev@lists.jboss.org >> >> >>> <mailto:infinispan-dev@lists.jboss.org> >> >> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> _______________________________________________ >> >> >>> infinispan-dev mailing list >> >> >>> infinispan-dev@lists.jboss.org >> >> >>> <mailto:infinispan-dev@lists.jboss.org> >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >>> >> >> >>> >> >> >>> >> >> >>> _______________________________________________ >> >> >>> infinispan-dev mailing list >> >> >>> infinispan-dev@lists.jboss.org >> >> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> > _______________________________________________ >> >> > infinispan-dev mailing list >> >> > infinispan-dev@lists.jboss.org >> >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> >> >> >> >> >> -- >> >> Radim Vansa <rva...@redhat.com> >> >> JBoss Performance Team >> >> >> >> _______________________________________________ >> >> infinispan-dev mailing list >> >> infinispan-dev@lists.jboss.org >> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev >> > >> > >> > _______________________________________________ >> > infinispan-dev mailing list >> > infinispan-dev@lists.jboss.org >> > https://lists.jboss.org/mailman/listinfo/infinispan-dev >> _______________________________________________ >> infinispan-dev mailing list >> infinispan-dev@lists.jboss.org >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev