[ 
https://issues.apache.org/jira/browse/IGNITE-12124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vyacheslav Koptilin updated IGNITE-12124:
-----------------------------------------
    Description: 
Stopping a cache with configured TTL may lead to errors. For instance,
{noformat}
java.lang.NullPointerException
        at 
org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
        at 
org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
        at 
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
        at java.lang.Thread.run(Thread.java:748){noformat}
The obvious reason for this {{NullPointerException}} is that unregistering of 
{{GridCacheTtlManager}} (see {{GridCacheSharedTtlCleanupManager#unregister}} 
does not wait for the finish of expiration (in that particular case, 
{{GridCacheContext}} is already cleaned up).

 

So, unregistering of {{GridCacheTtlManager}}, caused by cache stopping, must 
wait for expiration if it is running for the cache that stops. On the other 
hand, it does not seem correct to wait for expiration under the 
{{checkpointReadLock}} see 
{{GridCacheProcessor#processCacheStopRequestOnExchangeDone}}:

{code:java}
    private void processCacheStopRequestOnExchangeDone(ExchangeActions 
exchActions) {
        ...
        try {
            doInParallel(
                    parallelismLvl,
                    sharedCtx.kernalContext().getSystemExecutorService(),
                    cachesToStop.entrySet(),
                    cachesToStopByGrp -> {
                            ...
                            for (ExchangeActions.CacheActionData action: 
cachesToStopByGrp.getValue()) {
                                ...
                                sharedCtx.database().checkpointReadLock();

                                try {
                                    
prepareCacheStop(action.request().cacheName(), action.request().destroy()); 
<--- unregistering of GridCacheTtlManager is performed here
                                }
                                finally {
                                    sharedCtx.database().checkpointReadUnlock();
                                }
                            }
        ...
    }
{code}


  was:
Stopping a cache with configured TTL may lead to errors. For instance,
{noformat}
java.lang.NullPointerException
        at 
org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
        at 
org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
        at 
org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
        at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
        at 
org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
        at 
org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
        at 
org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
        at java.lang.Thread.run(Thread.java:748){noformat}
The obvious reason for this {{NullPointerException}} is that unregistering of 
{{GridCacheTtlManager}} (see {{GridCacheSharedTtlCleanupManager#unregister}} 
does not wait for the finish of expiration (in that particular case, 
{{GridCacheContext}} is already cleaned up).

 

So, it seems to me, unregistering of {{GridCacheTtlManager, caused by cache 
stopping, must wait for expiration if it is running for the cache to be 
stopped.}}


> Stopping the cache does not wait for expiration process, which may be started 
> and may lead to errors
> ----------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-12124
>                 URL: https://issues.apache.org/jira/browse/IGNITE-12124
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.7
>            Reporter: Vyacheslav Koptilin
>            Assignee: Vyacheslav Koptilin
>            Priority: Major
>             Fix For: 2.8
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Stopping a cache with configured TTL may lead to errors. For instance,
> {noformat}
> java.lang.NullPointerException
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheContext.onDeferredDelete(GridCacheContext.java:1702)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheMapEntry.onTtlExpired(GridCacheMapEntry.java:4040)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:75)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager$1.applyx(GridCacheTtlManager.java:66)
>       at 
> org.apache.ignite.internal.util.lang.IgniteInClosure2X.apply(IgniteInClosure2X.java:37)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2501)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2427)
>       at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:989)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:233)
>       at 
> org.apache.ignite.internal.processors.cache.GridCacheSharedTtlCleanupManager$CleanupWorker.body(GridCacheSharedTtlCleanupManager.java:150)
>       at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:119)
>       at java.lang.Thread.run(Thread.java:748){noformat}
> The obvious reason for this {{NullPointerException}} is that unregistering of 
> {{GridCacheTtlManager}} (see {{GridCacheSharedTtlCleanupManager#unregister}} 
> does not wait for the finish of expiration (in that particular case, 
> {{GridCacheContext}} is already cleaned up).
>  
> So, unregistering of {{GridCacheTtlManager}}, caused by cache stopping, must 
> wait for expiration if it is running for the cache that stops. On the other 
> hand, it does not seem correct to wait for expiration under the 
> {{checkpointReadLock}} see 
> {{GridCacheProcessor#processCacheStopRequestOnExchangeDone}}:
> {code:java}
>     private void processCacheStopRequestOnExchangeDone(ExchangeActions 
> exchActions) {
>         ...
>         try {
>             doInParallel(
>                     parallelismLvl,
>                     sharedCtx.kernalContext().getSystemExecutorService(),
>                     cachesToStop.entrySet(),
>                     cachesToStopByGrp -> {
>                             ...
>                             for (ExchangeActions.CacheActionData action: 
> cachesToStopByGrp.getValue()) {
>                                 ...
>                                 sharedCtx.database().checkpointReadLock();
>                                 try {
>                                     
> prepareCacheStop(action.request().cacheName(), action.request().destroy()); 
> <--- unregistering of GridCacheTtlManager is performed here
>                                 }
>                                 finally {
>                                     
> sharedCtx.database().checkpointReadUnlock();
>                                 }
>                             }
>         ...
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to