[ 
https://issues.apache.org/jira/browse/IGNITE-20299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17760136#comment-17760136
 ] 

Raymond Wilson commented on IGNITE-20299:
-----------------------------------------

I think persistence is an important for this issue as it is on restart that the 
grid complains that it cannot (a) start the incorrectly created cache (which 
raises the question as to why it is still known about if creation of it was 
unsuccessful) and (b) fails to initialise the persisted caches.

The cache folder for the incorrectly create cache is also constructed which 
indicates that the grid has somehow accepted the cache as a valid new cache 
while at the same time throwing the exchange process exception, all of which 
indicates the validation of the parameters for the new cache is not enforcing 
the requirement for the data region to be known.


> Creating a cache with an unknown data region name causes total unrecoverable 
> failure of the grid
> ------------------------------------------------------------------------------------------------
>
>                 Key: IGNITE-20299
>                 URL: https://issues.apache.org/jira/browse/IGNITE-20299
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.15
>         Environment: Observed in:
> C# client and grid running on Linux in a container
> C# client and grid running on Windows
>  
>            Reporter: Raymond Wilson
>            Priority: Major
>
> Using the Ignite C# client.
>  
> Given a running grid, having a client (and perhaps server) node in the grid 
> attempt to create a cache using a DataRegionName that does not exist in the 
> grid causes immediate failure in the client node with the following log 
> output. 
>  
> 2023-08-27 17:08:48,520 [44] INF [ImmutableClientServer]   Completed 
> partition exchange [localNode=15122bd7-bf81-44e6-a548-e70dbd9334c0, 
> exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion 
> [topVer=15, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode 
> [id=9d5ed68d-38bb-447d-aed5-189f52660716, 
> consistentId=9d5ed68d-38bb-447d-aed5-189f52660716, addrs=ArrayList 
> [127.0.0.1], sockAddrs=null, discPort=0, order=8, intOrder=8, 
> lastExchangeTime=1693112858024, loc=false, ver=2.15.0#20230425-sha1:f98f7f35, 
> isClient=true], rebalanced=false, done=true, newCrdFut=null], 
> topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0]]
> 2023-08-27 17:08:48,520 [44] INF [ImmutableClientServer]   Exchange timings 
> [startVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], 
> resVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], stage="Waiting in 
> exchange queue" (14850 ms), stage="Exchange parameters initialization" (2 
> ms), stage="Determine exchange type" (3 ms), stage="Exchange done" (4 ms), 
> stage="Total time" (14859 ms)]
> 2023-08-27 17:08:48,522 [44] INF [ImmutableClientServer]   Exchange longest 
> local stages [startVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], 
> resVer=AffinityTopologyVersion [topVer=15, minorTopVer=0]]
> 2023-08-27 17:08:48,524 [44] INF [ImmutableClientServer]   Finished exchange 
> init [topVer=AffinityTopologyVersion [topVer=15, minorTopVer=0], crd=false]
> 2023-08-27 17:08:48,525 [44] INF [ImmutableClientServer]   
> AffinityTopologyVersion [topVer=15, minorTopVer=0], evt=NODE_FAILED, 
> evtNode=9d5ed68d-38bb-447d-aed5-189f52660716, client=true]
> Unhandled exception: Apache.Ignite.Core.Cache.CacheException: class 
> org.apache.ignite.IgniteCheckedException: Failed to complete exchange process.
>  ---> Apache.Ignite.Core.Common.IgniteException: Failed to complete exchange 
> process.
>  ---> Apache.Ignite.Core.Common.JavaException: javax.cache.CacheException: 
> class org.apache.ignite.IgniteCheckedException: Failed to complete exchange 
> process.
>         at 
> org.apache.ignite.internal.processors.cache.GridCacheUtils.convertToCacheException(GridCacheUtils.java:1272)
>         at 
> org.apache.ignite.internal.IgniteKernal.getOrCreateCache0(IgniteKernal.java:2278)
>         at 
> org.apache.ignite.internal.IgniteKernal.getOrCreateCache(IgniteKernal.java:2242)
>         at 
> org.apache.ignite.internal.processors.platform.PlatformProcessorImpl.processInStreamOutObject(PlatformProcessorImpl.java:643)
>         at 
> org.apache.ignite.internal.processors.platform.PlatformTargetProxyImpl.inStreamOutObject(PlatformTargetProxyImpl.java:79)
> Caused by: class org.apache.ignite.IgniteCheckedException: Failed to complete 
> exchange process.
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.createExchangeException(GridDhtPartitionsExchangeFuture.java:3709)
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.sendExchangeFailureMessage(GridDhtPartitionsExchangeFuture.java:3737)
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.finishExchangeOnCoordinator(GridDhtPartitionsExchangeFuture.java:3832)
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onAllReceived(GridDhtPartitionsExchangeFuture.java:3813)
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:1796)
>         at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:1053)
>         at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:3348)
>         at 
> org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:3182)
>         at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
>         at java.base/java.lang.Thread.run(Thread.java:829)
>         Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to 
> initialize exchange locally [locNodeId=e9325b04-00fa-452e-9796-989b47b860ea]
>                 at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:1483)
>                 at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:979)
>                 ... 4 more
>         Caused by: class org.apache.ignite.IgniteCheckedException: Requested 
> DataRegion is not configured: Default-Mutable
>                 at 
> org.apache.ignite.internal.processors.cache.persistence.IgniteCacheDatabaseSharedManager.dataRegion(IgniteCacheDatabaseSharedManager.java:896)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2463)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2181)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:1991)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheStart(GridCacheProcessor.java:1926)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$55a0e703$1(GridCacheProcessor.java:1801)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCachesIfPossible$16(GridCacheProcessor.java:1771)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareStartCaches(GridCacheProcessor.java:1798)
>                 at 
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareStartCachesIfPossible(GridCacheProcessor.java:1769)
>                 at 
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processCacheStartRequests(CacheAffinitySharedManager.java:1000)
>                 at 
> org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.onCacheChangeRequest(CacheAffinitySharedManager.java:886)
>                 at 
> org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.onCacheChangeRequest(GridDhtPartitionsExchangeFuture.java:1472)
>                 ... 5 more
>    at Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.ExceptionCheck()
>    at Apache.Ignite.Core.Impl.Unmanaged.Jni.Env.CallObjectMethod(GlobalRef 
> obj, IntPtr methodId, Int64* argsPtr)
>    at 
> Apache.Ignite.Core.Impl.Unmanaged.UnmanagedUtils.TargetInStreamOutObject(GlobalRef
>  target, Int32 opType, Int64 inMemPtr)
>    at Apache.Ignite.Core.Impl.PlatformJniTarget.InStreamOutObject(Int32 type, 
> Action`1 writeAction)
>    --- End of inner exception stack trace —
>    --- End of inner exception stack trace —
>    at Apache.Ignite.Core.Impl.PlatformJniTarget.InStreamOutObject(Int32 type, 
> Action`1 writeAction)
>    at Apache.Ignite.Core.Impl.PlatformTargetAdapter.DoOutOpObject(Int32 type, 
> Action`1 action)
>    at 
> Apache.Ignite.Core.Impl.Ignite.GetOrCreateCache[TK,TV](CacheConfiguration 
> configuration, NearCacheConfiguration nearConfiguration, 
> PlatformCacheConfiguration platformCacheConfiguration, Op op)
>    at 
> Apache.Ignite.Core.Impl.Ignite.GetOrCreateCache[TK,TV](CacheConfiguration 
> configuration, NearCacheConfiguration nearConfiguration, 
> PlatformCacheConfiguration platformCacheConfiguration)
>    at 
> Apache.Ignite.Core.Impl.Ignite.GetOrCreateCache[TK,TV](CacheConfiguration 
> configuration, NearCacheConfiguration nearConfiguration)
>    at 
> Apache.Ignite.Core.Impl.Ignite.GetOrCreateCache[TK,TV](CacheConfiguration 
> configuration)
>  
>  
> This failure causes issues in the server nodes in the grid which now fail to 
> restart with errors such as below (for the incorrectly create cache) but 
> which are repeated for every defined cache in the grid:
>  
> 2023-08-27 17:11:36,882 [42] INF [ImmutableCacheComputeServer]   Can not 
> finish proxy initialization because proxy does not exist, 
> cacheName=SiteModelMetadata, localNodeId=3d4a75e8-174d-4947-877e-e45784d8d08d
> 2
>  
> At this point the grid is now unusable.
>  
> In summary: Attempted creation of a cache with an unknown DataRegionName 
> causes immediate and unrecoverable failure in the entire grid. 
>  
>  On attempted restarted Ignite notes all caches (including system caches) as 
> being "Started cache in recovery mode" in the log and then scans the WAL. The 
> incorrectly created cache is not mentioned at this point.
> At the point the cluster is activated, (ie: this appears in the log "Started 
> state transition: activate cluster"), it states the incorrectly created cache 
> ("SiteModelMetadata") can not be started, log entry is: "Cache can not be 
> started : cache=SiteModelMetadata"
> This is followed by multiple messages like this: "Finished recovery for cache 
> [cache=ignite-sys-cache, grp=ignite-sys-cache, 
> startVer=AffinityTopologyVersion [topVer=12, minorTopVer=1]] "
>  
> This is followed by errors relating to proxy initialisation, eg: "Can not 
> finish proxy initialization because proxy does not exist, 
> cacheName=ignite-sys-cache, localNodeId=4d44108f-cd96-4953-94db-6365f998a91b"
>  
> All caches are then stopped, eg: "Stopped cache 
> [cacheName=ignite-sys-cache]", and the grid enters a relatively dormant 
> inactivated state  where it only emits Ignite heartbeat messages.
>  
> This bug appears to be trivially easy to reproduce by creating a cache with 
> an unknown data region.
>  
> Attempting to destroy the bad cache in the grid with the Control.sh tool 
> results in this output:
>  
> {noformat}
> # ./control.sh --cache destroy --caches SiteModelMetadata
> WARNING: An illegal reflective access operation has occurred
> WARNING: Illegal reflective access by 
> org.apache.ignite.internal.util.GridUnsafe$2 
> (file:/trex/libs/ignite-core-2.15.0.jar) to field java.nio.Buffer.address
> WARNING: Please consider reporting this to the maintainers of 
> org.apache.ignite.internal.util.GridUnsafe$2
> WARNING: Use --illegal-access=warn to enable warnings of further illegal 
> reflective access operations
> WARNING: All illegal access operations will be denied in a future 
> releaseWarning! 
> The command will destroy 1 caches: SiteModelMetadata.
> If you continue, the cache data will be impossible to recover.
> Press 'y' to continue . . . y
> [01:53:38,925][SEVERE][session=24175683][CommandHandlerLog] Connection to 
> cluster failed. Latest topology update failed.Control utility [ver. 
> 2.15.0#20230425-sha1:f98f7f35]
> 2023 Copyright(C) Apache Software FoundationUser: rootTime: 
> 2023-08-29T01:53:26.048
> Command [CACHE] startedArguments: --cache destroy --caches 
> SiteModelMetadata--------------------------------------------------------------------------------
> Connection to cluster failed. 
> Latest topology update failed.Command [CACHE] finished with code: 
> Control utility has completed execution at: 2023-08-29T01:53:38.926
> {noformat}
>  
>  The Control.sh comment to list caches also fails.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to