[
https://issues.apache.org/jira/browse/GEODE-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259786#comment-17259786
]
Anilkumar Gingade commented on GEODE-8248:
------------------------------------------
The product is behaving as expected; based on the action performed by Gfsh.
When shutdown is executed from gfsh; it does shutdown on each member instead of
shutdown-all.
In order to have the behavior as mentioned in this issue; the gfsh has to call
shutdown-all.
> Member hangs waiting for missing disk-stores after gfsh shutdown
> ----------------------------------------------------------------
>
> Key: GEODE-8248
> URL: https://issues.apache.org/jira/browse/GEODE-8248
> Project: Geode
> Issue Type: Bug
> Components: gfsh, persistence
> Reporter: Juan Ramos
> Priority: Major
> Attachments: temporal.zip
>
>
> Let’s say I have 2 servers with a simple {{REPLICATE_PERSISTENT}} region and
> I stop both using the {{gfsh shutdown}} command.
> According to the
> [documentation|https://geode.apache.org/docs/guide/112/managing/disk_storage/starting_system_with_disk_stores.html],
> I should be able to start either of the servers without any problems as both
> host the most up to date data. However, what happens in reality is that the
> startup hangs with the following:
> {noformat}
> (1) Executing - start server --name=server1 --locators=localhost[10334]
> --server-port=40401 --cache-xml-file=/temporal/cache.xml
> .........
> Region /TestRegion has potentially stale data. It is waiting for another
> member to recover the latest data.
> My persistent id:
> DiskStore ID: 4d1abaf3-677d-4c52-b3f8-681e051f143c
> Name: server1
> Location: /temporal/server1/dataStore
> Members with potentially new data:
> [
> DiskStore ID: 163dfaf7-a680-4154-a278-8cec40d57d80
> Name: server2
> Location: /temporal/server2/dataStore
> ]
> "main" #1 prio=5 os_prio=31 tid=0x00007f9b28809000 nid=0x1003 in
> Object.wait() [0x000070000ab04000]
> java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at
> org.apache.geode.internal.cache.persistence.MembershipChangeListener.waitForChange(MembershipChangeListener.java:62)
> - locked <0x0000000719df55e0> (a
> org.apache.geode.internal.cache.persistence.MembershipChangeListener)
> at
> org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.waitForMembershipChangeForMissingDiskStores(PersistenceInitialImageAdvisor.java:218)
> at
> org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.getAdvice(PersistenceInitialImageAdvisor.java:118)
> at
> org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.getInitialImageAdvice(PersistenceAdvisorImpl.java:835)
> at
> org.apache.geode.internal.cache.persistence.CreatePersistentRegionProcessor.getInitialImageAdvice(CreatePersistentRegionProcessor.java:52)
> at
> org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1196)
> at
> org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1076)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3043)
> at
> org.apache.geode.pdx.internal.PeerTypeRegistration.initialize(PeerTypeRegistration.java:198)
> at
> org.apache.geode.pdx.internal.TypeRegistry.initialize(TypeRegistry.java:116)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.initializePdxRegistry(GemFireCacheImpl.java:1449)
> - locked <0x00000005c0593168> (a
> org.apache.geode.internal.cache.GemFireCacheImpl)
> at
> org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:511)
> at
> org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:337)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4272)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1388)
> at
> org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1208)
> at
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:207)
> - locked <0x00000005c016a108> (a java.lang.Class for
> org.apache.geode.internal.cache.GemFireCacheImpl)
> - locked <0x00000005c0043de0> (a java.lang.Class for
> org.apache.geode.internal.cache.InternalCacheBuilder)
> at
> org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:164)
> - locked <0x00000005c0043de0> (a java.lang.Class for
> org.apache.geode.internal.cache.InternalCacheBuilder)
> at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:139)
> at
> org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52)
> at
> org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:869)
> at
> org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:786)
> at
> org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:716)
> at
> org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:236)
> {noformat}
> We should either fix the problem and make sure the members fully synchronise
> their data during the {{shutdown}} process so they don't have to wait on each
> other or, if this is the expected behaviour, update the documentation
> accordingly.
> The attached {{zip}} file contains a simple script to reproduce the issue,
> the only thing that needs to be changed after downloading and uncompressing
> the file, it's the {{GEMFIRE}} environment variable.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)