[ https://issues.apache.org/jira/browse/GEODE-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Juan Ramos updated GEODE-8248: ------------------------------ Attachment: temporal.zip > Member hangs waiting for missing disk-stores after gfsh shutdown > ---------------------------------------------------------------- > > Key: GEODE-8248 > URL: https://issues.apache.org/jira/browse/GEODE-8248 > Project: Geode > Issue Type: Bug > Components: gfsh, persistence > Reporter: Juan Ramos > Priority: Major > Attachments: temporal.zip > > > Let’s say I have 2 servers with a simple {{REPLICATE_PERSISTENT}} region and > I stop both using the {{gfsh shutdown}} command. > According to the > [documentation|https://geode.apache.org/docs/guide/112/managing/disk_storage/starting_system_with_disk_stores.html], > I should be able to start either of the servers without any problems as both > host the most up to date data. However, what happens in reality is that the > startup hangs with the following: > {noformat} > (1) Executing - start server --name=server1 --locators=localhost[10334] > --server-port=40401 --cache-xml-file=/temporal/cache.xml > ......... > Region /TestRegion has potentially stale data. It is waiting for another > member to recover the latest data. > My persistent id: > DiskStore ID: 4d1abaf3-677d-4c52-b3f8-681e051f143c > Name: server1 > Location: /temporal/server1/dataStore > Members with potentially new data: > [ > DiskStore ID: 163dfaf7-a680-4154-a278-8cec40d57d80 > Name: server2 > Location: /temporal/server2/dataStore > ] > "main" #1 prio=5 os_prio=31 tid=0x00007f9b28809000 nid=0x1003 in > Object.wait() [0x000070000ab04000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at > org.apache.geode.internal.cache.persistence.MembershipChangeListener.waitForChange(MembershipChangeListener.java:62) > - locked <0x0000000719df55e0> (a > org.apache.geode.internal.cache.persistence.MembershipChangeListener) > at > org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.waitForMembershipChangeForMissingDiskStores(PersistenceInitialImageAdvisor.java:218) > at > org.apache.geode.internal.cache.persistence.PersistenceInitialImageAdvisor.getAdvice(PersistenceInitialImageAdvisor.java:118) > at > org.apache.geode.internal.cache.persistence.PersistenceAdvisorImpl.getInitialImageAdvice(PersistenceAdvisorImpl.java:835) > at > org.apache.geode.internal.cache.persistence.CreatePersistentRegionProcessor.getInitialImageAdvice(CreatePersistentRegionProcessor.java:52) > at > org.apache.geode.internal.cache.DistributedRegion.getInitialImageAndRecovery(DistributedRegion.java:1196) > at > org.apache.geode.internal.cache.DistributedRegion.initialize(DistributedRegion.java:1076) > at > org.apache.geode.internal.cache.GemFireCacheImpl.createVMRegion(GemFireCacheImpl.java:3043) > at > org.apache.geode.pdx.internal.PeerTypeRegistration.initialize(PeerTypeRegistration.java:198) > at > org.apache.geode.pdx.internal.TypeRegistry.initialize(TypeRegistry.java:116) > at > org.apache.geode.internal.cache.GemFireCacheImpl.initializePdxRegistry(GemFireCacheImpl.java:1449) > - locked <0x00000005c0593168> (a > org.apache.geode.internal.cache.GemFireCacheImpl) > at > org.apache.geode.internal.cache.xmlcache.CacheCreation.create(CacheCreation.java:511) > at > org.apache.geode.internal.cache.xmlcache.CacheXmlParser.create(CacheXmlParser.java:337) > at > org.apache.geode.internal.cache.GemFireCacheImpl.loadCacheXml(GemFireCacheImpl.java:4272) > at > org.apache.geode.internal.cache.GemFireCacheImpl.initializeDeclarativeCache(GemFireCacheImpl.java:1388) > at > org.apache.geode.internal.cache.GemFireCacheImpl.initialize(GemFireCacheImpl.java:1208) > at > org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:207) > - locked <0x00000005c016a108> (a java.lang.Class for > org.apache.geode.internal.cache.GemFireCacheImpl) > - locked <0x00000005c0043de0> (a java.lang.Class for > org.apache.geode.internal.cache.InternalCacheBuilder) > at > org.apache.geode.internal.cache.InternalCacheBuilder.create(InternalCacheBuilder.java:164) > - locked <0x00000005c0043de0> (a java.lang.Class for > org.apache.geode.internal.cache.InternalCacheBuilder) > at org.apache.geode.cache.CacheFactory.create(CacheFactory.java:139) > at > org.apache.geode.distributed.internal.DefaultServerLauncherCacheProvider.createCache(DefaultServerLauncherCacheProvider.java:52) > at > org.apache.geode.distributed.ServerLauncher.createCache(ServerLauncher.java:869) > at > org.apache.geode.distributed.ServerLauncher.start(ServerLauncher.java:786) > at > org.apache.geode.distributed.ServerLauncher.run(ServerLauncher.java:716) > at > org.apache.geode.distributed.ServerLauncher.main(ServerLauncher.java:236) > {noformat} > We should either fix the problem and make sure the members fully synchronise > their data during the {{shutdown}} process so they don't have to wait on each > other or, if this is the expected behaviour, update the documentation > accordingly. > The attached {{zip}} file contains a simple script to reproduce the issue, > the only thing that needs to be changed after downloading and uncompressing > the file, it's the {{GEMFIRE}} environment variable. -- This message was sent by Atlassian Jira (v8.3.4#803005)