[ https://issues.apache.org/jira/browse/GEODE-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645618#comment-16645618 ]
xiaojian zhou commented on GEODE-5252: -------------------------------------- The previous fix has a potential problem. When the thread doing handleCacheRemoval which holds the writeLock. It might send some distribution operations. such as destroy region, or execute a function. Then wait for reply from remote members. While waiting, the membership view could have changed, such as some members might have left the distributed system. But the arriving system notifications have to get the readLock to get processed. So the handleCacheRemoval does not know these membership view's change and wait for reply for ever. The fix is: system notification should not be blocked by the readLock. > Race in management adapter could fail to create MXBeans. > -------------------------------------------------------- > > Key: GEODE-5252 > URL: https://issues.apache.org/jira/browse/GEODE-5252 > Project: Geode > Issue Type: Bug > Components: management > Reporter: Sai Boorlagadda > Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > Handling DiskStore creation event to create DiskStoreMXBean could result into > a null pointer due to race in ManagementAdapter.java. > {noformat} > java.lang.NullPointerException > at > org.apache.geode.management.internal.beans.ManagementAdapter.handleDiskCreation(ManagementAdapter.java:380) > at > org.apache.geode.management.internal.beans.ManagementListener.handleEvent(ManagementListener.java:122) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.notifyResourceEventListeners(InternalDistributedSystem.java:2201) > at > org.apache.geode.distributed.internal.InternalDistributedSystem.handleResourceEvent(InternalDistributedSystem.java:590) > at > org.apache.geode.internal.cache.DiskStoreFactoryImpl.create(DiskStoreFactoryImpl.java:143) > {noformat} > ManagementAdapter.handleDiskCreation throws NullPointerException on line 380 > which means the thread invoking handleDiskCreation is seeing null for the > field service. > {noformat} > service.federate(changedMBeanName, DiskStoreMXBean.class, true); > {noformat} > Looks like service is SystemManagementService and it's set to a non-null > value in ManagementAdaptor.handleCacheCreation: > {noformat} > this.service = > (SystemManagementService) > ManagementService.getManagementService(internalCache); > {noformat} > The field is not volatile and it's not protected by any synchronization: > {noformat} > /** Internal ManagementService Instance **/ > private SystemManagementService service; > {noformat} > Lots of other fields in ManagementAdaptor also appear to NOT be thread-safe. > Looks like ManagementAdaptor concurrency in general needs to be fixed up to > fix this bug. -- This message was sent by Atlassian JIRA (v7.6.3#76005)