[
https://issues.apache.org/jira/browse/GEODE-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522937#comment-17522937
]
Barrett Oglesby commented on GEODE-10148:
-----------------------------------------
I think here is where the problem is:
{{LocalManager.startLocalManagement}} runs the {{ManagementTask}} once right
when it starts.
With logging added, the call to {{managementTask.get().run()}} returns right
away. Even though the comment says its a synchronous call, it isn't.
{noformat}
[vm3] [warn 2022/03/23 16:16:02.173 PDT server-3 <RMI TCP
Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.startLocalManagement about
to run managementTask
[vm3] [warn 2022/03/23 16:16:02.173 PDT server-3 <RMI TCP
Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.startLocalManagement done
managementTask
{noformat}
Then, {{LocalManager.markForFederation}} adds the mbeans to the
{{federatedComponentMap}}:
{noformat}
[vm3] [warn 2022/03/23 16:16:02.209 PDT server-3 <RMI TCP
Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to
add to federatedComponentMap objName=GemFire:type=Member,member=server-3
[vm3] [warn 2022/03/23 16:16:02.364 PDT server-3 <RMI TCP
Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to
add to federatedComponentMap
objName=GemFire:service=Region,name="/test-region-1",type=Member,member=server-3
[vm3] [warn 2022/03/23 16:16:02.437 PDT server-3 <RMI TCP
Connection(1)-127.0.0.1> tid=0x12] XXX LocalManager.markForFederation about to
add to federatedComponentMap
objName=GemFire:service=CacheServer,port=20017,type=Member,member=server-3
{noformat}
The CacheServer mbean above is the one that is missing in the failed run.
Then, the {{Management Task}} thread runs the {{ManagementTask}} started above
to put the mbeans into the region:
{noformat}
[vm3] [warn 2022/03/23 16:16:04.177 PDT server-3 <Management Task1> tid=0x46]
XXX LocalManager.doManagementTask about to putAll
replicaMap={GemFire:service=CacheServer,port=20017,type=Member,member=server-3=ObjectName
= GemFire:service=CacheServer,port=20017,type=Member,member=server-3,
GemFire:service=Region,name="/test-region-1",type=Member,member=server-3=ObjectName
= GemFire:service=Region,name="/test-region-1",type=Member,member=server-3,
GemFire:type=Member,member=server-3=ObjectName =
GemFire:type=Member,member=server-3}
[vm3] [warn 2022/03/23 16:16:04.211 PDT server-3 <Management Task1> tid=0x46]
XXX LocalManager.doManagementTask done putAll
replicaMap={GemFire:service=CacheServer,port=20017,type=Member,member=server-3=ObjectName
= GemFire:service=CacheServer,port=20017,type=Member,member=server-3,
GemFire:service=Region,name="/test-region-1",type=Member,member=server-3=ObjectName
= GemFire:service=Region,name="/test-region-1",type=Member,member=server-3,
GemFire:type=Member,member=server-3=ObjectName =
GemFire:type=Member,member=server-3}
{noformat}
If the {{Management Task}} thread runs between the added Region and CacheServer
mbeans, this issue would reproduce.
> [CI Failure] : JMXMBeanFederationDUnitTest > MBeanFederationAddRemoveServer
> FAILED
> ----------------------------------------------------------------------------------
>
> Key: GEODE-10148
> URL: https://issues.apache.org/jira/browse/GEODE-10148
> Project: Geode
> Issue Type: Bug
> Components: jmx
> Affects Versions: 1.15.0
> Reporter: Nabarun Nag
> Priority: Major
> Labels: test-stability
>
> JMXMBeanFederationDUnitTest > MBeanFederationAddRemoveServer FAILED
> java.lang.AssertionError:
> Expecting actual:
> ["GemFire:service=AccessControl,type=Distributed",
> "GemFire:service=CacheServer,port=20842,type=Member,member=server-1",
> "GemFire:service=CacheServer,port=20846,type=Member,member=server-2",
>
> "GemFire:service=DiskStore,name=cluster_config,type=Member,member=locator-one",
> "GemFire:service=FileUploader,type=Distributed",
> "GemFire:service=Locator,type=Member,member=locator-one",
>
> "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Distributed",
>
> "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Member,member=locator-one",
> "GemFire:service=Manager,type=Member,member=locator-one",
> "GemFire:service=Region,name="/test-region-1",type=Distributed",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-1",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-2",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-3",
> "GemFire:service=System,type=Distributed",
> "GemFire:type=Member,member=locator-one",
> "GemFire:type=Member,member=server-1",
> "GemFire:type=Member,member=server-2",
> "GemFire:type=Member,member=server-3"]
> to contain exactly (and in same order):
> ["GemFire:service=AccessControl,type=Distributed",
> "GemFire:service=CacheServer,port=20842,type=Member,member=server-1",
> "GemFire:service=CacheServer,port=20846,type=Member,member=server-2",
> "GemFire:service=CacheServer,port=20850,type=Member,member=server-3",
>
> "GemFire:service=DiskStore,name=cluster_config,type=Member,member=locator-one",
> "GemFire:service=FileUploader,type=Distributed",
> "GemFire:service=Locator,type=Member,member=locator-one",
>
> "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Distributed",
>
> "GemFire:service=LockService,name=__CLUSTER_CONFIG_LS,type=Member,member=locator-one",
> "GemFire:service=Manager,type=Member,member=locator-one",
> "GemFire:service=Region,name="/test-region-1",type=Distributed",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-1",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-2",
>
> "GemFire:service=Region,name="/test-region-1",type=Member,member=server-3",
> "GemFire:service=System,type=Distributed",
> "GemFire:type=Member,member=locator-one",
> "GemFire:type=Member,member=server-1",
> "GemFire:type=Member,member=server-2",
> "GemFire:type=Member,member=server-3"]
> but could not find the following elements:
> ["GemFire:service=CacheServer,port=20850,type=Member,member=server-3"]
> at
> org.apache.geode.management.internal.JMXMBeanFederationDUnitTest.MBeanFederationAddRemoveServer(JMXMBeanFederationDUnitTest.java:130)
> 8352 tests completed, 1 failed, 414 skipped
--
This message was sent by Atlassian Jira
(v8.20.1#820001)