Barrett Oglesby created GEODE-9664:
--------------------------------------

             Summary: Two different clients with the same durable id will both 
connect to the servers and receive messages
                 Key: GEODE-9664
                 URL: https://issues.apache.org/jira/browse/GEODE-9664
             Project: Geode
          Issue Type: Bug
          Components: client queues
            Reporter: Barrett Oglesby


There are two cases:
 # The number of queues is the same as the number of servers (e.g. client with 
subscription-redundancy=1 and 2 servers)
 # The number of queues is less than the number of servers (e.g. client with 
subscription-redundancy=0 and 2 servers)

h2. Case 1
 In this case, the client first attempts to connect to the primary and fails.
{noformat}
[warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal about to register 
clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300])

[warn 2021/10/01 14:37:56.209 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing 
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]); port=61581; primary=true; version=GEODE 1.15.0]

[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x4b] XXX CacheClientNotifier.registerClientInternal existing proxy 
isPaused=false

[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x4b] The requested durable client has the same identifier ( client-a ) 
as an existing durable client ( 
CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]); port=61581; primary=true; version=GEODE 1.15.0] ). Duplicate 
durable clients are not allowed.

[warn 2021/10/01 14:37:56.210 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x4b] CacheClientNotifier: Unsuccessfully registered client with 
identifier 
identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]) and response code 64
{noformat}
It then attempts to connect to the secondary and succeeds.
{noformat}
[warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization Thread 
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal about to register 
clientProxyMembershipID=identity(127.0.0.1(client-a-2:89832:loner):61596:fad3ca3d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300])

[warn 2021/10/01 14:37:56.215 PDT server-2 <Client Queue Initialization Thread 
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing 
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]); port=61578; primary=false; version=GEODE 1.15.0]

[warn 2021/10/01 14:37:56.216 PDT server-2 <Client Queue Initialization Thread 
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal existing proxy 
isPaused=true

[warn 2021/10/01 14:37:56.217 PDT server-2 <Client Queue Initialization Thread 
1> tid=0x47] XXX CacheClientNotifier.registerClientInternal reinitialized 
existing 
proxy=CacheClientProxy[identity(127.0.0.1(client-a-1:89806:loner):61573:10a9ca3d:client-a-1,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]); port=61578; primary=true; version=GEODE 1.15.0]
{noformat}
The previous secondary is reinitialized and made into a primary. Both queues 
will dispatch events.

The CacheClientNotifier.registerClientInternal method invoked when a client 
connects does:
{noformat}
if (cacheClientProxy.isPaused()) {
  ...
  cacheClientProxy.reinitialize(...);
} else {
  unsuccessfulMsg = String.format("The requested durable client has the same 
identifier ( %s ) as an existing durable client...);
  logger.warn(unsuccessfulMsg);
}
{noformat}
The CacheClientProxy is paused when the durable client it represents has 
disconnected. Unfortunately, a secondary CacheClientProxy is also paused. So, 
this check is not good enough to prevent a duplicate durable client from 
connecting.

There are a few things that can also be checked. One of them is:
{noformat}
cacheClientProxy.getCommBuffer() == null
{noformat}
With that check added, when the client attempts to connect to the secondary, it 
fails just like the it does with the primary.

The client then exits with this exception:
{noformat}
geode.cache.NoSubscriptionServersAvailableException: 
org.apache.geode.cache.NoSubscriptionServersAvailableException: Could not 
initialize a primary queue on startup. No queue servers available.
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.getAllConnections(QueueManagerImpl.java:191)
        at 
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:428)
        at 
org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:875)
        at 
org.apache.geode.cache.client.internal.RegisterInterestOp.execute(RegisterInterestOp.java:58)
        at 
org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterest(ServerRegionProxy.java:364)
        at 
org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3815)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3911)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3890)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterestRegex(LocalRegion.java:3885)
        at 
org.apache.geode.cache.Region.registerInterestForAllKeys(Region.java:1709)
Caused by: org.apache.geode.cache.NoSubscriptionServersAvailableException: 
Could not initialize a primary queue on startup. No queue servers available.
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.initializeConnections(QueueManagerImpl.java:575)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.start(QueueManagerImpl.java:293)
        at 
org.apache.geode.cache.client.internal.PoolImpl.start(PoolImpl.java:359)
        at 
org.apache.geode.cache.client.internal.PoolImpl.finishCreate(PoolImpl.java:183)
        at 
org.apache.geode.cache.client.internal.PoolImpl.create(PoolImpl.java:169)
        at 
org.apache.geode.internal.cache.PoolFactoryImpl.create(PoolFactoryImpl.java:378)
{noformat}
h2. Case 2
 In this case, the client first attempts to connect to the primary and fails 
just like case 1.

It will then attempt to connect a server with no existing queue and succeed.
{noformat}
[warn 2021/10/01 15:02:50.798 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal about to register 
clientProxyMembershipID=identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300])

[warn 2021/10/01 15:02:50.799 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal existing proxy=null

[warn 2021/10/01 15:02:50.810 PDT server-1 <Client Queue Initialization Thread 
1> tid=0x54] XXX CacheClientNotifier.registerClientInternal created 
proxy=CacheClientProxy[identity(127.0.0.1(client-a-2:91683:loner):62089:24a2e13d:client-a-2,connection=2,durableAttributes=DurableClientAttributes[id=client-a;
 timeout=300]); port=62094; primary=true; version=GEODE 1.15.0]
{noformat}
One way to address this case would be to prevent the durable client from 
retrying to another server if it can't connect to the primary.

That would have be addressed in QueueManagerImpl.initializeConnections. That 
method would have to know that the server refused the connection 
(ServerRefusedConnectionException) and then return out of that method.

Thats a bit more work since that method currently doesn't get any exceptions 
from initializeQueueConnection which does:
{noformat}
} catch (Exception e) {
  if (logger.isDebugEnabled()) {
    logger.debug("error creating subscription connection to server {}",
        connection.getEndpoint(), e);
  }
}
{noformat}
An exception handler like this would need to be added:
{noformat}
} catch (ServerRefusedConnectionException e) {
  throw e;
}
{noformat}
QueueManagerImpl.initializeConnections would have to handle that exception in a 
few places. I'm not sure exactly what should be done in that method.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to