I think Vahram is pointing out that this problem does not exist in Geode 1.2. If the fix for GEODE-1183 was in 1.1 then he is not suspecting that it is the cause for his issue.
Vahram, do you have a reproducible test case someone could use? Thanks, -Jason On Thu, Oct 25, 2018 at 10:07 AM Xiaojian Zhou <[email protected]> wrote: > I have checked, the fix GEM-1183 exists since 9.0 (i.e. some time in geode > 1.1.0, your 1.1.0 might not contain this fix). If you send me your revision > number, I will confirm for you. But I'm pretty sure it's due to that > behavior changes. > > I discussed with Bruce. There's chance that if client did not shutdown > gracefully, it might cause the server still have the proxy. There should be > a timeout for server to recognize the client's gone, then close the proxy. > If in that case, you shutdown your client, then wait for a while, Then > restart the client, it should be ok. If you restart immediately, then it > was rejected. This is my guess for your scenario. You can try this > workaround. > > Again, restart servers one by one, then restart clients. This will be > another workaround. > > I think we can improve the product to handle your case in more elegant > way. > > Regards > Xiaojian Zhou > > > On Thu, Oct 25, 2018 at 7:10 AM Vahram Aharonyan <[email protected]> > wrote: > >> Hi Xiaojian, >> >> >> >> Thank you for your response. >> >> >> >> Is this only about changes done in scope of GEODE-1183? Because it seems >> that they are performed in 1.0.0-incubating.M3. Currently we reside on >> Geode 1.1.0 and everything works fine. Problem occurs if we try to move to >> Geode 1.6.0. Moreover, we have not examined this issue neither with 1.2.0 >> version of Geode. If behavior change was introduced in GEODE-1183, then it >> should be visible in 1.1.0 and onwards, is not it ? >> >> >> >> Another thing to understand for me is whether this is expected behavior >> and is by design or it is a missing in the code flow that should be fixed, >> could you please comment on this as well? >> >> >> >> Thanks, >> >> Vahram. >> >> >> >> >> >> *From:* Xiaojian Zhou <[email protected]> >> *Sent:* Wednesday, October 24, 2018 8:07 PM >> *To:* [email protected] >> *Subject:* Re: Register Interest fails after clientCache creation >> >> >> >> The workaround is: restart the server one by one. Then they will >> recognize the clients again. >> >> >> >> On Tue, Oct 23, 2018 at 5:07 PM Xiaojian Zhou <[email protected]> wrote: >> >> Hi, Bruce and Vahram, >> >> >> >> The difference between the versions is: >> >> >> >> In old version's server, if the new connection request comes, it will >> close current one and create a new one. >> >> But the new version (with fix of GEODE-1183), it will check if the >> current one is alive. If yes, then reject the incoming request. >> >> >> >> That's the root cause of the error msg. >> >> >> >> What I wonder is: after the 9.0.0 client restarted, why the server still >> treats the connection (proxy) alive. The client shutdown should trigger the >> connection (proxy) to be closed. >> >> >> >> I am still looking at the code. >> >> >> >> Regards >> >> Gester Zhou >> >> >> >> On Tue, Oct 23, 2018 at 8:43 AM Vahram Aharonyan <[email protected]> >> wrote: >> >> Hi Bruce, >> >> >> >> Thanks for your feedback. >> >> >> >> Actually yes, on client restart we are ensuring that old process does not >> exist anymore and only after that start the new one. >> >> >> >> Could it be that something has changed in 1.6.0 from 1.1.0 in client >> cache or pool creation flows that I need to consider (Newly added or >> deprecated configuration parameters, etc.)? Or this does not seem to be a >> configuration mismatch? >> >> >> >> Thanks, >> >> Vahram. >> >> >> >> *From:* Bruce Schuchardt <[email protected]> >> *Sent:* Tuesday, October 23, 2018 12:34 AM >> *To:* [email protected] >> *Subject:* Re: Register Interest fails after clientCache creation >> >> >> >> Hi Vahram, >> >> Are you sure your old client process is gone when you try to connect the >> new one? It looks like the server should be doing a liveness check on the >> old "proxy" for the client and disallows the new connection if the old one >> is still around and its socket is still connected to the old client process. >> >> The work for JIRA ticket GEODE-1183 introduced the server-side behavior >> you're seeing. >> >> https://issues.apache.org/jira/browse/GEODE-1183 >> <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FGEODE-1183&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439474921&sdata=awP32tUDdUR1qQmB9OUoQniwFYNfsujkPs%2B%2B3SC2xCk%3D&reserved=0> >> >> >> >> On 10/22/18 8:09 AM, Vahram Aharonyan wrote: >> >> Hi All, >> >> >> >> We are trying to move from Geode 1.1.0 to 1.6.0 and examining problems >> with registering interest on regions after client restart. Please note that >> issue occurs only if server nodes and locator of the cluster are powered on >> client is been restarted. On the first boot of servers, client is >> successful in registering interest without any problem. >> >> >> >> Snippet from client log: >> >> >> >> [warning 2018/10/22 19:01:21.299 AMT >> Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <queueTimer-ControllerPool> >> tid=0x34] Cache Client Updater Thread on >> localhost(Server-b90ea8f2-8864-4c19-b4fb-b93f8499c674:22375)<ec><v3>:10002 >> port 10000 (10.27.72.245:10000 >> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.27.72.245%3A10000&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439484935&sdata=sLxD3mQ8I9%2FvihPI6iBRRJ2fGOZTE%2FbSycib5I2OL2A%3D&reserved=0>): >> Caught following exception while attempting to create a server-to-client >> communication socket and will exit: >> org.apache.geode.cache.client.ServerRefusedConnectionException: <null >> inet_addr hostname><ec>:10000 refused connection: A previous connection >> attempt from this client is still being processed: >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> >> >> >> [warning 2018/10/22 19:01:21.520 AMT >> Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <queueTimer-ControllerPool> >> tid=0x34] Cache Client Updater Thread on >> localhost(Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0:30157)<ec><v1>:10002 >> port 10000 (10.27.72.247:10000 >> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.27.72.247%3A10000&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439494954&sdata=TinfLgPZCNGmAgBFMM9riVwtR1w9mKZbplKcvys%2BLtM%3D&reserved=0>): >> Caught following exception while attempting to create a server-to-client >> communication socket and will exit: >> org.apache.geode.cache.client.ServerRefusedConnectionException: <null >> inet_addr hostname><ec>:10000 refused connection: A previous connection >> attempt from this client is still being processed: >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> >> >> >> [warning 2018/10/22 19:01:41.134 AMT >> Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <queueTimer-ControllerPool> >> tid=0x34] Cache Client Updater Thread on >> localhost(Server-0145f530-79a8-449e-9ca9-e0487cfe2934:31321)<ec><v2>:10002 >> port 10000 (10.27.72.252:10000 >> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.27.72.252%3A10000&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439504951&sdata=5UryzIkbBDl6D8%2BbXJYLbpW5xDSGwuKcRco4WeEZwvc%3D&reserved=0>): >> Caught following exception while attempting to create a server-to-client >> communication socket and will exit: >> org.apache.geode.cache.client.ServerRefusedConnectionException: <null >> inet_addr hostname><ec>:10000 refused connection: A previous connection >> attempt from this client is still being processed: >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> >> >> >> [warning 2018/10/22 19:01:41.350 AMT >> Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <queueTimer-ControllerPool> >> tid=0x34] Cache Client Updater Thread on >> localhost(Server-b90ea8f2-8864-4c19-b4fb-b93f8499c674:22375)<ec><v3>:10002 >> port 10000 (10.27.72.245:10000 >> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.27.72.245%3A10000&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439514970&sdata=qjgNMmYKr%2Frt4ezpEBHlq9nrsOrd1fDQZ4%2Fvflx%2FIGY%3D&reserved=0>): >> Caught following exception while attempting to create a server-to-client >> communication socket and will exit: >> org.apache.geode.cache.client.ServerRefusedConnectionException: <null >> inet_addr hostname><ec>:10000 refused connection: A previous connection >> attempt from this client is still being processed: >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> >> >> >> And on the server side: >> >> >> >> [warning 2018/10/22 19:04:41.992 AMT >> Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <ServerConnection on port 10000 >> Thread 80> tid=0x5e0] Server connection from >> [identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1; >> port=50484]: Unexpected IOException: >> >> java.net.SocketException: Socket Closed >> >> at >> java.net.AbstractPlainSocketImpl.setOption(AbstractPlainSocketImpl.java:212) >> >> at java.net.Socket.setSoTimeout(Socket.java:1141) >> >> at >> sun.security.ssl.BaseSSLSocketImpl.setSoTimeout(BaseSSLSocketImpl.java:631) >> >> at >> sun.security.ssl.SSLSocketImpl.setSoTimeout(SSLSocketImpl.java:2565) >> >> at >> org.apache.geode.internal.cache.tier.sockets.Message.readHeaderAndBody(Message.java:661) >> >> at >> org.apache.geode.internal.cache.tier.sockets.Message.receiveWithHeaderReadTimeout(Message.java:1125) >> >> at >> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1136) >> >> at >> org.apache.geode.internal.cache.tier.sockets.Message.receive(Message.java:1145) >> >> at >> org.apache.geode.internal.cache.tier.sockets.BaseCommand.readRequest(BaseCommand.java:851) >> >> at >> org.apache.geode.internal.cache.tier.sockets.ServerConnection.doNormalMsg(ServerConnection.java:792) >> >> at >> org.apache.geode.internal.cache.tier.sockets.OriginalServerConnection.doOneMessage(OriginalServerConnection.java:85) >> >> at >> org.apache.geode.internal.cache.tier.sockets.ServerConnection.run(ServerConnection.java:1248) >> >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) >> >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) >> >> at >> org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$4$1.run(AcceptorImpl.java:644) >> >> at java.lang.Thread.run(Thread.java:748) >> >> >> >> [warning 2018/10/22 19:04:41.992 AMT >> Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <ServerConnection on port 10000 >> Thread 80> tid=0x5e0] ClientHealthMonitor: Unregistering client with member >> id >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> due to: Socket Closed >> >> >> >> [info 2018/10/22 19:05:02.034 AMT >> Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <Client Queue Initialization >> Thread 1> tid=0x4b8] :Cache server: Initializing primary server-to-client >> communication socket: 62ca4c8d[TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256: >> Socket[addr=/10.27.72.247 >> <https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2F10.27.72.247&data=02%7C01%7Cvaharonyan%40vmware.com%7C3312049973d9427add1008d639cac742%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C1%7C0%7C636759940439524978&sdata=1qFyS%2BHWd7g4b7cDJ71Tbls6H5%2B8LxSip8tVRqPmyVU%3D&reserved=0> >> ,port=50516,localport=10000]] >> >> >> >> [warning 2018/10/22 19:05:02.034 AMT >> Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <Client Queue Initialization >> Thread 1> tid=0x4b8] A previous connection attempt from this client is >> still being processed: >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> >> >> >> [warning 2018/10/22 19:05:02.034 AMT >> Server-17488aac-6fbd-43d1-8f91-c6d06c3947c0 <Client Queue Initialization >> Thread 1> tid=0x4b8] CacheClientNotifier: Unsuccessfully registered client >> with identifier >> identity(localhost(Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0:21127:loner):57327:6cb13d9c:Client-17488aac-6fbd-43d1-8f91-c6d06c3947c0,connection=1 >> and response code 60 >> >> >> >> Could it be we are missing something on PoolCreation or client/server >> configuration for Geode 1.6.0? Things were working fine with 1.1.0 version. >> >> >> >> Thanks, >> >> Vahram. >> >> >> >>
