[ https://issues.apache.org/jira/browse/GEODE-8004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17088822#comment-17088822 ]
Juan Ramos commented on GEODE-8004: ----------------------------------- Hello [~alberto.bustamante.reyes], The change certainly improved things (not that many failures in 100 runs) but I still some exceptions, the stack trace is below: {noformat} org.apache.geode.InternalGemFireError: Unexpected message type REPLY at org.apache.geode.cache.client.internal.AbstractOp.processObjResponse(AbstractOp.java:292) at org.apache.geode.cache.client.internal.ContainsKeyOp$ContainsKeyOpImpl.processResponse(ContainsKeyOp.java:66) at org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:222) at org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:207) at org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:382) at org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:268) at org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:352) at org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:753) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:141) at org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:111) at org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:796) at org.apache.geode.cache.client.internal.ContainsKeyOp.execute(ContainsKeyOp.java:37) at org.apache.geode.cache.client.internal.ServerRegionProxy.containsKey(ServerRegionProxy.java:284) at org.apache.geode.internal.cache.LocalRegion.containsKeyOnServer(LocalRegion.java:4043) at parReg.ParRegTest.replace(ParRegTest.java:4487) at parReg.ParRegTest.doEntryOperations(ParRegTest.java:2846) at parReg.ParRegTest.HADoEntryOps(ParRegTest.java:2156) at parReg.ParRegTest.HydraTask_HADoEntryOps(ParRegTest.java:1056) at hydra.TestTask.execute(TestTask.java:197) {noformat} On my end, I'm still trying to write a {{DistributedTest}} to reproduce the problem more reliably. > Regression Introduced Through GEODE-7565 > ---------------------------------------- > > Key: GEODE-8004 > URL: https://issues.apache.org/jira/browse/GEODE-8004 > Project: Geode > Issue Type: Bug > Components: client/server > Reporter: Juan Ramos > Assignee: Juan Ramos > Priority: Major > Labels: GeodeCommons > > Intermittent errors were observed while executing some internal tests and > commit > [dd23ee8|https://github.com/apache/geode/commit/dd23ee8200cba67cea82e57e2e4ccedcdf9e8266] > was determined to be responsible. As of yet, no local reproduction of the > issue is available, but work is ongoing to provide a test that can be used to > debug the issue (a [PR|https://github.com/apache/geode/pull/4974] to revert > of the original commit has been opened and will be merged shortly, though, > this ticket is to investigate the root cause so the original commit can be > merged again into {{develop}}). > --- > It seems that a server is trying to read an {{ack}} response and, instead, it > receives a {{PING}} message: > {noformat} > [error 2020/04/18 23:44:22.758 PDT <poolTimer-edgeDescript-31> tid=0x165] > Unexpected error in pool task > <org.apache.geode.cache.client.internal.LiveServerPinger$PingTask@3483b110> > org.apache.geode.InternalGemFireError: Unexpected message type PING > at > org.apache.geode.cache.client.internal.AbstractOp.processAck(AbstractOp.java:264) > at > org.apache.geode.cache.client.internal.PingOp$PingOpImpl.processResponse(PingOp.java:82) > at > org.apache.geode.cache.client.internal.AbstractOp.processResponse(AbstractOp.java:222) > at > org.apache.geode.cache.client.internal.AbstractOp.attemptReadResponse(AbstractOp.java:207) > at > org.apache.geode.cache.client.internal.AbstractOp.attempt(AbstractOp.java:382) > at > org.apache.geode.cache.client.internal.ConnectionImpl.execute(ConnectionImpl.java:268) > at > org.apache.geode.cache.client.internal.pooling.PooledConnection.execute(PooledConnection.java:352) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.executeWithPossibleReAuthentication(OpExecutorImpl.java:753) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnServer(OpExecutorImpl.java:332) > at > org.apache.geode.cache.client.internal.OpExecutorImpl.executeOn(OpExecutorImpl.java:303) > at > org.apache.geode.cache.client.internal.PoolImpl.executeOn(PoolImpl.java:839) > at org.apache.geode.cache.client.internal.PingOp.execute(PingOp.java:38) > at > org.apache.geode.cache.client.internal.LiveServerPinger$PingTask.run2(LiveServerPinger.java:90) > at > org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1329) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > org.apache.geode.internal.ScheduledThreadPoolExecutorWithKeepAlive$DelegatingScheduledFuture.run(ScheduledThreadPoolExecutorWithKeepAlive.java:276) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {noformat} > Around the same time, another member of the distributed system logs the > following warning, which seems to be related to the original changes as well: > {noformat} > [warn 2020/04/18 23:44:22.757 PDT <ServerConnection on port 29019 Thread 1> > tid=0x298] Unable to ping non-member > rs-FullRegression19040559a2i32xlarge-hydra-client-63(bridgegemfire1_host1_4749:4749)<ec><v39>:41003 > for client > identity(rs-FullRegression19040559a2i32xlarge-hydra-client-63(edgegemfire3_host1_1071:1071:loner):50046:5a182991:edgegemfire3_host1_1071,connection=2 > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)