[ 
https://issues.apache.org/jira/browse/GEODE-9075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17325895#comment-17325895
 ] 

Mario Ivanac commented on GEODE-9075:
-------------------------------------

These are reproduction steps

 

In properties file gemfire1.properties set

membership-port-range=2025-2030

 

1. In gfsh execute:

start locator --name=locator1
start server --name=server1 --server-port=0 
--properties-file=gemfire1.properties
start server --name=server2 --server-port=0 
--properties-file=gemfire2.properties
create region --name=regionA --type=REPLICATE

put --region=regionA --key="1" --value="one"

 

2. Now in second terminal set iptables:

sudo iptables -I INPUT -p tcp --match multiport 
--destination-port=2025,2026,2027,2028,2029,2030 -j REJECT --reject-with 
tcp-reset

 

3. In gfsh execute:

put --region=regionA --key="1" --value="onev2"

 

4. Then in second terminal remove iptables:

sudo iptables -D INPUT -p tcp --match multiport 
--destination-port=2025,2026,2027,2028,2029,2030 -j REJECT --reject-with 
tcp-reset

 

After all these steps, gfsh is stuck.

 

> Thread stuck indefinitely when using Istio/Sidecar
> --------------------------------------------------
>
>                 Key: GEODE-9075
>                 URL: https://issues.apache.org/jira/browse/GEODE-9075
>             Project: Geode
>          Issue Type: Bug
>            Reporter: Mario Ivanac
>            Assignee: Mario Ivanac
>            Priority: Major
>              Labels: pull-request-available
>
> Geode cluster is deployed in kubernetes environment, and Istio/SideCars are 
> injected between cluster members. While running traffic, if any Istio/SideCar 
> is restarted, thread will get stuck indefinitely, while waiting for reply on 
> sent message.
> After detail analysis, it seams that due to restarting of proxy, in some 
> cases, message is lost, and sending side is waiting indefinitely for reply. 
> What can be seen on sending side, is reception of "reset connection" or "EOF" 
> on sending socket after message is sent.
>  
> [warn 2021/03/25 21:04:47.282 CET server2 <ThreadsMonitor> tid=0x12] Thread 
> <64> (0x40) that was executed at <25 Mar 2021 21:03:53 CET> has been stuck 
> for <53.897 seconds> and number of thread monitor iteration <1> 
>  Thread Name <Function Execution Processor2> state <TIMED_WAITING>
>  Waiting on <java.util.concurrent.CountDownLatch$Sync@7c7f9898>
>  Executor Group <FunctionExecutionPooledExecutor>
>  Monitored metric <ResourceManagerStats.numThreadsStuck>
>  Thread stack:
>  sun.misc.Unsafe.park(Native Method)
>  java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>  
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>  
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>  java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
>  
> org.apache.geode.internal.util.concurrent.StoppableCountDownLatch.await(StoppableCountDownLatch.java:72)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.basicWait(ReplyProcessor21.java:736)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:811)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:784)
>  
> org.apache.geode.distributed.internal.ReplyProcessor21.waitForRepliesUninterruptibly(ReplyProcessor21.java:874)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.waitForAckIfNeeded(DistributedCacheOperation.java:811)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation._distribute(DistributedCacheOperation.java:699)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.startOperation(DistributedCacheOperation.java:277)
>  
> org.apache.geode.internal.cache.DistributedCacheOperation.distribute(DistributedCacheOperation.java:318)
>  
> org.apache.geode.internal.cache.DistributedRegion.distributeUpdate(DistributedRegion.java:520)
> ...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to