Suresh Perumal created KARAF-4878: ------------------------------------- Summary: Cellar Hazelcast unresponsive when ETH Down Key: KARAF-4878 URL: https://issues.apache.org/jira/browse/KARAF-4878 Project: Karaf Issue Type: Bug Components: cellar-hazelcast Affects Versions: 4.0.5 Environment: Redhat Linux 7.2, CentOS 7.2 Reporter: Suresh Perumal Priority: Blocker
Cluster is configured with 2 Nodes. They are up and running. As part of fail-over scenario simulation. We are trying to test "ETHERNET down scenario" by running "/etc/sysconfig/network-scripts/ifdown eth0" command on the first node. During this scenario we are shutting down the first node where the ETH is down by using monitoring scripts(in-house scripts). The second node(Among those two nodes) is kept alive. Second Node's Hazelcast is not accessible for more than 15 minutes. We are getting bellow exception and no operation related to Hazelcast is working. Applications whichever uses hazelcast kept frozen. Invocation | 52 - com.hazelcast - 3.5.2 | [10.249.50.80]:5701 [cellar] [3.5.2] While asking 'is-executing': Invocation{ serviceName='hz:impl:mapService', op=PutOperation{unacknowledged-alarm}, partitionId=165, replicaIndex=0, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeout=60000, target=Address[10.249.50.79]:5701, backupsExpected=0, backupsCompleted=0} java.util.concurrent.TimeoutException: Call Invocation{ serviceName='hz:impl:mapService', op=com.hazelcast.spi.impl.operationservice.impl.operations.IsStillExecutingOperation{serviceName='hz:impl:mapService', partitionId=-1, callId=2114, invocationTime=1480511190143, waitTimeout=-1, callTimeout=5000}, partitionId=-1, replicaIndex=0, tryCount=0, tryPauseMillis=0, invokeCount=1, callTimeout=5000, target=Address[10.249.50.79]:5701, backupsExpected=0, backupsCompleted=0} encountered a timeout at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponse(InvocationFuture.java:366)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.resolveApplicationResponseOrThrowException(InvocationFuture.java:334)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:225)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.IsStillRunningService.isOperationExecuting(IsStillRunningService.java:85)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.waitForResponse(InvocationFuture.java:275)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:224)[52:com.hazelcast:3.5.2] at com.hazelcast.spi.impl.operationservice.impl.InvocationFuture.get(InvocationFuture.java:204)[52:com.hazelcast:3.5.2] at com.hazelcast.map.impl.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:456)[52:com.hazelcast:3.5.2] at com.hazelcast.map.impl.proxy.MapProxySupport.putInternal(MapProxySupport.java:417)[52:com.hazelcast:3.5.2] at com.hazelcast.map.impl.proxy.MapProxyImpl.put(MapProxyImpl.java:97)[52:com.hazelcast:3.5.2] at com.hazelcast.map.impl.proxy.MapProxyImpl.put(MapProxyImpl.java:87)[52:com.hazelcast:3.5.2] at com.fujitsu.fnc.emf.fpmplatform.cachemanager.HazelcastCacheManagerMapServiceImpl.addToMap(HazelcastCacheManagerMapServiceImpl.java:87)[209:FPMHazelcastCache:4.1.0.SNAPSHOT] at Proxy1897a82c_c032_4a5c_9839_e71cb2af452a.addToMap(Unknown Source)[:] at com.fujitsu.fnc.ngemf.fm.server.impl.FpmConsumerTask.prepareJSON(FpmConsumerTask.java:151)[235:com.fujitsu.fnc.ngemf.fm.server.impl:4.1.0.SNAPSHOT] at com.fujitsu.fnc.ngemf.fm.server.impl.FpmConsumerTask.run(FpmConsumerTask.java:244)[235:com.fujitsu.fnc.ngemf.fm.server.impl:4.1.0.SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_66] at java.util.concurrent.FutureTask.run(FutureTask.java:266)[:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_66] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_66] at java.lang.Thread.run(Thread.java:745)[:1.8.0_66] -- This message was sent by Atlassian JIRA (v6.3.4#6332)