[ 
https://issues.apache.org/jira/browse/IGNITE-13298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17164776#comment-17164776
 ] 

Vipul Thakur commented on IGNITE-13298:
---------------------------------------

cluster memory config/persistence is in environment section at top.

> Found long running cache at client end 
> ---------------------------------------
>
>                 Key: IGNITE-13298
>                 URL: https://issues.apache.org/jira/browse/IGNITE-13298
>             Project: Ignite
>          Issue Type: Task
>    Affects Versions: 2.7.6
>         Environment: ========cluster memory 
> config/persistence================ 
> <property name="gridLogger"> <property name="gridLogger">            <bean 
> class="org.apache.ignite.logger.log4j2.Log4J2Logger">                
> <constructor-arg type="java.lang.String" 
> value="${IGNITE_SCRIPT}/ignite-log4j2.xml" />            </bean>        
> </property>        <property name="dataStorageConfiguration">            
> <bean class="org.apache.ignite.configuration.DataStorageConfiguration">       
>          <property name="defaultDataRegionConfiguration">                    
> <bean class="org.apache.ignite.configuration.DataRegionConfiguration">        
>                 <property name="metricsEnabled" value="true"/>                
>             <property name="persistenceEnabled" value="true" />               
>          <!--<property name="maxSize" value="#\{10L * 1024 * 1024 * 1024}"/> 
> -->                        <property name="maxSize" value="400Gb" />          
>               <!-- Increasing the buffer size to 4 GB. -->                    
>     <property name="checkpointPageBufferSize" 
> value="${checkpointPageBufferSize}" />                    </bean>             
>    </property>                <property name="storagePath" 
> value="${storagePath}" />                <property name="walPath" 
> value="${walPath}" />                <property name="walArchivePath" 
> value="${walArchivePath}" />                <property name="walMode" 
> value="LOG_ONLY" />                <property name="pageSize" 
> value="${pageSize}" />                 <!-- Enable write throttling. -->      
>           <property name="writeThrottlingEnabled" value="true" />             
>    <property name="walHistorySize" value="1" />                <property 
> name="metricsEnabled" value="true"/>            </bean>        </property>
> ==================Client thread dump ===========================
> 2020-07-20 12:14:432020-07-20 12:14:43Full thread dump Java HotSpot(TM) 
> 64-Bit Server VM (25.211-b12 mixed mode):
> "Attach Listener" #788 daemon prio=9 os_prio=0 tid=0x00007fe7f4001000 
> nid=0x32d waiting on condition [0x0000000000000000]   java.lang.Thread.State: 
> RUNNABLE
>    Locked ownable synchronizers: - None
> "Context_6_jms_314_ConsumerDispatcher" #787 daemon prio=5 os_prio=0 
> tid=0x00007fe6e805e000 nid=0x31a waiting on condition [0x00007fe2e5bdd000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000000cb87d9d0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) 
> at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.eventLoop(ConsumerNotificationDispatcher.java:110)
>  at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.run(ConsumerNotificationDispatcher.java:130)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
> "DefaultMessageListenerContainer-35" #786 prio=5 os_prio=0 
> tid=0x00007fe460013800 nid=0x319 in Object.wait() [0x00007fe2e5cde000]   
> java.lang.Thread.State: TIMED_WAITING (on object monitor) at 
> java.lang.Object.wait(Native Method) at 
> com.solacesystems.jcsmp.impl.XMLMessageQueue.dequeue(XMLMessageQueue.java:130)
>  at 
> com.solacesystems.jcsmp.impl.flow.FlowHandleImpl.receive(FlowHandleImpl.java:845)
>  - locked <0x00000000cb8cce50> (a 
> com.solacesystems.jcsmp.impl.XMLMessageQueueList) at 
> com.solacesystems.jms.SolMessageConsumer.receive(SolMessageConsumer.java:253) 
> at 
> org.springframework.jms.connection.CachedMessageConsumer.receive(CachedMessageConsumer.java:86)
>  at 
> org.springframework.jms.support.destination.JmsDestinationAccessor.receiveFromConsumer(JmsDestinationAccessor.java:132)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveMessage(AbstractPollingMessageListenerContainer.java:418)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:303)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:257)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
> "Context_4_jms_313_ConsumerDispatcher" #785 daemon prio=5 os_prio=0 
> tid=0x00007fe6f8028000 nid=0x318 waiting on condition [0x00007fe2e5ddf000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000000cb8cf8d0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) 
> at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.eventLoop(ConsumerNotificationDispatcher.java:110)
>  at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.run(ConsumerNotificationDispatcher.java:130)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
> "DefaultMessageListenerContainer-27" #784 prio=5 os_prio=0 
> tid=0x00007fe45800f800 nid=0x317 in Object.wait() [0x00007fe2e5ee0000]   
> java.lang.Thread.State: TIMED_WAITING (on object monitor) at 
> java.lang.Object.wait(Native Method) at 
> com.solacesystems.jcsmp.impl.XMLMessageQueue.dequeue(XMLMessageQueue.java:130)
>  at 
> com.solacesystems.jcsmp.impl.flow.FlowHandleImpl.receive(FlowHandleImpl.java:845)
>  - locked <0x00000000cb8cffc8> (a 
> com.solacesystems.jcsmp.impl.XMLMessageQueueList) at 
> com.solacesystems.jms.SolMessageConsumer.receive(SolMessageConsumer.java:253) 
> at 
> org.springframework.jms.connection.CachedMessageConsumer.receive(CachedMessageConsumer.java:86)
>  at 
> org.springframework.jms.support.destination.JmsDestinationAccessor.receiveFromConsumer(JmsDestinationAccessor.java:132)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveMessage(AbstractPollingMessageListenerContainer.java:418)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:303)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:257)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
> "Context_6_jms_312_ConsumerDispatcher" #780 daemon prio=5 os_prio=0 
> tid=0x00007fe6e805c800 nid=0x313 waiting on condition [0x00007fe2e62e4000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) - parking to wait for  <0x00000000cb751ad0> (a 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at 
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
>  at java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:403) 
> at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.eventLoop(ConsumerNotificationDispatcher.java:110)
>  at 
> com.solacesystems.jcsmp.protocol.nio.impl.ConsumerNotificationDispatcher.run(ConsumerNotificationDispatcher.java:130)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
> "DefaultMessageListenerContainer-34" #779 prio=5 os_prio=0 
> tid=0x00007fe450003800 nid=0x312 waiting on condition [0x00007fe2e63e5000]   
> java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native 
> Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) 
> at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
>  at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
>  at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get0(GridCacheAdapter.java:4723)
>  at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4697)
>  at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1415)
>  at 
> org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.get(IgniteCacheProxyImpl.java:928)
>  at 
> org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.get(GatewayProtectedCacheProxy.java:640)
>  at 
> com.jio.digitalapi.cacheservice.client.impl.DigitalApiIgniteCache.get(DigitalApiIgniteCache.java:87)
>  at 
> com.jio.digitalapi.eventprocessing.service.dataservice.EventManagementApacheIgniteDataService.getCustomerEntity(EventManagementApacheIgniteDataService.java:101)
>  at 
> com.jio.digitalapi.ep.dataservice.impl.AbstractEventManagementDataService.getCustomer(AbstractEventManagementDataService.java:38)
>  at 
> com.jio.digitalapi.eventprocessing.service.base.AbstractMessageEventActionProcessor.getCustomerEntity(AbstractMessageEventActionProcessor.java:154)
>  at 
> com.jio.digitalapi.eventprocessing.service.event.action.processor.PrimeMemberUpdateEventActionProcessor.processEvent(PrimeMemberUpdateEventActionProcessor.java:31)
>  at 
> com.jio.digitalapi.eventprocessing.service.base.AbstractMessageEventActionProcessor.processMessageEvent(AbstractMessageEventActionProcessor.java:112)
>  at 
> com.jio.digitalapi.eventprocessing.service.base.AbstractMessageEventHandler.processMessage(AbstractMessageEventHandler.java:66)
>  at 
> com.jio.digitalapi.platform.core.messaging.jms.receiver.DigitalApiAsyncJmsMessageReceiver.onMessage(DigitalApiAsyncJmsMessageReceiver.java:106)
>  at 
> org.springframework.jms.listener.AbstractMessageListenerContainer.doInvokeListener(AbstractMessageListenerContainer.java:761)
>  at 
> org.springframework.jms.listener.AbstractMessageListenerContainer.invokeListener(AbstractMessageListenerContainer.java:699)
>  at 
> org.springframework.jms.listener.AbstractMessageListenerContainer.doExecuteListener(AbstractMessageListenerContainer.java:674)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.doReceiveAndExecute(AbstractPollingMessageListenerContainer.java:318)
>  at 
> org.springframework.jms.listener.AbstractPollingMessageListenerContainer.receiveAndExecute(AbstractPollingMessageListenerContainer.java:257)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.invokeListener(DefaultMessageListenerContainer.java:1189)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.executeOngoingLoop(DefaultMessageListenerContainer.java:1179)
>  at 
> org.springframework.jms.listener.DefaultMessageListenerContainer$AsyncMessageListenerInvoker.run(DefaultMessageListenerContainer.java:1076)
>  at java.lang.Thread.run(Thread.java:748)
>    Locked ownable synchronizers: - None
>  
>  
>  
>  
>            Reporter: Vipul Thakur
>            Priority: Blocker
>         Attachments: Ignite_10.143.75.24_threaddump.txt, 
> Ignite_10.143.75.24_threaddump_1.txt, Ignite_10.143.75.24_threaddump_2.txt, 
> Ignite_10.143.75.24_threaddump_3.txt, Ignite_10.143.75.24_threaddump_4.txt
>
>
> Hi 
> We have a ignite cluster with four nodes.(each having a memory of 400Gb). 
> After deploying the cluster and clients in an environment after nearly 2 
> months our cluster gets hung up, initially few clients get stuck with some 
> processing pending, and then after some duration everything gets stuck.
> So after restarting everything(clients and cluster both) it works 
> fine(process around 1 crore of records in 10-15 minutes involving creation of 
> data and even updating the data.
>  
> We are using transactions for all the caches, for create and update and no 
> transaction for get calls.
> We have already faced this issue twice in a span of 4-5 months of deployment.
> I am attaching the cluster thread dump and client thread dump.
> I have seen the *Found long running caches* with one ticket already in Jira 
> and moved to 2.8.1, so is that the solution(please confirm).
>     2020-06-04 20:05:55.889 WARN 1 --- [c7fd8b84-d8sdl%] 
> org.apache.ignite.internal.diagnostic : Found long running cache future 
> [startTime=19:59:30.288, curTime=20:05:55.882, 
> fut=GridPartitionedSingleGetFuture [topVer=AffinityTopologyVersion 
> [topVer=10, minorTopVer=0], key=UserKeyCacheObjectImpl [part=105, 
> val=7701112105, hasValBytes=true], readThrough=true, forcePrimary=false, 
> futId=681978f7271-55010ba8-d8d5-475f-97be-ff1c1916cea1, trackable=true, 
> subjId=59d5e3cf-d09c-44d3-82d6-84dd35b64e10, taskName=null, 
> deserializeBinary=true, skipVals=false, expiryPlc=null, canRemap=true, 
> needVer=false, keepCacheObjects=false, recovery=false, node=TcpDiscoveryNode 
> [id=14718baa-35e7-4d61-bde8-1e9c61978e8f, addrs=[10.135.34.67, 127.0.0.1], 
> sockAddrs=[/10.135.34.67:47500, /127.0.0.1:47500], discPort=47500, order=1, 
> intOrder=1, lastExchangeTime=1591277613188, loc=false, 
> ver=2.7.6#20190911-sha1:21f7ca41, isClient=false], postProcessingClos=null]]
>  
> But this issue i have observed to come up in our scenario in environment also 
> coming without any load or huge traffic.(my cluster just had 100 mb data).
>  
> We dont have any transaction timeout set as of now , should we go for that.
>  
>  
> Thanks 
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to