Hi,

We were stressing activemq 5.4.2 with few persistent queues, few
non-persistent queues with TimeToLive (TTL) of 2 mins, and few non
persistent queues without any TTL. We came across 2 issues from which we
couldn’t recover to normal state without cleaning tmp-store and restarting
activemq.

Stress Details:
1.      Usecase
a.      Senders sending messages to one master queue
b.      Consumers for master queue is integrated along with activemq thru camel
integration with vm protocol. Consumer is called as router, which routes to
other sub queues based on some routing information based on event type. By,
this same message could be dropped to more than one sub queues. We had 4 sub
queues, on average one message to master queue can result into 2.3 messages
in all sub queues because of routing same messages to sub queues.
c.      Consumers to sub queues are in different JVM which drain the messages.
2.      Stats
a.      2000-4000 msgs per second sent to master queue
b.      2000-3300 msgs per second of processing at router
c.      30% cpu in activemq
3.      Hardware
a.      ActiveMQ: one node 6 CPU machine with 24 virtual cpus because of core 
and
hyper threading.
b.      4 Senders:  Each node having 2 CPU with 4 virtual cpus because of dual
core. Each sender was to send 1000 msgs per second
c.      4 Receivers: Each node having 2 CPU with 4 virtual cpus because of dual
core.  1 receiver per sub queue.


Issue1: Temp store usage hits more than 100%
After temp store usage hits more than 100% all senders and consumers were
blocked. There were pending messages in both non-persistent queues with TTL
and without TTL. We tried purging in non-persistent queue, thinking purging
messages in non-persistent queue can free up temp store, but all purge
requests from activemq console and jmx were also blocked with no response. 

Only way to recover was to clean the tmp store and restart the activemq. We
had tmp store configured to 2 GB.

Issue2: Leak in temp store
As Issue 1 was hit, and we couldn’t proceed with stress, so stopped the
stress when temp store reached 96% (timeframe-1), after all messages where
drained in 4-5 mins and there no pending messages in any queue, but temp
store didn’t drop to 0, it dropped only to till 79% (time-frame2). We
started stress again messages piled and temp store again went to 90, we
again stopped stress, 

Below has chain of events with temp store stats, queue stats and temp store
files names and its size. 

Please let me know any pointers on how to recover from this kind of issue
without cleaning tempstore and restart activemq. Is there anyway
configuration thru which we can avoid memory leak in temp store. Also let me
know if you need any other details on stress usecase.

Thanks
siva


TempStore stats
2011-04-21 07:18:08,760 - MEMU  STOREU  TEMPU   ENQ#    DEQ#    TOT_MSG#
2011-04-21 07:18:08,779 - 0     0       0       6934    6480    454             
                  
Started one sender
2011-04-21 07:52:10,539 - 0     1       0       13370361        13370361        0
2011-04-21 07:54:10,737 - 0     0       11      14155760        14115898        
39861  Started three
more sender
2011-04-21 07:56:10,829 - 0     1       26      15484214        15375457        
108759
2011-04-21 07:58:10,899 - MEMU  STOREU  TEMPU   ENQ#    DEQ#    TOT#
2011-04-21 07:58:10,902 - 0     1       41      16822317        16653849        
168468
2011-04-21 08:00:10,999 - 0     2       59      18097270        17851082        
246192
2011-04-21 08:02:11,111 - 0     2       75      19411640        19097534        
314109
2011-04-21 08:04:11,188 - 0     0       96      20630853        20223204        
407652  Stopped stress;
consumers are still draining 400k msgs
2011-04-21 08:06:11,244 - 0     1       87      21618914        21436362        
182551
2011-04-21 08:08:11,301 - 0     0       79      22002217        22002217        
0            Dropped to
79 in 6 mins; 0 pending msgs
2011-04-21 08:18:11,734 - 0     1       79      25445840        25445574        
269        started only
one senders
2011-04-21 08:20:11,806 - 0     1       81      26328252        26311965        
16285   
2011-04-21 08:22:11,901 - 0     2       82      27674442        27597605        
76846    started three
more senders
2011-04-21 08:24:11,974 - 0     0       85      29008172        28864086        
144093
2011-04-21 08:26:12,034 - 0     1       88      30325245        30124132        
201114
2011-04-21 08:28:12,097 - 0     1       91      31664800        31398125        
266677
2011-04-21 08:30:12,173 - 0     2       93      32999706        32671816        
327890
2011-04-21 08:32:12,232 - 0     2       95      34307528        33946200        
361331  Stopped again
2011-04-21 08:34:12,296 - 0     0       79      35082703        35082702        
1             Dropped to
79 within 2 mins

Master Queue stats
2011-04-21 07:18:08,867 - masterEventQueue      PEND#   ENQ#    Chng_ENQ#       
DEQ#
Chng__DEQ#      EXP#    MEM%    DSPTCH#   Stress start
2011-04-21 07:18:08,872 - masterEventQueue      0       2237            18      
2237    18      0       0       2237                     
 Started one sender
2011-04-21 07:52:10,564 - masterEventQueue      0       4313020 2185    4313020 
2185    0       0
4313020
2011-04-21 07:54:10,767 - masterEventQueue      39769   4593381 2336    4553610 
2004    0
0       4555338
2011-04-21 07:56:10,845 - masterEventQueue      108697  5068561 3959    4959863 
3385
0       0       4961116  Started 3 more senders. Msgs enqued spiked from 2k to 
4k,
after which it stayed constent at 4k; 
2011-04-21 07:58:10,926 - masterEventQueue      168461  5540674 3934    5372213 
3436
0       0       5373512
2011-04-21 08:00:11,028 - masterEventQueue      246185  6004699 3866    5758517 
3219
0       0       5760118
2011-04-21 08:02:11,137 - masterEventQueue      314148  6474728 3916    6160580 
3350
0       0       6162105
2011-04-21 08:04:11,205 - masterEventQueue      407616  6931316 3804    6523701 
3026
0       0       6525303  Stopped stress; senders were sending 4k msg per 
second but
router was able to process only till 3.3-3.4k msg per second.
2011-04-21 08:06:11,261 - masterEventQueue      182463  7097490 1384    6915027 
3261
0       0       6920115
2011-04-21 08:08:11,368 - masterEventQueue      0       7097490 0       7097490 
1520    0       0
7097490  Pending msg is 0; but Tmp store dropped only to 79%
2011-04-21 08:18:11,758 - masterEventQueue      16      8208393 2084    8208378 
2084    0       0
8208397  started only one senders 
2011-04-21 08:20:11,836 - masterEventQueue      15958   8503901 2462    8487948 
2329    0
0       8490179
2011-04-21 08:22:11,929 - masterEventQueue      76455   8979141 3960    8902691 
3456    0
0       8903985  started three more senders
2011-04-21 08:24:11,993 - masterEventQueue      143969  9455091 3966    9311128 
3403
0       0       9312780
2011-04-21 08:26:12,053 - masterEventQueue      200591  9918299 3860    9717712 
3388
0       0       9719987
2011-04-21 08:28:12,124 - masterEventQueue      266382  10395026        3972    
10128647
3424    0       0       10130777
2011-04-21 08:30:12,192 - masterEventQueue      327391  10866887        3932    
10539495
3423    0       0       10541609
2011-04-21 08:32:12,252 - masterEventQueue      360928  11311497        3705    
10950569
3425    0       0       10952774   Stopped senders again
2011-04-21 08:34:12,320 - masterEventQueue      0       11317002        45      
11317002        3053    0       0
11317002  Dropped again to zero

Content of temp store
ls -la tmp-data/

33 mb Apr 21 08:56 db-253.log
33 mb Apr 21 08:57 db-254.log
33 mb Apr 21 08:57 db-255.log
33 mb Apr 21 08:57 db-256.log
0 Apr 21 07:53 lock
1.67 gb Apr 21 08:57 tmpDB.data  1.7GB
3 mb Apr 21 08:57 tmpDB.redo


--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Senders-blocked-when-temp-store-100-and-memory-leak-in-temp-store-tp3467749p3467749.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to