[ 
https://issues.apache.org/jira/browse/ARTEMIS-1700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16375205#comment-16375205
 ] 

ASF GitHub Bot commented on ARTEMIS-1700:
-----------------------------------------

Github user shoukunhuai commented on a diff in the pull request:

    https://github.com/apache/activemq-artemis/pull/1894#discussion_r170405082
  
    --- Diff: 
artemis-server/src/main/java/org/apache/activemq/artemis/core/persistence/impl/journal/AbstractJournalStorageManager.java
 ---
    @@ -1488,7 +1494,13 @@ public synchronized void start() throws Exception {
     
           beforeStart();
     
    -      singleThreadExecutor = executorFactory.getExecutor();
    +      ThreadFactory tFactory = AccessController.doPrivileged(new 
PrivilegedAction<ThreadFactory>() {
    +         @Override
    +         public ThreadFactory run() {
    +            return new ActiveMQThreadFactory("ActiveMQ-journal-server-" + 
this.toString(), true, ClientSessionFactoryImpl.class.getClassLoader());
    +         }
    +      });
    +      singleThreadExecutor = Executors.newSingleThreadExecutor(tFactory);
    --- End diff --
    
    What if the pool is full?
    In our case, the pool is a 60 thread fixed pool.
    One of the thread is doing page cleanup, and try to exit paging state, it 
holds the lock in paging store. All other 59 threads is blocked on the lock, 
trying to page.
    While cleanup, we need to store bookmark in journal for each page 
subscription, then wait until completed.
    In log, stored equals to storeLineUp, but there are pending tasks(there are 
going to count down latch cleanup thread is waiting on), the deadlock happened.
    ```
    16:44:28,930 AMQ222024: Could not complete operations on IO context 
OperationContextImpl [1251391301] [minimalStore=1, storeLineUp=2, stored=2, 
minimalReplicated=0, replicationLineUp=0, replicated=0, paged=0, minimalPage=0, 
pageLineUp=0, errorCode=-1, errorMessage=null, executorsPending=3, 
executor=OrderedExecutor(tasks=[org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@4d09259,
 
org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@54b73dc4,
 
org.apache.activemq.artemis.core.persistence.impl.journal.OperationContextImpl$1@640495d4])]
    ```


> Server stopped responding and killed itself while exiting paging state
> ----------------------------------------------------------------------
>
>                 Key: ARTEMIS-1700
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1700
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.4.0
>            Reporter: Qihong Xu
>            Priority: Major
>         Attachments: artemis.log
>
>
> We are currently experiencing this error while running stress test on artemis.
>  
> Basic configuration:
> 1 broker ,1 topic, pub-sub mode.
> Journal type = MAPPED. 
> Threadpool max size = 60.
>  
> In order to test the throughput of artemis we use 300 producers and 300 
> consumers. However we found that sometimes when artemis exit paging state, it 
> will stop responding and kill itself. This situatuion happened on some 
> specific servers.
>  
> Details can be found in attached dump file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to