[ 
https://issues.apache.org/jira/browse/AMQ-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299963#comment-14299963
 ] 

Tamas Cserveny commented on AMQ-2798:
-------------------------------------

Actially there is a general problem behind the scenes:

The component ResponseCorrelator is written in an optimistic manner. The 
assumption is that in case a command sent to the broker, an answer will reach 
us guaranteed.

This is not true in case of failover situations. It might happen, that we were 
able to serialize the last bit of the command just before the broker has 
crashed/killed. This is true for the failover-transport as well.

This means ResponseCorrelator could hang forever at very different locations.

The solution would be to define a time-to-live for the commands. Some of the 
commands do have timeout, the calling code also takes care of timeouts. 
(Callers without timeout does not cope with the situation well. In my case I 
interrupted the above thread, and then JBoss started to process the messages 
using two different processors at the same time). Thus, ResponseCorrelator 
should repeat the command to the server using the same commandID(?). 
In case the command ID is known to the failover-transport (ResponseMap) it 
might ignore the resend, because most likely it is still attempting to send it. 
TTL could be a larger number like 30-40 sec with infinite as default.






> Occaional hangs on ensureConnectionInfoSent
> -------------------------------------------
>
>                 Key: AMQ-2798
>                 URL: https://issues.apache.org/jira/browse/AMQ-2798
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: JMS client
>    Affects Versions: 5.3.2
>            Reporter: Mark Chaimungkalanont
>            Assignee: Timothy Bish
>             Fix For: 5.5.0
>
>         Attachments: blocked-connection-patch3
>
>
> When connecting to the broker, the client occasionally starts to hang. A 
> thread dump reveals:
> {noformat}
> "QuartzScheduler_Worker-7" prio=5 tid=0x0116f190 nid=0x1ce2400 waiting on 
> condition [0xf1fae000..0xf1fafb30]
>       at sun.misc.Unsafe.park(Native Method)
>       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:118)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1767)
>       at 
> java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:341)
>       at 
> org.apache.activemq.transport.FutureResponse.getResult(FutureResponse.java:40)
>       at 
> org.apache.activemq.transport.ResponseCorrelator.request(ResponseCorrelator.java:80)
>       at 
> org.apache.activemq.ActiveMQConnection.syncSendPacket(ActiveMQConnection.java:1233)
>       at 
> org.apache.activemq.ActiveMQConnection.ensureConnectionInfoSent(ActiveMQConnection.java:1339)
>       - locked <0x10b9bdf8> (a java.lang.Object)
>       at 
> org.apache.activemq.ActiveMQConnection.createSession(ActiveMQConnection.java:298)
>       at org.jencks.amqpool.SessionPool.createSession(SessionPool.java:110)
>       at org.jencks.amqpool.SessionPool.makeObject(SessionPool.java:78)
>       at 
> org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:974)
>       at org.jencks.amqpool.SessionPool.borrowSession(SessionPool.java:53)
>       at 
> org.jencks.amqpool.ConnectionPool.createSession(ConnectionPool.java:89)
>       at 
> org.jencks.amqpool.XaConnectionPool.createSession(XaConnectionPool.java:51)
>       at 
> org.jencks.amqpool.PooledConnection.createSession(PooledConnection.java:132)
>       at 
> org.springframework.jms.support.JmsAccessor.createSession(JmsAccessor.java:200)
> {noformat}
> Looking closer at the code of {{ensureConnectionInfoSent}} in 
> {{ActiveMQConnection}}, it uses the method:
> {code}
> public Response syncSendPacket(Command command) throws JMSException {
> {code}
> which never times out, possibly causing everything to hang eternally. There 
> does seem to be an identical method that allows for a timeout. 
> {code}
>     public Response syncSendPacket(Command command, int timeout) throws 
> JMSException {
> {code}
> should / can ensureConnectionInfoSent use the one with the timeout instead?
> We're using the failover transport:
> failover:(tcp://<someIP>:54663?wireFormat.maxInactivityDuration=300000)?maxReconnectAttempts=10&amp;initialReconnectDelay=15000



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to