You should checkout the failover transport to handle reconnecting.

On Sunday, June 2, 2013, fenbers wrote:

>
>
>
>
>
>     I don't know how to determine the NFS version but we are running on
>     RHEL 5.5.
>
>     I have not checked the syslog.  Thanks for the tip.  I will
> do that
>     after our morning Operations.
>
>     We are also very inclined to believe this is an NFS issue, based on
>     behaviors network-wide which have nothing to do with ActiveMQ, e.g,
>     often taking 10 seconds to list just 5 files in an NFS-mounted
>     directory.
>
>     So, we are creating an action plan this weekend to eliminate as many
>     NFS mount points as possible, and seeing how that helps the
>     situation.  The plan needs approval/buy-in from key people to be
>     implemented, so it may be a couple of weeks to implement the
> plan. 
>     In the meantime, ActiveMQ either shuts itself down or behaves in
>     rather despondent ways, so we find we are having to restart ActiveMQ
>     every 3 or 4 hours (and this frequency is slowly increasing).
>
>     Once ActiveMQ is rebooted, we find that our producers and our
>     consumers have to be shut down and relaunched in order to
>     reestablish the connection with ActiveMQ.  This is a royal
> pain! 
>     However, a producer will throw an exception whenever it tries to
>     send a message through a lost connection, and so I catch the
>     exception where I close the connection and reopen it.  Thus, my
>     producers are able to reconnect automatically in the event ActiveMQ
>     is restarted.
>
>     But with the consumers, no exception is thrown as it waits for
>     notifications.  It simply waits for a notification that never
>     happens after the connection with ActiveMQ is lost.  So what is
> your
>     recommended method for a consumer to check for a disconnection?? 
>     (Maybe I should post his question as a separate thread...)
>
>     Mark
>
>
>     On 5/29/2013 3:21 AM, rajdavies [via
>       ActiveMQ] wrote:
>
>      Ultimately I'm pretty confident this problem is an
>       NFS problem  - and as Johan has already let the cat out of the
> bag
>       ;) - let me ask the following:
>
>
>        Which version of NFS 4 are you using and which environment?
>
>        Have you checked the system logs for NFS errors on all the
>       machines running ActiveMQ brokers ?
>
>
>       thanks,
>
>
>       Rob
>
>
>       On 29 May 2013, at 00:46, Christian Posta < [hidden email] >
>       wrote:
>
>
>         > I can make two recommendations.
>
>         >
>         > #1, being the preferred, create a test case that shows
>         this... that will
>
>         > give us the best chance of finding out what's going on...
>         take a look at
>
>         > the following test cases in the activemq source code to
>         give you an idea
>
>         > about how to go about doing it...
>
>         >
>         >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
>         >
>         >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
>         >
>         >
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
>         >
>         >
>         > #2, if creating a test case doesn't sound like something
>         you want to get
>
>         > into.. i guess, give us the exact configs of broker,
>         clients, number of
>
>         > consumers, number of topics, message sizes, etc, etc all
>         details and if one
>
>         > of us gets the urge we can try it out on our boxes. this
>         will not be nearly
>
>         > as good as #1, and will provide a higher barrier to entry
>         because we spend
>
>         > our spare time doing this and like to spend that time
>         debugging and fixing,
>
>         > and not setting up environments and usecases which may not
>         even show a bug
>
>         > :)
>
>         >
>         >
>         >
>         >
>         > On Tue, May 28, 2013 at 4:34 PM, fenbers < [hidden email]
> >
>         wrote:
>
>         >
>         >>
>         >>
>         >>
>         >>
>         >>
>         >>    I'm getting the Sync exception on both,
> local and
>         NFS. 
>
>         >> Originally,
>
>         >>    I was only using a local disk, but there
> wasn't much
>         disk space for
>
>         >>    the ever growing list of 33MB enumerated
> .log files
>         that weren't
>
>         >>    cleaned up.  So I reconfigured
> ActiveMQ to
>         put these db files on
>
>         >> an
>
>         >>    NFS mount.  But the sync exceptions
>         occurred either way.
>
>         >>
>         >>    I've changed *all* my consumers to
> AUTO_ACKNOWLEDGE,
>         thinking that
>
>         >>    maybe an ACKNOWLEDGEment leak was causing the
>         undeleted files. 
>
>         >> That
>
>         >>    didn't help...  The TRACE level
> logging
>         points to only two of my 5
>
>         >>    topics that accumulate these undeleted db
>         files.  So I've
>
>         >>    concentrated by scrutiny over consumers of
> these two
>         topics.  But
>
>         >>    have not found anything out of the
>         ordinary. 
>
>         >>
>         >>    What is puzzling me still, is that the
> frequency of
>         the log file
>
>         >>    build-up and the frequency of exceptions
> continues
>         to increase even
>
>         >>    though the amount of messages sent per day
> by the
>         producers remains
>
>         >>    nearly constant...
>
>         >>    Mark
>
>         >>
>         >>    On 5/28/2013 6:06 PM, ceposta [via
>
>         >>      ActiveMQ] wrote:
>
>         >>
>         >>     Sounds like there's multiple issues...
>
>         >>
>         >>      You're journal files aren't being
> cleaned up, AND
>         you're getting
>
>         >>      the Sync
>
>         >>
>         >>      exception?
>
>         >>
>         >>      You get the sync exception on local
> disk mount? Or
>         just NFS?
>
>         >>
>         >>
>         >>      If the journals aren't being cleaned
> up, are your
>         consumers
>
>         >>      properly
>
>         >>
>         >>      ack'ing messages?
>
>         >>
>         >>
>         >>
>         >>      On Tue, May 28, 2013 at 2:42 PM,
> fenbers <
>         [hidden email] >
>
>         >>      wrote:
>
>         >>
>         >>
>         >>        >
>
>         >>
>         >>        >
>
>         >>
>         >>        >
>
>         >>
>         >>        >
>
>         >>
>         >>        >
>
>         >>
>         >>        >    
> I would LOVE to
>         help you help me!  But
>
>         >> I have
>
>         >>        no idea how to go
>
>         >>
>         >>        >    
> about making a
>         test case.  If you
>
>         >> could drop
>
>         >>        some hints in this
>
>         >>
>         >>        >    
> regard, I might
>         be able to produce one.
>
>         >>
>         >>        >
>
>         >>
>         >>        >    
> My ActiveMQ
>         issues seem to be related to network
>
>         >>        slowness, which we
>
>         >>
>         >>        >    
> are diagnosing
>         separately.  Or maybe
>
>         >> it is the
>
>         >>        other way around,
>
>         >>
>         >>        >    
> where ActiveMQ
>         problems are causing network
>
>         >>        sluggishness. 
> Either
>
>         >>
>         >>        >    
> way, there seems
>         to be a correlation, except
>
>         >> that when
>
>         >>        network
>
>         >>
>         >>        >    
> responsiveness
>         improves, ActiveMQ does not.
>
>         >>
>         >>        >
>
>         >>
>         >>        >    
> The problem I'm
>         having with AMQ is progressive,
>
>         >> which
>
>         >>        is even more
>
>         >>
>         >>        >    
> puzzling, because
>         we are not adding to the
>
>         >> number of
>
>         >>        messages that
>
>         >>
>         >>        >    
> AMQ has to
>         handle.  Today, we were up
>
>         >> to 191
>
>         >>        undeleted db-NNN.log
>
>         >>
>         >>        >    
> files in the
>         database directory before I
>
>         >> stopped AMQ
>
>         >>        and deleted
>
>         >>
>         >>        >    
>         them.   NNN was up to 451, so
>
>         >> 260
>
>         >>        files had been cleaned up
>
>         >>
>         >>        > by AMQ's
>
>         >>
>         >>        >    
> automatic
>         processes...
>
>         >>
>         >>        >
>
>         >>
>         >>        >    
> Will log files
>         assist you in helping
>
>         >> me?  I
>
>         >>        have TRACE level
>
>         >>
>         >>        >    
> messages turned
>         on, so they are quite large.
>
>         >>
>         >>        >
>
>         >>
>
> <



-- 
*Christian Posta*
http://www.christianposta.com/blog
twitter: @christianposta

Reply via email to