You should checkout the failover transport to handle reconnecting. On Sunday, June 2, 2013, fenbers wrote:
> > > > > > I don't know how to determine the NFS version but we are running on > RHEL 5.5. > > I have not checked the syslog. Thanks for the tip. I will > do that > after our morning Operations. > > We are also very inclined to believe this is an NFS issue, based on > behaviors network-wide which have nothing to do with ActiveMQ, e.g, > often taking 10 seconds to list just 5 files in an NFS-mounted > directory. > > So, we are creating an action plan this weekend to eliminate as many > NFS mount points as possible, and seeing how that helps the > situation. The plan needs approval/buy-in from key people to be > implemented, so it may be a couple of weeks to implement the > plan. > In the meantime, ActiveMQ either shuts itself down or behaves in > rather despondent ways, so we find we are having to restart ActiveMQ > every 3 or 4 hours (and this frequency is slowly increasing). > > Once ActiveMQ is rebooted, we find that our producers and our > consumers have to be shut down and relaunched in order to > reestablish the connection with ActiveMQ. This is a royal > pain! > However, a producer will throw an exception whenever it tries to > send a message through a lost connection, and so I catch the > exception where I close the connection and reopen it. Thus, my > producers are able to reconnect automatically in the event ActiveMQ > is restarted. > > But with the consumers, no exception is thrown as it waits for > notifications. It simply waits for a notification that never > happens after the connection with ActiveMQ is lost. So what is > your > recommended method for a consumer to check for a disconnection?? > (Maybe I should post his question as a separate thread...) > > Mark > > > On 5/29/2013 3:21 AM, rajdavies [via > ActiveMQ] wrote: > > Ultimately I'm pretty confident this problem is an > NFS problem - and as Johan has already let the cat out of the > bag > ;) - let me ask the following: > > > Which version of NFS 4 are you using and which environment? > > Have you checked the system logs for NFS errors on all the > machines running ActiveMQ brokers ? > > > thanks, > > > Rob > > > On 29 May 2013, at 00:46, Christian Posta < [hidden email] > > wrote: > > > > I can make two recommendations. > > > > > #1, being the preferred, create a test case that shows > this... that will > > > give us the best chance of finding out what's going on... > take a look at > > > the following test cases in the activemq source code to > give you an idea > > > about how to go about doing it... > > > > > > http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/ > > > > > http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/ > > > > > http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup > > > > > > #2, if creating a test case doesn't sound like something > you want to get > > > into.. i guess, give us the exact configs of broker, > clients, number of > > > consumers, number of topics, message sizes, etc, etc all > details and if one > > > of us gets the urge we can try it out on our boxes. this > will not be nearly > > > as good as #1, and will provide a higher barrier to entry > because we spend > > > our spare time doing this and like to spend that time > debugging and fixing, > > > and not setting up environments and usecases which may not > even show a bug > > > :) > > > > > > > > > > > On Tue, May 28, 2013 at 4:34 PM, fenbers < [hidden email] > > > wrote: > > > > >> > >> > >> > >> > >> > >> I'm getting the Sync exception on both, > local and > NFS.&nbsp; > > >> Originally, > > >> I was only using a local disk, but there > wasn't much > disk space for > > >> the ever growing list of 33MB enumerated > .log files > that weren't > > >> cleaned up.&nbsp; So I reconfigured > ActiveMQ to > put these db files on > > >> an > > >> NFS mount.&nbsp; But the sync exceptions > occurred either way. > > >> > >> I've changed *all* my consumers to > AUTO_ACKNOWLEDGE, > thinking that > > >> maybe an ACKNOWLEDGEment leak was causing the > undeleted files.&nbsp; > > >> That > > >> didn't help...&nbsp; The TRACE level > logging > points to only two of my 5 > > >> topics that accumulate these undeleted db > files.&nbsp; So I've > > >> concentrated by scrutiny over consumers of > these two > topics.&nbsp; But > > >> have not found anything out of the > ordinary.&nbsp; > > >> > >> What is puzzling me still, is that the > frequency of > the log file > > >> build-up and the frequency of exceptions > continues > to increase even > > >> though the amount of messages sent per day > by the > producers remains > > >> nearly constant... > > >> Mark > > >> > >> On 5/28/2013 6:06 PM, ceposta [via > > >> ActiveMQ] wrote: > > >> > >> Sounds like there's multiple issues... > > >> > >> You're journal files aren't being > cleaned up, AND > you're getting > > >> the Sync > > >> > >> exception? > > >> > >> You get the sync exception on local > disk mount? Or > just NFS? > > >> > >> > >> If the journals aren't being cleaned > up, are your > consumers > > >> properly > > >> > >> ack'ing messages? > > >> > >> > >> > >> On Tue, May 28, 2013 at 2:42 PM, > fenbers &lt; > [hidden email] &gt; > > >> wrote: > > >> > >> > >> &gt; > > >> > >> &gt; > > >> > >> &gt; > > >> > >> &gt; > > >> > >> &gt; > > >> > >> &gt; &nbsp; &nbsp; > I would LOVE to > help you help me!&amp;nbsp; But > > >> I have > > >> no idea how to go > > >> > >> &gt; &nbsp; &nbsp; > about making a > test case.&amp;nbsp; If you > > >> could drop > > >> some hints in this > > >> > >> &gt; &nbsp; &nbsp; > regard, I might > be able to produce one. > > >> > >> &gt; > > >> > >> &gt; &nbsp; &nbsp; > My ActiveMQ > issues seem to be related to network > > >> slowness, which we > > >> > >> &gt; &nbsp; &nbsp; > are diagnosing > separately.&amp;nbsp; Or maybe > > >> it is the > > >> other way around, > > >> > >> &gt; &nbsp; &nbsp; > where ActiveMQ > problems are causing network > > >> sluggishness.&amp;nbsp; > Either > > >> > >> &gt; &nbsp; &nbsp; > way, there seems > to be a correlation, except > > >> that when > > >> network > > >> > >> &gt; &nbsp; &nbsp; > responsiveness > improves, ActiveMQ does not. > > >> > >> &gt; > > >> > >> &gt; &nbsp; &nbsp; > The problem I'm > having with AMQ is progressive, > > >> which > > >> is even more > > >> > >> &gt; &nbsp; &nbsp; > puzzling, because > we are not adding to the > > >> number of > > >> messages that > > >> > >> &gt; &nbsp; &nbsp; > AMQ has to > handle.&amp;nbsp; Today, we were up > > >> to 191 > > >> undeleted db-NNN.log > > >> > >> &gt; &nbsp; &nbsp; > files in the > database directory before I > > >> stopped AMQ > > >> and deleted > > >> > >> &gt; &nbsp; &nbsp; > them.&amp;nbsp;&amp;nbsp; NNN was up to 451, so > > >> 260 > > >> files had been cleaned up > > >> > >> &gt; by AMQ's > > >> > >> &gt; &nbsp; &nbsp; > automatic > processes... > > >> > >> &gt; > > >> > >> &gt; &nbsp; &nbsp; > Will log files > assist you in helping > > >> me?&amp;nbsp; I > > >> have TRACE level > > >> > >> &gt; &nbsp; &nbsp; > messages turned > on, so they are quite large. > > >> > >> &gt; > > >> > > < -- *Christian Posta* http://www.christianposta.com/blog twitter: @christianposta