Re: Out of memory running ZK unit tests against trunk
I tracked this down to a low ulimit setting on the particular jenkins host where this was failing (max processes). Specifically the following test was failing on trunk, but not on branch 3_3, which concerns me ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java there haven't been any real changes to this test between versions, any insight into why the server is using more threads in trunk vs branch33? Patrick On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt wrote: > I've never seen this before, but in my CI environment (sun jdk > 1.6.0_20) I'm seeing some intermittent failures such as the following. > > Has anyone added/modified tests for 3.4.0 that might be using more > threads/memory than previously? Creating ZK clients but not closing > them, etc... > > java.lang.OutOfMemoryError: unable to create new native thread > at java.lang.Thread.start0(Native Method) > at java.lang.Thread.start(Thread.java:597) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114) > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406) > at > org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) > at > org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39) > > Patrick >
Re: Out of memory running ZK unit tests against trunk
Nice find Pat. I cant see a reason on why that should happen. Can we just do a stack dump and compare? thanks mahadev On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt wrote: > I tracked this down to a low ulimit setting on the particular jenkins > host where this was failing (max processes). > > Specifically the following test was failing on trunk, but not on > branch 3_3, which concerns me > ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java > > there haven't been any real changes to this test between versions, any > insight into why the server is using more threads in trunk vs > branch33? > > Patrick > > On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt wrote: >> I've never seen this before, but in my CI environment (sun jdk >> 1.6.0_20) I'm seeing some intermittent failures such as the following. >> >> Has anyone added/modified tests for 3.4.0 that might be using more >> threads/memory than previously? Creating ZK clients but not closing >> them, etc... >> >> java.lang.OutOfMemoryError: unable to create new native thread >> at java.lang.Thread.start0(Native Method) >> at java.lang.Thread.start(Thread.java:597) >> at >> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114) >> at >> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406) >> at >> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186) >> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) >> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) >> at >> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39) >> >> Patrick >> >
Re: Out of memory running ZK unit tests against trunk
Near the end of this test (QuorumZxidSyncTest) there are tons of threads running - 115 "ProcessThread" threads, similar numbers of SessionTracker. Also I see ~100 ReadOnlyRequestProcessor - why is this running as a separate thread? (henry/flavio?) Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect that the server shutdown is not shutting down fully for some reason. Patrick On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar wrote: > Nice find Pat. I cant see a reason on why that should happen. Can we > just do a stack dump and compare? > > thanks > mahadev > > On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt wrote: >> I tracked this down to a low ulimit setting on the particular jenkins >> host where this was failing (max processes). >> >> Specifically the following test was failing on trunk, but not on >> branch 3_3, which concerns me >> ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java >> >> there haven't been any real changes to this test between versions, any >> insight into why the server is using more threads in trunk vs >> branch33? >> >> Patrick >> >> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt wrote: >>> I've never seen this before, but in my CI environment (sun jdk >>> 1.6.0_20) I'm seeing some intermittent failures such as the following. >>> >>> Has anyone added/modified tests for 3.4.0 that might be using more >>> threads/memory than previously? Creating ZK clients but not closing >>> them, etc... >>> >>> java.lang.OutOfMemoryError: unable to create new native thread >>> at java.lang.Thread.start0(Native Method) >>> at java.lang.Thread.start(Thread.java:597) >>> at >>> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114) >>> at >>> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406) >>> at >>> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186) >>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) >>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) >>> at >>> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39) >>> >>> Patrick >>> >> >
RE: Out of memory running ZK unit tests against trunk
In QuorumPeer, when the peer is in LOOKING state we are starting ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this server even before startup which has no effect. Also, as this is not a blocking call QP keeps on spawning new servers. 1) ReadOnlyZooKeeperServer.startup() need not be called in separate a thread. 2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to introduce a method like Leader.lead(), Follower.followLeader() 3) Shutdown should be called only after the a/m blocking call is returned. -Original Message- From: Patrick Hunt [mailto:ph...@apache.org] Sent: Friday, July 29, 2011 6:24 AM To: dev@zookeeper.apache.org Subject: Re: Out of memory running ZK unit tests against trunk Near the end of this test (QuorumZxidSyncTest) there are tons of threads running - 115 "ProcessThread" threads, similar numbers of SessionTracker. Also I see ~100 ReadOnlyRequestProcessor - why is this running as a separate thread? (henry/flavio?) Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect that the server shutdown is not shutting down fully for some reason. Patrick On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar wrote: > Nice find Pat. I cant see a reason on why that should happen. Can we > just do a stack dump and compare? > > thanks > mahadev > > On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt wrote: >> I tracked this down to a low ulimit setting on the particular jenkins >> host where this was failing (max processes). >> >> Specifically the following test was failing on trunk, but not on >> branch 3_3, which concerns me >> ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java >> >> there haven't been any real changes to this test between versions, any >> insight into why the server is using more threads in trunk vs >> branch33? >> >> Patrick >> >> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt wrote: >>> I've never seen this before, but in my CI environment (sun jdk >>> 1.6.0_20) I'm seeing some intermittent failures such as the following. >>> >>> Has anyone added/modified tests for 3.4.0 that might be using more >>> threads/memory than previously? Creating ZK clients but not closing >>> them, etc... >>> >>> java.lang.OutOfMemoryError: unable to create new native thread >>> at java.lang.Thread.start0(Native Method) >>> at java.lang.Thread.start(Thread.java:597) >>> at org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory. java:114) >>> at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406) >>> at org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186) >>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) >>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) >>> at org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3 9) >>> >>> Patrick >>> >> >
Re: Out of memory running ZK unit tests against trunk
Hi Laxman, you want to take a stab at it? https://issues.apache.org/jira/browse/ZOOKEEPER-1140 Can you followup with Flavio/Henry about the readonly issue? Shouldn't such a feature only be enabled when R/O support is enabled? (my assumption is that it should be off by default, on via configuration option) Patrick On Fri, Jul 29, 2011 at 7:00 AM, Laxman wrote: > In QuorumPeer, when the peer is in LOOKING state we are starting > ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this > server even before startup which has no effect. Also, as this is not a > blocking call QP keeps on spawning new servers. > > 1) ReadOnlyZooKeeperServer.startup() need not be called in separate a > thread. > 2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to > introduce a method like Leader.lead(), Follower.followLeader() > 3) Shutdown should be called only after the a/m blocking call is returned. > > > -Original Message- > From: Patrick Hunt [mailto:ph...@apache.org] > Sent: Friday, July 29, 2011 6:24 AM > To: dev@zookeeper.apache.org > Subject: Re: Out of memory running ZK unit tests against trunk > > Near the end of this test (QuorumZxidSyncTest) there are tons of > threads running - 115 "ProcessThread" threads, similar numbers of > SessionTracker. > > Also I see ~100 ReadOnlyRequestProcessor - why is this running as a > separate thread? (henry/flavio?) > > Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect > that the server shutdown is not shutting down fully for some reason. > > Patrick > > On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar > wrote: >> Nice find Pat. I cant see a reason on why that should happen. Can we >> just do a stack dump and compare? >> >> thanks >> mahadev >> >> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt wrote: >>> I tracked this down to a low ulimit setting on the particular jenkins >>> host where this was failing (max processes). >>> >>> Specifically the following test was failing on trunk, but not on >>> branch 3_3, which concerns me >>> ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java >>> >>> there haven't been any real changes to this test between versions, any >>> insight into why the server is using more threads in trunk vs >>> branch33? >>> >>> Patrick >>> >>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt wrote: >>>> I've never seen this before, but in my CI environment (sun jdk >>>> 1.6.0_20) I'm seeing some intermittent failures such as the following. >>>> >>>> Has anyone added/modified tests for 3.4.0 that might be using more >>>> threads/memory than previously? Creating ZK clients but not closing >>>> them, etc... >>>> >>>> java.lang.OutOfMemoryError: unable to create new native thread >>>> at java.lang.Thread.start0(Native Method) >>>> at java.lang.Thread.start(Thread.java:597) >>>> at > org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory. > java:114) >>>> at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406) >>>> at > org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186) >>>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) >>>> at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) >>>> at > org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3 > 9) >>>> >>>> Patrick >>>> >>> >> > >