Re: Out of memory running ZK unit tests against trunk

2011-07-28 Thread Patrick Hunt
I tracked this down to a low ulimit setting on the particular jenkins
host where this was failing (max processes).

Specifically the following test was failing on trunk, but not on
branch 3_3, which concerns me
./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java

there haven't been any real changes to this test between versions, any
insight into why the server is using more threads in trunk vs
branch33?

Patrick

On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt  wrote:
> I've never seen this before, but in my CI environment (sun jdk
> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>
> Has anyone added/modified tests for 3.4.0 that might be using more
> threads/memory than previously? Creating ZK clients but not closing
> them, etc...
>
> java.lang.OutOfMemoryError: unable to create new native thread
>       at java.lang.Thread.start0(Native Method)
>       at java.lang.Thread.start(Thread.java:597)
>       at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>       at 
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>       at 
> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>
> Patrick
>


Re: Out of memory running ZK unit tests against trunk

2011-07-28 Thread Mahadev Konar
Nice find Pat. I cant see a reason on why that should happen. Can we
just do a stack dump and compare?

thanks
mahadev

On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt  wrote:
> I tracked this down to a low ulimit setting on the particular jenkins
> host where this was failing (max processes).
>
> Specifically the following test was failing on trunk, but not on
> branch 3_3, which concerns me
>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>
> there haven't been any real changes to this test between versions, any
> insight into why the server is using more threads in trunk vs
> branch33?
>
> Patrick
>
> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt  wrote:
>> I've never seen this before, but in my CI environment (sun jdk
>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>
>> Has anyone added/modified tests for 3.4.0 that might be using more
>> threads/memory than previously? Creating ZK clients but not closing
>> them, etc...
>>
>> java.lang.OutOfMemoryError: unable to create new native thread
>>       at java.lang.Thread.start0(Native Method)
>>       at java.lang.Thread.start(Thread.java:597)
>>       at 
>> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>>       at 
>> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>       at 
>> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>       at 
>> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>>
>> Patrick
>>
>


Re: Out of memory running ZK unit tests against trunk

2011-07-28 Thread Patrick Hunt
Near the end of this test (QuorumZxidSyncTest) there are tons of
threads running - 115 "ProcessThread" threads, similar numbers of
SessionTracker.

Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
separate thread? (henry/flavio?)

Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
that the server shutdown is not shutting down fully for some reason.

Patrick

On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar  wrote:
> Nice find Pat. I cant see a reason on why that should happen. Can we
> just do a stack dump and compare?
>
> thanks
> mahadev
>
> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt  wrote:
>> I tracked this down to a low ulimit setting on the particular jenkins
>> host where this was failing (max processes).
>>
>> Specifically the following test was failing on trunk, but not on
>> branch 3_3, which concerns me
>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>
>> there haven't been any real changes to this test between versions, any
>> insight into why the server is using more threads in trunk vs
>> branch33?
>>
>> Patrick
>>
>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt  wrote:
>>> I've never seen this before, but in my CI environment (sun jdk
>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>
>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>> threads/memory than previously? Creating ZK clients but not closing
>>> them, etc...
>>>
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>       at java.lang.Thread.start0(Native Method)
>>>       at java.lang.Thread.start(Thread.java:597)
>>>       at 
>>> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.java:114)
>>>       at 
>>> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>       at 
>>> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>       at 
>>> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:39)
>>>
>>> Patrick
>>>
>>
>


RE: Out of memory running ZK unit tests against trunk

2011-07-29 Thread Laxman
In QuorumPeer, when the peer is in LOOKING state we are starting
ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this
server even before startup which has no effect. Also, as this is not a
blocking call QP keeps on spawning new servers.

1) ReadOnlyZooKeeperServer.startup() need not be called in separate a
thread.
2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to
introduce a method like Leader.lead(), Follower.followLeader()
3) Shutdown should be called only after the a/m blocking call is returned.


-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Friday, July 29, 2011 6:24 AM
To: dev@zookeeper.apache.org
Subject: Re: Out of memory running ZK unit tests against trunk

Near the end of this test (QuorumZxidSyncTest) there are tons of
threads running - 115 "ProcessThread" threads, similar numbers of
SessionTracker.

Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
separate thread? (henry/flavio?)

Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
that the server shutdown is not shutting down fully for some reason.

Patrick

On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar 
wrote:
> Nice find Pat. I cant see a reason on why that should happen. Can we
> just do a stack dump and compare?
>
> thanks
> mahadev
>
> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt  wrote:
>> I tracked this down to a low ulimit setting on the particular jenkins
>> host where this was failing (max processes).
>>
>> Specifically the following test was failing on trunk, but not on
>> branch 3_3, which concerns me
>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>
>> there haven't been any real changes to this test between versions, any
>> insight into why the server is using more threads in trunk vs
>> branch33?
>>
>> Patrick
>>
>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt  wrote:
>>> I've never seen this before, but in my CI environment (sun jdk
>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>
>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>> threads/memory than previously? Creating ZK clients but not closing
>>> them, etc...
>>>
>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>       at java.lang.Thread.start0(Native Method)
>>>       at java.lang.Thread.start(Thread.java:597)
>>>       at
org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.
java:114)
>>>       at
org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>       at
org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>       at
org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3
9)
>>>
>>> Patrick
>>>
>>
>



Re: Out of memory running ZK unit tests against trunk

2011-07-29 Thread Patrick Hunt
Hi Laxman, you want to take a stab at it?
https://issues.apache.org/jira/browse/ZOOKEEPER-1140

Can you followup with Flavio/Henry about the readonly issue? Shouldn't
such a feature only be enabled when R/O support is enabled? (my
assumption is that it should be off by default, on via configuration
option)

Patrick

On Fri, Jul 29, 2011 at 7:00 AM, Laxman  wrote:
> In QuorumPeer, when the peer is in LOOKING state we are starting
> ReadOnlyZooKeeperServer in a separate thread. And we are shutting down this
> server even before startup which has no effect. Also, as this is not a
> blocking call QP keeps on spawning new servers.
>
> 1) ReadOnlyZooKeeperServer.startup() need not be called in separate a
> thread.
> 2) ReadOnlyZooKeeperServer.startup() is not a blocking call. Need to
> introduce a method like Leader.lead(), Follower.followLeader()
> 3) Shutdown should be called only after the a/m blocking call is returned.
>
>
> -Original Message-
> From: Patrick Hunt [mailto:ph...@apache.org]
> Sent: Friday, July 29, 2011 6:24 AM
> To: dev@zookeeper.apache.org
> Subject: Re: Out of memory running ZK unit tests against trunk
>
> Near the end of this test (QuorumZxidSyncTest) there are tons of
> threads running - 115 "ProcessThread" threads, similar numbers of
> SessionTracker.
>
> Also I see ~100 ReadOnlyRequestProcessor - why is this running as a
> separate thread? (henry/flavio?)
>
> Regardless, I'll enter a 3.4.0 blocker to clean this up - I suspect
> that the server shutdown is not shutting down fully for some reason.
>
> Patrick
>
> On Thu, Jul 28, 2011 at 5:28 PM, Mahadev Konar 
> wrote:
>> Nice find Pat. I cant see a reason on why that should happen. Can we
>> just do a stack dump and compare?
>>
>> thanks
>> mahadev
>>
>> On Thu, Jul 28, 2011 at 1:54 PM, Patrick Hunt  wrote:
>>> I tracked this down to a low ulimit setting on the particular jenkins
>>> host where this was failing (max processes).
>>>
>>> Specifically the following test was failing on trunk, but not on
>>> branch 3_3, which concerns me
>>>    ./src/java/test/org/apache/zookeeper/test/QuorumZxidSyncTest.java
>>>
>>> there haven't been any real changes to this test between versions, any
>>> insight into why the server is using more threads in trunk vs
>>> branch33?
>>>
>>> Patrick
>>>
>>> On Fri, Jul 22, 2011 at 10:58 AM, Patrick Hunt  wrote:
>>>> I've never seen this before, but in my CI environment (sun jdk
>>>> 1.6.0_20) I'm seeing some intermittent failures such as the following.
>>>>
>>>> Has anyone added/modified tests for 3.4.0 that might be using more
>>>> threads/memory than previously? Creating ZK clients but not closing
>>>> them, etc...
>>>>
>>>> java.lang.OutOfMemoryError: unable to create new native thread
>>>>       at java.lang.Thread.start0(Native Method)
>>>>       at java.lang.Thread.start(Thread.java:597)
>>>>       at
> org.apache.zookeeper.server.NIOServerCnxnFactory.start(NIOServerCnxnFactory.
> java:114)
>>>>       at
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:406)
>>>>       at
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:186)
>>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103)
>>>>       at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67)
>>>>       at
> org.apache.zookeeper.test.QuorumZxidSyncTest.setUp(QuorumZxidSyncTest.java:3
> 9)
>>>>
>>>> Patrick
>>>>
>>>
>>
>
>