Re: New strategy for Netty (ZOOKEEPER-823) was: What's the QA strategy of ZooKeeper?

2010-10-17 Thread Patrick Hunt
On Sat, Oct 16, 2010 at 1:56 AM, Thomas Koch  wrote:

> Benjamin Reed:
> >   actually, the other way of doing the netty patch (since i'm scared of
> > merges) would be to do a refactor cleanup patch with an eye toward
> > netty, and then another patch to actually add netty. [...]
>

Ben you really need to give git a try and stop fearing the branch/merge. ;-)

Seriously though, having a branch is not a big deal. In the end you an
create one or more patches if you like and apply them, but this is
essentially just a merge.

My main concern personally is that a branch not go on for too long or get
too big, ie incorporate too many changes, not focused. I believe that's not
the case here though. Thomas would focus on 1) refactoring the client code
to enable netty integration, 2) integrate netty changes. He'd also be adding
3) significant tests (potentially refactoring some code to better allow
"design for test") to ensure that the code changes (incl refactoring) don't
break anything.

For the record I'll add that this is pretty much what I did when creating
this patch in the first place. Because it was not done on a svn branch, and
it's just a big "patch ball" you can't see that. Also my goals were a bit
different from Thomas's (which I'm fine with in principal).


> I've had exactly the same thought last evening. Instead of trying to find
> the
> bug(s) in the current patch, I'd like to start it over again and do small
> incremental changes from the current trunk towards the current
> ZOOKEEPER-823
> patch.
> Maybe I could do this in ZOOKEEPER-823 patch, this would mean to revert the
> already applied ZOOKEEPER-823 patch.
>

Thomas, did you mean to say "do this in ZOOKEEPER-823 *branch*"?


> Then I want to test each incremental step at least 5 times to find the
> step(s)
> that breaks ZK.
> This approach should take me another two weeks, I believe, mostly because
> each
> Test run takes ~15-25 minutes.
>

This sounds like a reasonable plan to me if you want to try your hand at it.
I also appreciate you stepping up on this effort.

Unfortunately only committers can commit to apache SVN. Which means that one
of us (ben/f/m/h/myself) will have to apply your change to the branch.
You'll have to bug one of us when you're ready to apply a new patch to the
branch. If you can create a new patch (rather than changing the original)
that would be a good idea (easier for us to apply). Shouldn't be much of an
issue I assume if you're using git personally. Notice that I've already
setup a hudson job that pulls from the branch.
https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper_branch_823/

Regards,

Patrick


Re: Restarting discussion on ZooKeeper as a TLP

2010-10-17 Thread 明珠刘
+1

2010/10/14 Patrick Hunt 

> In March of this year we discussed a request from the Apache Board, and
> Hadoop PMC, that we become a TLP rather than a subproject of Hadoop:
>
> Original discussion
> http://markmail.org/thread/42cobkpzlgotcbin
>
> I originally voted against this move, my primary concern being that we were
> not "ready" to move to tlp status given our small contributor base and
> limited contributor diversity. However I'd now like to revisit that
> discussion/decision. Since that time the team has been working hard to
> attract new contributors, and we've seen significant new contributions come
> in. There has also been feedback from board/pmc addressing many of these
> concerns (both on the list and in private). I am now less concerned about
> this issue and don't see it as a blocker for us to move to TLP status.
>
> A second concern was that by becoming a TLP the project would lose it's
> connection with Hadoop, a big source of new users for us. I've been assured
> (and you can see with the other projects that have moved to tlp status;
> pig/hive/hbase/etc...) that this connection will be maintained. The Hadoop
> ZooKeeper tab for example will redirect to our new homepage.
>
> Other Apache members also pointed out to me that we are essentially
> operating as a TLP within the Hadoop PMC. Most of the other PMC members
> have
> little or no experience with ZooKeeper and this makes it difficult for them
> to monitor and advise us. By moving to TLP status we'll be able to govern
> ourselves and better set our direction.
>
> I believe we are ready to become a TLP. Please respond to this email with
> your thoughts and any issues. I will call a vote in a few days, once
> discussion settles.
>
> Regards,
>
> Patrick
>


Re: Running a single unit test

2010-10-17 Thread Henry Robinson
You need to use -Dtestcase, not -Dtest, as per below:

ant test -Dtestcase=YourTestHere

HTH,

Henry

On 17 October 2010 17:34, Michi Mutsuzaki  wrote:

> Hello,
>
> How do I run a single unit test? I tried this:
>
> $ ant test -Dtest=SessionTest
>
> but it still runs all the tests.
>
> Thanks!
> --Michi
>
>


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679


[jira] Commented: (ZOOKEEPER-794) Callbacks are not invoked when the client is closed

2010-10-17 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921939#action_12921939
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-794:
---

ZOOKEEPER-794_5.patch.txt doesn't compile. I'm getting these errors:

[javac] 
branch-3.3/src/java/test/org/apache/zookeeper/test/SessionTest.java:201: cannot 
find symbol
[javac] symbol : variable Assert
[javac] location: class org.apache.zookeeper.test.SessionTest
[javac] Assert.fail("Should have received a SessionExpiredException");
[javac] ^
[javac] 
branch-3.3/src/java/test/org/apache/zookeeper/test/SessionTest.java:217: cannot 
find symbol
[javac] symbol : variable Assert
[javac] location: class org.apache.zookeeper.test.SessionTest
[javac] Assert.assertEquals(KeeperException.Code.SESSIONEXPIRED.toString(), 
cb.toString());
[javac] ^

We need to either:

a. import org.junit.Assert, or
b. Use fail/assertEquals instead of Assert.fail/Assert.assertEquals.

--Michi

> Callbacks are not invoked when the client is closed
> ---
>
> Key: ZOOKEEPER-794
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-794
> Project: Zookeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.3.1
>Reporter: Alexis Midon
>Assignee: Alexis Midon
>Priority: Blocker
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-794.patch.txt, ZOOKEEPER-794.txt, 
> ZOOKEEPER-794_2.patch, ZOOKEEPER-794_3.patch, ZOOKEEPER-794_4.patch.txt, 
> ZOOKEEPER-794_5.patch.txt
>
>
> I noticed that ZooKeeper has different behaviors when calling synchronous or 
> asynchronous actions on a closed ZooKeeper client.
> Actually a synchronous call will throw a "session expired" exception while an 
> asynchronous call will do nothing. No exception, no callback invocation.
> Actually, even if the EventThread receives the Packet with the session 
> expired err code, the packet is never processed since the thread has been 
> killed by the ventOfDeath. So the call back is not invoked.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Running a single unit test

2010-10-17 Thread Michi Mutsuzaki
Hello,

How do I run a single unit test? I tried this:

$ ant test -Dtest=SessionTest

but it still runs all the tests.

Thanks!
--Michi



[jira] Updated: (ZOOKEEPER-820) update c unit tests to ensure "zombie" java server processes don't cause failure

2010-10-17 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki updated ZOOKEEPER-820:
--

Status: Patch Available  (was: Open)

> update c unit tests to ensure "zombie" java server processes don't cause 
> failure
> 
>
> Key: ZOOKEEPER-820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-820
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-820-1.patch, ZOOKEEPER-820.patch, 
> ZOOKEEPER-820.patch, ZOOKEEPER-820.patch
>
>
> When the c unit tests are run sometimes the server doesn't shutdown at the 
> end of the test, this causes subsequent tests (hudson esp) to fail.
> 1) we should try harder to make the server shut down at the end of the test, 
> I suspect this is related to test failing/cleanup
> 2) before the tests are run we should see if the old server is still running 
> and try to shut it down

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (ZOOKEEPER-820) update c unit tests to ensure "zombie" java server processes don't cause failure

2010-10-17 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki updated ZOOKEEPER-820:
--

Attachment: ZOOKEEPER-820.patch

Uses which to check if lsof command is present. If it is, use it to see if 
there is a process listening on port 22181 and kill it. 

--Michi

> update c unit tests to ensure "zombie" java server processes don't cause 
> failure
> 
>
> Key: ZOOKEEPER-820
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-820
> Project: Zookeeper
>  Issue Type: Bug
>Affects Versions: 3.3.1
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-820-1.patch, ZOOKEEPER-820.patch, 
> ZOOKEEPER-820.patch, ZOOKEEPER-820.patch
>
>
> When the c unit tests are run sometimes the server doesn't shutdown at the 
> end of the test, this causes subsequent tests (hudson esp) to fail.
> 1) we should try harder to make the server shut down at the end of the test, 
> I suspect this is related to test failing/cleanup
> 2) before the tests are run we should see if the old server is still running 
> and try to shut it down

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: What's the QA strategy of ZooKeeper?

2010-10-17 Thread Patrick Hunt
Hi Vishal, thanks for the list. As you can see when we do find issues we do
our best to address them and increase testing in that area. Unfortunately
our testing regime, while extensive is not exhaustive. You can see the
clover coverage reports here btw:
https://hudson.apache.org/hudson/view/ZooKeeper/job/ZooKeeper-trunk/clover/

We'd love to see further contributions around testing. Thomas has opened
some discussion around code refactoring, and I'm hopeful that will increase
the coverage and enable "design for test" which we lack in some cases.

Patrick

On Fri, Oct 15, 2010 at 12:24 PM, Vishal K  wrote:

> Hi Patrick,
>
> On Fri, Oct 15, 2010 at 2:22 PM, Patrick Hunt  wrote:
>
> > > Recently, we have ran into issues in ZK that I believe should have
> caught
> > by some basic testing before the release
> >
> > Vishal, can you be more specific, point out specific JIRAs that you
> entered
> > would be very valuable. Don't worry about hurting our feelings or
> anything,
> > without this type of feedback we can't address the specific issues and
> > their
> > underlying problems.
> >
> >
> Heres a list of few issues:
> Leader election taking a long time  to complete -
> https://issues.apache.org/jira/browse/ZOOKEEPER-822
> Last processed zxid set prematurely while establishing leadership -
> https://issues.apache.org/jira/browse/ZOOKEEPER-790
> FLE implementation should be improved to use non-blocking sockets
> ZOOKEEPER-900
> ZK lets any node to become an observer -
> https://issues.apache.org/jira/browse/ZOOKEEPER-851
>
>
> > Regards,
> >
> > Patrick
> >
> > On Fri, Oct 15, 2010 at 11:14 AM, Mahadev Konar  > >wrote:
> >
> > > Well said Vishal.
> > >
> > > I really like the points you put forth!!!
> > >
> > > Agree on all the points, but again, all the point you mention require
> > > commitment from folks like you. Its a pretty hard task to test all the
> > > corner cases of ZooKeeper. I'd expect everyone to pitch in for testing
> a
> > > release. We should definitely work towards a plan. You should go ahead
> > and
> > > create a jira for the QA plan. We should all pitch in with what all
> > should
> > > be tested.
> > >
> > > Thanks
> > > mahadev
> > >
> > > On 10/15/10 7:32 AM, "Vishal K"  wrote:
> > >
> > > > Hi,
> > > >
> > > > I would like to add my few cents here.
> > > >
> > > > I would suggest to stay away from code cleanup unless it is
> absolutely
> > > > necessary.
> > > >
> > > > I would also like to extend this discussion to understand the amount
> of
> > > > testing/QA to be performed before a release. How do we currently
> > qualify
> > > a
> > > > release?
> > > >
> > > > Recently, we have ran into issues in ZK that I believe should have
> > caught
> > > by
> > > > some basic testing before the release. I will be honest in saying
> that,
> > > > unfortunately, these bugs have resulted in questions being raised by
> > > several
> > > > people in our organization about our choice of using ZooKeeper.
> > > > Nevertheless, our product group really thinks that ZK is a cool
> > > technology,
> > > > but we need to focus on making it robust before adding major new
> > features
> > > to
> > > > it.
> > > >
> > > > I would suggest to:
> > > > 1. Look at current bugs and see why existing test did not uncover
> these
> > > bugs
> > > > and improve those tests.
> > > > 2. Look at places that need more tests and broadcast it to the
> > community.
> > > > Follow-up with test development.
> > > > 3. Have a crisp release QA strategy for each release.
> > > > 4. Improve API documentation as well as code documentation so that
> the
> > > API
> > > > usage is clear and debugging is made easier.
> > > >
> > > > Comments?
> > > >
> > > > Thanks.
> > > > -Vishal
> > > >
> > > > On Fri, Oct 15, 2010 at 9:44 AM, Thomas Koch  wrote:
> > > >
> > > >> Hi Benjamin,
> > > >>
> > > >> thank you for your response. Please find some comments inline.
> > > >>
> > > >> Benjamin Reed:
> > > >>>   code quality is important, and there are things we should keep in
> > > >>> mind, but in general i really don't like the idea of risking code
> > > >>> breakage because of a gratuitous code cleanup. we should be
> watching
> > > out
> > > >>> for these things when patches get submitted or when new things go
> in.
> > > >> I didn't want to say it that clear, but especially the new Netty
> code,
> > > both
> > > >> on
> > > >> client and server side is IMHO an example of new code in very bad
> > shape.
> > > >> The
> > > >> client code patch even changes the FindBugs configuration to exclude
> > the
> > > >> new
> > > >> code from the FindBugs checks.
> > > >>
> > > >>> i think this is inline with what pat was saying. just to expand a
> > bit.
> > > >>> in my opinion clean up refactorings have the following problems:
> > > >>>
> > > >>> 1) you risk breaking things in production for a potential future
> > > >>> maintenance advantage.
> > > >> If your code is already in such a bad shape, that every change
> > includes
> > > >> considerable

[jira] Commented: (ZOOKEEPER-804) c unit tests failing due to "assertion cptr failed"

2010-10-17 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921923#action_12921923
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-804:
---

+1.

> I can open a new bug and submit a patch that way if its preferred. 

No worry, it's not a big deal since this is a one line change. 

Thanks again, Jared!
--Michi

> c unit tests failing due to "assertion cptr failed"
> ---
>
> Key: ZOOKEEPER-804
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-804
> Project: Zookeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.0
> Environment: gcc 4.4.3, ubuntu lucid lynx, dual core laptop (intel)
>Reporter: Patrick Hunt
>Assignee: Michi Mutsuzaki
>Priority: Critical
> Fix For: 3.3.2, 3.4.0
>
> Attachments: ZOOKEEPER-804-1.patch, ZOOKEEPER-804.patch
>
>
> I'm seeing this frequently:
>  [exec] Zookeeper_simpleSystem::testPing : elapsed 18006 : OK
>  [exec] Zookeeper_simpleSystem::testAcl : elapsed 1022 : OK
>  [exec] Zookeeper_simpleSystem::testChroot : elapsed 3145 : OK
>  [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started : 
> elapsed 25687 : OK
>  [exec] zktest-mt: 
> /home/phunt/dev/workspace/gitzk/src/c/src/zookeeper.c:1952: 
> zookeeper_process: Assertion `cptr' failed.
>  [exec] make: *** [run-check] Aborted
>  [exec] Zookeeper_simpleSystem::testHangingClient
> Mahadev can you take a look?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (ZOOKEEPER-901) Redesign of QuorumCnxManager

2010-10-17 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921921#action_12921921
 ] 

Patrick Hunt edited comment on ZOOKEEPER-901 at 10/17/10 7:21 PM:
--

Thoughts regarding netty support? We've been adding netty support to the client 
< - > server connection mechanisms. My intent was to eventually modify the 
server < - > server connections (quorum/election) similarly. You might want to 
consider this when refactoring -- either adding directly or just making sure it 
will be easy(ier) to add netty eventually.

  was (Author: phunt):
Thoughts regarding netty support? We've been adding netty support to the 
client<->server connection mechanisms. My intent was to eventually modify the 
server<->server connections (quorum/election) similarly. You might want to 
consider this when refactoring -- either adding directly or just making sure it 
will be easy(ier) to add netty eventually.
  
> Redesign of QuorumCnxManager
> 
>
> Key: ZOOKEEPER-901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-901
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: leaderElection
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.0
>
>
> QuorumCnxManager manages TCP connections between ZooKeeper servers for leader 
> election in replicated mode. We have identified over time a couple of 
> deficiencies that we would like to fix. Unfortunately, fixing these issues 
> requires a little more than just generating a couple of small patches. More 
> specifically, I propose, based on previous discussions with the community, 
> that we reimplement QuorumCnxManager so that we achieve the following:
> # Establishing connections should not be a blocking operation, and perhaps 
> even more important, it shouldn't prevent the establishment of connections 
> with other servers;
> # Using a pair of threads per connection is a little messy, and we have seen 
> issues over time due to the creation and destruction of such threads. A more 
> reasonable approach is to have a single thread and a selector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (ZOOKEEPER-901) Redesign of QuorumCnxManager

2010-10-17 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921921#action_12921921
 ] 

Patrick Hunt commented on ZOOKEEPER-901:


Thoughts regarding netty support? We've been adding netty support to the 
client<->server connection mechanisms. My intent was to eventually modify the 
server<->server connections (quorum/election) similarly. You might want to 
consider this when refactoring -- either adding directly or just making sure it 
will be easy(ier) to add netty eventually.

> Redesign of QuorumCnxManager
> 
>
> Key: ZOOKEEPER-901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-901
> Project: Zookeeper
>  Issue Type: Improvement
>  Components: leaderElection
>Affects Versions: 3.3.1
>Reporter: Flavio Junqueira
>Assignee: Flavio Junqueira
> Fix For: 3.4.0
>
>
> QuorumCnxManager manages TCP connections between ZooKeeper servers for leader 
> election in replicated mode. We have identified over time a couple of 
> deficiencies that we would like to fix. Unfortunately, fixing these issues 
> requires a little more than just generating a couple of small patches. More 
> specifically, I propose, based on previous discussions with the community, 
> that we reimplement QuorumCnxManager so that we achieve the following:
> # Establishing connections should not be a blocking operation, and perhaps 
> even more important, it shouldn't prevent the establishment of connections 
> with other servers;
> # Using a pair of threads per connection is a little messy, and we have seen 
> issues over time due to the creation and destruction of such threads. A more 
> reasonable approach is to have a single thread and a selector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Restarting discussion on ZooKeeper as a TLP

2010-10-17 Thread Patrick Hunt
Hmm, on second thought ;-)

In looking into this a bit more (for example avro
http://www.mail-archive.com/avro-...@hadoop.apache.org/msg03603.html and
http://www.mail-archive.com/avro-...@hadoop.apache.org/msg03688.html) we
should also call an official vote on this. All the better if we can put
together a proposal similar to what Doug has done in the provided "vote"
link. We'll need to provide this to the PMC & Board anyway, so we should get
our cards in order. A couple issues I notice from the links (please read
them):

1) site: we'll be moving from hadoop.apache.org to zookeeper.apache.org

2) initial PMC members, notice doug says:

"I suggest that initial Avro PMC consist of all active Avro committers at
the time we make the formal proposal. This is typical for new TLPs.
(Subsequently PMCs tend to promote committers to the PMC. The Hadoop PMC
generally promotes committers to the PMC after a year of consistent
activity, while some projects immediately add new committers to their PMC.
But we don't need to decide our policy for new PMC membership now, only the
makeup of the initial PMC.)"

I suggest that we make our current committers PMC members similar to this.

3) on a zookeeper chair, again from avro:

"I nominate myself as the initial chair of the Avro PMC, with the proviso
that we adopt a policy of regular chair replacement. I suggest that Avro PMC
chairs serve a one or two-year term. A PMC chair has no morepower than other
PMC members, but rather has a few more duties. In particular, the chair must
submit written quarterly reports to the board describing the health of the
projects developer community. The chair also maintains subversion
permissions and committer account creation."

I nominate myself as the initial chair. I also like this idea of regular
chair replacement for a number of reasons.


What do you folks think? Should I put this into a draft proposal and call an
official vote on our zookeeper-dev list?

Patrick

On Sun, Oct 17, 2010 at 3:12 PM, Patrick Hunt  wrote:

> Good to see we are in agreement on this. Thanks everyone who voted. Looks
> like this is unanimous at this point. I will
> start the proceedings in the Hadoop PMC to make ZooKeeper a TLP.
>
> Patrick
>
>
> On Thu, Oct 14, 2010 at 5:37 PM, Flavio Junqueira wrote:
>
>> +1. Frankly, I don't see concretes benefits for the community with
>> ZooKeeper becoming a TLP, but perhaps it will become clear over time. Now it
>> is certainly cool to have our own top-level domain:
>> http://zookeeper.apache.org/ rocks!
>>
>> -Flavio
>>
>> On Oct 14, 2010, at 1:00 PM, Benjamin Reed wrote:
>>
>>  +1
>>
>> ben
>>
>> On 10/14/2010 11:47 AM, Henry Robinson wrote:
>>
>> +1,
>>
>>
>> I agree that we've addressed most outstanding concerns, we're ready for
>>
>> TLP.
>>
>>
>> Henry
>>
>>
>> On 14 October 2010 13:29, Mahadev Konar  wrote:
>>
>>
>> +1 for moving to TLP.
>>
>>
>> Thanks for starting the vote Pat.
>>
>>
>> mahadev
>>
>>
>>
>> On 10/13/10 2:10 PM, "Patrick Hunt"  wrote:
>>
>>
>> In March of this year we discussed a request from the Apache Board, and
>>
>> Hadoop PMC, that we become a TLP rather than a subproject of Hadoop:
>>
>>
>> Original discussion
>>
>> http://markmail.org/thread/42cobkpzlgotcbin
>>
>>
>> I originally voted against this move, my primary concern being that we
>>
>> were
>>
>> not "ready" to move to tlp status given our small contributor base and
>>
>> limited contributor diversity. However I'd now like to revisit that
>>
>> discussion/decision. Since that time the team has been working hard to
>>
>> attract new contributors, and we've seen significant new contributions
>>
>> come
>>
>> in. There has also been feedback from board/pmc addressing many of these
>>
>> concerns (both on the list and in private). I am now less concerned about
>>
>> this issue and don't see it as a blocker for us to move to TLP status.
>>
>>
>> A second concern was that by becoming a TLP the project would lose it's
>>
>> connection with Hadoop, a big source of new users for us. I've been
>>
>> assured
>>
>> (and you can see with the other projects that have moved to tlp status;
>>
>> pig/hive/hbase/etc...) that this connection will be maintained. The
>>
>> Hadoop
>>
>> ZooKeeper tab for example will redirect to our new homepage.
>>
>>
>> Other Apache members also pointed out to me that we are essentially
>>
>> operating as a TLP within the Hadoop PMC. Most of the other PMC members
>>
>> have
>>
>> little or no experience with ZooKeeper and this makes it difficult for
>>
>> them
>>
>> to monitor and advise us. By moving to TLP status we'll be able to govern
>>
>> ourselves and better set our direction.
>>
>>
>> I believe we are ready to become a TLP. Please respond to this email with
>>
>> your thoughts and any issues. I will call a vote in a few days, once
>>
>> discussion settles.
>>
>>
>> Regards,
>>
>>
>> Patrick
>>
>>
>>
>>
>>
>>
>>   *flavio*
>> *junqueira*
>>
>> research scientist
>>
>> f...@yahoo-inc.com
>> direct +34 93-183-8828
>

Re: Restarting discussion on ZooKeeper as a TLP

2010-10-17 Thread Patrick Hunt
Good to see we are in agreement on this. Thanks everyone who voted. Looks
like this is unanimous at this point. I will
start the proceedings in the Hadoop PMC to make ZooKeeper a TLP.

Patrick

On Thu, Oct 14, 2010 at 5:37 PM, Flavio Junqueira  wrote:

> +1. Frankly, I don't see concretes benefits for the community with
> ZooKeeper becoming a TLP, but perhaps it will become clear over time. Now it
> is certainly cool to have our own top-level domain:
> http://zookeeper.apache.org/ rocks!
>
> -Flavio
>
> On Oct 14, 2010, at 1:00 PM, Benjamin Reed wrote:
>
>  +1
>
> ben
>
> On 10/14/2010 11:47 AM, Henry Robinson wrote:
>
> +1,
>
>
> I agree that we've addressed most outstanding concerns, we're ready for
>
> TLP.
>
>
> Henry
>
>
> On 14 October 2010 13:29, Mahadev Konar  wrote:
>
>
> +1 for moving to TLP.
>
>
> Thanks for starting the vote Pat.
>
>
> mahadev
>
>
>
> On 10/13/10 2:10 PM, "Patrick Hunt"  wrote:
>
>
> In March of this year we discussed a request from the Apache Board, and
>
> Hadoop PMC, that we become a TLP rather than a subproject of Hadoop:
>
>
> Original discussion
>
> http://markmail.org/thread/42cobkpzlgotcbin
>
>
> I originally voted against this move, my primary concern being that we
>
> were
>
> not "ready" to move to tlp status given our small contributor base and
>
> limited contributor diversity. However I'd now like to revisit that
>
> discussion/decision. Since that time the team has been working hard to
>
> attract new contributors, and we've seen significant new contributions
>
> come
>
> in. There has also been feedback from board/pmc addressing many of these
>
> concerns (both on the list and in private). I am now less concerned about
>
> this issue and don't see it as a blocker for us to move to TLP status.
>
>
> A second concern was that by becoming a TLP the project would lose it's
>
> connection with Hadoop, a big source of new users for us. I've been
>
> assured
>
> (and you can see with the other projects that have moved to tlp status;
>
> pig/hive/hbase/etc...) that this connection will be maintained. The
>
> Hadoop
>
> ZooKeeper tab for example will redirect to our new homepage.
>
>
> Other Apache members also pointed out to me that we are essentially
>
> operating as a TLP within the Hadoop PMC. Most of the other PMC members
>
> have
>
> little or no experience with ZooKeeper and this makes it difficult for
>
> them
>
> to monitor and advise us. By moving to TLP status we'll be able to govern
>
> ourselves and better set our direction.
>
>
> I believe we are ready to become a TLP. Please respond to this email with
>
> your thoughts and any issues. I will call a vote in a few days, once
>
> discussion settles.
>
>
> Regards,
>
>
> Patrick
>
>
>
>
>
>
> *flavio*
> *junqueira*
>
> research scientist
>
> f...@yahoo-inc.com
> direct +34 93-183-8828
>
> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> phone (408) 349 3300fax (408) 349 3301
>
>
>


"Testing for Failure in the Cloud: FATE and DESTINI"

2010-10-17 Thread Patrick Hunt
"research that produces real tools, which help developers find (and
then fix) real failure-handling bugs, including 16 new bug reports to
HDFS (7 design bugs and 9 implementation bugs). Pretty nice, given the
intricacies of failure-recovery protocols."

Has anyone heard of this? First time for me, he mentions some
preliminary results with ZK, but I've yet to hear anything:

http://databeta.wordpress.com/2010/10/15/testing-for-failure-in-the-cloud-fate-and-destini/

Patrick


[jira] Created: (ZOOKEEPER-901) Redesign of QuorumCnxManager

2010-10-17 Thread Flavio Junqueira (JIRA)
Redesign of QuorumCnxManager


 Key: ZOOKEEPER-901
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-901
 Project: Zookeeper
  Issue Type: Improvement
  Components: leaderElection
Affects Versions: 3.3.1
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
 Fix For: 3.4.0


QuorumCnxManager manages TCP connections between ZooKeeper servers for leader 
election in replicated mode. We have identified over time a couple of 
deficiencies that we would like to fix. Unfortunately, fixing these issues 
requires a little more than just generating a couple of small patches. More 
specifically, I propose, based on previous discussions with the community, that 
we reimplement QuorumCnxManager so that we achieve the following:

# Establishing connections should not be a blocking operation, and perhaps even 
more important, it shouldn't prevent the establishment of connections with 
other servers;
# Using a pair of threads per connection is a little messy, and we have seen 
issues over time due to the creation and destruction of such threads. A more 
reasonable approach is to have a single thread and a selector.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.