RE: closing session on socket close vs waiting for timeout

2010-11-08 Thread Fournier, Camille F. [Tech]
I'm reopening this question for the group. I have attached some sample code 
(3.3 branch) to a jira tracker that seems to do what I propose, namely, lower 
the session timeout in the case of an error causing the socket to close. 
https://issues.apache.org/jira/browse/ZOOKEEPER-922

I am very interested in any feedback about what might fail here. I have this 
running in a dev ensemble and it seems to work, but I haven't done any sort of 
extensive testing or considered the effects of this on observers, etc. Even if 
the community doesn't want the change in ZK for reasons of false positives I 
may need to use it internally and could use any insights the experts have on 
unintended side effects.

Thanks,
Camille

-Original Message-
From: Benjamin Reed [mailto:br...@yahoo-inc.com] 
Sent: Friday, September 10, 2010 4:11 PM
To: zookeeper-u...@hadoop.apache.org
Subject: Re: closing session on socket close vs waiting for timeout

  ah dang, i should have said generate a close request for the session 
and push that through the system.

ben

On 09/10/2010 01:01 PM, Benjamin Reed wrote:
the problem is that followers don't track session timeouts. they track
 when they last heard from the sessions that are connected to them and
 they periodically propagate this information to the leader. the leader
 is the one that expires the session. your technique only works when the
 client is connected to the leader.

 one thing you can do is generate a close request for the socket and push
 that through the system. that will cause it to get propagated through
 the followers and processed at the leader. it would also allow you to
 get your functionality without touching the processing pipeline.

 the thing that worries me about this functionality in general is that
 network anomalies can cause a whole raft of sessions to get expired in
 this way. for example, you have 3 servers with load spread well; there
 is a networking glitch that cause clients to abandon a server; suddenly
 1/3 of your clients will get expired sessions.

 ben

 On 09/10/2010 12:17 PM, Fournier, Camille F. [Tech] wrote:
 Ben, could you explain a bit more why you think this won't work? I'm trying 
 to decide if I should put in the work to take the POC I wrote and complete 
 it, but I don't really want to waste my time if there's a fundamental reason 
 it's a bad idea.

 Thanks,
 Camille

 -Original Message-
 From: Benjamin Reed [mailto:br...@yahoo-inc.com]
 Sent: Wednesday, September 08, 2010 4:03 PM
 To: zookeeper-u...@hadoop.apache.org
 Subject: Re: closing session on socket close vs waiting for timeout

 unfortunately, that only works on the standalone server.

 ben

 On 09/08/2010 12:52 PM, Fournier, Camille F. [Tech] wrote:
 This would be the ideal solution to this problem I think.
 Poking around the (3.3) code to figure out how hard it would be to 
 implement, I figure one way to do it would be to modify the session timeout 
 to the min session timeout and touch the connection before calling close 
 when you get certain exceptions in NIOServerCnxn.doIO. I did this (removing 
 the code in touch session that returns if the tickTime is greater than the 
 expire time) and it worked (in the standalone server anyway). Interesting 
 solution, or total hack that will not work beyond most basic test case?

 C

 (forgive lack of actual code in this email)

 -Original Message-
 From: Ted Dunning [mailto:ted.dunn...@gmail.com]
 Sent: Tuesday, September 07, 2010 1:11 PM
 To: zookeeper-u...@hadoop.apache.org
 Cc: Benjamin Reed
 Subject: Re: closing session on socket close vs waiting for timeout

 This really is, just as Ben says a problem of false positives and false
 negatives in detecting session
 expiration.

 On the other hand, the current algorithm isn't really using all the
 information available.  The current algorithm is
 using time since last client initiated heartbeat.  The new proposal is
 somewhat worse in that it proposes to use
 just the boolean has-TCP-disconnect-happened.

 Perhaps it would be better to use multiple features in order to decrease
 both false positives and false negatives.

 For instance, I could imagine that we use the following features:

 - time since last client hearbeat or disconnect or reconnect

 - what was the last event? (a heartbeat or a disconnect or a reconnect)

 Then the expiration algorithm could use a relatively long time since last
 heartbeat and a relatively short time since last disconnect to mark a
 session as disconnected.

 Wouldn't this avoid expiration during GC and cluster partition and cause
 expiration quickly after a client disconnect?


 On Mon, Sep 6, 2010 at 11:26 PM, Patrick Huntph...@apache.orgwrote:


 That's a good point, however with suitable documentation, warnings and such
 it seems like a reasonable feature to provide for those users who require
 it. Used in moderation it seems fine to me. Perhaps we also make it
 configurable at the server level for those

Windows port of ZK C api

2010-11-03 Thread Fournier, Camille F. [Tech]
Hi everyone,

We have a requirement for a native windows-compatible version of the ZK C api. 
We're currently working on various ways to do this port, but would very much 
like to submit this back to you all when we are finished so that we don't have 
to maintain the code ourselves through future releases. Is there interest in 
having this? What would you need with this patch (build scripts, etc) to accept 
it?

Thanks,
Camille



RE: Windows port of ZK C api

2010-11-03 Thread Fournier, Camille F. [Tech]
Thanks Mahadev. We are using those C# bindings but also need native windows 
C/C++. Every language all the time!

C

-Original Message-
From: Mahadev Konar [mailto:maha...@yahoo-inc.com] 
Sent: Wednesday, November 03, 2010 11:06 AM
To: zookeeper-dev@hadoop.apache.org
Subject: Re: Windows port of ZK C api

Hi Camille,
 I think definitely there is. I think a build script with a set of
requirements and a nice set of docs on how to start using it would be great.
BTW, there is a C# binding which someone wrote earlier

http://wiki.apache.org/hadoop/ZooKeeper/ZKClientBindings

You can take a look at that and see if you want to extend that or write your
own.

Thanks
mahadev


On 11/3/10 7:18 AM, Fournier, Camille F. [Tech] camille.fourn...@gs.com
wrote:

 Hi everyone,
 
 We have a requirement for a native windows-compatible version of the ZK C api.
 We're currently working on various ways to do this port, but would very much
 like to submit this back to you all when we are finished so that we don't have
 to maintain the code ourselves through future releases. Is there interest in
 having this? What would you need with this patch (build scripts, etc) to
 accept it?
 
 Thanks,
 Camille
 
 



RE: implications of netty on client connections

2010-10-22 Thread Fournier, Camille F. [Tech]
Yes, that's correct.

C

-Original Message-
From: Mahadev Konar [mailto:maha...@yahoo-inc.com] 
Sent: Friday, October 22, 2010 1:39 PM
To: zookeeper-dev@hadoop.apache.org
Subject: Re: implications of netty on client connections

Hi Camille,
   I am a little curious here. Does this mean you tried a single zookeeper
server with 16K clients?

Thanks
mahadev

On 10/20/10 1:07 PM, Fournier, Camille F. [Tech] camille.fourn...@gs.com
wrote:

 Thanks Patrick, I'll look and see if I can figure out a clean change for this.
 It was the kernel limit for max number of open fds for the process that was
 where the problem shows up (not zk limit). FWIW, we tested with a process fd
 limit of 16K, and ZK performed reasonably well until the fd limit was reached,
 at which point it choked. There was a throughput degradation, but mostly going
 from 0 to 4000 connections. 4000 to 16000 was mostly flat until the sharp
 drop. For our use case it is fine to have a bit of performance loss with huge
 numbers of connections, so long as we can handle the choke, which for initial
 rollout I'm planning on just monitoring for.
 
 C
 
 -Original Message-
 From: Patrick Hunt [mailto:ph...@apache.org]
 Sent: Wednesday, October 20, 2010 2:06 PM
 To: zookeeper-dev@hadoop.apache.org
 Subject: Re: implications of netty on client connections
 
 It may just be the case that we haven't tested sufficiently for this case
 (running out of fds) and we need to handle this better even in nio. Probably
 by cutting off op_connect in the selector. We should be able to do similar
 in netty.
 
 Btw, on unix one can access the open/max fd count using this:
 http://download.oracle.com/javase/6/docs/jre/api/management/extension/com/sun/
 management/UnixOperatingSystemMXBean.html
 
 
 Secondly, are you running into a kernel limit or a zk limit? Take a look at
 this post describing 1million concurrent connections to a box:
 http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb
 -part-3
 
 specifically:
 --
 
 During various test with lots of connections, I ended up making some
 additional changes to my sysctl.conf. This was part trial-and-error, I don't
 really know enough about the internals to make especially informed decisions
 about which values to change. My policy was to wait for things to break,
 check /var/log/kern.log and see what mysterious error was reported, then
 increase stuff that sounded sensible after a spot of googling. Here are the
 settings in place during the above test:
 
 net.core.rmem_max = 33554432
 net.core.wmem_max = 33554432
 net.ipv4.tcp_rmem = 4096 16384 33554432
 net.ipv4.tcp_wmem = 4096 16384 33554432
 net.ipv4.tcp_mem = 786432 1048576 26777216
 net.ipv4.tcp_max_tw_buckets = 36
 net.core.netdev_max_backlog = 2500
 vm.min_free_kbytes = 65536
 vm.swappiness = 0
 net.ipv4.ip_local_port_range = 1024 65535
 
 --
 
 
 I'm guessing that even with this, at some point you'll run into a limit in
 our server implementation. In particular I suspect that we may start to
 respond more slowly to pings, eventually getting so bad it would time out.
 We'd have to debug that and address (optimize).
 
 http://www.metabrew.com/article/a-million-user-comet-application-with-mochiwe
 b-part-3
 Patrick
 
 On Tue, Oct 19, 2010 at 7:16 AM, Fournier, Camille F. [Tech] 
 camille.fourn...@gs.com wrote:
 
 Hi everyone,
 
 I'm curious what the implications of using netty are going to be for the
 case where a server gets close to its max available file descriptors. Right
 now our somewhat limited testing has shown that a ZK server performs fine up
 to the point when it runs out of available fds, at which point performance
 degrades sharply and new connections get into a somewhat bad state. Is netty
 going to enable the server to handle this situation more gracefully (or is
 there a way to do this already that I haven't found)? Limiting connections
 from the same client is not enough since we can potentially have far more
 clients wanting to connect than available fds for certain use cases we might
 consider.
 
 Thanks,
 Camille
 
 
 



RE: implications of netty on client connections

2010-10-20 Thread Fournier, Camille F. [Tech]
Thanks Patrick, I'll look and see if I can figure out a clean change for this.
It was the kernel limit for max number of open fds for the process that was 
where the problem shows up (not zk limit). FWIW, we tested with a process fd 
limit of 16K, and ZK performed reasonably well until the fd limit was reached, 
at which point it choked. There was a throughput degradation, but mostly going 
from 0 to 4000 connections. 4000 to 16000 was mostly flat until the sharp drop. 
For our use case it is fine to have a bit of performance loss with huge numbers 
of connections, so long as we can handle the choke, which for initial rollout 
I'm planning on just monitoring for.

C

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Wednesday, October 20, 2010 2:06 PM
To: zookeeper-dev@hadoop.apache.org
Subject: Re: implications of netty on client connections

It may just be the case that we haven't tested sufficiently for this case
(running out of fds) and we need to handle this better even in nio. Probably
by cutting off op_connect in the selector. We should be able to do similar
in netty.

Btw, on unix one can access the open/max fd count using this:
http://download.oracle.com/javase/6/docs/jre/api/management/extension/com/sun/management/UnixOperatingSystemMXBean.html


Secondly, are you running into a kernel limit or a zk limit? Take a look at
this post describing 1million concurrent connections to a box:
http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3

specifically:
--

During various test with lots of connections, I ended up making some
additional changes to my sysctl.conf. This was part trial-and-error, I don't
really know enough about the internals to make especially informed decisions
about which values to change. My policy was to wait for things to break,
check /var/log/kern.log and see what mysterious error was reported, then
increase stuff that sounded sensible after a spot of googling. Here are the
settings in place during the above test:

net.core.rmem_max = 33554432
net.core.wmem_max = 33554432
net.ipv4.tcp_rmem = 4096 16384 33554432
net.ipv4.tcp_wmem = 4096 16384 33554432
net.ipv4.tcp_mem = 786432 1048576 26777216
net.ipv4.tcp_max_tw_buckets = 36
net.core.netdev_max_backlog = 2500
vm.min_free_kbytes = 65536
vm.swappiness = 0
net.ipv4.ip_local_port_range = 1024 65535

--


I'm guessing that even with this, at some point you'll run into a limit in
our server implementation. In particular I suspect that we may start to
respond more slowly to pings, eventually getting so bad it would time out.
We'd have to debug that and address (optimize).

http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-3
Patrick

On Tue, Oct 19, 2010 at 7:16 AM, Fournier, Camille F. [Tech] 
camille.fourn...@gs.com wrote:

 Hi everyone,

 I'm curious what the implications of using netty are going to be for the
 case where a server gets close to its max available file descriptors. Right
 now our somewhat limited testing has shown that a ZK server performs fine up
 to the point when it runs out of available fds, at which point performance
 degrades sharply and new connections get into a somewhat bad state. Is netty
 going to enable the server to handle this situation more gracefully (or is
 there a way to do this already that I haven't found)? Limiting connections
 from the same client is not enough since we can potentially have far more
 clients wanting to connect than available fds for certain use cases we might
 consider.

 Thanks,
 Camille




RE: Fix release 3.3.2 planning, status.

2010-10-18 Thread Fournier, Camille F. [Tech]
Hi guys,

Any updates on the 3.3.2 release schedule? Trying to plan a release myself and 
wondering if I'll have to go to production with patched 3.3.1 or have time to 
QA with the 3.3.2 release.

Thanks,
Camille

-Original Message-
From: Patrick Hunt [mailto:ph...@apache.org] 
Sent: Thursday, September 23, 2010 12:45 PM
To: zookeeper-dev@hadoop.apache.org
Subject: Fix release 3.3.2 planning, status.

Looking at the JIRA queue for 3.3.2 I see that there are two blockers, one
is currently PA and the other is pretty close (it has a patch that should go
in soon).

There are a few JIRAs that already went into the branch that are important
to get out there ASAP, esp ZOOKEEPER-846 (fix close issue found by hbase).

One issue that's been slowing us down is hudson. The trunk was not passing
it's hudson validation, which was causing a slow down in patch review.
Mahadev and I fixed this. However with recent changes to the hudson
hw/security environment the patch testing process (automated) is broken.
Giri is working on this. In the mean time we'll have to test ourselves.
Committers -- be sure to verify RAT, Findbugs, etc... in addition to
verifying via test. I've setup an additional Hudson environment inside
Cloudera that also verifies the trunk/branch. If issues are found I will
report them (unfortunately I can't provide access to cloudera's hudson env
to non-cloudera employees at this time).

I'd like to clear out the PAs asap and get a release candidate built. Anyone
see a problem with shooting for an RC mid next week?

Patrick


RE: [jira] Updated: (ZOOKEEPER-844) handle auth failure in java client

2010-09-16 Thread Fournier, Camille F. [Tech]
Hi everyone,
Can someone explain what I should do for this? I have a patch for both 3.4 and 
3.3, and I think the 3.3 patch caused issues in the automated patch applier. 
What do I need to do to submit both of these patches to the different branches? 

Thanks,
Camille

-Original Message-
From: Camille Fournier (JIRA) [mailto:j...@apache.org] 
Sent: Thursday, September 16, 2010 2:25 PM
To: zookeeper-dev@hadoop.apache.org
Subject: [jira] Updated: (ZOOKEEPER-844) handle auth failure in java client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier updated ZOOKEEPER-844:
---

Attachment: (was: ZOOKEEPER332-844)

 handle auth failure in java client
 --

 Key: ZOOKEEPER-844
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-844
 Project: Zookeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.3.1
Reporter: Camille Fournier
Assignee: Camille Fournier
 Fix For: 3.3.2, 3.4.0

 Attachments: ZOOKEEPER-844.patch


 ClientCnxn.java currently has the following code:
   if (replyHdr.getXid() == -4) {
 // -2 is the xid for AuthPacket
 // TODO: process AuthPacket here
 if (LOG.isDebugEnabled()) {
 LOG.debug(Got auth sessionid:0x
 + Long.toHexString(sessionId));
 }
 return;
 }
 Auth failures appear to cause the server to disconnect but the client never 
 gets a proper state change or notification that auth has failed, which makes 
 handling this scenario very difficult as it causes the client to go into a 
 loop of sending bad auth, getting disconnected, trying to reconnect, sending 
 bad auth again, over and over. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: (ZOOKEEPER-844) handle auth failure in java client

2010-09-02 Thread Fournier, Camille F. [Tech]
Hi all,

I would like to submit this patch into the 3.3 branch as well, since we are 
probably going to go into production with 3.3 and I'd rather not do a 
production release with a patched version of ZK if possible. I added a patch 
for this fix against the 3.3 branch to this ticket. Any idea of the odds of 
getting this in to the 3.3.2 release?

Thanks,
Camille

-Original Message-
From: Giridharan Kesavan (JIRA) [mailto:j...@apache.org] 
Sent: Tuesday, August 31, 2010 7:25 PM
To: Fournier, Camille F. [Tech]
Subject: [jira] Updated: (ZOOKEEPER-844) handle auth failure in java client


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Giridharan Kesavan updated ZOOKEEPER-844:
-

Status: Patch Available  (was: Open)

 handle auth failure in java client
 --

 Key: ZOOKEEPER-844
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-844
 Project: Zookeeper
  Issue Type: Improvement
  Components: java client
Affects Versions: 3.3.1
Reporter: Camille Fournier
Assignee: Camille Fournier
 Fix For: 3.4.0

 Attachments: ZOOKEEPER-844.patch


 ClientCnxn.java currently has the following code:
   if (replyHdr.getXid() == -4) {
 // -2 is the xid for AuthPacket
 // TODO: process AuthPacket here
 if (LOG.isDebugEnabled()) {
 LOG.debug(Got auth sessionid:0x
 + Long.toHexString(sessionId));
 }
 return;
 }
 Auth failures appear to cause the server to disconnect but the client never 
 gets a proper state change or notification that auth has failed, which makes 
 handling this scenario very difficult as it causes the client to go into a 
 loop of sending bad auth, getting disconnected, trying to reconnect, sending 
 bad auth again, over and over. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: windows port of C API

2010-08-31 Thread Fournier, Camille F. [Tech]
I would be very interested to see any work already done and provide feedback, 
we need such a port and were planning on writing one ourselves.

C

-Original Message-
From: Ben Collins [mailto:ben.coll...@foundationdb.com] 
Sent: Monday, August 30, 2010 5:01 PM
To: zookeeper-dev@hadoop.apache.org
Subject: windows port of C API

I have a working win32 port of the C API, not depending on Cygwin, that
supports the single-threaded model of network interaction.  It compiles in
Visual Studio 2010 and works on 64 bit Windows 7.   There are know issues,
and it is in it's initial stages; but it has been successfully used against
the java server.

I am happy to provide patches, but would like any pointers to efforts
already undertaken in this area, or folks to communicate with about this.

Thanks,
-- 
Ben


handling auth failure in java client

2010-08-19 Thread Fournier, Camille F. [Tech]
Hi all,

I filed this ticket last week:
https://issues.apache.org/jira/browse/ZOOKEEPER-844

Currently, the Java client ignores auth failures which is extremely problematic 
for the deployment I am preparing. I have written a patch to correct the 
problem by adding an AuthFailed KeeperState and checking the auth responses for 
the AUTHFAILED error code (patch is now attached to the ticket). I checked the 
flow vs the c client and it seems to basically match. Is there anything I 
should be aware of beyond this simple fix? All the testing I've done seems fine.

Thanks,
Camille