from:"Jay Shrauner"

[jira] [Commented] (ZOOKEEPER-1620) NIOServerCnxnFactory (new code introduced in ZK-1504) opens selectors but never closes them

2013-01-18 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557405#comment-13557405
 ] 

Jay Shrauner commented on ZOOKEEPER-1620:
-

You shouldn't need to do that (explicitly call closeSelector inside 
factory.stop()), and it's preferable if you don't. The reason it's preferable 
not to do it is to allow the system to do a graceful shutdown. The reason you 
shouldn't need to do that is that the shutdown call joins on the accept and 
selector threads, so it's not going to return until those threads exit.

The unit tests create and destroy the factories/threads/selectors in a pretty 
tight loop. I'm not sure how long it takes the system to close fd's associated 
with the selector, but maybe this is something like how closed sockets can 
linger. I might try to put a lengthier sleep after the factory.shutdown() call.

There have been known issues with file descriptor leaks when calling 
selector.close(), so we might check JDK versions we're each running. I found 
the following bug affecting JDK5u28/6u30/7u5
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7118373


> NIOServerCnxnFactory (new code introduced in ZK-1504) opens selectors but 
> never closes them
> ---
>
> Key: ZOOKEEPER-1620
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1620
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Alexander Shraer
>Assignee: Thawan Kooburat
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1620.patch, ZOOKEEPER-1620.patch
>
>
> New code (committed in ZK-1504) opens selectors but doesn't close them.
> Specifically AbstractSelectThread in its constructor does 
> this.selector = Selector.open();
> But possibly also elsewhere. Tests fail for me with the following message:
> java.io.IOException: Too many open files
>   at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
>   at sun.nio.ch.EPollArrayWrapper.(EPollArrayWrapper.java:69)
>   at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:52)
>   at 
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
>   at java.nio.channels.Selector.open(Selector.java:209)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$AbstractSelectThread.(NIOServerCnxnFactory.java:128)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$AcceptThread.(NIOServerCnxnFactory.java:177)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:663)
>   at 
> org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:127)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.(QuorumPeer.java:709)
>   at 
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:177)
>   at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:113)
>   at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:71)
>   at org.apache.zookeeper.test.ReconfigTest.setUp(ReconfigTest.java:56)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1620) NIOServerCnxnFactory (new code introduced in ZK-1504) opens selectors but never closes them

2013-01-16 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555763#comment-13555763
 ] 

Jay Shrauner commented on ZOOKEEPER-1620:
-

Have you tried running the tests in an unmodified branch? I think you are 
encountering this because of changes in your branch, because in the upstream 
branch we don't see the tests fail like this.

That being said, you are right we could do a better job of cleaning up the 
selector. I'd suggest adding a closeSelector() method to the 
AbstractSelectThread

protected void closeSelector() {
  try {
selector.close();
  catch (IOException e) {
// Ignore
  }
}

and then adding calls to closeSelector() right before the 
NIOServerCnxnFactory.this.stop() calls in the finally block of the run methods 
for the AcceptThread and SelectorThread.

> NIOServerCnxnFactory (new code introduced in ZK-1504) opens selectors but 
> never closes them
> ---
>
> Key: ZOOKEEPER-1620
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1620
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0
>Reporter: Alexander Shraer
>
> New code (committed in ZK-1504) opens selectors but doesn't close them.
> Specifically AbstractSelectThread in its constructor does 
> this.selector = Selector.open();
> But possibly also elsewhere. Tests fail for me with the following message:
> java.io.IOException: Too many open files
>   at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
>   at sun.nio.ch.EPollArrayWrapper.(EPollArrayWrapper.java:69)
>   at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:52)
>   at 
> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
>   at java.nio.channels.Selector.open(Selector.java:209)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$AbstractSelectThread.(NIOServerCnxnFactory.java:128)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$AcceptThread.(NIOServerCnxnFactory.java:177)
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:663)
>   at 
> org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:127)
>   at 
> org.apache.zookeeper.server.quorum.QuorumPeer.(QuorumPeer.java:709)
>   at 
> org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:177)
>   at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:113)
>   at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:71)
>   at org.apache.zookeeper.test.ReconfigTest.setUp(ReconfigTest.java:56)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2013-01-14 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13553535#comment-13553535
 ] 

Jay Shrauner commented on ZOOKEEPER-1504:
-

If you want to reconfigure the running AcceptThread, add a new API to the 
AcceptThread so you can signal it to close the old listen socket and open a new 
one on a new port. Add a new routine "requestReconfig(int port)" that toggles a 
boolean ("reconfig=true") and call wakeup() on the AcceptThread. Its main loop 
currently only checks for pending new connections. Add so it also checks 
whether its been asked to reconfigure itself. If it has, call a new routine 
"reconfigure()" that closes the old socket and opens a new one--basically the 
same code you had before for the opening of the new one. If you look at the 
selector threads, they do something similar, but much more complicated--they 
have a pair of message queues as well as the selector. You don't need an entire 
queue here.

The other option is to toggle a boolean that tells the AcceptThread not to take 
the rest of the socket factory down with it when it shuts down. Then you could 
tell one thread to shutdown and fire up another, as Thawan says. The code 
should, I believe, with a little modification support multiple simultaneous 
listen sockets by creating an AcceptThread per listen socket but using the same 
set of selector threads across all of them. Maybe instead of a boolean, add a 
factor member variable AtomicInteger that counts the number of active 
AcceptThreads, and trigger the factory shutdown only when that count hits 0? 
Then for your reconfigure spin up the new AcceptThread/port first before 
shutting down the old one. I think I lean towards this latter solution simply 
because the work to support your reconfigure would be the same as for 
supporting multiple listen sockets, which might be kind of nice to have.

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>    Affects Versions: 3.4.3, 3.4.4, 3.5.0
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-12-12 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530222#comment-13530222
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

Alex- The race condition is within FinalRequestProcessor on any node--Leader, 
Follower, or Observer. This has nothing to do with serialization order of the 
leader. Watch setting/firing is a write operation only on the local node of 
locally maintained state. What happens is, say client A is toggling the value 
of node /X from 1 to 2, and client B is reading and setting a watch on node /X. 
Client B will always see a consistent view; it may however not receive a watch 
firing so it may never know to read value 2. If client B is relying on timely 
watch firing to keep its data fresh, this is a problem.

1. It is possible for thread C1 to process client B reading value 1 and setting 
the watch; thread C2 to process client A writing 2 to /X, firing the watch, 
writing this out to client B's network stack (the watch firing); and finally 
thread C1 to push the read of value 1 onto client B's network stack. Because 
the return value of a getData-and-setWatch call came after the watch fired, the 
client will possibly ignore the watch firing. So eg say client B had originally 
responded to a watch firing on /X. In its view, it sees /X watch fire, it sends 
a getData request, it sees /X watch fire again (which it ignores, because it 
already has a getData outstanding), and finally it gets the response to its 
getData request.

2. It is also possible for client B to read value 1, client A to write value 2 
and check for watch firing, and then for client B to reset the watch. There is 
no locking guarding the atomicity of client B reading /X and setting the watch 
on /X.

It is relatively straightforward to add locking preventing case (2), but for 
case (1) I think we need to restrict parallelism in FinalRequestProcessor.

We can improve the parallelism here, but it hit the point where I wanted to 
leave that for a future Jira. If we could identify which read requests set 
watches, and treat those as a third type, we could then allow pure read 
requests from client B to process simultaneously with write request from client 
A. Current code only fully parses getData and other read request blocks in 
FinalRequestProcessor, so we would need to move this up earlier, which might 
however have performance implications.


> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZO

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-12-07 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526918#comment-13526918
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

That's correct. We initially thought we could allow reads in client B to happen 
concurrently with writes in client A, but there are race conditions if the read 
in client B resets a watch that would be fired by the write in client A. 
Essentially the problem is that we would really prefer to categorize 
read-with-watch-reset requests as write requests. This would require changing 
where in the pipeline we parse read requests.

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6260/
---

(Updated Oct. 12, 2012, 11:47 p.m.)


Review request for zookeeper and Patrick Hunt.


Changes
---

Address feedback from review--shutdown CommitProcessor if downstream processor 
throws an exception (preserves previous behavior)


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1505


This addresses bug ZOOKEEPER-1505.
https://issues.apache.org/jira/browse/ZOOKEEPER-1505


Diffs (updated)
-

  /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 1391526 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 
1391526 
  /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1391526 
  /src/java/test/org/apache/zookeeper/server/quorum/CommitProcessorTest.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/6260/diff/


Testing
---


Thanks,

Jay Shrauner

Re: Review Request: Multi-thread NIOServerCnxn

2012-10-12 Thread Jay Shrauner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6256/
---

(Updated Oct. 12, 2012, 11:45 p.m.)


Review request for zookeeper and Patrick Hunt.


Changes
---

Rebase


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1504


This addresses bug ZOOKEEPER-1504.
https://issues.apache.org/jira/browse/ZOOKEEPER-1504


Diffs (updated)
-

  /src/java/main/org/apache/zookeeper/server/ExpiryQueue.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java 1391526 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/RateLogger.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ServerCnxn.java 1391526 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1391526 
  /src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 1391526 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1391526 
  /src/java/test/org/apache/zookeeper/test/ServerCnxnTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/6256/diff/


Testing
---


Thanks,

Jay Shrauner

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475444#comment-13475444
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

Findbug warning ("naked notify") is bogus; this is a helper routine to wakeup 
the main thread with the state change happening in the routines that call it.

>From the blurb in findbug: "This bug does not necessarily indicate an error, 
>since the change to mutable object state may have taken place in a method 
>which then called the method containing the notification."

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (ZOOKEEPER-1147) Add support for local sessions

2012-10-12 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner reassigned ZOOKEEPER-1147:
---

Assignee: Thawan Kooburat  (was: Jay Shrauner)

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>Assignee: Thawan Kooburat
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-10-12 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: ZOOKEEPER-1505.patch

Address feedback from review--shutdown CommitProcessor if downstream processor 
throws an exception (preserves previous behavior)

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch, 
> ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-10-12 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

Rebase

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-08-22 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13439981#comment-13439981
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

javadoc warning is spurious and unrelated:

 [exec]   [javadoc] javadoc: warning - Error fetching URL: 
http://docs.oracle.com/javase/6/docs/api/package-list

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-08-22 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: ZOOKEEPER-1505.patch

- Addressed reviewboard comments.
- Added unit test.
- Bugfix for issue Thawan found with watch resets on read requests in one 
session racing a write request affecting that watch in another session. 
Solution taken here is to prevent any read requests at all from running 
concurrently with a write request. There is room for further improvement, by 
parsing the request earlier in the pipeline and identifying read requests with 
watch resets.

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch, ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Multi-thread CommitProcessor

2012-08-22 Thread Jay Shrauner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6260/
---

(Updated Aug. 22, 2012, 9:20 p.m.)


Review request for zookeeper and Patrick Hunt.


Changes
---

Addressed comments, added unit test.

Bugfix discovered by Thawan. Tightened concurrency allowed: now a write 
transaction is not allowed to be run concurrently with reads from other 
sessions to prevent race condition with watch resets.


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1505


This addresses bug ZOOKEEPER-1505.
https://issues.apache.org/jira/browse/ZOOKEEPER-1505


Diffs (updated)
-

  /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 1373156 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1373156 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 
1373156 
  /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1373156 
  /src/java/test/org/apache/zookeeper/server/quorum/CommitProcessorTest.java 
PRE-CREATION 

Diff: https://reviews.apache.org/r/6260/diff/


Testing
---


Thanks,

Jay Shrauner

Re: Review Request: Multi-thread CommitProcessor

2012-08-22 Thread Jay Shrauner



> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > This looks very nice Jay! I've looked through most of the code but not yet 
> > the CP logic itself. My thoughts so far:
> > 
> > afaict the approach seems sound.
> > 
> > Needs updates to the documentation.
> > 
> > Needs unit tests to verify the new cases.
> >

Added unit test that tests the different configuration scenarios (0, 1, or many 
worker threads).

Tightened restrictions on concurrency to prevent bug Thawan discovered (reads 
that reset watch in one session could race a write affecting the same node in 
another session). Updated related comments.


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java, lines 
> > 460-474
> > <https://reviews.apache.org/r/6260/diff/1/?file=131594#file131594line460>
> >
> > Is this a bug fix? If so it should be separated out to another jira and 
> > a test should be added for it. (likely we'd want to fix it in 3.3/3.4/trunk)

You're right, this is an unrelated bug fix, I pulled it out.


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/WorkerService.java, line 56
> > <https://reviews.apache.org/r/6260/diff/1/?file=131596#file131596line56>
> >
> > make this configurable. how did you come to 5 seconds as the default?

Made it configurable. 5s was picked somewhat arbitrarily; I'm open to changing 
the default if you think some other value sounds more reasonable.


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java, 
> > line 60
> > <https://reviews.apache.org/r/6260/diff/1/?file=131597#file131597line60>
> >
> > convert these to javadoc so they show up in eclipse tools tips

Done


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/WorkerService.java, lines 76-78
> > <https://reviews.apache.org/r/6260/diff/1/?file=131596#file131596line76>
> >
> > move the method specific docs to the javadoc of the methods themselves.

Done


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/WorkerService.java, line 203
> > <https://reviews.apache.org/r/6260/diff/1/?file=131596#file131596line203>
> >
> > seems you need to be a bit careful when calling stop, if schedule() is 
> > already past the "stoped" check you could end up with a 
> > RejectedExecutionException being thrown? However in this case it seems only 
> > CommitProcessor shutdown is calling this..

Line 126 catches any RejectedExecutionExceptions being thrown and does cleanup


> On Aug. 2, 2012, 12:27 a.m., Patrick Hunt wrote:
> > /src/java/main/org/apache/zookeeper/server/WorkerService.java, line 62
> > <https://reviews.apache.org/r/6260/diff/1/?file=131596#file131596line62>
> >
> > doc this can be 0, and any other implications?

Added


- Jay


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6260/#review9710
---


On July 31, 2012, 10:05 p.m., Jay Shrauner wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6260/
> ---
> 
> (Updated July 31, 2012, 10:05 p.m.)
> 
> 
> Review request for zookeeper and Patrick Hunt.
> 
> 
> Description
> ---
> 
> See https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> 
> 
> This addresses bug ZOOKEEPER-1505.
> https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> 
> 
> Diffs
> -
> 
>   /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 
> 1366784 
>   /src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java 
> 1366784 
>   /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1366784 
>   /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
>   /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 
> 1366784 
>   /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1366784 
> 
> Diff: https://reviews.apache.org/r/6260/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Jay Shrauner
> 
>

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-31 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426170#comment-13426170
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

Posted to reviewboard

https://reviews.apache.org/r/6260/

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: Multi-thread CommitProcessor

2012-07-31 Thread Jay Shrauner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6260/
---

Review request for zookeeper and Patrick Hunt.


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1505


This addresses bug ZOOKEEPER-1505.
https://issues.apache.org/jira/browse/ZOOKEEPER-1505


Diffs
-

  /src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java 1366784 
  /src/java/main/org/apache/zookeeper/server/PrepRequestProcessor.java 1366784 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1366784 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/quorum/CommitProcessor.java 
1366784 
  /src/java/main/org/apache/zookeeper/server/quorum/Leader.java 1366784 

Diff: https://reviews.apache.org/r/6260/diff/


Testing
---


Thanks,

Jay Shrauner

[jira] [Commented] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-31 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426163#comment-13426163
 ] 

Jay Shrauner commented on ZOOKEEPER-1504:
-

Posted to reviewboard

https://reviews.apache.org/r/6256/

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: Multi-thread NIOServerCnxn

2012-07-31 Thread Jay Shrauner


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6256/
---

Review request for zookeeper and Patrick Hunt.


Description
---

See https://issues.apache.org/jira/browse/ZOOKEEPER-1504


This addresses bug ZOOKEEPER-1504.
https://issues.apache.org/jira/browse/ZOOKEEPER-1504


Diffs
-

  /src/java/main/org/apache/zookeeper/server/ExpiryQueue.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java 1366784 
  /src/java/main/org/apache/zookeeper/server/NIOServerCnxnFactory.java 1366784 
  /src/java/main/org/apache/zookeeper/server/RateLogger.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ServerCnxn.java 1366784 
  /src/java/main/org/apache/zookeeper/server/ServerCnxnFactory.java 1366784 
  /src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 1366784 
  /src/java/main/org/apache/zookeeper/server/WorkerService.java PRE-CREATION 
  /src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java 1366784 
  /src/java/test/org/apache/zookeeper/test/ServerCnxnTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/6256/diff/


Testing
---


Thanks,

Jay Shrauner

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-31 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

Split connection expiration out into separate thread.

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch, 
> ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-10 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

Address findbugs warnings

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-10 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410458#comment-13410458
 ] 

Jay Shrauner commented on ZOOKEEPER-1505:
-

FindBug warning is "Naked notify in 
org.apache.zookeeper.server.quorum.CommitProcessor.wakeup()". Explanation of 
warning states "This bug does not necessarily indicate an error, since the 
change to mutable object state may have taken place in a method which then 
called the method containing the notification." which is exactly the situation 
here.

Testing: I haven't found the unit tests always to be the best way to find 
multi-threading issues (even the hammer ones, although they're helpful). Tested 
and debugged by running on an ensemble and driving test load, and then by 
running on our production system.

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: (was: ZOOKEEPER-1504.patch)

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch, ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: (was: ZOOKEEPER-1505.patch)

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: ZOOKEEPER-1505.patch

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-09 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: (was: ZOOKEEPER-1504.patch)

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-06 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Fix Version/s: 3.5.0

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first constraint-- 
> requests we push onto the thread queue are processed in order. The way we 
> guarantee the second constraint is we only allow a single commit transaction 
> to be in flight at a time--the queue servicing thread blocks while a commit 
> transaction is in flight, and when the transaction completes it clears the 
> flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-06 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Fix Version/s: 3.5.0

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.4.3, 3.4.4, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>  Labels: performance, scaling
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used by CommitProcessor) and 
> non-assignable threads (as used here).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-06 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Description: 
NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
numbers of clients. This is particularly noticeable when thousands of clients 
connect. I propose multi-threading this code as follows:

- 1   acceptor thread, for accepting new connections
- 1-N selector threads
- 0-M I/O worker threads

Numbers of threads are configurable, with defaults scaling according to number 
of cores. Communication with the selector threads is handled via 
LinkedBlockingQueues, and connections are permanently assigned to a particular 
selector thread so that all potentially blocking SelectionKey operations can be 
performed solely by the selector thread. An ExecutorService is used for the 
worker threads.

On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
selector threads and 64 worker threads for a 70% +/- 5% improvement in 
throughput.

This patch incorporates and supersedes the patches for

https://issues.apache.org/jira/browse/ZOOKEEPER-517
https://issues.apache.org/jira/browse/ZOOKEEPER-1444

New classes introduced in this patch are:

  - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
SessionTrackerImpl used to expire sessions so that the same logic can be used 
to expire connections
  - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
currently only used to throttle rate of logging "out of file descriptors" errors
  - WorkerService (also in ZOOKEEPER-1505): ExecutorService wrapper that makes 
worker threads daemon threads and names then in an easily debuggable manner. 
Supports assignable threads (as used by CommitProcessor) and non-assignable 
threads (as used here).

  was:
NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
numbers of clients. This is particularly noticeable when thousands of clients 
connect. I propose multi-threading this code as follows:

- 1   acceptor thread, for accepting new connections
- 1-N selector threads
- 0-M I/O worker threads

Numbers of threads are configurable, with defaults scaling according to number 
of cores. Communication with the selector threads is handled via 
LinkedBlockingQueues, and connections are permanently assigned to a particular 
selector thread so that all potentially blocking SelectionKey operations can be 
performed solely by the selector thread. An ExecutorService is used for the 
worker threads.

On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
selector threads and 64 worker threads for a 70% +/- 5% improvement in 
throughput.

This patch incorporates and supersedes the patches for

https://issues.apache.org/jira/browse/ZOOKEEPER-517
https://issues.apache.org/jira/browse/ZOOKEEPER-1444

New classes introduced in this patch are:

  - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
SessionTrackerImpl used to expire sessions so that the same logic can be used 
to expire connections
  - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
currently only used to throttle rate of logging "out of file descriptors" errors
  - WorkerService: ExecutorService wrapper that makes worker threads daemon 
threads and names then in an easily debuggable manner


> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>        Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Attachments: ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-06 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Description: 
CommitProcessor has a single thread that both pulls requests off its queues and 
runs all downstream processors. This is noticeably inefficient for 
read-intensive workloads, which could be run concurrently. The trick is 
handling write transactions. I propose multi-threading this code according to 
the following two constraints

  - each session must see its requests responded to in order
  - all committed transactions must be handled in zxid order, across all 
sessions

I believe these cover the only constraints we need to honor. In particular, I 
believe we can relax the following:

  - it does not matter if the read request in one session happens before or 
after the write request in another session

With these constraints, I propose the following threads

  - 1primary queue servicing/work dispatching thread
  - 0-N  assignable worker threads, where a given session is always assigned to 
the same worker thread

By assigning sessions always to the same worker thread (using a simple 
sessionId mod number of worker threads), we guarantee the first constraint-- 
requests we push onto the thread queue are processed in order. The way we 
guarantee the second constraint is we only allow a single commit transaction to 
be in flight at a time--the queue servicing thread blocks while a commit 
transaction is in flight, and when the transaction completes it clears the flag.

On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
worker threads for a 56% +/- 5% improvement in throughput (this improvement was 
measured on top of that for ZOOKEEPER-1504, not in isolation).

New classes introduced in this patch are:

WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that makes 
worker threads daemon threads and names then in an easily debuggable manner. 
Supports assignable threads (as used here) and non-assignable threads (as used 
by NIOServerCnxnFactory).


  was:
CommitProcessor has a single thread that both pulls requests off its queues and 
runs all downstream processors. This is noticeably inefficient for 
read-intensive workloads, which could be run concurrently. The trick is 
handling write transactions. I propose multi-threading this code according to 
the following two constraints

  - each session must see its requests responded to in order
  - all committed transactions must be handled in zxid order, across all 
sessions

I believe these cover the only constraints we need to honor. In particular, I 
believe we can relax the following:

  - it does not matter if the read request in one session happens before or 
after the write request in another session

With these constraints, I propose the following threads

  - 1primary queue servicing/work dispatching thread
  - 0-N  assignable worker threads, where a given session is always assigned to 
the same worker thread

By assigning sessions always to the same worker thread (using a simple 
sessionId mod number of worker threads), we guarantee the first 
constraint--requests we push onto the thread queue are processed in order. The 
way we guarantee the second constraint is we only allow a single commit 
transaction to be in flight at a time--the queue servicing thread blocks while 
a commit transaction is in flight, and when the transaction completes it clears 
the flag.

On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
worker threads for a 56% +/- 5% improvement in throughput (this improvement was 
measured on top of that for ZOOKEEPER-1504, not in isolation).

New classes introduced in this patch are:

WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that makes 
worker threads daemon threads and names then in an easily debuggable manner. 
Supports assignable threads (as used here) and non-assignable threads (as used 
by NIOServerCnxnFactory).



> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
&g

[jira] [Updated] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-05 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1505:


Attachment: ZOOKEEPER-1505.patch

> Multi-thread CommitProcessor
> 
>
> Key: ZOOKEEPER-1505
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: performance, scaling
> Attachments: ZOOKEEPER-1505.patch
>
>
> CommitProcessor has a single thread that both pulls requests off its queues 
> and runs all downstream processors. This is noticeably inefficient for 
> read-intensive workloads, which could be run concurrently. The trick is 
> handling write transactions. I propose multi-threading this code according to 
> the following two constraints
>   - each session must see its requests responded to in order
>   - all committed transactions must be handled in zxid order, across all 
> sessions
> I believe these cover the only constraints we need to honor. In particular, I 
> believe we can relax the following:
>   - it does not matter if the read request in one session happens before or 
> after the write request in another session
> With these constraints, I propose the following threads
>   - 1primary queue servicing/work dispatching thread
>   - 0-N  assignable worker threads, where a given session is always assigned 
> to the same worker thread
> By assigning sessions always to the same worker thread (using a simple 
> sessionId mod number of worker threads), we guarantee the first 
> constraint--requests we push onto the thread queue are processed in order. 
> The way we guarantee the second constraint is we only allow a single commit 
> transaction to be in flight at a time--the queue servicing thread blocks 
> while a commit transaction is in flight, and when the transaction completes 
> it clears the flag.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
> worker threads for a 56% +/- 5% improvement in throughput (this improvement 
> was measured on top of that for ZOOKEEPER-1504, not in isolation).
> New classes introduced in this patch are:
> WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that 
> makes worker threads daemon threads and names then in an easily debuggable 
> manner. Supports assignable threads (as used here) and non-assignable threads 
> (as used by NIOServerCnxnFactory).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (ZOOKEEPER-1505) Multi-thread CommitProcessor

2012-07-05 Thread Jay Shrauner (JIRA)

Jay Shrauner created ZOOKEEPER-1505:
---

 Summary: Multi-thread CommitProcessor
 Key: ZOOKEEPER-1505
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1505
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Jay Shrauner
Assignee: Jay Shrauner
 Attachments: ZOOKEEPER-1505.patch

CommitProcessor has a single thread that both pulls requests off its queues and 
runs all downstream processors. This is noticeably inefficient for 
read-intensive workloads, which could be run concurrently. The trick is 
handling write transactions. I propose multi-threading this code according to 
the following two constraints

  - each session must see its requests responded to in order
  - all committed transactions must be handled in zxid order, across all 
sessions

I believe these cover the only constraints we need to honor. In particular, I 
believe we can relax the following:

  - it does not matter if the read request in one session happens before or 
after the write request in another session

With these constraints, I propose the following threads

  - 1primary queue servicing/work dispatching thread
  - 0-N  assignable worker threads, where a given session is always assigned to 
the same worker thread

By assigning sessions always to the same worker thread (using a simple 
sessionId mod number of worker threads), we guarantee the first 
constraint--requests we push onto the thread queue are processed in order. The 
way we guarantee the second constraint is we only allow a single commit 
transaction to be in flight at a time--the queue servicing thread blocks while 
a commit transaction is in flight, and when the transaction completes it clears 
the flag.

On a 32 core machine running Linux 2.6.38, achieved best performance with 32 
worker threads for a 56% +/- 5% improvement in throughput (this improvement was 
measured on top of that for ZOOKEEPER-1504, not in isolation).

New classes introduced in this patch are:

WorkerService (also in ZOOKEEPER-1504): ExecutorService wrapper that makes 
worker threads daemon threads and names then in an easily debuggable manner. 
Supports assignable threads (as used here) and non-assignable threads (as used 
by NIOServerCnxnFactory).


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (ZOOKEEPER-1147) Add support for local sessions

2012-07-05 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner reassigned ZOOKEEPER-1147:
---

Assignee: Jay Shrauner

> Add support for local sessions
> --
>
> Key: ZOOKEEPER-1147
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1147
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>Affects Versions: 3.3.3
>Reporter: Vishal Kathuria
>    Assignee: Jay Shrauner
>  Labels: api-change, scaling
> Fix For: 3.5.0
>
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> This improvement is in the bucket of making ZooKeeper work at a large scale. 
> We are planning on having about a 1 million clients connect to a ZooKeeper 
> ensemble through a set of 50-100 observers. Majority of these clients are 
> read only - ie they do not do any updates or create ephemeral nodes.
> In ZooKeeper today, the client creates a session and the session creation is 
> handled like any other update. In the above use case, the session create/drop 
> workload can easily overwhelm an ensemble. The following is a proposal for a 
> "local session", to support a larger number of connections.
> 1.   The idea is to introduce a new type of session - "local" session. A 
> "local" session doesn't have a full functionality of a normal session.
> 2.   Local sessions cannot create ephemeral nodes.
> 3.   Once a local session is lost, you cannot re-establish it using the 
> session-id/password. The session and its watches are gone for good.
> 4.   When a local session connects, the session info is only maintained 
> on the zookeeper server (in this case, an observer) that it is connected to. 
> The leader is not aware of the creation of such a session and there is no 
> state written to disk.
> 5.   The pings and expiration is handled by the server that the session 
> is connected to.
> With the above changes, we can make ZooKeeper scale to a much larger number 
> of clients without making the core ensemble a bottleneck.
> In terms of API, there are two options that are being considered
> 1. Let the client specify at the connect time which kind of session do they 
> want.
> 2. All sessions connect as local sessions and automatically get promoted to 
> global sessions when they do an operation that requires a global session 
> (e.g. creating an ephemeral node)
> Chubby took the approach of lazily promoting all sessions to global, but I 
> don't think that would work in our case, where we want to keep sessions which 
> never create ephemeral nodes as always local. Option 2 would make it more 
> broadly usable but option 1 would be easier to implement.
> We are thinking of implementing option 1 as the first cut. There would be a 
> client flag, IsLocalSession (much like the current readOnly flag) that would 
> be used to determine whether to create a local session or a global session.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-517) NIO factory fails to close connections when the number of file handles run out.

2012-07-05 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407571#comment-13407571
 ] 

Jay Shrauner commented on ZOOKEEPER-517:


Superseded and made obsolete by ZOOKEEPER-1504

> NIO factory fails to close connections when the number of file handles run 
> out.
> ---
>
> Key: ZOOKEEPER-517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-517
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.3, 3.5.0
>Reporter: Mahadev konar
>Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-517.patch
>
>
> The code in NIO factory is such that if we fail to accept a connection due to 
> some reasons (too many file handles maybe one of them) we do not close the 
> connections that are in CLOSE_WAIT. We need to call an explicit close on 
> these sockets and then close them. One of the solutions might be to move doIO 
> before accpet so that we can still close connection even if we cannot accept 
> connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1444) Idle session-less connections never time out

2012-07-05 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13407572#comment-13407572
 ] 

Jay Shrauner commented on ZOOKEEPER-1444:
-

Superseded and made obsolete by ZOOKEEPER-1504

> Idle session-less connections never time out
> 
>
> Key: ZOOKEEPER-1444
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.2, 3.4.3, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1444.patch, ZOOKEEPER-1444.patch
>
>
> A socket connection to the server on which a session is not created will 
> never time out. A misbehaving client that opens and leaks connections without 
> creating sessions will hold open file descriptors on the server.
> The existing timeout code is implemented at the session level, but the 
> servers also should track and expire connections at the connection level. 
> Proposed solution is to pull the timeout data structure handling code 
> (hashmap of expiry time to sets of objects, simple monotonically incrementing 
> nextExpirationTime) from SessionTrackerImpl into its own class in order to 
> share it with connection level timeouts to be implemented in 
> NIOServerCnxnFactory. Connections can be assigned a small initial timeout 
> (proposing something small, like 3s) until a session is created, at which 
> point the ServerCnxn session timeout can be used instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-05 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1504:


Attachment: ZOOKEEPER-1504.patch

> Multi-thread NIOServerCnxn
> --
>
> Key: ZOOKEEPER-1504
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>  Labels: perfomance
> Attachments: ZOOKEEPER-1504.patch
>
>
> NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
> numbers of clients. This is particularly noticeable when thousands of clients 
> connect. I propose multi-threading this code as follows:
> - 1   acceptor thread, for accepting new connections
> - 1-N selector threads
> - 0-M I/O worker threads
> Numbers of threads are configurable, with defaults scaling according to 
> number of cores. Communication with the selector threads is handled via 
> LinkedBlockingQueues, and connections are permanently assigned to a 
> particular selector thread so that all potentially blocking SelectionKey 
> operations can be performed solely by the selector thread. An ExecutorService 
> is used for the worker threads.
> On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
> selector threads and 64 worker threads for a 70% +/- 5% improvement in 
> throughput.
> This patch incorporates and supersedes the patches for
> https://issues.apache.org/jira/browse/ZOOKEEPER-517
> https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> New classes introduced in this patch are:
>   - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
> SessionTrackerImpl used to expire sessions so that the same logic can be used 
> to expire connections
>   - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
> currently only used to throttle rate of logging "out of file descriptors" 
> errors
>   - WorkerService: ExecutorService wrapper that makes worker threads daemon 
> threads and names then in an easily debuggable manner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (ZOOKEEPER-1504) Multi-thread NIOServerCnxn

2012-07-05 Thread Jay Shrauner (JIRA)

Jay Shrauner created ZOOKEEPER-1504:
---

 Summary: Multi-thread NIOServerCnxn
 Key: ZOOKEEPER-1504
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1504
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Reporter: Jay Shrauner
Assignee: Jay Shrauner


NIOServerCnxnFactory is single threaded, which doesn't scale well to large 
numbers of clients. This is particularly noticeable when thousands of clients 
connect. I propose multi-threading this code as follows:

- 1   acceptor thread, for accepting new connections
- 1-N selector threads
- 0-M I/O worker threads

Numbers of threads are configurable, with defaults scaling according to number 
of cores. Communication with the selector threads is handled via 
LinkedBlockingQueues, and connections are permanently assigned to a particular 
selector thread so that all potentially blocking SelectionKey operations can be 
performed solely by the selector thread. An ExecutorService is used for the 
worker threads.

On a 32 core machine running Linux 2.6.38, achieved best performance with 4 
selector threads and 64 worker threads for a 70% +/- 5% improvement in 
throughput.

This patch incorporates and supersedes the patches for

https://issues.apache.org/jira/browse/ZOOKEEPER-517
https://issues.apache.org/jira/browse/ZOOKEEPER-1444

New classes introduced in this patch are:

  - ExpiryQueue (from ZOOKEEPER-1444): factor out the logic from 
SessionTrackerImpl used to expire sessions so that the same logic can be used 
to expire connections
  - RateLogger (from ZOOKEEPER-517): rate limit error message logging, 
currently only used to throttle rate of logging "out of file descriptors" errors
  - WorkerService: ExecutorService wrapper that makes worker threads daemon 
threads and names then in an easily debuggable manner

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1444) Idle session-less connections never time out

2012-05-02 Thread Jay Shrauner (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267170#comment-13267170
 ] 

Jay Shrauner commented on ZOOKEEPER-1444:
-

The automated test runs are looking pretty flaky..is this typical? They all 
pass in my client.

> Idle session-less connections never time out
> 
>
> Key: ZOOKEEPER-1444
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.2, 3.4.3, 3.5.0
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1444.patch, ZOOKEEPER-1444.patch
>
>
> A socket connection to the server on which a session is not created will 
> never time out. A misbehaving client that opens and leaks connections without 
> creating sessions will hold open file descriptors on the server.
> The existing timeout code is implemented at the session level, but the 
> servers also should track and expire connections at the connection level. 
> Proposed solution is to pull the timeout data structure handling code 
> (hashmap of expiry time to sets of objects, simple monotonically incrementing 
> nextExpirationTime) from SessionTrackerImpl into its own class in order to 
> share it with connection level timeouts to be implemented in 
> NIOServerCnxnFactory. Connections can be assigned a small initial timeout 
> (proposing something small, like 3s) until a session is created, at which 
> point the ServerCnxn session timeout can be used instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1444) Idle session-less connections never time out

2012-05-02 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1444:


Attachment: ZOOKEEPER-1444.patch

> Idle session-less connections never time out
> 
>
> Key: ZOOKEEPER-1444
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.2, 3.4.3, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1444.patch, ZOOKEEPER-1444.patch
>
>
> A socket connection to the server on which a session is not created will 
> never time out. A misbehaving client that opens and leaks connections without 
> creating sessions will hold open file descriptors on the server.
> The existing timeout code is implemented at the session level, but the 
> servers also should track and expire connections at the connection level. 
> Proposed solution is to pull the timeout data structure handling code 
> (hashmap of expiry time to sets of objects, simple monotonically incrementing 
> nextExpirationTime) from SessionTrackerImpl into its own class in order to 
> share it with connection level timeouts to be implemented in 
> NIOServerCnxnFactory. Connections can be assigned a small initial timeout 
> (proposing something small, like 3s) until a session is created, at which 
> point the ServerCnxn session timeout can be used instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1444) Idle session-less connections never time out

2012-05-02 Thread Jay Shrauner (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1444:


Attachment: ZOOKEEPER-1444.patch

> Idle session-less connections never time out
> 
>
> Key: ZOOKEEPER-1444
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1444
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.3.2, 3.4.3, 3.5.0
>Reporter: Jay Shrauner
>    Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1444.patch
>
>
> A socket connection to the server on which a session is not created will 
> never time out. A misbehaving client that opens and leaks connections without 
> creating sessions will hold open file descriptors on the server.
> The existing timeout code is implemented at the session level, but the 
> servers also should track and expire connections at the connection level. 
> Proposed solution is to pull the timeout data structure handling code 
> (hashmap of expiry time to sets of objects, simple monotonically incrementing 
> nextExpirationTime) from SessionTrackerImpl into its own class in order to 
> share it with connection level timeouts to be implemented in 
> NIOServerCnxnFactory. Connections can be assigned a small initial timeout 
> (proposing something small, like 3s) until a session is created, at which 
> point the ServerCnxn session timeout can be used instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-10 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250850#comment-13250850
 ] 

Jay Shrauner commented on ZOOKEEPER-1383:
-

Updated patch to remove extra copies of createQuota and delQuota that were left 
behind in ZooKeeperMain when it was refactored into separate classes in cli/*

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-10 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: ZOOKEEPER-1383.patch

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-10 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250838#comment-13250838
 ] 

Jay Shrauner commented on ZOOKEEPER-1383:
-

To clarify a bit: stats and quota nodes are serialized as strings, and the 
current code in StatsTrack expects them to serialize and parse identically with 
only 2 fields. This makes adding new fields problematic, and we need to add new 
fields to both stats and quota nodes for throughput quotas and we need to add 
new fields to quota nodes for hard limit quotas.

Fortunately, the parsing code in the StatsTrack constructor leaves an opening 
to add new fields if we don't use commas and string them after an extra = sign 
after the 'bytes=4' portion of the string. It's definitely a bit hacky, but it 
works, and it seems the easiest way to allow us to add new fields. I've added 
new checks in the unit tests to verify that the current StatsTrack parser 
continues to parse correctly the new stats/quota nodes.

I've also changed the parser so it can handle parsing unrecognized fields, 
which should make it easier to add new quota type in the future without having 
these backwards compatibility issues.

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-09 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Release Note: Adds support for throughput quotas (soft and hard) and hard 
node count and hard size quotas. Parses quota nodes from older versions of the 
server and preserves behavior of existing quotas (soft node count and soft 
size).  (was: Adds support for throughput quotas (soft and hard) and hard node 
count and hard size quotas. Parses quota nodes from older versions of the 
server and preserves behavior of existing quotas (soft node count and soft 
size). Stat and quota nodes created with this version of the server are not 
parseable by older versions and thus downgrading is not supported.)

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-09 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: ZOOKEEPER-1383.patch

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-09 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: (was: ZOOKEEPER-1383.patch)

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-09 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: ZOOKEEPER-1383.patch

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch, 
> ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-04-09 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250375#comment-13250375
 ] 

Jay Shrauner commented on ZOOKEEPER-1383:
-

I looked at the old StatsNode deserialization code and it is possible to make 
this backwards compatible by exploiting some quirks in the old parsing code. 
The old stats nodes were serialized as

"count=5,bytes=4"

We can add new quota/stats types using the following format

"count=5,bytes=4=;bytesPerSec=100;countHardLimit=10"

The old parsing code will assert if there isn't exactly 1 ',', and it expects 
(but doesn't verify) that the first element is count and the second is bytes. 
So we put those two fields first and use semi-colons to separate all the new 
fields. The extra '=' after the bytes field is what tricks the old parsing into 
ignoring all the new fields.

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (ZOOKEEPER-1444) Idle session-less connections never time out

2012-04-09 Thread Jay Shrauner (Created) (JIRA)

Idle session-less connections never time out


 Key: ZOOKEEPER-1444
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1444
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3, 3.3.2, 3.5.0
Reporter: Jay Shrauner
Assignee: Jay Shrauner
 Fix For: 3.5.0


A socket connection to the server on which a session is not created will never 
time out. A misbehaving client that opens and leaks connections without 
creating sessions will hold open file descriptors on the server.

The existing timeout code is implemented at the session level, but the servers 
also should track and expire connections at the connection level. Proposed 
solution is to pull the timeout data structure handling code (hashmap of expiry 
time to sets of objects, simple monotonically incrementing nextExpirationTime) 
from SessionTrackerImpl into its own class in order to share it with connection 
level timeouts to be implemented in NIOServerCnxnFactory. Connections can be 
assigned a small initial timeout (proposing something small, like 3s) until a 
session is created, at which point the ServerCnxn session timeout can be used 
instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-517) NIO factory fails to close connections when the number of file handles run out.

2012-04-09 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250347#comment-13250347
 ] 

Jay Shrauner commented on ZOOKEEPER-517:


How tested:
  - set file descriptor limit low on server, started server
  - opened connections until no longer able to establish session
  - opened additional connections to set up large pool of pending accepts
  - closed connected session
  - verified pending accepts made it through

For max connections per client testing, set a low value and kept opening more 
than that # of connections from the same IP, verified that no longer see 
sockets parked in CLOSE_WAIT.

> NIO factory fails to close connections when the number of file handles run 
> out.
> ---
>
> Key: ZOOKEEPER-517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-517
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.3, 3.5.0
>Reporter: Mahadev konar
>Assignee: Jay Shrauner
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-517.patch
>
>
> The code in NIO factory is such that if we fail to accept a connection due to 
> some reasons (too many file handles maybe one of them) we do not close the 
> connections that are in CLOSE_WAIT. We need to call an explicit close on 
> these sockets and then close them. One of the solutions might be to move doIO 
> before accpet so that we can still close connection even if we cannot accept 
> connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-517) NIO factory fails to close connections when the number of file handles run out.

2012-04-09 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250278#comment-13250278
 ] 

Jay Shrauner commented on ZOOKEEPER-517:


- Fixed accept() code to trap errors so the select handler loop doesn't skip 
handling any other IO requests (such as, eg, closing a socket)
- Added rate limiting to error logging. Noticed when stressing the server with 
too many connections that the server was pegging on hammering the error logs
- Reproduced issue Patrick noted and fixed closing of too many connects from 
single client so they wouldn't park in CLOSE_WAIT

> NIO factory fails to close connections when the number of file handles run 
> out.
> ---
>
> Key: ZOOKEEPER-517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-517
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Mahadev konar
>Assignee: Benjamin Reed
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-517.patch
>
>
> The code in NIO factory is such that if we fail to accept a connection due to 
> some reasons (too many file handles maybe one of them) we do not close the 
> connections that are in CLOSE_WAIT. We need to call an explicit close on 
> these sockets and then close them. One of the solutions might be to move doIO 
> before accpet so that we can still close connection even if we cannot accept 
> connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-517) NIO factory fails to close connections when the number of file handles run out.

2012-04-09 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-517:
---

Attachment: ZOOKEEPER-517.patch

> NIO factory fails to close connections when the number of file handles run 
> out.
> ---
>
> Key: ZOOKEEPER-517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-517
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Reporter: Mahadev konar
>Assignee: Benjamin Reed
>Priority: Critical
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-517.patch
>
>
> The code in NIO factory is such that if we fail to accept a connection due to 
> some reasons (too many file handles maybe one of them) we do not close the 
> connections that are in CLOSE_WAIT. We need to call an explicit close on 
> these sockets and then close them. One of the solutions might be to move doIO 
> before accpet so that we can still close connection even if we cannot accept 
> connections.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-02-07 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: ZOOKEEPER-1383.patch

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch, ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1384) test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard location

2012-02-01 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198011#comment-13198011
 ] 

Jay Shrauner commented on ZOOKEEPER-1384:
-

This is a change solely to build.xml, so a test isn't appropriate.

I ran ant test using two dev setups, one in which the path pointed to a gcc in 
/usr/bin and with no LD_LIBRARY_PATH set, and the other using a path pointing 
to a gcc in /usr/local/bin with LD_LIBRARY_PATH set to 
/usr/local/lib64:/usr/local/lib. Both compiled and ran the tests fine.

> test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard 
> location
> ---
>
> Key: ZOOKEEPER-1384
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1384
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, tests
>Affects Versions: 3.4.2
> Environment: Linux
>Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Attachments: ZOOKEEPER-1384.patch
>
>
> On Linux with gcc installed in /usr/local and the libs in /usr/local/lib64, 
> test-core-cppunit fails because zktest-st is unable to find the right 
> libstdc++.
> build.xml is overriding the environment LD_LIBRARY_PATH instead of appending 
> to it. This should be changed to match the treatment of PATH by appending the 
> desired extra path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Assigned] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-01-31 Thread Jay Shrauner (Assigned) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner reassigned ZOOKEEPER-1383:
---

Assignee: Jay Shrauner

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1384) test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard location

2012-01-31 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197523#comment-13197523
 ] 

Jay Shrauner commented on ZOOKEEPER-1384:
-

Thanks--I was confused and thought I used "submit patch" after people had 
approved it.

> test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard 
> location
> ---
>
> Key: ZOOKEEPER-1384
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1384
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, tests
>Affects Versions: 3.4.2
>     Environment: Linux
>    Reporter: Jay Shrauner
>Assignee: Jay Shrauner
>Priority: Minor
> Attachments: ZOOKEEPER-1384.patch
>
>
> On Linux with gcc installed in /usr/local and the libs in /usr/local/lib64, 
> test-core-cppunit fails because zktest-st is unable to find the right 
> libstdc++.
> build.xml is overriding the environment LD_LIBRARY_PATH instead of appending 
> to it. This should be changed to match the treatment of PATH by appending the 
> desired extra path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1384) test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard location

2012-01-31 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197449#comment-13197449
 ] 

Jay Shrauner commented on ZOOKEEPER-1384:
-

Please review this change.

> test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard 
> location
> ---
>
> Key: ZOOKEEPER-1384
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1384
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, tests
>Affects Versions: 3.4.2
> Environment: Linux
>Reporter: Jay Shrauner
>Priority: Minor
> Attachments: ZOOKEEPER-1384.patch
>
>
> On Linux with gcc installed in /usr/local and the libs in /usr/local/lib64, 
> test-core-cppunit fails because zktest-st is unable to find the right 
> libstdc++.
> build.xml is overriding the environment LD_LIBRARY_PATH instead of appending 
> to it. This should be changed to match the treatment of PATH by appending the 
> desired extra path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1384) test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard location

2012-01-31 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1384:


Attachment: ZOOKEEPER-1384.patch

> test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard 
> location
> ---
>
> Key: ZOOKEEPER-1384
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1384
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, tests
>Affects Versions: 3.4.2
> Environment: Linux
>Reporter: Jay Shrauner
>Priority: Minor
> Attachments: ZOOKEEPER-1384.patch
>
>
> On Linux with gcc installed in /usr/local and the libs in /usr/local/lib64, 
> test-core-cppunit fails because zktest-st is unable to find the right 
> libstdc++.
> build.xml is overriding the environment LD_LIBRARY_PATH instead of appending 
> to it. This should be changed to match the treatment of PATH by appending the 
> desired extra path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (ZOOKEEPER-1384) test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard location

2012-01-31 Thread Jay Shrauner (Created) (JIRA)

test-cppunit overrides LD_LIBRARY_PATH and fails if gcc is in non-standard 
location
---

 Key: ZOOKEEPER-1384
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1384
 Project: ZooKeeper
  Issue Type: Bug
  Components: build, tests
Affects Versions: 3.4.2
 Environment: Linux
Reporter: Jay Shrauner
Priority: Minor
 Attachments: ZOOKEEPER-1384.patch

On Linux with gcc installed in /usr/local and the libs in /usr/local/lib64, 
test-core-cppunit fails because zktest-st is unable to find the right libstdc++.

build.xml is overriding the environment LD_LIBRARY_PATH instead of appending to 
it. This should be changed to match the treatment of PATH by appending the 
desired extra path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-01-31 Thread Jay Shrauner (Commented) (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13197424#comment-13197424
 ] 

Jay Shrauner commented on ZOOKEEPER-1383:
-

Provided the idea sounds good, please review the attached change.

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-01-31 Thread Jay Shrauner (Updated) (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Shrauner updated ZOOKEEPER-1383:


Attachment: ZOOKEEPER-1383.patch

> Create update throughput quotas and add hard quota limits
> -
>
> Key: ZOOKEEPER-1383
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
> Project: ZooKeeper
>  Issue Type: New Feature
>  Components: server
>    Reporter: Jay Shrauner
>Priority: Minor
> Fix For: 3.5.0
>
> Attachments: ZOOKEEPER-1383.patch
>
>
> Quotas exist for size (node count and size in bytes); it would be useful to 
> track and support quotas on update throughput (bytes per second) as well. 
> This can be tracked on both a node/subtree level for quota support as well as 
> on the server level for monitoring.
> In addition, the existing quotas log a warning when they are exceeded but 
> allow the transaction to proceed (soft quotas). It would also be useful to 
> support a corresponding set of hard quota limits that fail the transaction.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Created] (ZOOKEEPER-1383) Create update throughput quotas and add hard quota limits

2012-01-31 Thread Jay Shrauner (Created) (JIRA)

Create update throughput quotas and add hard quota limits
-

 Key: ZOOKEEPER-1383
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1383
 Project: ZooKeeper
  Issue Type: New Feature
  Components: server
Reporter: Jay Shrauner
Priority: Minor
 Fix For: 3.5.0


Quotas exist for size (node count and size in bytes); it would be useful to 
track and support quotas on update throughput (bytes per second) as well. This 
can be tracked on both a node/subtree level for quota support as well as on the 
server level for monitoring.

In addition, the existing quotas log a warning when they are exceeded but allow 
the transaction to proceed (soft quotas). It would also be useful to support a 
corresponding set of hard quota limits that fail the transaction.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

64 matches

Mail list logo