Re: Some thoughts on Zookeeper after using it for a while in the CXF/DOSGi subproject

2009-06-02 Thread davidb
Hi Ben, ... inline ... 2009/5/29 Benjamin Reed : > this is great to hear. it's great to see siblings playing together ;) > >> * In CXF we use Maven to build everything. To depend on Zookeeper we >> need to pull it in from a Maven repository. I couldn't find Zookeeper >> in any main Maven repos, s

ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
I am running a 5 node ZooKeeper cluster and I noticed that one of them has very high CPU usage: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 6883 infact 22 0 725m 41m 4188 S 95 0.5 5671:54 java It is not "doing anything" application-wise at this

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Ted Dunning
I have seen this when I was over-loading our zookeeper instance. When total data or total number of files gets large, the system can wind up in GC almost permanently. Zookeeper being Zookeeper, it does an amazing job of keeping on, but eventually things go bad. To test if this is your problem, y

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Benjamin Reed
can you attach the jstack output? it seems to be missing from your email. ben Satish Bhatti wrote: I am running a 5 node ZooKeeper cluster and I noticed that one of them has very high CPU usage: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 6883 infact 22

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
Hey Ben, Strange you didn't get the attachment, my gmail is showing the paper clip thingy for that message. ANyway, I have pasted the whole jstack output into this email, since it's pretty small. Satish 2009-06-02 11:56:26 Full thread dump Java HotSpot(TM) 64-Bit Server VM (1.6.0_03-b05 mixed mo

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Ted Dunning
Mailing lists like this often strip attachments. On Tue, Jun 2, 2009 at 1:29 PM, Satish Bhatti wrote: > Hey Ben, > Strange you didn't get the attachment, my gmail is showing the paper clip > thingy for that message. ANyway, I have pasted the whole jstack output > into > this email, since it's p

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
Hey Ted, I have a grand total of 3 files, each holding a single long! Could it really be in gc hell because of this? I passed in -Xmx256m on the command line. Satish On Tue, Jun 2, 2009 at 12:52 PM, Ted Dunning wrote: > I have seen this when I was over-loading our zookeeper instance. When >

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Ted Dunning
I am sooo glad I put in the fourth item. This is clearly not overload in that sense. Do you have logs from the ZK? What does the stat command return (telnet to the ZK inquestion, type "stat" (without the quotes)). On Tue, Jun 2, 2009 at 1:35 PM, Satish Bhatti wrote: > I have a grand total of

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
stat Zookeeper version: 3.1.1-755636, built on 03/18/2009 16:52 GMT Clients: /127.0.0.1:42460[1](queued=0,recved=0,sent=0) /172.16.0.178:34283[1](queued=0,recved=925109,sent=925109) Latency min/avg/max: 0/0/220 Received: 925109 Sent: 925109 Outstanding: 0 Zxid: 0x50051c7b3 Mode: follower Node co

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Mahadev Konar
Hi Satish, Can you attach this trace to a jira? Please open one for this. Also, can you do the following - For all the threads for the zookeeper server you are seeing the problem on, Can you do an strace on all the threads and see which thread is spinning? Also, can you upload the configs of t

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Ted Dunning
Are the sent and received numbers going up quickly? What is the second client doing? On Tue, Jun 2, 2009 at 1:49 PM, Satish Bhatti wrote: > stat > Zookeeper version: 3.1.1-755636, built on 03/18/2009 16:52 GMT > Clients: > /127.0.0.1:42460[1](queued=0,recved=0,sent=0) > /172.16.0.178:34283[1]

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
Created a Jira and attached logile + jstack file to it. https://issues.apache.org/jira/secure/ManageAttachments.jspa?id=12426974 On Tue, Jun 2, 2009 at 1:50 PM, Mahadev Konar wrote: > Hi Satish, > Can you attach this trace to a jira? Please open one for this. Also, can > you do the following -

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Patrick Hunt
According to your trace I see you are using jvm 1.6.0_03-b05 One of the bugs fixed in: http://java.sun.com/javase/6/webnotes/6u4.html specifically: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933 seems to have a description very close to what you are seeing. Perhaps you can try runnin

Re: ZooKeeper heavy CPU utilisation

2009-06-02 Thread Satish Bhatti
[inf...@df3-8 infact]$ echo stats | nc localhost 2181 Zookeeper version: 3.1.1-755636, built on 03/18/2009 16:52 GMT Clients: /127.0.0.1:33139[1](queued=0,recved=0,sent=0) /172.16.0.178:34283[1](queued=0,recved=925371,sent=925371) Latency min/avg/max: 0/0/220 Received: 925371 Sent: 925382 Outsta

Errors during shutdown/startup of ZooKeeper

2009-06-02 Thread Nitay
Hey guys, We are getting a lot of messages like this in HBase: [junit] 2009-06-02 11:57:23,658 ERROR [NIOServerCxn.Factory:21810] server.NIOServerCnxn(514): Client has seen zxid 0xe our last zxid is 0xd For more context, the block it usually appears in is: [junit] 2009-06-02 13:27:54,083 IN

Re: Errors during shutdown/startup of ZooKeeper

2009-06-02 Thread Mahadev Konar
Hi Nitay, This is not an error but should be a warning. I have opened up a jira for it. http://issues.apache.org/jira/browse/ZOOKEEPER-428 The message just says that a client is connecting to a server that is behind that a server is was connected to earlier. The log should be warn and not erro

Re: Errors during shutdown/startup of ZooKeeper

2009-06-02 Thread Nitay
I see. That helps. However, even as warnings, these go on seemingly endlessly. Why do they not get fixed by themselves? What are we doing wrong here? On Tue, Jun 2, 2009 at 2:24 PM, Mahadev Konar wrote: > Hi Nitay, > This is not an error but should be a warning. I have opened up a jira for > it

Re: Errors during shutdown/startup of ZooKeeper

2009-06-02 Thread Mahadev Konar
I think my last message got bounced... They should get fixed automatically. Are you shutting downm servers often in your unit test? A client should be avle to connect to some other server which is more recnet. Whats the reason behind your question that it isnt getting fixed by itself? mahade

Re: Errors during shutdown/startup of ZooKeeper

2009-06-02 Thread Patrick Hunt
This log manifests if the client is running ahead of the server. say you have: 1) client connects to server A and sees some changes 2) client gets disconnected from A and attempts to connect to B 3) B can be running behind A by some number of changes (it will eventually catch up) 4) client will