[jira] [Commented] (ZOOKEEPER-1653) zookeeper fails to start because of inconsistent epoch

2013-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823388#comment-13823388
 ] 

Flavio Junqueira commented on ZOOKEEPER-1653:
-

If we can't do any of the operations related to the updating file, then we 
shouldn't keep going, right? Say we fail to create the fail and the server 
keeps executing. In this case we can fall into the same problem we are 
discussing here. I think we should either throw an exception or exit the 
server. 

 zookeeper fails to start because of inconsistent epoch
 --

 Key: ZOOKEEPER-1653
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1653
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.4.5
Reporter: Michi Mutsuzaki
Assignee: Michi Mutsuzaki
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1653.3.4.patch, ZOOKEEPER-1653.patch, 
 ZOOKEEPER-1653.patch


 It looks like QuorumPeer.loadDataBase() could fail if the server was 
 restarted after zk.takeSnapshot() but before finishing 
 self.setCurrentEpoch(newEpoch) in Learner.java.
 {code:java}
 case Leader.NEWLEADER: // it will be NEWLEADER in v1.0
 zk.takeSnapshot();
 self.setCurrentEpoch(newEpoch); //  got restarted here
 snapshotTaken = true;
 writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), 
 true);
 break;
 {code}
 The server fails to start because currentEpoch is still 1 but the last 
 processed zkid from the snapshot has been updated.
 {noformat}
 2013-02-20 13:45:02,733 5543 [pool-1-thread-1] ERROR 
 org.apache.zookeeper.server.quorum.QuorumPeer  - Unable to load database on 
 disk
 java.io.IOException: The current epoch, 1, is older than the last zxid, 
 8589934592
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:439)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:413)
 ...
 {noformat}
 {noformat}
 $ find datadir 
 datadir
 datadir/version-2
 datadir/version-2/currentEpoch.tmp
 datadir/version-2/acceptedEpoch
 datadir/version-2/snapshot.0
 datadir/version-2/currentEpoch
 datadir/version-2/snapshot.2
 $ cat datadir/version-2/currentEpoch.tmp
 2%
 $ cat datadir/version-2/acceptedEpoch
 2%
 $ cat datadir/version-2/currentEpoch
 1%
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1813) Zookeeper restart fails due to missing node from snapshot

2013-11-15 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-1813:


Priority: Major  (was: Blocker)

 Zookeeper restart fails due to missing node from snapshot
 -

 Key: ZOOKEEPER-1813
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1813
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.4.5, 3.5.0
Reporter: Vinay
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1813-test.patch


 Due to following exception Zookeeper restart is failing
 {noformat}java.io.IOException: Failed to process transaction type: 1 error: 
 KeeperErrorCode = NoNode for /test/subdir2/subdir2/subdir
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:183)
   at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:222)
   at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:255)
   at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:380)
   at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:748)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:111)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:90)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMainTest$2.run(ZooKeeperServerMainTest.java:218)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /test/subdir2/subdir2/subdir
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:268)
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:181)
   ... 7 more{noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1653) zookeeper fails to start because of inconsistent epoch

2013-11-15 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823477#comment-13823477
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1653:


Ok that sounds reasonable. I'll change the patch so that it throws an 
IOException if any of the operations on the updating file fails.

 zookeeper fails to start because of inconsistent epoch
 --

 Key: ZOOKEEPER-1653
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1653
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.4.5
Reporter: Michi Mutsuzaki
Assignee: Michi Mutsuzaki
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1653.3.4.patch, ZOOKEEPER-1653.patch, 
 ZOOKEEPER-1653.patch


 It looks like QuorumPeer.loadDataBase() could fail if the server was 
 restarted after zk.takeSnapshot() but before finishing 
 self.setCurrentEpoch(newEpoch) in Learner.java.
 {code:java}
 case Leader.NEWLEADER: // it will be NEWLEADER in v1.0
 zk.takeSnapshot();
 self.setCurrentEpoch(newEpoch); //  got restarted here
 snapshotTaken = true;
 writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), 
 true);
 break;
 {code}
 The server fails to start because currentEpoch is still 1 but the last 
 processed zkid from the snapshot has been updated.
 {noformat}
 2013-02-20 13:45:02,733 5543 [pool-1-thread-1] ERROR 
 org.apache.zookeeper.server.quorum.QuorumPeer  - Unable to load database on 
 disk
 java.io.IOException: The current epoch, 1, is older than the last zxid, 
 8589934592
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:439)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:413)
 ...
 {noformat}
 {noformat}
 $ find datadir 
 datadir
 datadir/version-2
 datadir/version-2/currentEpoch.tmp
 datadir/version-2/acceptedEpoch
 datadir/version-2/snapshot.0
 datadir/version-2/currentEpoch
 datadir/version-2/snapshot.2
 $ cat datadir/version-2/currentEpoch.tmp
 2%
 $ cat datadir/version-2/acceptedEpoch
 2%
 $ cat datadir/version-2/currentEpoch
 1%
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2013-11-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823483#comment-13823483
 ] 

Germán Blanco commented on ZOOKEEPER-832:
-

There are 10 people watching this issue, surely there must be opinions.
I am sure e.g. [~randgalt] must be able to tell if this is ok from the client 
point of view or not.
Please, speak up.

 Invalid session id causes infinite loop during automatic reconnect
 --

 Key: ZOOKEEPER-832
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1, 3.4.5, 3.5.0
 Environment: All
Reporter: Ryan Holmes
Assignee: Germán Blanco
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
 ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch


 Steps to reproduce:
 1.) Connect to a standalone server using the Java client.
 2.) Stop the server.
 3.) Delete the contents of the data directory (i.e. the persisted session 
 data).
 4.) Start the server.
 The client now automatically tries to reconnect but the server refuses the 
 connection because the session id is invalid. The client and server are now 
 in an infinite loop of attempted and rejected connections. While this 
 situation represents a catastrophic failure and the current behavior is not 
 incorrect, it appears that there is no way to detect this situation on the 
 client and therefore no way to recover.
 The suggested improvement is to send an event to the default watcher 
 indicating that the current state is session invalid, similar to how the 
 session expired state is handled.
 Server log output (repeats indefinitely):
 2010-08-05 11:48:08,283 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
 Accepted socket connection from /127.0.0.1:63292
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
 session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
 zxid is 0x0 client must try another server
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
 socket connection for client /127.0.0.1:63292 (no session established for 
 client)
 Client log output (repeats indefinitely):
 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
 Opening socket connection to server localhost/127.0.0.1:2181
 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
 0x12a3ae4e893000a for server null, unexpected error, closing socket 
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
 exception during shutdown input
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
 exception during shutdown output
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


ZooKeeper_branch34 - Build # 794 - Failure

2013-11-15 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34/794/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 177888 lines...]
[junit] 2013-11-15 08:51:25,191 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-15 08:51:25,192 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-15 08:51:25,193 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-15 08:51:25,193 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-15 08:51:25,193 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-15 08:51:25,193 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-15 08:51:25,193 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  [main:ZooKeeperServer@441] 
- shutting down
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  
[main:PrepRequestProcessor@761] - Shutting down
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited loop!
[junit] 2013-11-15 08:51:25,194 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2013-11-15 08:51:25,195 [myid:] - INFO  
[main:FinalRequestProcessor@415] - shutdown of request processor complete
[junit] 2013-11-15 08:51:25,195 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-15 08:51:25,196 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-11-15 08:51:25,197 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-11-15 08:51:25,197 [myid:] - INFO  [main:ZooKeeperServer@162] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test4101136614563758108.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34/branch-3.4/build/test/tmp/test4101136614563758108.junit.dir/version-2
[junit] 2013-11-15 08:51:25,198 [myid:] - INFO  
[main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-15 08:51:25,201 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-15 08:51:25,202 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - 
Accepted socket connection from /127.0.0.1:40758
[junit] 2013-11-15 08:51:25,202 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@817] - Processing 
stat command from /127.0.0.1:40758
[junit] 2013-11-15 08:51:25,202 [myid:] - INFO  
[Thread-5:NIOServerCnxn$StatCommand@653] - Stat command output
[junit] 2013-11-15 08:51:25,203 [myid:] - INFO  
[Thread-5:NIOServerCnxn@997] - Closed socket connection for client 
/127.0.0.1:40758 (no session established for client)
[junit] 2013-11-15 08:51:25,203 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-15 08:51:25,204 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-15 08:51:25,204 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-15 08:51:25,204 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-15 08:51:25,205 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-15 08:51:25,205 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-11-15 08:51:25,205 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-11-15 08:51:25,280 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x1425af526d5 closed
[junit] 2013-11-15 08:51:25,280 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-15 08:51:25,280 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down
[junit] 2013-11-15 08:51:25,281 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@224] - 
NIOServerCnxn factory exited run method
[junit] 2013-11-15 08:51:25,281 

[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2013-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823502#comment-13823502
 ] 

Flavio Junqueira commented on ZOOKEEPER-832:


bq. still I think that closing the session is a better behaviour than the 
infinite loop

I'm not arguing whether it is better or worse. My problem with it is that it 
changes the semantics of closing a session. The contract is that the server 
side terminates a session only when it expires, but this is proposing to 
terminate it earlier. Could it cause any problem at the client because we are 
terminating the lease earlier?

bq. Closing the session means that the application will receive a CONNECTION 
LOST event.

I think you meant to say SESSION EXPIRED here, no?

bq. There are 10 people watching this issue, surely there must be opinions.

It would be great if the folks who discussed this issue earlier could chime in 
here.

 Invalid session id causes infinite loop during automatic reconnect
 --

 Key: ZOOKEEPER-832
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1, 3.4.5, 3.5.0
 Environment: All
Reporter: Ryan Holmes
Assignee: Germán Blanco
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
 ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch


 Steps to reproduce:
 1.) Connect to a standalone server using the Java client.
 2.) Stop the server.
 3.) Delete the contents of the data directory (i.e. the persisted session 
 data).
 4.) Start the server.
 The client now automatically tries to reconnect but the server refuses the 
 connection because the session id is invalid. The client and server are now 
 in an infinite loop of attempted and rejected connections. While this 
 situation represents a catastrophic failure and the current behavior is not 
 incorrect, it appears that there is no way to detect this situation on the 
 client and therefore no way to recover.
 The suggested improvement is to send an event to the default watcher 
 indicating that the current state is session invalid, similar to how the 
 session expired state is handled.
 Server log output (repeats indefinitely):
 2010-08-05 11:48:08,283 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
 Accepted socket connection from /127.0.0.1:63292
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
 session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
 zxid is 0x0 client must try another server
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
 socket connection for client /127.0.0.1:63292 (no session established for 
 client)
 Client log output (repeats indefinitely):
 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
 Opening socket connection to server localhost/127.0.0.1:2181
 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
 0x12a3ae4e893000a for server null, unexpected error, closing socket 
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
 exception during shutdown input
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
 exception during shutdown output
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (ZOOKEEPER-1813) Zookeeper restart fails due to missing node from snapshot

2013-11-15 Thread Vinay (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay resolved ZOOKEEPER-1813.
--

Resolution: Duplicate

Resolving as duplicate of ZOOKEEPER-1573, 
Patch has been submitted to ZOOKEEPER-1573

 Zookeeper restart fails due to missing node from snapshot
 -

 Key: ZOOKEEPER-1813
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1813
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.4.5, 3.5.0
Reporter: Vinay
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1813-test.patch


 Due to following exception Zookeeper restart is failing
 {noformat}java.io.IOException: Failed to process transaction type: 1 error: 
 KeeperErrorCode = NoNode for /test/subdir2/subdir2/subdir
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:183)
   at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:222)
   at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:255)
   at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:380)
   at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:748)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:111)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:90)
   at 
 org.apache.zookeeper.server.ZooKeeperServerMainTest$2.run(ZooKeeperServerMainTest.java:218)
 Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
 KeeperErrorCode = NoNode for /test/subdir2/subdir2/subdir
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:268)
   at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:181)
   ... 7 more{noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1810 PreCommit Build #1769

2013-11-15 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1810
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1769/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 295023 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12614027/ZOOKEEPER-1810.patch
 [exec]   against trunk revision 1541810.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 33 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1769//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1769//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1769//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 96e055574e38868fbb669aec32509688a5b444eb logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 29 minutes 21 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1810
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  init.org.apache.zookeeper.test.FLEBackwardElectionRoundTest

Error Message:
org.apache.zookeeper.test.FLEBackwardElectionRoundTest

Stack Trace:
java.lang.ClassNotFoundException: 
org.apache.zookeeper.test.FLEBackwardElectionRoundTest
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)


FAILED:  init.org.apache.zookeeper.test.FLELostMessageTest

Error Message:
org.apache.zookeeper.test.FLELostMessageTest

Stack Trace:
java.lang.ClassNotFoundException: org.apache.zookeeper.test.FLELostMessageTest
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:169)




ZooKeeper-trunk-solaris - Build # 731 - Still Failing

2013-11-15 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/731/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 216118 lines...]
[junit] 2013-11-15 09:41:44,909 [myid:] - INFO  
[main:PrepRequestProcessor@972] - Shutting down
[junit] 2013-11-15 09:41:44,909 [myid:] - INFO  
[main:SyncRequestProcessor@190] - Shutting down
[junit] 2013-11-15 09:41:44,909 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@156] - PrepRequestProcessor exited loop!
[junit] 2013-11-15 09:41:44,909 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@168] - SyncRequestProcessor exited!
[junit] 2013-11-15 09:41:44,910 [myid:] - INFO  
[main:FinalRequestProcessor@442] - shutdown of request processor complete
[junit] 2013-11-15 09:41:44,910 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-15 09:41:44,911 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-11-15 09:41:44,912 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-11-15 09:41:44,912 [myid:] - INFO  [main:ZooKeeperServer@149] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3183266030717086104.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3183266030717086104.junit.dir/version-2
[junit] 2013-11-15 09:41:44,912 [myid:] - INFO  
[main:NIOServerCnxnFactory@670] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2013-11-15 09:41:44,913 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-15 09:41:44,914 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3183266030717086104.junit.dir/version-2/snapshot.b
[junit] 2013-11-15 09:41:44,916 [myid:] - INFO  [main:FileTxnSnapLog@297] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test3183266030717086104.junit.dir/version-2/snapshot.b
[junit] 2013-11-15 09:41:44,917 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-11-15 09:41:44,918 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:33193
[junit] 2013-11-15 09:41:44,919 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@828] - Processing stat command from 
/127.0.0.1:33193
[junit] 2013-11-15 09:41:44,919 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@677] - Stat command output
[junit] 2013-11-15 09:41:44,919 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@999] - Closed socket connection for client 
/127.0.0.1:33193 (no session established for client)
[junit] 2013-11-15 09:41:44,919 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-11-15 09:41:44,921 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-11-15 09:41:44,921 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-11-15 09:41:44,921 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-11-15 09:41:44,921 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-15 09:41:44,921 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-11-15 09:41:44,922 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-11-15 09:41:45,001 [myid:] - INFO  [main:ZooKeeper@777] - 
Session: 0x1425b2339ff closed
[junit] 2013-11-15 09:41:45,001 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down
[junit] 2013-11-15 09:41:45,001 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-11-15 09:41:45,002 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2013-11-15 09:41:45,002 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2013-11-15 09:41:45,002 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@219]
 - 

[jira] [Commented] (ZOOKEEPER-1573) Unable to load database due to missing parent node

2013-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823518#comment-13823518
 ] 

Hadoop QA commented on ZOOKEEPER-1573:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12614040/ZOOKEEPER-1573.patch
  against trunk revision 1541810.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//console

This message is automatically generated.

 Unable to load database due to missing parent node
 --

 Key: ZOOKEEPER-1573
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3, 3.5.0
Reporter: Thawan Kooburat
 Attachments: ZOOKEEPER-1573.patch


 While replaying txnlog on data tree, the server has a code to detect missing 
 parent node. This code block was last modified as part of ZOOKEEPER-1333. In 
 our production, we found a case where this check is return false positive.
 The sequence of txns is as follows:
 zxid 1:  create /prefix/a
 zxid 2:  create /prefix/a/b
 zxid 3:  delete /prefix/a/b
 zxid 4:  delete /prefix/a
 The server start capturing snapshot at zxid 1. However, by the time it 
 traversing the data tree down to /prefix, txn 4 is already applied and 
 /prefix have no children. 
 When the server restore from snapshot, it process txnlog starting from zxid 
 2. This txn generate missing parent error and the server refuse to start up.
 The same check allow me to discover bug in ZOOKEEPER-1551, but I don't know 
 if we have any option beside removing this check to solve this issue.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Success: ZOOKEEPER-1573 PreCommit Build #1770

2013-11-15 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 296386 lines...]
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] +1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12614040/ZOOKEEPER-1573.patch
 [exec]   against trunk revision 1541810.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1770//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 65ebf37a767c49e6a79d5a6b19fb4739ad07bf19 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 34 minutes 14 seconds
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1573
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-832) Invalid session id causes infinite loop during automatic reconnect

2013-11-15 Thread JIRA

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823521#comment-13823521
 ] 

Germán Blanco commented on ZOOKEEPER-832:
-

I guess you are right about the SESSION EXPIRED event, I don't know very well 
how this is mapped.

 Invalid session id causes infinite loop during automatic reconnect
 --

 Key: ZOOKEEPER-832
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-832
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.3.1, 3.4.5, 3.5.0
 Environment: All
Reporter: Ryan Holmes
Assignee: Germán Blanco
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, 
 ZOOKEEPER-832.patch, ZOOKEEPER-832.patch, ZOOKEEPER-832.patch


 Steps to reproduce:
 1.) Connect to a standalone server using the Java client.
 2.) Stop the server.
 3.) Delete the contents of the data directory (i.e. the persisted session 
 data).
 4.) Start the server.
 The client now automatically tries to reconnect but the server refuses the 
 connection because the session id is invalid. The client and server are now 
 in an infinite loop of attempted and rejected connections. While this 
 situation represents a catastrophic failure and the current behavior is not 
 incorrect, it appears that there is no way to detect this situation on the 
 client and therefore no way to recover.
 The suggested improvement is to send an event to the default watcher 
 indicating that the current state is session invalid, similar to how the 
 session expired state is handled.
 Server log output (repeats indefinitely):
 2010-08-05 11:48:08,283 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@250] - 
 Accepted socket connection from /127.0.0.1:63292
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@751] - Refusing 
 session request for client /127.0.0.1:63292 as it has seen zxid 0x44 our last 
 zxid is 0x0 client must try another server
 2010-08-05 11:48:08,284 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1434] - Closed 
 socket connection for client /127.0.0.1:63292 (no session established for 
 client)
 Client log output (repeats indefinitely):
 11:47:17 org.apache.zookeeper.ClientCnxn startConnect INFO line 1000 - 
 Opening socket connection to server localhost/127.0.0.1:2181
 11:47:17 org.apache.zookeeper.ClientCnxn run WARN line 1120 - Session 
 0x12a3ae4e893000a for server null, unexpected error, closing socket 
 connection and attempting reconnect
 java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at 
 sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1078)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1167 - Ignoring 
 exception during shutdown input
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
   at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1164)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)
 11:47:17 org.apache.zookeeper.ClientCnxn cleanup DEBUG line 1174 - Ignoring 
 exception during shutdown output
 java.nio.channels.ClosedChannelException
   at 
 sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
   at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
   at 
 org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1171)
   at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1129)



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1814) Reduction of time during Leader election

2013-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823538#comment-13823538
 ] 

Flavio Junqueira commented on ZOOKEEPER-1814:
-

Thanks for reporting this issue. The 60 seconds you're referring to is the 
worst case. Making the top interval configurable is not a bug, so it should go 
into 3.5.0. If you want to bump the test case timeout value because it causes 
failures, then please do it in a different jira. In any case, it is odd that 
the test even hits the 60s limit. I think it is premature to blame the 60s 
limit for the test failure.

 Reduction of time during Leader election
 

 Key: ZOOKEEPER-1814
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1814
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.4.5, 3.5.0
Reporter: Daniel Peon
 Fix For: 3.5.0

   Original Estimate: 24h
  Remaining Estimate: 24h

 FastLeader election takes long time because of the exponential backoff. 
 Currently the time is 60 seconds.
 It would be interesting to give the possibility to configure this parameter, 
 like for example for a Server shutdown.
 Otherwise, it sometimes takes so long and it has been detected a test failure 
 when executing: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.
 This test case waits until 30 seconds and this is smaller than the 60 seconds 
 where the leader election can be waiting for at the moment of shutting down.
 Considering the failure during the test case, this issue was considered a 
 possible bug.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Assigned] (ZOOKEEPER-1814) Reduction of time during Leader election

2013-11-15 Thread Daniel Peon (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Peon reassigned ZOOKEEPER-1814:
--

Assignee: Daniel Peon

 Reduction of time during Leader election
 

 Key: ZOOKEEPER-1814
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1814
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.4.5, 3.5.0
Reporter: Daniel Peon
Assignee: Daniel Peon
 Fix For: 3.5.0

   Original Estimate: 24h
  Remaining Estimate: 24h

 FastLeader election takes long time because of the exponential backoff. 
 Currently the time is 60 seconds.
 It would be interesting to give the possibility to configure this parameter, 
 like for example for a Server shutdown.
 Otherwise, it sometimes takes so long and it has been detected a test failure 
 when executing: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.
 This test case waits until 30 seconds and this is smaller than the 60 seconds 
 where the leader election can be waiting for at the moment of shutting down.
 Considering the failure during the test case, this issue was considered a 
 possible bug.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1814) Reduction of waiting time during Fast Leader Election

2013-11-15 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Germán Blanco updated ZOOKEEPER-1814:
-

Summary: Reduction of waiting time during Fast Leader Election  (was: 
Reduction of time during Leader election)

 Reduction of waiting time during Fast Leader Election
 -

 Key: ZOOKEEPER-1814
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1814
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.4.5, 3.5.0
Reporter: Daniel Peon
Assignee: Daniel Peon
 Fix For: 3.5.0

   Original Estimate: 24h
  Remaining Estimate: 24h

 FastLeader election takes long time because of the exponential backoff. 
 Currently the time is 60 seconds.
 It would be interesting to give the possibility to configure this parameter, 
 like for example for a Server shutdown.
 Otherwise, it sometimes takes so long and it has been detected a test failure 
 when executing: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.
 This test case waits until 30 seconds and this is smaller than the 60 seconds 
 where the leader election can be waiting for at the moment of shutting down.
 Considering the failure during the test case, this issue was considered a 
 possible bug.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1635) Support x64 architecture for Windows

2013-11-15 Thread Joe Gamache (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823696#comment-13823696
 ] 

Joe Gamache commented on ZOOKEEPER-1635:


As for why it matters, read 

here: 
http://mail-archives.apache.org/mod_mbox/zookeeper-user/201307.mbox/%3CCANLc_9LTYw7Q2Zte1vXdJMG_VTuXXK1Un2kvtny1LY=ah3k...@mail.gmail.com%3E

and here: http://osdir.com/ml/java-hadoop-zookeeper-user/2010-03/msg00116.html

So for some people when the Admin Guide says it is not production ready there 
is understandable recalcitrance to usage in a production mode...

 Support x64 architecture for Windows
 

 Key: ZOOKEEPER-1635
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1635
 Project: ZooKeeper
  Issue Type: Improvement
 Environment: Windows x64 systems.
Reporter: Tomas Gutierrez
 Fix For: 3.5.0


 x64 target does not support _asm inline  (See: 
 http://msdn.microsoft.com/en-us/library/4ks26t93(v=vs.80).aspx)
 The proposal is to use native windows function which still valid for i386 and 
 x64 architecture.
 In order to avoid any potential break, a compilation directive has been 
 added. But, the best should be the removal of the asm part.
 ---
 sample code
 ---
 int32_t fetch_and_add(volatile int32_t* operand, int incr)
 {
 #ifndef WIN32
 int32_t result;
 asm __volatile__(
  lock xaddl %0,%1\n
  : =r(result), =m(*(int *)operand)
  : 0(incr)
  : memory);
return result;
 #else
 #ifdef WIN32_NOASM
 InterlockedExchangeAdd(operand, incr);
 return *operand;
 #else
 volatile int32_t result;
 _asm
 {
 mov eax, operand; //eax = v;
mov ebx, incr; // ebx = i;
 mov ecx, 0x0; // ecx = 0;
 lock xadd dword ptr [eax], ecx; 
lock xadd dword ptr [eax], ebx; 
 mov result, ecx; // result = ebx;
  }
  return result;*/
 #endif
 #endif
 }



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-223) PendingReadOp tries to read all entries at once

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823703#comment-13823703
 ] 

Ivan Kelly commented on BOOKKEEPER-223:
---

I don't think it needs a new interface as such, but in any case, it can be 
moved to 4.4.0. 

 PendingReadOp tries to read all entries at once
 ---

 Key: BOOKKEEPER-223
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-223
 Project: Bookkeeper
  Issue Type: Bug
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Fix For: 4.4.0


 PendingReadOp tries to read all entries from the bookie ensemble at once, and 
 fill an enumeration with what comes back. This is bad. If we have a ledger 
 with millions of entries, and you try to read the whole thing, you're client 
 will crap out. Of course you can get around this by only requesting a little 
 bit at a time, but why doesn't the client do this for you, as we are 
 effectively exposing a iterator interface anyhow?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-674) Tooling wishlist

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823706#comment-13823706
 ] 

Ivan Kelly commented on BOOKKEEPER-674:
---

It can be bumped to 4.4. I've no immediate plans to work on this.

 Tooling wishlist
 

 Key: BOOKKEEPER-674
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-674
 Project: Bookkeeper
  Issue Type: Wish
Reporter: Ivan Kelly
 Fix For: 4.3.0


 One of the issues brought up when I was in California was the lack of tooling 
 for bookkeeper. As such, I'm creating this wishlist as a place to discuss 
 tooling and to create a list of the tools missing. Before 4.3.0 we should go 
 through the suggestions and implement the most useful stuff.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (BOOKKEEPER-277) Commandline administration tool for bookkeeper

2013-11-15 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly resolved BOOKKEEPER-277.
---

   Resolution: Fixed
Fix Version/s: (was: 4.3.0)

 Commandline administration tool for bookkeeper
 --

 Key: BOOKKEEPER-277
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-277
 Project: Bookkeeper
  Issue Type: New Feature
Reporter: Ivan Kelly

 Similar to hedwig console, this would be a tool from which various admin 
 tools could be run, such as bkck, recovery. It could also be used to extract 
 info from the cluster, such as listing ledgers, listing bookies etc. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-546) Too many SyncThread and GC thread logs in DEBUG logging level

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823710#comment-13823710
 ] 

Ivan Kelly commented on BOOKKEEPER-546:
---

I'm not sure this is still an issue. I think I fixed most of this with the 
logging cleanup for 4.2.2

 Too many SyncThread and GC thread logs in DEBUG logging level
 -

 Key: BOOKKEEPER-546
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-546
 Project: Bookkeeper
  Issue Type: Improvement
  Components: bookkeeper-server, hedwig-server
Reporter: Jiannan Wang
Assignee: Jiannan Wang

 I find there are too many SyncThread and GC thread logs in DEBUG logging 
 level, which may not been cared about mostly. And these logs may cause more 
 time to find the actual logs one may interest in, so I suggest to disable 
 logging for these two thread in src/test/resources/log4j.properties by 
 default.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1635) Support x64 architecture for Windows

2013-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823708#comment-13823708
 ] 

Flavio Junqueira commented on ZOOKEEPER-1635:
-

bq.  when the Admin Guide says it is not production ready there is 
understandable recalcitrance to usage in a production mode

I'm not sure which admin guide you're referring to, I can only see e-mail 
pointers. In any case, we have worked on a couple of patches to fix the windows 
build on both trunk and 3.4 branch.

 Support x64 architecture for Windows
 

 Key: ZOOKEEPER-1635
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1635
 Project: ZooKeeper
  Issue Type: Improvement
 Environment: Windows x64 systems.
Reporter: Tomas Gutierrez
 Fix For: 3.5.0


 x64 target does not support _asm inline  (See: 
 http://msdn.microsoft.com/en-us/library/4ks26t93(v=vs.80).aspx)
 The proposal is to use native windows function which still valid for i386 and 
 x64 architecture.
 In order to avoid any potential break, a compilation directive has been 
 added. But, the best should be the removal of the asm part.
 ---
 sample code
 ---
 int32_t fetch_and_add(volatile int32_t* operand, int incr)
 {
 #ifndef WIN32
 int32_t result;
 asm __volatile__(
  lock xaddl %0,%1\n
  : =r(result), =m(*(int *)operand)
  : 0(incr)
  : memory);
return result;
 #else
 #ifdef WIN32_NOASM
 InterlockedExchangeAdd(operand, incr);
 return *operand;
 #else
 volatile int32_t result;
 _asm
 {
 mov eax, operand; //eax = v;
mov ebx, incr; // ebx = i;
 mov ecx, 0x0; // ecx = 0;
 lock xadd dword ptr [eax], ecx; 
lock xadd dword ptr [eax], ebx; 
 mov result, ecx; // result = ebx;
  }
  return result;*/
 #endif
 #endif
 }



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Reopened] (BOOKKEEPER-277) Commandline administration tool for bookkeeper

2013-11-15 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly reopened BOOKKEEPER-277:
---


 Commandline administration tool for bookkeeper
 --

 Key: BOOKKEEPER-277
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-277
 Project: Bookkeeper
  Issue Type: New Feature
Reporter: Ivan Kelly

 Similar to hedwig console, this would be a tool from which various admin 
 tools could be run, such as bkck, recovery. It could also be used to extract 
 info from the cluster, such as listing ledgers, listing bookies etc. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Resolved] (BOOKKEEPER-572) Make the journal a write ahead log

2013-11-15 Thread Ivan Kelly (JIRA)

 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Kelly resolved BOOKKEEPER-572.
---

   Resolution: Won't Fix
Fix Version/s: (was: 4.3.0)

 Make the journal a write ahead log
 --

 Key: BOOKKEEPER-572
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-572
 Project: Bookkeeper
  Issue Type: Bug
Reporter: Ivan Kelly
Assignee: Ivan Kelly
 Attachments: 
 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 0001-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 0003-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 0003-BOOKKEEPER-572-Write-to-the-journal-before-writing-t.patch, 
 BookieServer-2013-02-22.snapshot


 A bookie adds to the LedgerStorage before writing to the journal. This is the 
 fundamental problem behind BOOKKEEPER-447 and blocks a nice solution to 
 BOOKKEEPER-530. By writing to the memory state before the journal, we exposed 
 ourselves to bugs if the bookie crashed before we wrote to the journal. The 
 entry may exist in index, but not in the entrylog, a situation which cannot 
 be distinguished from an I/O error. The comments on BOOKKEEPER-447 goes into 
 more details. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-220) Managed Ledger proposal

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823713#comment-13823713
 ] 

Ivan Kelly commented on BOOKKEEPER-220:
---

I think this is out of scope for 4.3.0 so bump it up. It is something I'd like 
to get in next year though as part of a api revamp. I've implemented something 
like managed ledger 4 or 5 times now. I'm sure you have too. It's hard to have 
a master-slave system without something similar. As such, we should extract the 
pattern and offer it to users (as managed ledger does).

 Managed Ledger proposal
 ---

 Key: BOOKKEEPER-220
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-220
 Project: Bookkeeper
  Issue Type: New Feature
  Components: bookkeeper-client
Reporter: Matteo Merli
Assignee: Matteo Merli
 Attachments: 0001-BOOKKEEPER-220-Managed-Ledger-proposal.patch, 
 0001-BOOKKEEPER-220-Managed-Ledger-proposal.patch, 
 0001-BOOKKEEPER-220-Managed-Ledger-proposal.patch, 
 0001-BOOKKEEPER-220-Managed-Ledger-proposal.patch


 The ManagedLedger design is based on our need to manage a set of ledgers, 
 with a single writer (at any point in time) and a set on consumers that read 
 entries from it. 
 The ManagedLedger also takes care of periodically closing ledgers to have a 
 reasonable sized sets of ledgers that can individually deleted when no more 
 needed.
 I've put on github the interface proposal (along with an early WIP 
 implementation)
 http://github.com/merlimat/managed-ledger



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-578) LedgerCacheImpl is reserving 1/3 of Heap size but allocates NonHeap memory

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823715#comment-13823715
 ] 

Ivan Kelly commented on BOOKKEEPER-578:
---

I think this is still something that merits more investigation. Not for 4.3.0 
though.

 LedgerCacheImpl is reserving 1/3 of Heap size but allocates NonHeap memory
 --

 Key: BOOKKEEPER-578
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-578
 Project: Bookkeeper
  Issue Type: Bug
  Components: bookkeeper-server
Reporter: Matteo Merli
Priority: Minor
 Fix For: 4.4.0


 By default the page limit parameter is set to -1, which means to assign 1/3 
 of Heap space to the LedgerCache. Each LedgerEntryPage is then allocating the 
 memory outside the heap (ByteBuffer.allocateDirect()).
 This makes BK to use more memory than the -XmxNN configured setting. Is there 
 any particular reason for the LedgerEntryPage buffer to be allocated outside 
 the java heap? Could that be changed?



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1808) Add version to FLE notifications for 3.4 branch

2013-11-15 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823732#comment-13823732
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1808:
---

(Meant FLETestUtils.createMsg() is called again and again with the same 
params).

 Add version to FLE notifications for 3.4 branch
 ---

 Key: ZOOKEEPER-1808
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1808
 Project: ZooKeeper
  Issue Type: Sub-task
Reporter: Flavio Junqueira
Assignee: Flavio Junqueira
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1808.patch, ZOOKEEPER-1808.patch, 
 ZOOKEEPER-1808.patch, ZOOKEEPER-1808.patch, ZOOKEEPER-1808.patch, 
 ZOOKEEPER-1808.patch, ZOOKEEPER-1808.patch, ZOOKEEPER-1808.patch


 Add version to notification messages so that we can differentiate messages 
 during rolling upgrades. This task is for the 3.4 branch only. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Re: Review Request 15568: See ZOOKEEPER-1810

2013-11-15 Thread Raul Gutierrez Segales

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15568/#review28971
---



./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
https://reviews.apache.org/r/15568/#comment56020

space: synchronized (self) {



./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
https://reviews.apache.org/r/15568/#comment56021

Use {} instead of concatenating?



./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
https://reviews.apache.org/r/15568/#comment56022

Even though this is a critical path the params for LOG.debug aren't 
computed so could probably drop the if(LOG.isDebugEnabled) - your call.

Also - could you change it to use {} instead of concatenating since you are 
already fixing the line :-). 



./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
https://reviews.apache.org/r/15568/#comment56023

Even though this is a critical path the params for LOG.debug aren't 
computed so could probably drop the if(LOG.isDebugEnabled) - your call.

Also - could you change it to use {} instead of concatenating since you are 
already fixing the line :-). 



./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java
https://reviews.apache.org/r/15568/#comment56024

Maybe convert to {} instead of concat while we are at it?



./src/java/test/org/apache/zookeeper/server/quorum/FLELostMessageTest.java
https://reviews.apache.org/r/15568/#comment56025

extra newline



./src/java/test/org/apache/zookeeper/server/quorum/FLETestUtils.java
https://reviews.apache.org/r/15568/#comment56026

Lets start pushing towards {} instead of +. 



./src/java/test/org/apache/zookeeper/server/quorum/FLETestUtils.java
https://reviews.apache.org/r/15568/#comment56027

ditto


- Raul Gutierrez Segales


On Nov. 15, 2013, 7 a.m., German Blanco wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/15568/
 ---
 
 (Updated Nov. 15, 2013, 7 a.m.)
 
 
 Review request for zookeeper, fpj and Raul Gutierrez Segales.
 
 
 Bugs: ZOOKEEPER-1810
 https://issues.apache.org/jira/browse/ZOOKEEPER-1810
 
 
 Repository: zookeeper
 
 
 Description
 ---
 
 See ZOOKEEPER-1810
 
 
 Diffs
 -
 
   ./src/java/main/org/apache/zookeeper/server/quorum/FastLeaderElection.java 
 1542171 
   ./src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java 
 1542171 
   ./src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java 1542171 
   ./src/java/main/org/apache/zookeeper/server/quorum/Vote.java 1542171 
   
 ./src/java/test/org/apache/zookeeper/server/quorum/FLEBackwardElectionRoundTest.java
  PRE-CREATION 
   ./src/java/test/org/apache/zookeeper/server/quorum/FLELostMessageTest.java 
 PRE-CREATION 
   ./src/java/test/org/apache/zookeeper/server/quorum/FLETestUtils.java 
 PRE-CREATION 
   ./src/java/test/org/apache/zookeeper/test/FLEBackwardElectionRoundTest.java 
 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLELostMessageTest.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLENewEpochTest.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLEPredicateTest.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLETest.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLETestUtils.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/FLEZeroWeightTest.java 1542171 
   ./src/java/test/org/apache/zookeeper/test/LENonTerminateTest.java 1542171 
 
 Diff: https://reviews.apache.org/r/15568/diff/
 
 
 Testing
 ---
 
 Test included.
 
 
 Thanks,
 
 German Blanco
 




[jira] [Updated] (ZOOKEEPER-1786) ZooKeeper data model documentation is incorrect

2013-11-15 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-1786:


Assignee: Niraj Tolia

 ZooKeeper data model documentation is incorrect
 ---

 Key: ZOOKEEPER-1786
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1786
 Project: ZooKeeper
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.4.6
Reporter: Niraj Tolia
Assignee: Niraj Tolia
Priority: Minor
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1786.patch


 When I look at 
 https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkDataModel,
  I see two things that seem wrong in terms of restricted characters:
 * \uXFFFE - \uX (where X is a digit 1 - E)
 * \uF - \uF
 These definitions are invalid characters in Java and aren't reflected in 
 PathUtils either (or PathUtilsTest). In fact the code in PathUtils states:
 {code:borderStyle=solid}
 } else if (c  '\u'  c = '\u001f'
 || c = '\u007f'  c = '\u009F'
 || c = '\ud800'  c = '\uf8ff'
 || c = '\ufff0'  c = '\u') {
 reason = invalid charater @ + i;
 break;
 }
 {code}
 Unless I am missing something, this simple patch should fix the documentation 
 problem:
 {code}
 Index: src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
 ===
 --- src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml 
 (revision 1530514)
 +++ src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml 
 (working copy)
 @@ -139,8 +139,7 @@
listitem
  paraThe following characters are not allowed: \ud800 - uF8FF,
 -\uFFF0 - u, \uXFFFE - \uX (where X is a digit 1 - E), 
 \uF -
 -\uF./para
 +\uFFF0 - u./para
/listitem
listitem
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1786) ZooKeeper data model documentation is incorrect

2013-11-15 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823891#comment-13823891
 ] 

Flavio Junqueira commented on ZOOKEEPER-1786:
-

B3.4 Committed revision 1542356.

 ZooKeeper data model documentation is incorrect
 ---

 Key: ZOOKEEPER-1786
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1786
 Project: ZooKeeper
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.4.6
Reporter: Niraj Tolia
Assignee: Niraj Tolia
Priority: Minor
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1786.patch


 When I look at 
 https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkDataModel,
  I see two things that seem wrong in terms of restricted characters:
 * \uXFFFE - \uX (where X is a digit 1 - E)
 * \uF - \uF
 These definitions are invalid characters in Java and aren't reflected in 
 PathUtils either (or PathUtilsTest). In fact the code in PathUtils states:
 {code:borderStyle=solid}
 } else if (c  '\u'  c = '\u001f'
 || c = '\u007f'  c = '\u009F'
 || c = '\ud800'  c = '\uf8ff'
 || c = '\ufff0'  c = '\u') {
 reason = invalid charater @ + i;
 break;
 }
 {code}
 Unless I am missing something, this simple patch should fix the documentation 
 problem:
 {code}
 Index: src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml
 ===
 --- src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml 
 (revision 1530514)
 +++ src/docs/src/documentation/content/xdocs/zookeeperProgrammers.xml 
 (working copy)
 @@ -139,8 +139,7 @@
listitem
  paraThe following characters are not allowed: \ud800 - uF8FF,
 -\uFFF0 - u, \uXFFFE - \uX (where X is a digit 1 - E), 
 \uF -
 -\uF./para
 +\uFFF0 - u./para
/listitem
listitem
 {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1807) Observers spam each other creating connections to the election addr

2013-11-15 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823924#comment-13823924
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1807:
---

I am happy to give the RB a shipit but I would prefer to have more 
feedback/reviews from [~thawan] and [~fpj] since they are more familiar with 
the internals of FLE. 

 Observers spam each other creating connections to the election addr
 ---

 Key: ZOOKEEPER-1807
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1807
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Raul Gutierrez Segales
Assignee: Alexander Shraer
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1807-alex.patch, ZOOKEEPER-1807-ver2.patch, 
 ZOOKEEPER-1807-ver3.patch, ZOOKEEPER-1807-ver4.patch, 
 ZOOKEEPER-1807-ver5.patch, ZOOKEEPER-1807.patch, notifications-loop.png


 Hey [~shralex],
 I noticed today that my Observers are spamming each other trying to open 
 connections to the election port. I've got tons of these:
 {noformat}
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 9
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 10
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 6
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 12
 2013-11-01 22:19:45,819 - DEBUG [WorkerSender[myid=13]] - There is a 
 connection already for server 14
 {noformat}
 and so and so on ad nauseam. 
 Now, looking around I found this inside FastLeaderElection.java from when you 
 committed ZOOKEEPER-107:
 {noformat}
  private void sendNotifications() {
 -for (QuorumServer server : self.getVotingView().values()) {
 -long sid = server.id;
 -
 +for (long sid : self.getAllKnownServerIds()) {
 +QuorumVerifier qv = self.getQuorumVerifier();
 {noformat}
 Is that really desired? I suspect that is what's causing Observers to try to 
 connect to each other (as opposed as just connecting to participants). I'll 
 give it a try now and let you know. (Also, we use observer ids that are  0, 
 and I saw some parts of the code that might not deal with that assumption - 
 so it could be that too..). 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1635) Support x64 architecture for Windows

2013-11-15 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824063#comment-13824063
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1635:


We don't use _asm any more. The current fetch_and_add code looks like this:

{code}
int32_t fetch_and_add(volatile int32_t* operand, int incr)
{
#ifndef WIN32
return __sync_fetch_and_add(operand, incr);
#else
return InterlockedExchangeAdd(operand, incr);
#endif
}
{code}

This should work on 64 bit windows, no?

 Support x64 architecture for Windows
 

 Key: ZOOKEEPER-1635
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1635
 Project: ZooKeeper
  Issue Type: Improvement
 Environment: Windows x64 systems.
Reporter: Tomas Gutierrez
 Fix For: 3.5.0


 x64 target does not support _asm inline  (See: 
 http://msdn.microsoft.com/en-us/library/4ks26t93(v=vs.80).aspx)
 The proposal is to use native windows function which still valid for i386 and 
 x64 architecture.
 In order to avoid any potential break, a compilation directive has been 
 added. But, the best should be the removal of the asm part.
 ---
 sample code
 ---
 int32_t fetch_and_add(volatile int32_t* operand, int incr)
 {
 #ifndef WIN32
 int32_t result;
 asm __volatile__(
  lock xaddl %0,%1\n
  : =r(result), =m(*(int *)operand)
  : 0(incr)
  : memory);
return result;
 #else
 #ifdef WIN32_NOASM
 InterlockedExchangeAdd(operand, incr);
 return *operand;
 #else
 volatile int32_t result;
 _asm
 {
 mov eax, operand; //eax = v;
mov ebx, incr; // ebx = i;
 mov ecx, 0x0; // ecx = 0;
 lock xadd dword ptr [eax], ecx; 
lock xadd dword ptr [eax], ebx; 
 mov result, ecx; // result = ebx;
  }
  return result;*/
 #endif
 #endif
 }



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread some one (JIRA)
some one created ZOOKEEPER-1815:
---

 Summary: Tolerate incorrectly set system hostname in tests
 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0


A bunch of tests will fail with UnknownHostException errors when the hostname 
is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread some one (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

some one updated ZOOKEEPER-1815:


Attachment: ZOOKEEPER-1815.patch

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824109#comment-13824109
 ] 

Hadoop QA commented on ZOOKEEPER-1815:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12614130/ZOOKEEPER-1815.patch
  against trunk revision 1542355.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1771//console

This message is automatically generated.

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1815 PreCommit Build #1771

2013-11-15 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1771/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 99 lines...]
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
b/src/java/test/org/apache/zookeeper/test/FLERestartTest.java.rej
 [exec] patching file b/src/java/test/org/apache/zookeeper/test/FLETest.java
 [exec] Hunk #1 FAILED at 468.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
b/src/java/test/org/apache/zookeeper/test/FLETest.java.rej
 [exec] patching file b/src/java/test/org/apache/zookeeper/test/JMXEnv.java
 [exec] Hunk #1 FAILED at 49.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
b/src/java/test/org/apache/zookeeper/test/JMXEnv.java.rej
 [exec] patching file 
b/src/java/test/org/apache/zookeeper/test/NIOConnectionFactoryFdLeakTest.java
 [exec] Hunk #1 FAILED at 51.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
b/src/java/test/org/apache/zookeeper/test/NIOConnectionFactoryFdLeakTest.java.rej
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12614130/ZOOKEEPER-1815.patch
 [exec]   against trunk revision 1542355.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 33 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1771//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 6fa5fa8a75e3244eb57fba1a3b2818e91d39259f logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 1 minute 17 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1815
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Updated] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread some one (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

some one updated ZOOKEEPER-1815:


Attachment: ZOOKEEPER-1815.patch

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch, ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Created] (ZOOKEEPER-1816) ClientCnxn.close() should block until threads have died

2013-11-15 Thread Jared Winick (JIRA)
Jared Winick created ZOOKEEPER-1816:
---

 Summary: ClientCnxn.close() should block until threads have died
 Key: ZOOKEEPER-1816
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1816
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.5, 3.3.6
Reporter: Jared Winick
Priority: Minor


In the testing of ACCUMULO-1379 and ACCUMULO-1858 it was seen that the 
non-blocking behavior of ClientCnxn.close(), and therefore ZooKeeper.close(), 
can cause a race condition when undeploying an application running in a Java 
container such as JBoss or Tomcat. As the close() method returns without 
joining on the sendThread and eventThread, those threads continue to 
execute/cleanup while the container is cleaning up the application's resources. 
If the container has unloaded classes by the time this code runs

{code}
ZooTrace.logTraceMessage(LOG, ZooTrace.getTextTraceLevel(), SendThread 
exitedloop.);
{code}

A java.lang.NoClassDefFoundError: org/apache/zookeeper/server/ZooTrace can be 
seen. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824155#comment-13824155
 ] 

Hadoop QA commented on ZOOKEEPER-1815:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12614132/ZOOKEEPER-1815.patch
  against trunk revision 1542355.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1772//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1772//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1772//console

This message is automatically generated.

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch, ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1573) Unable to load database due to missing parent node

2013-11-15 Thread Thawan Kooburat (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824156#comment-13824156
 ] 

Thawan Kooburat commented on ZOOKEEPER-1573:


Probably need a comment from other people as well.  We disable this check in 
our prod system because we have some other way of detecting data inconsistency. 
 This check has shown to catch a real bug but it can also raise false possible 
in certain usage pattern.

 Unable to load database due to missing parent node
 --

 Key: ZOOKEEPER-1573
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1573
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3, 3.5.0
Reporter: Thawan Kooburat
 Attachments: ZOOKEEPER-1573.patch


 While replaying txnlog on data tree, the server has a code to detect missing 
 parent node. This code block was last modified as part of ZOOKEEPER-1333. In 
 our production, we found a case where this check is return false positive.
 The sequence of txns is as follows:
 zxid 1:  create /prefix/a
 zxid 2:  create /prefix/a/b
 zxid 3:  delete /prefix/a/b
 zxid 4:  delete /prefix/a
 The server start capturing snapshot at zxid 1. However, by the time it 
 traversing the data tree down to /prefix, txn 4 is already applied and 
 /prefix have no children. 
 When the server restore from snapshot, it process txnlog starting from zxid 
 2. This txn generate missing parent error and the server refuse to start up.
 The same check allow me to discover bug in ZOOKEEPER-1551, but I don't know 
 if we have any option beside removing this check to solve this issue.  



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (ZOOKEEPER-1653) zookeeper fails to start because of inconsistent epoch

2013-11-15 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki updated ZOOKEEPER-1653:
---

Attachment: ZOOKEEPER-1653.3.4.patch

Throw an IOException if any of the operations on the updating file fails.

 zookeeper fails to start because of inconsistent epoch
 --

 Key: ZOOKEEPER-1653
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1653
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.4.5
Reporter: Michi Mutsuzaki
Assignee: Michi Mutsuzaki
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1653.3.4.patch, ZOOKEEPER-1653.3.4.patch, 
 ZOOKEEPER-1653.patch, ZOOKEEPER-1653.patch


 It looks like QuorumPeer.loadDataBase() could fail if the server was 
 restarted after zk.takeSnapshot() but before finishing 
 self.setCurrentEpoch(newEpoch) in Learner.java.
 {code:java}
 case Leader.NEWLEADER: // it will be NEWLEADER in v1.0
 zk.takeSnapshot();
 self.setCurrentEpoch(newEpoch); //  got restarted here
 snapshotTaken = true;
 writePacket(new QuorumPacket(Leader.ACK, newLeaderZxid, null, null), 
 true);
 break;
 {code}
 The server fails to start because currentEpoch is still 1 but the last 
 processed zkid from the snapshot has been updated.
 {noformat}
 2013-02-20 13:45:02,733 5543 [pool-1-thread-1] ERROR 
 org.apache.zookeeper.server.quorum.QuorumPeer  - Unable to load database on 
 disk
 java.io.IOException: The current epoch, 1, is older than the last zxid, 
 8589934592
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:439)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:413)
 ...
 {noformat}
 {noformat}
 $ find datadir 
 datadir
 datadir/version-2
 datadir/version-2/currentEpoch.tmp
 datadir/version-2/acceptedEpoch
 datadir/version-2/snapshot.0
 datadir/version-2/currentEpoch
 datadir/version-2/snapshot.2
 $ cat datadir/version-2/currentEpoch.tmp
 2%
 $ cat datadir/version-2/acceptedEpoch
 2%
 $ cat datadir/version-2/currentEpoch
 1%
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824352#comment-13824352
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1815:


Please wrap lines to 80 characters. There are lines longer than 80 characters 
already, but I'd like to avoid introducing new ones.

The patch looks good to me otherwise.

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch, ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


Failed: ZOOKEEPER-1653 PreCommit Build #1773

2013-11-15 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1653
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1773/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 78 lines...]
 [exec] Hunk #3 FAILED at 50.
 [exec] Hunk #4 FAILED at 62.
 [exec] Hunk #5 succeeded at 677 (offset 2 lines).
 [exec] 2 out of 5 hunks FAILED -- saving rejects to file 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java.rej
 [exec] patching file 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerTestBase.java
 [exec] Hunk #1 succeeded at 24 with fuzz 2.
 [exec] Hunk #2 succeeded at 76 with fuzz 2 (offset 18 lines).
 [exec] Hunk #3 succeeded at 85 with fuzz 2 (offset 12 lines).
 [exec] Hunk #4 succeeded at 134 (offset 30 lines).
 [exec] Hunk #5 succeeded at 145 (offset 30 lines).
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12614162/ZOOKEEPER-1653.3.4.patch
 [exec]   against trunk revision 1542355.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 6 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1773//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 2b8b0e1a62c13588a679137bb6b77b3733b48a67 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 1 minute 35 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1653
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Success: ZOOKEEPER-1815 PreCommit Build #1774

2013-11-15 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 257018 lines...]
 [exec] BUILD SUCCESSFUL
 [exec] Total time: 0 seconds
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] +1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12614196/ZOOKEEPER-1815.patch
 [exec]   against trunk revision 1542355.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 33 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] a15bc17965bcde5244cb1f9c469581a46d915876 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD SUCCESSFUL
Total time: 34 minutes 32 seconds
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1815
Email was triggered for: Success
Sending email for trigger: Success



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1815) Tolerate incorrectly set system hostname in tests

2013-11-15 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13824390#comment-13824390
 ] 

Hadoop QA commented on ZOOKEEPER-1815:
--

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12614196/ZOOKEEPER-1815.patch
  against trunk revision 1542355.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 33 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1774//console

This message is automatically generated.

 Tolerate incorrectly set system hostname in tests
 -

 Key: ZOOKEEPER-1815
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1815
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: some one
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1815.patch, ZOOKEEPER-1815.patch, 
 ZOOKEEPER-1815.patch


 A bunch of tests will fail with UnknownHostException errors when the hostname 
 is incorrectly set on the system that you are running tests on.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


ZooKeeper_branch33_solaris - Build # 708 - Still Failing

2013-11-15 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch33_solaris/708/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 100875 lines...]
[junit] 2013-11-16 07:08:06,015 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2
[junit] 2013-11-16 07:08:06,016 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-16 07:08:06,018 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2/snapshot.0
[junit] 2013-11-16 07:08:06,022 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-11-16 07:08:06,024 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-11-16 07:08:06,025 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:43997
[junit] 2013-11-16 07:08:06,026 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:43997
[junit] 2013-11-16 07:08:06,027 - INFO  
[Thread-4:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] 2013-11-16 07:08:06,028 - INFO  [Thread-4:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:43997 (no session established 
for client)
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-16 07:08:06,029 - INFO  [main:ClientBase@408] - STOPPING 
server
[junit] 2013-11-16 07:08:06,031 - INFO  
[ProcessThread:-1:PrepRequestProcessor@128] - PrepRequestProcessor exited loop!
[junit] 2013-11-16 07:08:06,031 - INFO  
[SyncThread:0:SyncRequestProcessor@151] - SyncRequestProcessor exited!
[junit] 2013-11-16 07:08:06,032 - INFO  [main:FinalRequestProcessor@370] - 
shutdown of request processor complete
[junit] 2013-11-16 07:08:06,033 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] ensureOnly:[]
[junit] 2013-11-16 07:08:06,034 - INFO  [main:ClientBase@401] - STARTING 
server
[junit] 2013-11-16 07:08:06,035 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2
[junit] 2013-11-16 07:08:06,036 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-11-16 07:08:06,037 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test8444094255255524841.junit.dir/version-2/snapshot.b
[junit] 2013-11-16 07:08:06,040 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-11-16 07:08:06,042 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-11-16 07:08:06,043 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:43999
[junit] 2013-11-16 07:08:06,043 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:43999
[junit] 2013-11-16 07:08:06,044 - INFO  
[Thread-5:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] 2013-11-16 07:08:06,045 - INFO  [Thread-5:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:43999 (no session established 
for client)
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-11-16 07:08:06,047 - INFO  

[jira] [Commented] (BOOKKEEPER-708) Shade protobuf library to avoid incompatible versions

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823688#comment-13823688
 ] 

Ivan Kelly commented on BOOKKEEPER-708:
---

As HDFS-5518 points out, simply mandating that people use a newer guava is not 
without risks. I still think we should shade guava, but do so with the reduced 
jar option to avoid polluting the namespace as [~hustlmsp] says.

 Shade protobuf library to avoid incompatible versions
 -

 Key: BOOKKEEPER-708
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-708
 Project: Bookkeeper
  Issue Type: Bug
  Components: bookkeeper-server
Reporter: Sijie Guo
Assignee: Rakesh R
 Fix For: 4.3.0, 4.2.3

 Attachments: 0001-BOOKKEEPER-708.patch, 0002-BOOKKEEPER-708.patch


 as offline discussion, we need to shade protobuf library for BKJM as hadoop 
 uses protobuf 2.5.
 this is planned on version 4.2.3 and 4.3.0.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (BOOKKEEPER-702) Upgrade protobuf-java to 2.5.0 version

2013-11-15 Thread Ivan Kelly (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13823691#comment-13823691
 ] 

Ivan Kelly commented on BOOKKEEPER-702:
---

[~ste...@apache.org] shading solves this without requiring us to keep in lock 
step. bookkeeper is a library, to be uses by many different systems. I don't 
want to mandate what version of protobuf other people use. Likewise, I don't 
want the version other people use to affect bookkeeper. Because it doesn't just 
affect bookkeeper, but also any internal systems we have using bookkeeper.

 Upgrade protobuf-java to 2.5.0 version
 --

 Key: BOOKKEEPER-702
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-702
 Project: Bookkeeper
  Issue Type: Improvement
Reporter: Rakesh R
Assignee: Rakesh R
Priority: Blocker
 Fix For: 4.4.0

 Attachments: 0001-BOOKKEEPER-702.patch, 0002-BOOKKEEPER-702.patch


 HDFS is using BK for the shared memory approach through BKJM plugin. 
 Presently HDFS is using Bookkeeper4.0.0 version and when tries to upgrade to 
 latest 4.2.2 version, there is a conflicts in protobuf versions between the 
 components. Latest HDFS 2.1 branch is using protobuf-java-2.5.0 version, but 
 BK has protobuf-java-2.4.1 version.



--
This message was sent by Atlassian JIRA
(v6.1#6144)