[jira] [Commented] (ZOOKEEPER-1333) NPE in FileTxnSnapLog when restarting a cluster

Andrew McNair (Commented) (JIRA) Mon, 19 Dec 2011 18:09:56 -0800

    [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13172851#comment-13172851
 ]


Andrew McNair commented on ZOOKEEPER-1333:
------------------------------------------

Added a test case in LoadFromLogTest that fails:

    /**
     * Test we can restore a snapshot that has errors and data ahead of the zxid
     * of the snapshot file. 
     */
    @Test
    public void testRestoreWithTransactionErrors() throws Exception {
        // setup a single server cluster
        File tmpDir = ClientBase.createTmpDir();
        ClientBase.setupTestEnv();
        ZooKeeperServer zks = new ZooKeeperServer(tmpDir, tmpDir, 3000);
        SyncRequestProcessor.setSnapCount(10000);
        final int PORT = Integer.parseInt(HOSTPORT.split(":")[1]);
        ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
        f.startup(zks);
        Assert.assertTrue("waiting for server being up ", ClientBase
                .waitForServerUp(HOSTPORT, CONNECTION_TIMEOUT));
        ZooKeeper zk = new ZooKeeper(HOSTPORT, CONNECTION_TIMEOUT, this);

        long start = System.currentTimeMillis();
        while (!connected) {
            long end = System.currentTimeMillis();
            if (end - start > 5000) {
                Assert.assertTrue("Could not connect with server in 5 seconds",
                        false);
            }
            try {
                Thread.sleep(200);
            } catch (Exception e) {
                LOG.warn("Intrrupted");
            }

        }
        // generate some transactions
        try {
            for (int i = 0; i < NUM_MESSAGES; i++) {
                try {
                    zk.create("/invaliddir/test-", new byte[0],
                            Ids.OPEN_ACL_UNSAFE, 
CreateMode.PERSISTENT_SEQUENTIAL);
                } catch(NoNodeException e) {
                    //Expected
                }
            }
        } finally {
            zk.close();
        }

        // force the zxid to be behind the content
        zks.getZKDatabase().setlastProcessedZxid(
                zks.getZKDatabase().getDataTreeLastProcessedZxid() - 10);
        LOG.info("Set lastProcessedZxid to "
                + zks.getZKDatabase().getDataTreeLastProcessedZxid());
        
        // Force snapshot and restore
        zks.takeSnapshot();
        zks.shutdown();
        f.shutdown();

        zks = new ZooKeeperServer(tmpDir, tmpDir, 3000);
        SyncRequestProcessor.setSnapCount(10000);
        f = ServerCnxnFactory.createFactory(PORT, -1);
        f.startup(zks);
        Assert.assertTrue("waiting for server being up ", ClientBase
                .waitForServerUp(HOSTPORT, CONNECTION_TIMEOUT));
        
        f.shutdown();
    }
                
> NPE in FileTxnSnapLog when restarting a cluster
> -----------------------------------------------
>
>                 Key: ZOOKEEPER-1333
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1333
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.4.0
>            Reporter: Andrew McNair
>            Priority: Blocker
>             Fix For: 3.4.2
>
>
> I think a NPE was created in the fix for 
> https://issues.apache.org/jira/browse/ZOOKEEPER-1269
> Looking in DataTree.processTxn(TxnHeader header, Record txn) it seems likely 
> that if rc.err != Code.OK then rc.path will be null. 
> I'm currently working on a minimal test case for the bug, I'll attach it to 
> this issue when it's ready.
> java.lang.NullPointerException
>       at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:203)
>       at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:150)
>       at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:418)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:410)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:151)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111)
>       at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (ZOOKEEPER-1333) NPE in FileTxnSnapLog when restarting a cluster

Reply via email to