[ https://issues.apache.org/jira/browse/HBASE-10000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13834365#comment-13834365 ]
Ted Yu commented on HBASE-10000: -------------------------------- There are a lot of datanode exceptions in test output (https://builds.apache.org/job/PreCommit-HBASE-Build/8017//testReport/org.apache.hadoop.hbase.regionserver.wal/TestHLog/testAppendClose/): {code} 2013-11-27 22:42:49,500 ERROR [org.apache.hadoop.hdfs.server.datanode.DataXceiver@c4bc5a] datanode.DataXceiver(136): DatanodeRegistration(127.0.0.1:34488, storageID=DS-1139617915-67.195.138.30-34488-1385592147680, infoPort=49296, ipcPort=60679):DataXceiver java.io.EOFException: while trying to read 640 bytes at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readToBuf(BlockReceiver.java:296) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.readNextPacket(BlockReceiver.java:340) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:404) {code} I ran the following tests on hadoop-2 before patch submission and they passed: {code} TestSerialization,TestFSHDFSUtils,TestSplitLogWorker,TestDistributedLogSplitting,TestSplitLogManager,TestHLogSplit,TestLogRolling,TestHLog {code} Will dig deeper. > Initiate lease recovery for outstanding WAL files at the very beginning of > recovery > ----------------------------------------------------------------------------------- > > Key: HBASE-10000 > URL: https://issues.apache.org/jira/browse/HBASE-10000 > Project: HBase > Issue Type: Improvement > Reporter: Ted Yu > Assignee: Ted Yu > Fix For: 0.98.0 > > Attachments: 10000-recover-ts-with-pb-2.txt, > 10000-recover-ts-with-pb.txt, 10000-v1.txt, 10000-v4.txt, 10000-v5.txt, > 10000-v6.txt > > > At the beginning of recovery, master can send lease recovery requests > concurrently for outstanding WAL files using a thread pool. > Each split worker would first check whether the WAL file it processes is > closed. > Thanks to Nicolas Liochon and Jeffery discussion with whom gave rise to this > idea. -- This message was sent by Atlassian JIRA (v6.1#6144)