Text Box: From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Monday, March 07, 2011 3:26 PM To: gok...@huawei.com; user@hbase.apache.org Subject: Re: will HBase detect NN failure? There are a series of patches that address this, check the recent commit history of append branch.
I have got this issue in the build taken from latest append trunk only. These are the steps to reproduce. 1. Write a file and do some syncs but not close 2. Restart NN 3. Run the following while loop for the above file _____ From: Ryan Rawson [mailto:ryano...@gmail.com] Sent: Monday, March 07, 2011 3:26 PM To: gok...@huawei.com; user@hbase.apache.org Subject: Re: will HBase detect NN failure? There are a series of patches that address this, check the recent commit history of append branch. On Mar 7, 2011 1:52 AM, "Gokulakannan M" <gok...@huawei.com> wrote: > Hi All, > > > > In HBase 0.90 I have seen that it has a fault tolerant behavior > of triggering lease recovery and closing the file when the writer dies in > the middle. Yet does hbase have any workaround/recovery when Namenode is > restarted in the middle of the file write(possibly the HLog file , after > some syncs)??? > > I faced a problem in the above scenario. When the NN is > restarted(but not DN), the following code goes into infinite loop as lease > recovery is not at all happening. But once the DN is restarted, the file can > be recovered successfully(I think the DN is not sending those partial blocks > in blocksBeingWritten to NN when only NN is restarted). > > > > // Recover the files lease if necessary > boolean recovered = false; > while (!recovered) { > try { > FSDataOutputStream out = fs.append(logfiles[i].getPath()); > out.close(); > recovered = true; > } catch (IOException e) { > if (LOG.isDebugEnabled()) { > LOG.debug("Triggering lease recovery."); > } > try { > Thread.sleep(leaseRecoveryPeriod); > } catch (InterruptedException ex) { > // ignore it and try again > } > } > > > > > > Thanks, > > Gokul > > > > > > > > >