That sounds like a good idea. If you don't mind, please file an issue and make a patch. Thank you, St.Ack
On Sun, Apr 24, 2011 at 5:53 PM, bijieshan <[email protected]> wrote: > Under the current hdfs Version, there's no related method to judge whether > the namenode is in safemode. > Maybe we can handle the SafeModeException at the top layer where called the > method of checkFileSystem(), wait for a while and retry the operation. > Does that reasonable? I hope someone can give some advises. > > Thanks! > Jieshan Bean > > -----邮件原件----- > 发件人: [email protected] [mailto:[email protected]] 代表 Stack > 发送时间: 2011年4月24日 5:55 > 收件人: [email protected] > 抄送: Chenjian > 主题: Re: Splitlog() executed while the namenode was in safemode may cause > data-loss > > Sorry, what did you change? > Thanks, > St.Ack > > On Fri, Apr 22, 2011 at 9:00 PM, bijieshan <[email protected]> wrote: >> Hi, >> I found this problem while the namenode went into safemode due to some >> unclear reasons. >> There's one patch about this problem: >> >> try { >> HLogSplitter splitter = HLogSplitter.createLogSplitter( >> conf, rootdir, logDir, oldLogDir, this.fs); >> try { >> splitter.splitLog(); >> } catch (OrphanHLogAfterSplitException e) { >> LOG.warn("Retrying splitting because of:", e); >> // An HLogSplitter instance can only be used once. Get new instance. >> splitter = HLogSplitter.createLogSplitter(conf, rootdir, logDir, >> oldLogDir, this.fs); >> splitter.splitLog(); >> } >> splitTime = splitter.getTime(); >> splitLogSize = splitter.getSize(); >> } catch (IOException e) { >> checkFileSystem(); >> LOG.error("Failed splitting " + logDir.toString(), e); >> master.abort("Shutting down HBase cluster: Failed splitting hlog >> files...", e); >> } finally { >> this.splitLogLock.unlock(); >> } >> >> And it was really give some useful help to some extent, while the namenode >> process exited or been killed, but not considered the Namenode safemode >> exception. >> I think the root reason is the method of checkFileSystem(). >> It gives out an method to check whether the HDFS works normally(Read and >> write could be success), and that maybe the original propose of this method. >> This's how this method implements: >> >> DistributedFileSystem dfs = (DistributedFileSystem) fs; >> try { >> if (dfs.exists(new Path("/"))) { >> return; >> } >> } catch (IOException e) { >> exception = RemoteExceptionHandler.checkIOException(e); >> } >> >> I have check the hdfs code, and learned that while the namenode was in >> safemode ,the dfs.exists(new Path("/")) returned true. Because the file >> system could provide read-only service. So this method just checks the dfs >> whether could be read. I >> think it's not reasonable. >> >> >> Regards, >> Jieshan Bean >> >> >> >> >> >> >> >> >> >> >> >> >> >
