Tianyin Xu created HDFS-7726: -------------------------------- Summary: Parse and check the configuration settings of edit log to prevent runtime errors Key: HDFS-7726 URL: https://issues.apache.org/jira/browse/HDFS-7726 Project: Hadoop HDFS Issue Type: Bug Components: namenode Affects Versions: 2.6.0 Reporter: Tianyin Xu Priority: Minor
============================ Problem ------------------------------------------------- Similar as the following two issues addressed in 2.7.0, https://issues.apache.org/jira/browse/YARN-2165 https://issues.apache.org/jira/browse/YARN-2166 The edit log related configuration settings should be checked in the constructor rather than being applied directly at runtime. This would cause runtime failures if the values are wrong. Take "dfs.ha.tail-edits.period" as an example, currently in EditLogTailer.java, its value is not checked but directly used in doWork(), as the following code snippets. Any negative values would cause IllegalArgumentException (which is not caught) and impair the component. {code:title=EditLogTailer.java|borderStyle=solid} private void doWork() { { ..... Thread.sleep(sleepTimeMs); .... } {code} Another example is "dfs.ha.log-roll.rpc.timeout". Right now, we use getInt() to parse the value at runtime in the getActiveNodeProxy() function which is called by doWork(), shown as below. Any erroneous settings (e.g., ill-formatted integer) would cause exceptions. {code:title=EditLogTailer.java|borderStyle=solid} private NamenodeProtocol getActiveNodeProxy() throws IOException { { ..... int rpcTimeout = conf.getInt( DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_KEY, DFSConfigKeys.DFS_HA_LOGROLL_RPC_TIMEOUT_DEFAULT); .... } {code} ============================ Solution (the attached patch) ------------------------------------------------- Basically, the idea of the attached patch is to move the parsing and checking logics into the constructor to expose the error at initialization, so that the errors won't be latent at the runtime (same as YARN-2165 and YARN-2166) I'm not aware of the implementation of 2.7.0. It seems there's checking utilities such as the validatePositiveNonZero function in YARN-2165. If so, we can use that one to make the checking more systematic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)