[jira] [Commented] (HDFS-3004) Implement Recovery Mode

Todd Lipcon (Commented) (JIRA) Tue, 27 Mar 2012 23:03:05 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240215#comment-13240215
 ]


Todd Lipcon commented on HDFS-3004:
-----------------------------------

{code}
+    } catch (IOException e) {
+      return null;
{code}
This shows up a few places and makes me nervous. We should always at least 
LOG.warn an exception.

----
{code}
+            if (op.getTransactionId() > expectedTxId) { 
+              askOperator("There appears to be a gap in the edit log.  " +
+                  "We expected txid " + expectedTxId + ", but got txid " +
+                  op.getTransactionId() + ".", recovery, "skipping bad edit");
+            } else if (op.getTransactionId() < expectedTxId) { 
+              askOperator("There appears to be an out-of-order edit in " +
+                  "the edit log.  We expected txid " + expectedTxId +
+                  ", but got txid " + op.getTransactionId() + ".", recovery,
+                  "applying edits");
+              continue;
{code}
Am I reading this wrong, or is the logic backwards? It seems like the first 
prompt allows you to say "continue, skipping bad edit", but doesn't call 
continue (thus applying the bad edit). The second one gives you the choice 
"continue, applying edits" but *does* skip the bad one.

----
{code}
+          catch (Throwable e) {
+            askOperator("Failed to apply edit log operation " +
+              op.getTransactionIdStr() + ": error " + e.getMessage(),
+              recovery, "applying edits");
{code}
Here we no longer output the stack trace of the exception anywhere. That's 
often a big problem - we really need the exception trace to understand the root 
cause and to know whether we might be able to skip it.

- I would also recommend printing the op.toString() itself in all of these 
cases, not just the txids. That gives the operator something to base their 
decision on. I think we have a reasonable stringification of all ops nowdays.

----
{code}
-          LOG.info(String.format("Log begins at txid %d, but requested start "
-              + "txid is %d. Skipping %d edits.", elf.getFirstTxId(), fromTxId,
-              transactionsToSkip));
-          elfis.skipTransactions(transactionsToSkip);
+        // TODO: add recovery mode override here as part of edit log failover
+        if (elfis.skipUntil(fromTxId) == false) {
{code}
Please leave the log message in place here. Also, in the failure log message 
here, I think you should (a) throw EditLogInputException, and (b) include the 
file path here


----
{code}
+          /* Now that the operation has been successfully decoded and
+           * applied, update our bookkeeping. */
{code}
style: use // comments inline in code, not /* */ generally. There are a couple 
other places where you should make the same fix.

----
- There are still a few spurious whitespace changes unrelated to your code 
here. Not a big deal, but best to remove those hunks from your patch.

----

{code}
+   * Get the next valid operation from the stream storage
+   * 
+   * This is exactly like nextOp, except that we attempt to skip over damaged
+   * parts of the edit log
{code}
Please add '.'s at the end of the sentences in the docs. It's annoying, but the 
way JavaDoc gets formatted, all the sentences will run together without line 
breaks in many cases, so the punctuation's actually important for readability.
----

{code}
+   * @return        Returns true if we found a transaction ID greater than
+   *                'txid' in the log.
+   */
+  public boolean skipUntil(long txid) throws IOException {
{code}
The javadoc should say "greater than or equal to", right? Also, I think the doc 
should specify explicitly: "the next call to readOp() will usually return the 
transaction with the specified txid, unless the log contains a gap, in which 
case it may return a higher one."

----
{code}
+  private ThreadLocal <OpInstanceCache> cache = new 
ThreadLocal<OpInstanceCache>() {
{code}
Style: no space between "ThreadLocal" and '<'. Also the line's a bit long (we 
try to keep to 80 characters unless it's really hard not to)

----
{code}
+            if (op == null)
               break;
{code}
Style: we use {} braces even for one-line if statements. There are a few other 
examples of this too.

{code}
+          }
+          catch (Throwable e) {
{code}
catch clause goes on same line as '}'

----
{code}
+    if (txid == HdfsConstants.INVALID_TXID) {
+      throw new RuntimeException("operation has no txid");
{code}
Use {{Preconditions.checkState()}} here for better readability

----
{code}
+        LOG.error("automatically choosing " + firstChoice);
{code}
this should probably be INFO or WARN
----

Style: please add a blank line between functions in RecoveryContext.java

----
TestNameNodeRecovery:
- do we need data nodes in these tests? if it's just metadata ops, 0 DNs should 
be fine
- move the EditLogTestSetup interface down lower in the file, just above where 
you define the implementations
- extra space between {{interface}} and {{EditLogTestSetup}}
- please add blank lines between functions in that interface, and expand the 
javadoc out so the {{/**}} line is on its own, like the rest of the codebase
- removeFromArray -- really this is {{markTxnIdFound}} where {{-1}} is used as 
a mark, right? I think it might be much clearer to just use Set<Integer> here 
and remove the ints from the set -- that's essentially what you're doing?
- some funky indentation in a few places
- rather than catching, logging, and calling fail(), you can just let the 
Throwable pass out of the test case - that'll also fail the test case, with the 
advantage that the stack trace of the failure becomes clickable in IDEs
- tip: you can use StringUtils.stirngifyException to get an exception's stack 
trace into a String without StringWriter
- instead of the if (stream != null) checks, you can use IOUtils.closeStream, 
or IOUtils.cleanup, which takes care of that for you

                
> Implement Recovery Mode
> -----------------------
>
>                 Key: HDFS-3004
>                 URL: https://issues.apache.org/jira/browse/HDFS-3004
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: tools
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, 
> HDFS-3004.012.patch, HDFS-3004.013.patch, HDFS-3004.015.patch, 
> HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, 
> HDFS-3004.019.patch, HDFS-3004.020.patch, HDFS-3004.022.patch, 
> HDFS-3004.023.patch, HDFS-3004.024.patch, HDFS-3004.026.patch, 
> HDFS-3004.027.patch, HDFS-3004.029.patch, HDFS-3004.030.patch, 
> HDFS-3004.031.patch, HDFS-3004.032.patch, HDFS-3004.033.patch, 
> HDFS-3004__namenode_recovery_tool.txt
>
>
> When the NameNode metadata is corrupt for some reason, we want to be able to 
> fix it.  Obviously, we would prefer never to get in this case.  In a perfect 
> world, we never would.  However, bad data on disk can happen from time to 
> time, because of hardware errors or misconfigurations.  In the past we have 
> had to correct it manually, which is time-consuming and which can result in 
> downtime.
> Recovery mode is initialized by the system administrator.  When the NameNode 
> starts up in Recovery Mode, it will try to load the FSImage file, apply all 
> the edits from the edits log, and then write out a new image.  Then it will 
> shut down.
> Unlike in the normal startup process, the recovery mode startup process will 
> be interactive.  When the NameNode finds something that is inconsistent, it 
> will prompt the operator as to what it should do.   The operator can also 
> choose to take the first option for all prompts by starting up with the '-f' 
> flag, or typing 'a' at one of the prompts.
> I have reused as much code as possible from the NameNode in this tool.  
> Hopefully, the effort that was spent developing this will also make the 
> NameNode editLog and image processing even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3004) Implement Recovery Mode

Reply via email to