[ 
https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010242#comment-13010242
 ] 

Ivan Kelly commented on HDFS-1521:
----------------------------------

I think this BackupNode failure is the result of a preexisting problem with 
BackupNode. The problem is the timing with which OP_JSPOOL_START is arriving at 
the backup node. The sequence is:

||    || NameNode             || BackupNode                      |
|  1. |                       | doCheckpoint                     |
|  2. | startCheckpoint       |                                  |
|  3. | log(OP_JSPOOL_START)  |                                  |
|  4. |                       | download images                  |
|  5. |                       | merge                            |
|  6. |                       | upload new image                 |
|  7. |                       | convergeJournalSpool             |
|  8. | flush editlog buffers |                                  |
|  9. |                       | startJournalSpool                |
|     |                       |                                  |
| ... | ...                   | ...                              |
|     |                       |                                  |
| 10. |                       | doCheckpoint                     |
| 11. | startCheckpoint       |                                  |
| 12. | log(OP_JSPOOL_START)  |                                  |
| 13. |                       | download images                  |
| 14. |                       | merge                            |
| 15. |                       | upload new image                 |
| 16. |                       | convergeJournalSpool (EXCEPTION) |

Basically, the OP_JSPOOL_START doesn't reach BackupNode before the checkpoint 
finishes, so when it does arrive, a spool is created which is then converged 
during the next checkpoint, but it contains all the transactions from the first 
checkpoint onwards. 

> Persist transaction ID on disk between NN restarts
> --------------------------------------------------
>
>                 Key: HDFS-1521
>                 URL: https://issues.apache.org/jira/browse/HDFS-1521
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.23.0
>
>         Attachments: FSImageFormat.patch, HDFS-1521.diff, HDFS-1521.diff, 
> HDFS-1521.diff, HDFS-1521.diff, HDFS-1521.diff, hdfs-1521.3.txt, 
> hdfs-1521.4.txt, hdfs-1521.5.txt, hdfs-1521.txt, hdfs-1521.txt, hdfs-1521.txt
>
>
> For HDFS-1073 and other future work, we'd like to have the concept of a 
> transaction ID that is persisted on disk with the image/edits. We already 
> have this concept in the NameNode but it resets to 0 on restart. We can also 
> use this txid to replace the _checkpointTime_ field, I believe.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to