[
https://issues.apache.org/jira/browse/HADOOP-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Konstantin Shvachko updated HADOOP-702:
---------------------------------------
Attachment: FSStateTransition6.htm
FSStateTransition.patch
This is the patch that fully implements the design in the updated document.
I updated three versions for ClientProtocol, DatanodeProtocol and the
LAYOUT_VERSION,
which previously used to be called DFS_CURRENT_VERSION.
New code enforces more strict version checking: if a data-node has different
from the
name-node build version then it fails, even if the layout and protocol versions
are the same.
The build version is checked during handshake - a new rpc call which happens
before registration.
The -upgrade feature can be used immediately although it is not mandatory.
The expected behavior is that the old fs layout will be first converted into
the new layout, and then
saved in directory "previous". "current" directory will contain the new file
system state.
All old files (in "previous") will remain unmodified, and can be restored in
case of failure.
The rollback will not restore the pre-upgrade layout as pointed out in the
design doc.
After applying the upgrade patch I recommend to actually upgrade
- start the cluster with the -upgrade option
- run fsck and some tests
- bin/hadoop dfsadmin -finalizeUpgrade
If something failed during conversion or later on I do not recommend to use
rollback as a recovery procedure.
In order to recover the pre-upgrade state and layout from the "previous"
directory one should manually rename files, namely:
for NameNode
mv previous/edits ../
rm previous/VERSION
mv previous image
rm current
for DataNode
mv previous/storage ../
rm previous/VERSION
mv previous data
rm current
Other changes and future work.
- The name-node image file format has not been changed, and it still contains
the layout version and the namespace ID,
which are redundant now. The reason for that is that it would make failure
during the conversion unrecoverable.
If the image is converted but the name-node fails before writing down the
version file, the namespace id and the LV will be lost.
The image file format should be changed sometimes later.
- I deprecated some methods. Most of then will need to be removed in a
subsequent patch.
- Name-node is locking the storage directory now, the same as data-nodes, so no
one can start
two name-nodes in the same directory from now on.
- I removed unused code in FSEditLog and SecondaryNameNode. This is related to
HADOOP-1076 (2)
- In FSEditLog I replaced 4 arrays by one and eliminated duplicate code.
- I changed MiniDFSCluster to sleep for 2 seconds before starting each
data-node.
Otherwise many tests were failing, because data-nodes were rolling ports.
This is not a good fix, we will need to find out why this is happening.
Thanks Raghu for reviewing the code and helping with testing.
Thanks Nigel for testing and for creating a comprehensive junit test that
covers at least 134 test cases
related to the new functionality.
> DFS Upgrade Proposal
> --------------------
>
> Key: HADOOP-702
> URL: https://issues.apache.org/jira/browse/HADOOP-702
> Project: Hadoop
> Issue Type: New Feature
> Components: dfs
> Reporter: Konstantin Shvachko
> Assigned To: Konstantin Shvachko
> Fix For: 0.13.0
>
> Attachments: DFSUpgradeProposal.html, DFSUpgradeProposal2.html,
> DFSUpgradeProposal3.html, FSStateTransition.html, FSStateTransition.patch,
> FSStateTransition5.htm, FSStateTransition6.htm, TestPlan-HdfsUpgrade.html,
> TestPlan-HdfsUpgrade.html, TestPlan-HdfsUpgrade.html
>
>
> Currently the DFS cluster upgrade procedure is manual.
> http://wiki.apache.org/lucene-hadoop/Hadoop_Upgrade
> It is rather complicated and does not guarantee data recoverability in case
> of software errors or administrator mistakes.
> This is a description of utilities that make the upgrade process almost
> automatic and minimize chance of loosing or corrupting data.
> Please see the attached html file for details.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.