Repository: hadoop Updated Branches: refs/heads/branch-2 b539bb41b -> c260a8aa9
HDFS-8974. Convert docs in xdoc format to markdown. Contributed by Masatake Iwasaki. Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/c260a8aa Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/c260a8aa Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/c260a8aa Branch: refs/heads/branch-2 Commit: c260a8aa93f6aec7725119c3b108fa74c5e2b739 Parents: b539bb4 Author: Akira Ajisaka <aajis...@apache.org> Authored: Thu Sep 10 16:50:57 2015 +0900 Committer: Akira Ajisaka <aajis...@apache.org> Committed: Thu Sep 10 16:50:57 2015 +0900 ---------------------------------------------------------------------- hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt | 3 + .../src/site/markdown/HdfsRollingUpgrade.md | 311 ++++++++++++++++ .../src/site/markdown/HdfsSnapshots.md | 301 ++++++++++++++++ .../src/site/xdoc/HdfsRollingUpgrade.xml | 350 ------------------- .../hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml | 303 ---------------- 5 files changed, 615 insertions(+), 653 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/hadoop/blob/c260a8aa/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt index f5e9ea1..618352c 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt +++ b/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt @@ -563,6 +563,9 @@ Release 2.8.0 - UNRELEASED HDFS-7116. Add a command to get the balancer bandwidth (Rakesh R via vinayakumarb) + HDFS-8974. Convert docs in xdoc format to markdown. + (Masatake Iwasaki via aajisaka) + OPTIMIZATIONS HDFS-8026. Trace FSOutputSummer#writeChecksumChunks rather than http://git-wip-us.apache.org/repos/asf/hadoop/blob/c260a8aa/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md new file mode 100644 index 0000000..334925d --- /dev/null +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsRollingUpgrade.md @@ -0,0 +1,311 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +HDFS Rolling Upgrade +==================== + +* [Introduction](#Introduction) +* [Upgrade](#Upgrade) + * [Upgrade without Downtime](#Upgrade_without_Downtime) + * [Upgrading Non-Federated Clusters](#Upgrading_Non-Federated_Clusters) + * [Upgrading Federated Clusters](#Upgrading_Federated_Clusters) + * [Upgrade with Downtime](#Upgrade_with_Downtime) + * [Upgrading Non-HA Clusters](#Upgrading_Non-HA_Clusters) +* [Downgrade and Rollback](#Downgrade_and_Rollback) +* [Downgrade](#Downgrade) + * [Downgrade without Downtime](#Downgrade_without_Downtime) + * [Downgrade with Downtime](#Downgrade_with_Downtime) +* [Rollback](#Rollback) +* [Commands and Startup Options for Rolling Upgrade](#Commands_and_Startup_Options_for_Rolling_Upgrade) + * [DFSAdmin Commands](#DFSAdmin_Commands) + * [dfsadmin -rollingUpgrade](#dfsadmin_-rollingUpgrade) + * [dfsadmin -getDatanodeInfo](#dfsadmin_-getDatanodeInfo) + * [dfsadmin -shutdownDatanode](#dfsadmin_-shutdownDatanode) + * [NameNode Startup Options](#NameNode_Startup_Options) + * [namenode -rollingUpgrade](#namenode_-rollingUpgrade) + + +Introduction +------------ + +*HDFS rolling upgrade* allows upgrading individual HDFS daemons. +For examples, the datanodes can be upgraded independent of the namenodes. +A namenode can be upgraded independent of the other namenodes. +The namenodes can be upgraded independent of datanods and journal nodes. + + +Upgrade +------- + +In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility. +These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime. +In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA. + +If there is any new feature which is enabled in new software release, may not work with old software release after upgrade. +In such cases upgrade should be done by following steps. + +1. Disable new feature. +2. Upgrade the cluster. +3. Enable the new feature. + +Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards. + + +### Upgrade without Downtime + +In a HA cluster, there are two or more *NameNodes (NNs)*, many *DataNodes (DNs)*, +a few *JournalNodes (JNs)* and a few *ZooKeeperNodes (ZKNs)*. +*JNs* is relatively stable and does not require upgrade when upgrading HDFS in most of the cases. +In the rolling upgrade procedure described here, +only *NNs* and *DNs* are considered but *JNs* and *ZKNs* are not. +Upgrading *JNs* and *ZKNs* may incur cluster downtime. + +#### Upgrading Non-Federated Clusters + +Suppose there are two namenodes *NN1* and *NN2*, +where *NN1* and *NN2* are respectively in active and standby states. +The following are the steps for upgrading a HA cluster: + +1. Prepare Rolling Upgrade + 1. Run "[`hdfs dfsadmin -rollingUpgrade prepare`](#dfsadmin_-rollingUpgrade)" + to create a fsimage for rollback. + 1. Run "[`hdfs dfsadmin -rollingUpgrade query`](#dfsadmin_-rollingUpgrade)" + to check the status of the rollback image. + Wait and re-run the command until + the "`Proceed with rolling upgrade`" message is shown. +1. Upgrade Active and Standby *NNs* + 1. Shutdown and upgrade *NN2*. + 1. Start *NN2* as standby with the + "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option. + 1. Failover from *NN1* to *NN2* + so that *NN2* becomes active and *NN1* becomes standby. + 1. Shutdown and upgrade *NN1*. + 1. Start *NN1* as standby with the + "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option. +1. Upgrade *DNs* + 1. Choose a small subset of datanodes (e.g. all datanodes under a particular rack). + 1. Run "[`hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade`](#dfsadmin_-shutdownDatanode)" + to shutdown one of the chosen datanodes. + 1. Run "[`hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>`](#dfsadmin_-getDatanodeInfo)" + to check and wait for the datanode to shutdown. + 1. Upgrade and restart the datanode. + 1. Perform the above steps for all the chosen datanodes in the subset in parallel. + 1. Repeat the above steps until all datanodes in the cluster are upgraded. +1. Finalize Rolling Upgrade + 1. Run "[`hdfs dfsadmin -rollingUpgrade finalize`](#dfsadmin_-rollingUpgrade)" + to finalize the rolling upgrade. + + +#### Upgrading Federated Clusters + +In a federated cluster, there are multiple namespaces +and a pair of active and standby *NNs* for each namespace. +The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster +except that Step 1 and Step 4 are performed on each namespace +and Step 2 is performed on each pair of active and standby *NNs*, i.e. + +1. Prepare Rolling Upgrade for Each Namespace +1. Upgrade Active and Standby *NN* pairs for Each Namespace +1. Upgrade *DNs* +1. Finalize Rolling Upgrade for Each Namespace + + +### Upgrade with Downtime + +For non-HA clusters, +it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes. +However, datanodes can still be upgraded in a rolling manner. + + +#### Upgrading Non-HA Clusters + +In a non-HA cluster, there are a *NameNode (NN)*, a *SecondaryNameNode (SNN)* +and many *DataNodes (DNs)*. +The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster +except that Step 2 "Upgrade Active and Standby *NNs*" is changed to below: + +* Upgrade *NN* and *SNN* + 1. Shutdown *SNN* + 1. Shutdown and upgrade *NN*. + 1. Start *NN* with the + "[`-rollingUpgrade started`](#namenode_-rollingUpgrade)" option. + 1. Upgrade and restart *SNN* + + +Downgrade and Rollback +---------------------- + +When the upgraded release is undesirable +or, in some unlikely case, the upgrade fails (due to bugs in the newer release), +administrators may choose to downgrade HDFS back to the pre-upgrade release, +or rollback HDFS to the pre-upgrade release and the pre-upgrade state. + +Note that downgrade can be done in a rolling fashion but rollback cannot. +Rollback requires cluster downtime. + +Note also that downgrade and rollback are possible only after a rolling upgrade is started and +before the upgrade is terminated. +An upgrade can be terminated by either finalize, downgrade or rollback. +Therefore, it may not be possible to perform rollback after finalize or downgrade, +or to perform downgrade after finalize. + + +Downgrade +--------- + +*Downgrade* restores the software back to the pre-upgrade release +and preserves the user data. +Suppose time *T* is the rolling upgrade start time and the upgrade is terminated by downgrade. +Then, the files created before or after *T* remain available in HDFS. +The files deleted before or after *T* remain deleted in HDFS. + +A newer release is downgradable to the pre-upgrade release +only if both the namenode layout version and the datenode layout version +are not changed between these two releases. + + +### Downgrade without Downtime + +In a HA cluster, +when a rolling upgrade from an old software release to a new software release is in progress, +it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release. +Same as before, suppose *NN1* and *NN2* are respectively in active and standby states. +Below are the steps for rolling downgrade: + +1. Downgrade *DNs* + 1. Choose a small subset of datanodes (e.g. all datanodes under a particular rack). + 1. Run "[`hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade`](#dfsadmin_-shutdownDatanode)" + to shutdown one of the chosen datanodes. + 1. Run "[`hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>`](#dfsadmin_-getDatanodeInfo)" + to check and wait for the datanode to shutdown. + 1. Downgrade and restart the datanode. + 1. Perform the above steps for all the chosen datanodes in the subset in parallel. + 1. Repeat the above steps until all upgraded datanodes in the cluster are downgraded. +1. Downgrade Active and Standby *NNs* + 1. Shutdown and downgrade *NN2*. + 1. Start *NN2* as standby normally. (Note that it is incorrect to use the + "[`-rollingUpgrade downgrade`](#namenode_-rollingUpgrade)" option here.) + 1. Failover from *NN1* to *NN2* + so that *NN2* becomes active and *NN1* becomes standby. + 1. Shutdown and upgrade *NN1*. + 1. Start *NN1* as standby normally. (Note that it is incorrect to use the + "[`-rollingUpgrade downgrade`](#namenode_-rollingUpgrade)" option here.) +1. Finalize Rolling Downgrade + 1. Run "[`hdfs dfsadmin -rollingUpgrade finalize`](#dfsadmin_-rollingUpgrade)" + to finalize the rolling downgrade. + +Note that the datanodes must be downgraded before downgrading the namenodes +since protocols may be changed in a backward compatible manner but not forward compatible, +i.e. old datanodes can talk to the new namenodes but not vice versa. + + +### Downgrade with Downtime + +Administrator may choose to first shutdown the cluster and then downgrade it. +The following are the steps: + +1. Shutdown all *NNs* and *DNs*. +1. Restore the pre-upgrade release in all machines. +1. Start *NNs* with the + "[`-rollingUpgrade downgrade`](#namenode_-rollingUpgrade)" option. +1. Start *DNs* normally. + + + +Rollback +-------- + +*Rollback* restores the software back to the pre-upgrade release +but also reverts the user data back to the pre-upgrade state. +Suppose time *T* is the rolling upgrade start time and the upgrade is terminated by rollback. +The files created before *T* remain available in HDFS but the files created after *T* become unavailable. +The files deleted before *T* remain deleted in HDFS but the files deleted after *T* are restored. + +Rollback from a newer release to the pre-upgrade release is always supported. +However, it cannot be done in a rolling fashion. It requires cluster downtime. +Suppose *NN1* and *NN2* are respectively in active and standby states. +Below are the steps for rollback: + +* Rollback HDFS + 1. Shutdown all *NNs* and *DNs*. + 1. Restore the pre-upgrade release in all machines. + 1. Start *NN1* as Active with the + "[`-rollingUpgrade rollback`](#namenode_-rollingUpgrade)" option. + 1. Run `-bootstrapStandby' on NN2 and start it normally as standby. + 1. Start *DNs* with the "`-rollback`" option. + + +Commands and Startup Options for Rolling Upgrade +------------------------------------------------ + +### DFSAdmin Commands + +#### `dfsadmin -rollingUpgrade` + + hdfs dfsadmin -rollingUpgrade <query|prepare|finalize> + +Execute a rolling upgrade action. + +* Options: + + | --- | --- | + | `query` | Query the current rolling upgrade status. | + | `prepare` | Prepare a new rolling upgrade. | + | `finalize` | Finalize the current rolling upgrade. | + + +#### `dfsadmin -getDatanodeInfo` + + hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT> + +Get the information about the given datanode. +This command can be used for checking if a datanode is alive +like the Unix `ping` command. + + +#### `dfsadmin -shutdownDatanode` + + hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade] + +Submit a shutdown request for the given datanode. +If the optional `upgrade` argument is specified, +clients accessing the datanode will be advised to wait for it to restart +and the fast start-up mode will be enabled. +When the restart does not happen in time, clients will timeout and ignore the datanode. +In such case, the fast start-up mode will also be disabled. + +Note that the command does not wait for the datanode shutdown to complete. +The "[`dfsadmin -getDatanodeInfo`](#dfsadmin_-getDatanodeInfo)" +command can be used for checking if the datanode shutdown is completed. + + +### NameNode Startup Options + +#### `namenode -rollingUpgrade` + + hdfs namenode -rollingUpgrade <downgrade|rollback|started> + +When a rolling upgrade is in progress, +the `-rollingUpgrade` namenode startup option is used to specify +various rolling upgrade options. + +* Options: + + | --- | --- | + | `downgrade` | Restores the namenode back to the pre-upgrade release and preserves the user data. | + | `rollback` | Restores the namenode back to the pre-upgrade release but also reverts the user data back to the pre-upgrade state. | + | `started` | Specifies a rolling upgrade already started so that the namenode should allow image directories with different layout versions during startup. | http://git-wip-us.apache.org/repos/asf/hadoop/blob/c260a8aa/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md new file mode 100644 index 0000000..94a37cd --- /dev/null +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsSnapshots.md @@ -0,0 +1,301 @@ +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +HDFS Snapshots +============== + +* [HDFS Snapshots](#HDFS_Snapshots) + * [Overview](#Overview) + * [Snapshottable Directories](#Snapshottable_Directories) + * [Snapshot Paths](#Snapshot_Paths) + * [Upgrading to a version of HDFS with snapshots](#Upgrading_to_a_version_of_HDFS_with_snapshots) + * [Snapshot Operations](#Snapshot_Operations) + * [Administrator Operations](#Administrator_Operations) + * [Allow Snapshots](#Allow_Snapshots) + * [Disallow Snapshots](#Disallow_Snapshots) + * [User Operations](#User_Operations) + * [Create Snapshots](#Create_Snapshots) + * [Delete Snapshots](#Delete_Snapshots) + * [Rename Snapshots](#Rename_Snapshots) + * [Get Snapshottable Directory Listing](#Get_Snapshottable_Directory_Listing) + * [Get Snapshots Difference Report](#Get_Snapshots_Difference_Report) + + +Overview +-------- + +HDFS Snapshots are read-only point-in-time copies of the file system. +Snapshots can be taken on a subtree of the file system or the entire file system. +Some common use cases of snapshots are data backup, protection against user errors +and disaster recovery. + +The implementation of HDFS Snapshots is efficient: + + +* Snapshot creation is instantaneous: + the cost is *O(1)* excluding the inode lookup time. + +* Additional memory is used only when modifications are made relative to a snapshot: + memory usage is *O(M)*, + where *M* is the number of modified files/directories. + +* Blocks in datanodes are not copied: + the snapshot files record the block list and the file size. + There is no data copying. + +* Snapshots do not adversely affect regular HDFS operations: + modifications are recorded in reverse chronological order + so that the current data can be accessed directly. + The snapshot data is computed by subtracting the modifications + from the current data. + + +### Snapshottable Directories + +Snapshots can be taken on any directory once the directory has been set as +*snapshottable*. +A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. +There is no limit on the number of snapshottable directories. +Administrators may set any directory to be snapshottable. +If there are snapshots in a snapshottable directory, +the directory can be neither deleted nor renamed +before all the snapshots are deleted. + +Nested snapshottable directories are currently not allowed. +In other words, a directory cannot be set to snapshottable +if one of its ancestors/descendants is a snapshottable directory. + + +### Snapshot Paths + +For a snapshottable directory, +the path component *".snapshot"* is used for accessing its snapshots. +Suppose `/foo` is a snapshottable directory, +`/foo/bar` is a file/directory in `/foo`, +and `/foo` has a snapshot `s0`. +Then, the path `/foo/.snapshot/s0/bar` +refers to the snapshot copy of `/foo/bar`. +The usual API and CLI can work with the ".snapshot" paths. +The following are some examples. + +* Listing all the snapshots under a snapshottable directory: + + hdfs dfs -ls /foo/.snapshot + +* Listing the files in snapshot `s0`: + + hdfs dfs -ls /foo/.snapshot/s0 + +* Copying a file from snapshot `s0`: + + hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp + + Note that this example uses the preserve option to preserve + timestamps, ownership, permission, ACLs and XAttrs. + + +Upgrading to a version of HDFS with snapshots +--------------------------------------------- + +The HDFS snapshot feature introduces a new reserved path name used to +interact with snapshots: `.snapshot`. When upgrading from an +older version of HDFS, existing paths named `.snapshot` need +to first be renamed or deleted to avoid conflicting with the reserved path. +See the upgrade section in +[the HDFS user guide](HdfsUserGuide.html#Upgrade_and_Rollback) +for more information. + + +Snapshot Operations +------------------- + + +### Administrator Operations + +The operations described in this section require superuser privilege. + + +#### Allow Snapshots + + +Allowing snapshots of a directory to be created. +If the operation completes successfully, the directory becomes snapshottable. + +* Command: + + hdfs dfsadmin -allowSnapshot <path> + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + +See also the corresponding Java API +`void allowSnapshot(Path path)` in `HdfsAdmin`. + + +#### Disallow Snapshots + +Disallowing snapshots of a directory to be created. +All snapshots of the directory must be deleted before disallowing snapshots. + +* Command: + + hdfs dfsadmin -disallowSnapshot <path> + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + +See also the corresponding Java API +`void disallowSnapshot(Path path)` in `HdfsAdmin`. + + +### User Operations + +The section describes user operations. +Note that HDFS superuser can perform all the operations +without satisfying the permission requirement in the individual operations. + + +#### Create Snapshots + +Create a snapshot of a snapshottable directory. +This operation requires owner privilege of the snapshottable directory. + +* Command: + + hdfs dfs -createSnapshot <path> [<snapshotName>] + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + | snapshotName | The snapshot name, which is an optional argument. When it is omitted, a default name is generated using a timestamp with the format `"'s'yyyyMMdd-HHmmss.SSS"`, e.g. `"s20130412-151029.033"`. | + +See also the corresponding Java API +`Path createSnapshot(Path path)` and +`Path createSnapshot(Path path, String snapshotName)` +in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html) +The snapshot path is returned in these methods. + + +#### Delete Snapshots + +Delete a snapshot of from a snapshottable directory. +This operation requires owner privilege of the snapshottable directory. + +* Command: + + hdfs dfs -deleteSnapshot <path> <snapshotName> + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + | snapshotName | The snapshot name. | + +See also the corresponding Java API +`void deleteSnapshot(Path path, String snapshotName)` +in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html). + + +#### Rename Snapshots + +Rename a snapshot. +This operation requires owner privilege of the snapshottable directory. + +* Command: + + hdfs dfs -renameSnapshot <path> <oldName> <newName> + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + | oldName | The old snapshot name. | + | newName | The new snapshot name. | + +See also the corresponding Java API +`void renameSnapshot(Path path, String oldName, String newName)` +in [FileSystem](../../api/org/apache/hadoop/fs/FileSystem.html). + + +#### Get Snapshottable Directory Listing + +Get all the snapshottable directories where the current user has permission to take snapshtos. + +* Command: + + hdfs lsSnapshottableDir + +* Arguments: none + +See also the corresponding Java API +`SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()` +in `DistributedFileSystem`. + + +#### Get Snapshots Difference Report + +Get the differences between two snapshots. +This operation requires read access privilege for all files/directories in both snapshots. + +* Command: + + hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot> + +* Arguments: + + | --- | --- | + | path | The path of the snapshottable directory. | + | fromSnapshot | The name of the starting snapshot. | + | toSnapshot | The name of the ending snapshot. | + + Note that snapshotDiff can be used to get the difference report between two snapshots, or between + a snapshot and the current status of a directory. Users can use "." to represent the current status. + +* Results: + + | --- | --- | + | \+ | The file/directory has been created. | + | \- | The file/directory has been deleted. | + | M | The file/directory has been modified. | + | R | The file/directory has been renamed. | + +A *RENAME* entry indicates a file/directory has been renamed but +is still under the same snapshottable directory. A file/directory is +reported as deleted if it was renamed to outside of the snapshottble directory. +A file/directory renamed from outside of the snapshottble directory is +reported as newly created. + +The snapshot difference report does not guarantee the same operation sequence. +For example, if we rename the directory *"/foo"* to *"/foo2"*, and +then append new data to the file *"/foo2/bar"*, the difference report will +be: + + R. /foo -> /foo2 + M. /foo/bar + +I.e., the changes on the files/directories under a renamed directory is +reported using the original path before the rename (*"/foo/bar"* in +the above example). + +See also the corresponding Java API +`SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)` +in `DistributedFileSystem`. http://git-wip-us.apache.org/repos/asf/hadoop/blob/c260a8aa/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml deleted file mode 100644 index 8fd4f1c..0000000 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsRollingUpgrade.xml +++ /dev/null @@ -1,350 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!-- - Licensed to the Apache Software Foundation (ASF) under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - The ASF licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. ---> -<document xmlns="http://maven.apache.org/XDOC/2.0" - xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" - xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - - <properties> - <title>HDFS Rolling Upgrade</title> - </properties> - - <body> - - <h1>HDFS Rolling Upgrade</h1> - <macro name="toc"> - <param name="section" value="0"/> - <param name="fromDepth" value="0"/> - <param name="toDepth" value="4"/> - </macro> - - <section name="Introduction" id="Introduction"> - <p> - <em>HDFS rolling upgrade</em> allows upgrading individual HDFS daemons. - For examples, the datanodes can be upgraded independent of the namenodes. - A namenode can be upgraded independent of the other namenodes. - The namenodes can be upgraded independent of datanods and journal nodes. - </p> - </section> - - <section name="Upgrade" id="Upgrade"> - <p> - In Hadoop v2, HDFS supports highly-available (HA) namenode services and wire compatibility. - These two capabilities make it feasible to upgrade HDFS without incurring HDFS downtime. - In order to upgrade a HDFS cluster without downtime, the cluster must be setup with HA. - </p> - <p> - If there is any new feature which is enabled in new software release, may not work with old software release after upgrade. - In such cases upgrade should be done by following steps. - </p> - <ol> - <li>Disable new feature.</li> - <li>Upgrade the cluster.</li> - <li>Enable the new feature.</li> - </ol> - <p> - Note that rolling upgrade is supported only from Hadoop-2.4.0 onwards. - </p> - <subsection name="Upgrade without Downtime" id="UpgradeWithoutDowntime"> - <p> - In a HA cluster, there are two or more <em>NameNodes (NNs)</em>, many <em>DataNodes (DNs)</em>, - a few <em>JournalNodes (JNs)</em> and a few <em>ZooKeeperNodes (ZKNs)</em>. - <em>JNs</em> is relatively stable and does not require upgrade when upgrading HDFS in most of the cases. - In the rolling upgrade procedure described here, - only <em>NNs</em> and <em>DNs</em> are considered but <em>JNs</em> and <em>ZKNs</em> are not. - Upgrading <em>JNs</em> and <em>ZKNs</em> may incur cluster downtime. - </p> - - <h4>Upgrading Non-Federated Clusters</h4> - <p> - Suppose there are two namenodes <em>NN1</em> and <em>NN2</em>, - where <em>NN1</em> and <em>NN2</em> are respectively in active and standby states. - The following are the steps for upgrading a HA cluster: - </p> - <ol> - <li>Prepare Rolling Upgrade<ol> - <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade prepare</a></code>" - to create a fsimage for rollback. - </li> - <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade query</a></code>" - to check the status of the rollback image. - Wait and re-run the command until - the "<tt>Proceed with rolling upgrade</tt>" message is shown. - </li> - </ol></li> - <li>Upgrade Active and Standby <em>NNs</em><ol> - <li>Shutdown and upgrade <em>NN2</em>.</li> - <li>Start <em>NN2</em> as standby with the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li> - <li>Failover from <em>NN1</em> to <em>NN2</em> - so that <em>NN2</em> becomes active and <em>NN1</em> becomes standby.</li> - <li>Shutdown and upgrade <em>NN1</em>.</li> - <li>Start <em>NN1</em> as standby with the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li> - </ol></li> - <li>Upgrade <em>DNs</em><ol> - <li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).</li> - <ol> - <li>Run "<code><a href="#dfsadmin_-shutdownDatanode">hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</a></code>" - to shutdown one of the chosen datanodes.</li> - <li>Run "<code><a href="#dfsadmin_-getDatanodeInfo">hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></a></code>" - to check and wait for the datanode to shutdown.</li> - <li>Upgrade and restart the datanode.</li> - <li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li> - </ol> - <li>Repeat the above steps until all datanodes in the cluster are upgraded.</li> - </ol></li> - <li>Finalize Rolling Upgrade<ul> - <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade finalize</a></code>" - to finalize the rolling upgrade.</li> - </ul></li> - </ol> - - <h4>Upgrading Federated Clusters</h4> - <p> - In a federated cluster, there are multiple namespaces - and a pair of active and standby <em>NNs</em> for each namespace. - The procedure for upgrading a federated cluster is similar to upgrading a non-federated cluster - except that Step 1 and Step 4 are performed on each namespace - and Step 2 is performed on each pair of active and standby <em>NNs</em>, i.e. - </p> - <ol> - <li>Prepare Rolling Upgrade for Each Namespace</li> - <li>Upgrade Active and Standby <em>NN</em> pairs for Each Namespace</li> - <li>Upgrade <em>DNs</em></li> - <li>Finalize Rolling Upgrade for Each Namespace</li> - </ol> - - </subsection> - - <subsection name="Upgrade with Downtime" id="UpgradeWithDowntime"> - <p> - For non-HA clusters, - it is impossible to upgrade HDFS without downtime since it requires restarting the namenodes. - However, datanodes can still be upgraded in a rolling manner. - </p> - - <h4>Upgrading Non-HA Clusters</h4> - <p> - In a non-HA cluster, there are a <em>NameNode (NN)</em>, a <em>SecondaryNameNode (SNN)</em> - and many <em>DataNodes (DNs)</em>. - The procedure for upgrading a non-HA cluster is similar to upgrading a HA cluster - except that Step 2 "Upgrade Active and Standby <em>NNs</em>" is changed to below: - </p> - <ul> - <li>Upgrade <em>NN</em> and <em>SNN</em><ol> - <li>Shutdown <em>SNN</em></li> - <li>Shutdown and upgrade <em>NN</em>.</li> - <li>Start <em>NN</em> with the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade started</code></a>" option.</li> - <li>Upgrade and restart <em>SNN</em></li> - </ol></li> - </ul> - </subsection> - </section> - - <section name="Downgrade and Rollback" id="DowngradeAndRollback"> - <p> - When the upgraded release is undesirable - or, in some unlikely case, the upgrade fails (due to bugs in the newer release), - administrators may choose to downgrade HDFS back to the pre-upgrade release, - or rollback HDFS to the pre-upgrade release and the pre-upgrade state. - </p> - <p> - Note that downgrade can be done in a rolling fashion but rollback cannot. - Rollback requires cluster downtime. - </p> - <p> - Note also that downgrade and rollback are possible only after a rolling upgrade is started and - before the upgrade is terminated. - An upgrade can be terminated by either finalize, downgrade or rollback. - Therefore, it may not be possible to perform rollback after finalize or downgrade, - or to perform downgrade after finalize. - </p> - </section> - - <section name="Downgrade" id="Downgrade"> - <p> - <em>Downgrade</em> restores the software back to the pre-upgrade release - and preserves the user data. - Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by downgrade. - Then, the files created before or after <em>T</em> remain available in HDFS. - The files deleted before or after <em>T</em> remain deleted in HDFS. - </p> - <p> - A newer release is downgradable to the pre-upgrade release - only if both the namenode layout version and the datenode layout version - are not changed between these two releases. - </p> - - <subsection name="Downgrade without Downtime" id="DowngradeWithoutDowntime"> - <p> - In a HA cluster, - when a rolling upgrade from an old software release to a new software release is in progress, - it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release. - Same as before, suppose <em>NN1</em> and <em>NN2</em> are respectively in active and standby states. - Below are the steps for rolling downgrade: - </p> - <ol> - <li>Downgrade <em>DNs</em><ol> - <li>Choose a small subset of datanodes (e.g. all datanodes under a particular rack).</li> - <ol> - <li>Run "<code><a href="#dfsadmin_-shutdownDatanode">hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade</a></code>" - to shutdown one of the chosen datanodes.</li> - <li>Run "<code><a href="#dfsadmin_-getDatanodeInfo">hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></a></code>" - to check and wait for the datanode to shutdown.</li> - <li>Downgrade and restart the datanode.</li> - <li>Perform the above steps for all the chosen datanodes in the subset in parallel.</li> - </ol> - <li>Repeat the above steps until all upgraded datanodes in the cluster are downgraded.</li> - </ol></li> - <li>Downgrade Active and Standby <em>NNs</em><ol> - <li>Shutdown and downgrade <em>NN2</em>.</li> - <li>Start <em>NN2</em> as standby normally. (Note that it is incorrect to use the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade downgrade</code></a>" - option here.) - </li> - <li>Failover from <em>NN1</em> to <em>NN2</em> - so that <em>NN2</em> becomes active and <em>NN1</em> becomes standby.</li> - <li>Shutdown and upgrade <em>NN1</em>.</li> - <li>Start <em>NN1</em> as standby normally. (Note that it is incorrect to use the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade downgrade</code></a>" - option here.) - </li> - </ol></li> - <li>Finalize Rolling Downgrade<ul> - <li>Run "<code><a href="#dfsadmin_-rollingUpgrade">hdfs dfsadmin -rollingUpgrade finalize</a></code>" - to finalize the rolling downgrade.</li> - </ul></li> - </ol> - <p> - Note that the datanodes must be downgraded before downgrading the namenodes - since protocols may be changed in a backward compatible manner but not forward compatible, - i.e. old datanodes can talk to the new namenodes but not vice versa. - </p> - </subsection> - <subsection name="Downgrade with Downtime" id="DowngradeWithDowntime"> - <p> - Administrator may choose to first shutdown the cluster and then downgrade it. - The following are the steps: - </p> - <ol> - <li>Shutdown all <em>NNs</em> and <em>DNs</em>.</li> - <li>Restore the pre-upgrade release in all machines.</li> - <li>Start <em>NNs</em> with the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade downgrade</code></a>" option.</li> - <li>Start <em>DNs</em> normally.</li> - </ol> - </subsection> - </section> - - <section name="Rollback" id="Rollback"> - <p> - <em>Rollback</em> restores the software back to the pre-upgrade release - but also reverts the user data back to the pre-upgrade state. - Suppose time <em>T</em> is the rolling upgrade start time and the upgrade is terminated by rollback. - The files created before <em>T</em> remain available in HDFS but the files created after <em>T</em> become unavailable. - The files deleted before <em>T</em> remain deleted in HDFS but the files deleted after <em>T</em> are restored. - </p> - <p> - Rollback from a newer release to the pre-upgrade release is always supported. - However, it cannot be done in a rolling fashion. It requires cluster downtime. - Suppose <em>NN1</em> and <em>NN2</em> are respectively in active and standby states. - Below are the steps for rollback: - </p> - <ul> - <li>Rollback HDFS<ol> - <li>Shutdown all <em>NNs</em> and <em>DNs</em>.</li> - <li>Restore the pre-upgrade release in all machines.</li> - <li>Start <em>NN1</em> as Active with the - "<a href="#namenode_-rollingUpgrade"><code>-rollingUpgrade rollback</code></a>" option.</li> - <li>Run `-bootstrapStandby' on NN2 and start it normally as standby.</li> - <li>Start <em>DNs</em> with the "<code>-rollback</code>" option.</li> - </ol></li> - </ul> - - </section> - - <section name="Commands and Startup Options for Rolling Upgrade" id="dfsadminCommands"> - - <subsection name="DFSAdmin Commands" id="dfsadminCommands"> - <h4><code>dfsadmin -rollingUpgrade</code></h4> - <source>hdfs dfsadmin -rollingUpgrade <query|prepare|finalize></source> - <p> - Execute a rolling upgrade action. - <ul><li>Options:<table> - <tr><td><code>query</code></td><td>Query the current rolling upgrade status.</td></tr> - <tr><td><code>prepare</code></td><td>Prepare a new rolling upgrade.</td></tr> - <tr><td><code>finalize</code></td><td>Finalize the current rolling upgrade.</td></tr> - </table></li></ul> - </p> - - <h4><code>dfsadmin -getDatanodeInfo</code></h4> - <source>hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT></source> - <p> - Get the information about the given datanode. - This command can be used for checking if a datanode is alive - like the Unix <code>ping</code> command. - </p> - - <h4><code>dfsadmin -shutdownDatanode</code></h4> - <source>hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> [upgrade]</source> - <p> - Submit a shutdown request for the given datanode. - If the optional <code>upgrade</code> argument is specified, - clients accessing the datanode will be advised to wait for it to restart - and the fast start-up mode will be enabled. - When the restart does not happen in time, clients will timeout and ignore the datanode. - In such case, the fast start-up mode will also be disabled. - </p> - <p> - Note that the command does not wait for the datanode shutdown to complete. - The "<a href="#dfsadmin_-getDatanodeInfo">dfsadmin -getDatanodeInfo</a>" - command can be used for checking if the datanode shutdown is completed. - </p> - </subsection> - - <subsection name="NameNode Startup Options" id="dfsadminCommands"> - - <h4><code>namenode -rollingUpgrade</code></h4> - <source>hdfs namenode -rollingUpgrade <downgrade|rollback|started></source> - <p> - When a rolling upgrade is in progress, - the <code>-rollingUpgrade</code> namenode startup option is used to specify - various rolling upgrade options. - </p> - <ul><li>Options:<table> - <tr><td><code>downgrade</code></td> - <td>Restores the namenode back to the pre-upgrade release - and preserves the user data.</td> - </tr> - <tr><td><code>rollback</code></td> - <td>Restores the namenode back to the pre-upgrade release - but also reverts the user data back to the pre-upgrade state.</td> - </tr> - <tr><td><code>started</code></td> - <td>Specifies a rolling upgrade already started - so that the namenode should allow image directories - with different layout versions during startup.</td> - </tr> - </table></li></ul> - - </subsection> - - </section> - </body> -</document> http://git-wip-us.apache.org/repos/asf/hadoop/blob/c260a8aa/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml ---------------------------------------------------------------------- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml b/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml deleted file mode 100644 index 330d00f..0000000 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/xdoc/HdfsSnapshots.xml +++ /dev/null @@ -1,303 +0,0 @@ -<?xml version="1.0" encoding="UTF-8"?> -<!-- - Licensed to the Apache Software Foundation (ASF) under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - The ASF licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. ---> -<document xmlns="http://maven.apache.org/XDOC/2.0" - xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" - xsi:schemaLocation="http://maven.apache.org/XDOC/2.0 http://maven.apache.org/xsd/xdoc-2.0.xsd"> - - <properties> - <title>HDFS Snapshots</title> - </properties> - - <body> - - <h1>HDFS Snapshots</h1> - <macro name="toc"> - <param name="section" value="0"/> - <param name="fromDepth" value="0"/> - <param name="toDepth" value="4"/> - </macro> - - <section name="Overview" id="Overview"> - <p> - HDFS Snapshots are read-only point-in-time copies of the file system. - Snapshots can be taken on a subtree of the file system or the entire file system. - Some common use cases of snapshots are data backup, protection against user errors - and disaster recovery. - </p> - - <p> - The implementation of HDFS Snapshots is efficient: - </p> - <ul> - <li>Snapshot creation is instantaneous: - the cost is <em>O(1)</em> excluding the inode lookup time.</li> - <li>Additional memory is used only when modifications are made relative to a snapshot: - memory usage is <em>O(M)</em>, - where <em>M</em> is the number of modified files/directories.</li> - <li>Blocks in datanodes are not copied: - the snapshot files record the block list and the file size. - There is no data copying.</li> - <li>Snapshots do not adversely affect regular HDFS operations: - modifications are recorded in reverse chronological order - so that the current data can be accessed directly. - The snapshot data is computed by subtracting the modifications - from the current data.</li> - </ul> - - <subsection name="Snapshottable Directories" id="SnapshottableDirectories"> - <p> - Snapshots can be taken on any directory once the directory has been set as - <em>snapshottable</em>. - A snapshottable directory is able to accommodate 65,536 simultaneous snapshots. - There is no limit on the number of snapshottable directories. - Administrators may set any directory to be snapshottable. - If there are snapshots in a snapshottable directory, - the directory can be neither deleted nor renamed - before all the snapshots are deleted. - </p> - - <p> - Nested snapshottable directories are currently not allowed. - In other words, a directory cannot be set to snapshottable - if one of its ancestors/descendants is a snapshottable directory. - </p> - - </subsection> - - <subsection name="Snapshot Paths" id="SnapshotPaths"> - <p> - For a snapshottable directory, - the path component <em>".snapshot"</em> is used for accessing its snapshots. - Suppose <code>/foo</code> is a snapshottable directory, - <code>/foo/bar</code> is a file/directory in <code>/foo</code>, - and <code>/foo</code> has a snapshot <code>s0</code>. - Then, the path <source>/foo/.snapshot/s0/bar</source> - refers to the snapshot copy of <code>/foo/bar</code>. - The usual API and CLI can work with the ".snapshot" paths. - The following are some examples. - </p> - <ul> - <li>Listing all the snapshots under a snapshottable directory: - <source>hdfs dfs -ls /foo/.snapshot</source></li> - <li>Listing the files in snapshot <code>s0</code>: - <source>hdfs dfs -ls /foo/.snapshot/s0</source></li> - <li>Copying a file from snapshot <code>s0</code>: - <source>hdfs dfs -cp -ptopax /foo/.snapshot/s0/bar /tmp</source> - <p>Note that this example uses the preserve option to preserve - timestamps, ownership, permission, ACLs and XAttrs.</p></li> - </ul> - </subsection> - </section> - - <section name="Upgrading to a version of HDFS with snapshots" id="Upgrade"> - - <p> - The HDFS snapshot feature introduces a new reserved path name used to - interact with snapshots: <tt>.snapshot</tt>. When upgrading from an - older version of HDFS, existing paths named <tt>.snapshot</tt> need - to first be renamed or deleted to avoid conflicting with the reserved path. - See the upgrade section in - <a href="HdfsUserGuide.html#Upgrade_and_Rollback">the HDFS user guide</a> - for more information. </p> - - </section> - - <section name="Snapshot Operations" id="SnapshotOperations"> - <subsection name="Administrator Operations" id="AdministratorOperations"> - <p> - The operations described in this section require superuser privilege. - </p> - - <h4>Allow Snapshots</h4> - <p> - Allowing snapshots of a directory to be created. - If the operation completes successfully, the directory becomes snapshottable. - </p> - <ul> - <li>Command: - <source>hdfs dfsadmin -allowSnapshot <path></source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - </table></li> - </ul> - <p> - See also the corresponding Java API - <code>void allowSnapshot(Path path)</code> in <code>HdfsAdmin</code>. - </p> - - <h4>Disallow Snapshots</h4> - <p> - Disallowing snapshots of a directory to be created. - All snapshots of the directory must be deleted before disallowing snapshots. - </p> - <ul> - <li>Command: - <source>hdfs dfsadmin -disallowSnapshot <path></source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - </table></li> - </ul> - <p> - See also the corresponding Java API - <code>void disallowSnapshot(Path path)</code> in <code>HdfsAdmin</code>. - </p> - </subsection> - - <subsection name="User Operations" id="UserOperations"> - <p> - The section describes user operations. - Note that HDFS superuser can perform all the operations - without satisfying the permission requirement in the individual operations. - </p> - - <h4>Create Snapshots</h4> - <p> - Create a snapshot of a snapshottable directory. - This operation requires owner privilege of the snapshottable directory. - </p> - <ul> - <li>Command: - <source>hdfs dfs -createSnapshot <path> [<snapshotName>]</source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - <tr><td>snapshotName</td><td> - The snapshot name, which is an optional argument. - When it is omitted, a default name is generated using a timestamp with the format - <code>"'s'yyyyMMdd-HHmmss.SSS"</code>, e.g. "s20130412-151029.033". - </td></tr> - </table></li> - </ul> - <p> - See also the corresponding Java API - <code>Path createSnapshot(Path path)</code> and - <code>Path createSnapshot(Path path, String snapshotName)</code> - in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. - The snapshot path is returned in these methods. - </p> - - <h4>Delete Snapshots</h4> - <p> - Delete a snapshot of from a snapshottable directory. - This operation requires owner privilege of the snapshottable directory. - </p> - <ul> - <li>Command: - <source>hdfs dfs -deleteSnapshot <path> <snapshotName></source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - <tr><td>snapshotName</td><td>The snapshot name.</td></tr> - </table></li> - </ul> - <p> - See also the corresponding Java API - <code>void deleteSnapshot(Path path, String snapshotName)</code> - in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. - </p> - - <h4>Rename Snapshots</h4> - <p> - Rename a snapshot. - This operation requires owner privilege of the snapshottable directory. - </p> - <ul> - <li>Command: - <source>hdfs dfs -renameSnapshot <path> <oldName> <newName></source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - <tr><td>oldName</td><td>The old snapshot name.</td></tr> - <tr><td>newName</td><td>The new snapshot name.</td></tr> - </table></li> - </ul> - <p> - See also the corresponding Java API - <code>void renameSnapshot(Path path, String oldName, String newName)</code> - in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><code>FileSystem</code></a>. - </p> - - <h4>Get Snapshottable Directory Listing</h4> - <p> - Get all the snapshottable directories where the current user has permission to take snapshtos. - </p> - <ul> - <li>Command: - <source>hdfs lsSnapshottableDir</source></li> - <li>Arguments: none</li> - </ul> - <p> - See also the corresponding Java API - <code>SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()</code> - in <code>DistributedFileSystem</code>. - </p> - - <h4>Get Snapshots Difference Report</h4> - <p> - Get the differences between two snapshots. - This operation requires read access privilege for all files/directories in both snapshots. - </p> - <ul> - <li>Command: - <source>hdfs snapshotDiff <path> <fromSnapshot> <toSnapshot></source></li> - <li>Arguments:<table> - <tr><td>path</td><td>The path of the snapshottable directory.</td></tr> - <tr><td>fromSnapshot</td><td>The name of the starting snapshot.</td></tr> - <tr><td>toSnapshot</td><td>The name of the ending snapshot.</td></tr> - </table></li> - <p> - Note that snapshotDiff can be used to get the difference report between two snapshots, or between - a snapshot and the current status of a directory.Users can use "." to represent the current status. - </p> - <li>Results: - <table> - <tr><td>+</td><td>The file/directory has been created.</td></tr> - <tr><td>-</td><td>The file/directory has been deleted.</td></tr> - <tr><td>M</td><td>The file/directory has been modified.</td></tr> - <tr><td>R</td><td>The file/directory has been renamed.</td></tr> - </table> - </li> - </ul> - <p> - A <em>RENAME</em> entry indicates a file/directory has been renamed but - is still under the same snapshottable directory. A file/directory is - reported as deleted if it was renamed to outside of the snapshottble directory. - A file/directory renamed from outside of the snapshottble directory is - reported as newly created. - </p> - <p> - The snapshot difference report does not guarantee the same operation sequence. - For example, if we rename the directory <em>"/foo"</em> to <em>"/foo2"</em>, and - then append new data to the file <em>"/foo2/bar"</em>, the difference report will - be: - <source> - R. /foo -> /foo2 - M. /foo/bar - </source> - I.e., the changes on the files/directories under a renamed directory is - reported using the original path before the rename (<em>"/foo/bar"</em> in - the above example). - </p> - <p> - See also the corresponding Java API - <code>SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)</code> - in <code>DistributedFileSystem</code>. - </p> - - </subsection> - </section> - - </body> -</document>