[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/f4bc889b Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/f4bc889b Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/f4bc889b Branch: refs/heads/HDFS-10285 Commit: f4bc889b042c18d5d760c6329cbfb04c0d0a1c78 Parents: 18c3709 Author: Rakesh Radhakrishnan Authored: Fri Jul 14 22:36:09 2017 +0530 Committer: Rakesh Radhakrishnan Committed: Thu Jul 19 22:46:57 2018 +0530 -- .../src/site/markdown/ArchivalStorage.md| 51 ++-- 1 file changed, 48 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/hadoop/blob/f4bc889b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md -- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md index a56cf8b..9098616 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md @@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the "[`storagepolicies -getStor The default storage type of a datanode storage location will be DISK if it does not have a storage type tagged explicitly. -Mover - A New Data Migration Tool -- +Storage Policy Based Data Movement +-- + +Setting a new storage policy on already existing file/dir will change the policy in Namespace, but it will not move the blocks physically across storage medias. +Following 2 options will allow users to move the blocks based on new policy set. So, once user change/set to a new policy on file/directory, user should also perform one of the following options to achieve the desired data movement. Note that both options cannot be allowed to run simultaneously. + +### Storage Policy Satisfier (SPS) + +When user changes the storage policy on a file/directory, user can call `HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new policy set. +The SPS daemon thread runs along with namenode and periodically scans for the storage mismatches between new policy set and the physical blocks placed. This will only track the files/directories for which user invoked satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, then it will schedule block movement tasks to datanodes. A Coordinator DataNode(C-DN) will track all block movements associated to a file and notify to namenode about movement success/failure. If there are any failures in movement, the SPS will re-attempt by sending new block movement task. + +SPS can be activated and deactivated dynamically without restarting the Namenode. + +Detailed design documentation can be found at [Storage Policy Satisfier(SPS) (HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285) + +* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS will consider the files which are immediate to that directory. Sub-directories won't be considered for satisfying the policy. Its user responsibility to call this API on directories recursively, to track all files under the sub tree. + +* HdfsAdmin API : +`public void satisfyStoragePolicy(final Path path) throws IOException` + +* Arguments : + +| | | +|: |: | +| `path` | A path which requires blocks storage movement. | + +Configurations: + +* **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate SPS. Configuring true represents SPS is + activated and vice versa. + +* **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to re-check the processed block storage movement + command results from Co-ordinator Datanode. + +* **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to retry if no block movement results reported from + Co-ordinator Datanode in this configured timeout. + +### Mover - A New Data Migration Tool A new data migration tool is added for archiving data. The tool is similar to Balancer. It periodically scans the files in HDFS to check if the block placement satisfies the storage policy. For the blocks violating the storage policy, it moves the replicas to a different storage type in order to fulfill the storage policy requirement. Note that it always tries to move block replicas within the same node whenever possible. If that is not possible (e.g. when a node doesnât have the target storage type) then it will copy the block replicas to another node over the network. @@ -115,6 +151,10 @@ A new data migration tool
[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/afabd3b0 Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/afabd3b0 Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/afabd3b0 Branch: refs/heads/HDFS-10285 Commit: afabd3b0fdfd73c8c4e58688007c1e3eb77d696b Parents: e19c79b Author: Rakesh Radhakrishnan Authored: Fri Jul 14 22:36:09 2017 +0530 Committer: Rakesh Radhakrishnan Committed: Sun Jul 15 20:19:11 2018 +0530 -- .../src/site/markdown/ArchivalStorage.md| 51 ++-- 1 file changed, 48 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/hadoop/blob/afabd3b0/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md -- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md index a56cf8b..9098616 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md @@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the "[`storagepolicies -getStor The default storage type of a datanode storage location will be DISK if it does not have a storage type tagged explicitly. -Mover - A New Data Migration Tool -- +Storage Policy Based Data Movement +-- + +Setting a new storage policy on already existing file/dir will change the policy in Namespace, but it will not move the blocks physically across storage medias. +Following 2 options will allow users to move the blocks based on new policy set. So, once user change/set to a new policy on file/directory, user should also perform one of the following options to achieve the desired data movement. Note that both options cannot be allowed to run simultaneously. + +### Storage Policy Satisfier (SPS) + +When user changes the storage policy on a file/directory, user can call `HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new policy set. +The SPS daemon thread runs along with namenode and periodically scans for the storage mismatches between new policy set and the physical blocks placed. This will only track the files/directories for which user invoked satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, then it will schedule block movement tasks to datanodes. A Coordinator DataNode(C-DN) will track all block movements associated to a file and notify to namenode about movement success/failure. If there are any failures in movement, the SPS will re-attempt by sending new block movement task. + +SPS can be activated and deactivated dynamically without restarting the Namenode. + +Detailed design documentation can be found at [Storage Policy Satisfier(SPS) (HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285) + +* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS will consider the files which are immediate to that directory. Sub-directories won't be considered for satisfying the policy. Its user responsibility to call this API on directories recursively, to track all files under the sub tree. + +* HdfsAdmin API : +`public void satisfyStoragePolicy(final Path path) throws IOException` + +* Arguments : + +| | | +|: |: | +| `path` | A path which requires blocks storage movement. | + +Configurations: + +* **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate SPS. Configuring true represents SPS is + activated and vice versa. + +* **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to re-check the processed block storage movement + command results from Co-ordinator Datanode. + +* **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to retry if no block movement results reported from + Co-ordinator Datanode in this configured timeout. + +### Mover - A New Data Migration Tool A new data migration tool is added for archiving data. The tool is similar to Balancer. It periodically scans the files in HDFS to check if the block placement satisfies the storage policy. For the blocks violating the storage policy, it moves the replicas to a different storage type in order to fulfill the storage policy requirement. Note that it always tries to move block replicas within the same node whenever possible. If that is not possible (e.g. when a node doesnât have the target storage type) then it will copy the block replicas to another node over the network. @@ -115,6 +151,10 @@ A new data migration tool
[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/ac0e71c3 Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/ac0e71c3 Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/ac0e71c3 Branch: refs/heads/HDFS-10285 Commit: ac0e71c3ce38a45023eb5d85cdbe0f22c8e7739f Parents: b43972a Author: Rakesh Radhakrishnan Authored: Fri Jul 14 22:36:09 2017 +0530 Committer: Rakesh Radhakrishnan Committed: Thu Jul 12 17:01:37 2018 +0530 -- .../src/site/markdown/ArchivalStorage.md| 51 ++-- 1 file changed, 48 insertions(+), 3 deletions(-) -- http://git-wip-us.apache.org/repos/asf/hadoop/blob/ac0e71c3/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md -- diff --git a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md index a56cf8b..9098616 100644 --- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md +++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md @@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the "[`storagepolicies -getStor The default storage type of a datanode storage location will be DISK if it does not have a storage type tagged explicitly. -Mover - A New Data Migration Tool -- +Storage Policy Based Data Movement +-- + +Setting a new storage policy on already existing file/dir will change the policy in Namespace, but it will not move the blocks physically across storage medias. +Following 2 options will allow users to move the blocks based on new policy set. So, once user change/set to a new policy on file/directory, user should also perform one of the following options to achieve the desired data movement. Note that both options cannot be allowed to run simultaneously. + +### Storage Policy Satisfier (SPS) + +When user changes the storage policy on a file/directory, user can call `HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new policy set. +The SPS daemon thread runs along with namenode and periodically scans for the storage mismatches between new policy set and the physical blocks placed. This will only track the files/directories for which user invoked satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, then it will schedule block movement tasks to datanodes. A Coordinator DataNode(C-DN) will track all block movements associated to a file and notify to namenode about movement success/failure. If there are any failures in movement, the SPS will re-attempt by sending new block movement task. + +SPS can be activated and deactivated dynamically without restarting the Namenode. + +Detailed design documentation can be found at [Storage Policy Satisfier(SPS) (HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285) + +* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS will consider the files which are immediate to that directory. Sub-directories won't be considered for satisfying the policy. Its user responsibility to call this API on directories recursively, to track all files under the sub tree. + +* HdfsAdmin API : +`public void satisfyStoragePolicy(final Path path) throws IOException` + +* Arguments : + +| | | +|: |: | +| `path` | A path which requires blocks storage movement. | + +Configurations: + +* **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate SPS. Configuring true represents SPS is + activated and vice versa. + +* **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to re-check the processed block storage movement + command results from Co-ordinator Datanode. + +* **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to retry if no block movement results reported from + Co-ordinator Datanode in this configured timeout. + +### Mover - A New Data Migration Tool A new data migration tool is added for archiving data. The tool is similar to Balancer. It periodically scans the files in HDFS to check if the block placement satisfies the storage policy. For the blocks violating the storage policy, it moves the replicas to a different storage type in order to fulfill the storage policy requirement. Note that it always tries to move block replicas within the same node whenever possible. If that is not possible (e.g. when a node doesnât have the target storage type) then it will copy the block replicas to another node over the network. @@ -115,6 +151,10 @@ A new data migration tool