[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G

2018-07-19 Thread rakeshr
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/f4bc889b
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/f4bc889b
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/f4bc889b

Branch: refs/heads/HDFS-10285
Commit: f4bc889b042c18d5d760c6329cbfb04c0d0a1c78
Parents: 18c3709
Author: Rakesh Radhakrishnan 
Authored: Fri Jul 14 22:36:09 2017 +0530
Committer: Rakesh Radhakrishnan 
Committed: Thu Jul 19 22:46:57 2018 +0530

--
 .../src/site/markdown/ArchivalStorage.md| 51 ++--
 1 file changed, 48 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hadoop/blob/f4bc889b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
--
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
index a56cf8b..9098616 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
@@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the 
"[`storagepolicies -getStor
 
 The default storage type of a datanode storage location will be DISK if it 
does not have a storage type tagged explicitly.
 
-Mover - A New Data Migration Tool
--
+Storage Policy Based Data Movement
+--
+
+Setting a new storage policy on already existing file/dir will change the 
policy in Namespace, but it will not move the blocks physically across storage 
medias.
+Following 2 options will allow users to move the blocks based on new policy 
set. So, once user change/set to a new policy on file/directory, user should 
also perform one of the following options to achieve the desired data movement. 
Note that both options cannot be allowed to run simultaneously.
+
+### Storage Policy Satisfier (SPS)
+
+When user changes the storage policy on a file/directory, user can call 
`HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new 
policy set.
+The SPS daemon thread runs along with namenode and periodically scans for the 
storage mismatches between new policy set and the physical blocks placed. This 
will only track the files/directories for which user invoked 
satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, 
then it will schedule block movement tasks to datanodes. A Coordinator 
DataNode(C-DN) will track all block movements associated to a file and notify 
to namenode about movement success/failure. If there are any failures in 
movement, the SPS will re-attempt by sending new block movement task.
+
+SPS can be activated and deactivated dynamically without restarting the 
Namenode.
+
+Detailed design documentation can be found at [Storage Policy Satisfier(SPS) 
(HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285)
+
+* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS 
will consider the files which are immediate to that directory. Sub-directories 
won't be considered for satisfying the policy. Its user responsibility to call 
this API on directories recursively, to track all files under the sub tree.
+
+* HdfsAdmin API :
+`public void satisfyStoragePolicy(final Path path) throws IOException`
+
+* Arguments :
+
+| | |
+|: |: |
+| `path` | A path which requires blocks storage movement. |
+
+Configurations:
+
+*   **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate 
SPS. Configuring true represents SPS is
+   activated and vice versa.
+
+*   **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to 
re-check the processed block storage movement
+   command results from Co-ordinator Datanode.
+
+*   **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to 
retry if no block movement results reported from
+   Co-ordinator Datanode in this configured timeout.
+
+### Mover - A New Data Migration Tool
 
 A new data migration tool is added for archiving data. The tool is similar to 
Balancer. It periodically scans the files in HDFS to check if the block 
placement satisfies the storage policy. For the blocks violating the storage 
policy, it moves the replicas to a different storage type in order to fulfill 
the storage policy requirement. Note that it always tries to move block 
replicas within the same node whenever possible. If that is not possible (e.g. 
when a node doesn’t have the target storage type) then it will copy the block 
replicas to another node over the network.
 
@@ -115,6 +151,10 @@ A new data migration tool 

[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G

2018-07-16 Thread rakeshr
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/afabd3b0
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/afabd3b0
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/afabd3b0

Branch: refs/heads/HDFS-10285
Commit: afabd3b0fdfd73c8c4e58688007c1e3eb77d696b
Parents: e19c79b
Author: Rakesh Radhakrishnan 
Authored: Fri Jul 14 22:36:09 2017 +0530
Committer: Rakesh Radhakrishnan 
Committed: Sun Jul 15 20:19:11 2018 +0530

--
 .../src/site/markdown/ArchivalStorage.md| 51 ++--
 1 file changed, 48 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hadoop/blob/afabd3b0/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
--
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
index a56cf8b..9098616 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
@@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the 
"[`storagepolicies -getStor
 
 The default storage type of a datanode storage location will be DISK if it 
does not have a storage type tagged explicitly.
 
-Mover - A New Data Migration Tool
--
+Storage Policy Based Data Movement
+--
+
+Setting a new storage policy on already existing file/dir will change the 
policy in Namespace, but it will not move the blocks physically across storage 
medias.
+Following 2 options will allow users to move the blocks based on new policy 
set. So, once user change/set to a new policy on file/directory, user should 
also perform one of the following options to achieve the desired data movement. 
Note that both options cannot be allowed to run simultaneously.
+
+### Storage Policy Satisfier (SPS)
+
+When user changes the storage policy on a file/directory, user can call 
`HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new 
policy set.
+The SPS daemon thread runs along with namenode and periodically scans for the 
storage mismatches between new policy set and the physical blocks placed. This 
will only track the files/directories for which user invoked 
satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, 
then it will schedule block movement tasks to datanodes. A Coordinator 
DataNode(C-DN) will track all block movements associated to a file and notify 
to namenode about movement success/failure. If there are any failures in 
movement, the SPS will re-attempt by sending new block movement task.
+
+SPS can be activated and deactivated dynamically without restarting the 
Namenode.
+
+Detailed design documentation can be found at [Storage Policy Satisfier(SPS) 
(HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285)
+
+* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS 
will consider the files which are immediate to that directory. Sub-directories 
won't be considered for satisfying the policy. Its user responsibility to call 
this API on directories recursively, to track all files under the sub tree.
+
+* HdfsAdmin API :
+`public void satisfyStoragePolicy(final Path path) throws IOException`
+
+* Arguments :
+
+| | |
+|: |: |
+| `path` | A path which requires blocks storage movement. |
+
+Configurations:
+
+*   **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate 
SPS. Configuring true represents SPS is
+   activated and vice versa.
+
+*   **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to 
re-check the processed block storage movement
+   command results from Co-ordinator Datanode.
+
+*   **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to 
retry if no block movement results reported from
+   Co-ordinator Datanode in this configured timeout.
+
+### Mover - A New Data Migration Tool
 
 A new data migration tool is added for archiving data. The tool is similar to 
Balancer. It periodically scans the files in HDFS to check if the block 
placement satisfies the storage policy. For the blocks violating the storage 
policy, it moves the replicas to a different storage type in order to fulfill 
the storage policy requirement. Note that it always tries to move block 
replicas within the same node whenever possible. If that is not possible (e.g. 
when a node doesn’t have the target storage type) then it will copy the block 
replicas to another node over the network.
 
@@ -115,6 +151,10 @@ A new data migration tool 

[11/50] [abbrv] hadoop git commit: HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G

2018-07-12 Thread rakeshr
HDFS-11874. [SPS]: Document the SPS feature. Contributed by Uma Maheswara Rao G


Project: http://git-wip-us.apache.org/repos/asf/hadoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/hadoop/commit/ac0e71c3
Tree: http://git-wip-us.apache.org/repos/asf/hadoop/tree/ac0e71c3
Diff: http://git-wip-us.apache.org/repos/asf/hadoop/diff/ac0e71c3

Branch: refs/heads/HDFS-10285
Commit: ac0e71c3ce38a45023eb5d85cdbe0f22c8e7739f
Parents: b43972a
Author: Rakesh Radhakrishnan 
Authored: Fri Jul 14 22:36:09 2017 +0530
Committer: Rakesh Radhakrishnan 
Committed: Thu Jul 12 17:01:37 2018 +0530

--
 .../src/site/markdown/ArchivalStorage.md| 51 ++--
 1 file changed, 48 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/hadoop/blob/ac0e71c3/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
--
diff --git 
a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md 
b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
index a56cf8b..9098616 100644
--- a/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/ArchivalStorage.md
@@ -97,8 +97,44 @@ The effective storage policy can be retrieved by the 
"[`storagepolicies -getStor
 
 The default storage type of a datanode storage location will be DISK if it 
does not have a storage type tagged explicitly.
 
-Mover - A New Data Migration Tool
--
+Storage Policy Based Data Movement
+--
+
+Setting a new storage policy on already existing file/dir will change the 
policy in Namespace, but it will not move the blocks physically across storage 
medias.
+Following 2 options will allow users to move the blocks based on new policy 
set. So, once user change/set to a new policy on file/directory, user should 
also perform one of the following options to achieve the desired data movement. 
Note that both options cannot be allowed to run simultaneously.
+
+### Storage Policy Satisfier (SPS)
+
+When user changes the storage policy on a file/directory, user can call 
`HdfsAdmin` API `satisfyStoragePolicy()` to move the blocks as per the new 
policy set.
+The SPS daemon thread runs along with namenode and periodically scans for the 
storage mismatches between new policy set and the physical blocks placed. This 
will only track the files/directories for which user invoked 
satisfyStoragePolicy. If SPS identifies some blocks to be moved for a file, 
then it will schedule block movement tasks to datanodes. A Coordinator 
DataNode(C-DN) will track all block movements associated to a file and notify 
to namenode about movement success/failure. If there are any failures in 
movement, the SPS will re-attempt by sending new block movement task.
+
+SPS can be activated and deactivated dynamically without restarting the 
Namenode.
+
+Detailed design documentation can be found at [Storage Policy Satisfier(SPS) 
(HDFS-10285)](https://issues.apache.org/jira/browse/HDFS-10285)
+
+* **Note**: When user invokes `satisfyStoragePolicy()` API on a directory, SPS 
will consider the files which are immediate to that directory. Sub-directories 
won't be considered for satisfying the policy. Its user responsibility to call 
this API on directories recursively, to track all files under the sub tree.
+
+* HdfsAdmin API :
+`public void satisfyStoragePolicy(final Path path) throws IOException`
+
+* Arguments :
+
+| | |
+|: |: |
+| `path` | A path which requires blocks storage movement. |
+
+Configurations:
+
+*   **dfs.storage.policy.satisfier.activate** - Used to activate or deactivate 
SPS. Configuring true represents SPS is
+   activated and vice versa.
+
+*   **dfs.storage.policy.satisfier.recheck.timeout.millis** - A timeout to 
re-check the processed block storage movement
+   command results from Co-ordinator Datanode.
+
+*   **dfs.storage.policy.satisfier.self.retry.timeout.millis** - A timeout to 
retry if no block movement results reported from
+   Co-ordinator Datanode in this configured timeout.
+
+### Mover - A New Data Migration Tool
 
 A new data migration tool is added for archiving data. The tool is similar to 
Balancer. It periodically scans the files in HDFS to check if the block 
placement satisfies the storage policy. For the blocks violating the storage 
policy, it moves the replicas to a different storage type in order to fulfill 
the storage policy requirement. Note that it always tries to move block 
replicas within the same node whenever possible. If that is not possible (e.g. 
when a node doesn’t have the target storage type) then it will copy the block 
replicas to another node over the network.
 
@@ -115,6 +151,10 @@ A new data migration tool