[ 
https://issues.apache.org/jira/browse/HDFS-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14562108#comment-14562108
 ] 

Allen Wittenauer commented on HDFS-6261:
----------------------------------------

Awesome!  Doc patches are great!  

Now for the review. ;) This isn't comprehensive, but here's a first pass at 
least.

A common problem is a missing 'the' in front of 'following'.  I pointed it out 
in a few places, but there are more. Articles in English are tricky. :(  
'Following' is particularly tricky though because 'following' without 'the' or 
'a' in front of it is a verb (e.g., "The dog was following the boy" vs. "Hadoop 
has a following" or "See the following list of cool places to eat".). But hey,  
at least we don't have genders like the European languages though! :) 

{code}
However, for
+other cases, like: Hadoop nodes running on virtualized platform, we have
+additional "hypervisor" layer, and its characteristics include:
{code}

I don't know how to parse this phrasing.  It feels awkward.  I'd probably 
rewrite as:

However for some cases, this is insufficient.  Take for example Hadoop nodes 
running on a virtualized platform where there is an additional hypervisor 
layer.  It has the following characteristics:

{code}
+-   The communication price between VMs within the same hypervisor is lower
+than across hypervisor (physical host) which will have higher throughput,
+lower latency, and not generating physical network traffic.
{code}

Same sort of problem.  I'd probably rephrase a bit:

"The communication price between multiple VMs running on one physical host is 
lower than the communication price between processes on multiple physical 
hosts.  In addition to the multiple VMs having higher throughput and lower 
latency between themselves, they do not generate any network traffic on the 
wire."

{code}
transparent for Hadoop, so
{code}

'for' should be 'to'.  Hadoop (period). (new sentence) So

{code}
like following:
{code}

like the following:

{code}
layer, following polices
+in hdfs are refined:
{code}

the following.  HDFS.

{code}
+-   Replica placement policy
{code}

I have a feeling bullet points in front of all the items listed under this 
section may render better.  I need to play with it though. 

{code}
of writer,
{code

of the writer

{code}
on other
+    node
{code}

on another node

{code}
if node of writer
{code}

if the node of the writer

{code}
The remaining replicas are placed randomly across rack and node group to
+    meet minimum restriction.
{code}

I'm confused by this since there are missing articles and/or plurals here.  
Does this mean randomly across the remaining racks or randomly across all racks 
including the writer's rack? 

{code}
At node level
{code}

At the node level

{code}
At block level
{code}

At the block level

{code}
Reliability: By never placing more than one replicas on the same node
+group(physical host), in case of node group failure, only one replica is
+lost at maximum.
{code}

Awkward phrasing.  I'd probably rewrite as:

"Reliability: By never placing more than one replica in the same node
group (aka physical host),  only one replica is lost at maximum in case of node 
group failure."

{code}
rather than remote node
{code}
than a remote

{code}
+3-layer topology tends to support different failure and locality topologies
+which is primarily driven from the perspective of virtualization, however,
+it is also possible to use the feature support other scenarios, such as
+those relating to failures of power supplies, arbitrary sets of physical
+servers, or collections of servers from same hardware purchase cycle.
{code}

This paragraph feels like it should be up closer to the top of these changes. 


> Add document for enabling node group layer in HDFS
> --------------------------------------------------
>
>                 Key: HDFS-6261
>                 URL: https://issues.apache.org/jira/browse/HDFS-6261
>             Project: Hadoop HDFS
>          Issue Type: Task
>          Components: documentation
>            Reporter: Wenwu Peng
>            Assignee: Binglin Chang
>              Labels: documentation
>         Attachments: 2-layer-topology.png, 3-layer-topology.png, 
> 3layer-topology.png, 4layer-topology.png, HDFS-6261.004.patch, 
> HDFS-6261.005.patch, HDFS-6261.006.patch, HDFS-6261.007.patch, 
> HDFS-6261.v1.patch, HDFS-6261.v1.patch, HDFS-6261.v2.patch, HDFS-6261.v3.patch
>
>
> Most of patches from Umbrella JIRA HADOOP-8468  have committed, However there 
> is no site to introduce NodeGroup-aware(HADOOP Virtualization Extensisons) 
> and how to do configuration. so we need to doc it.
> 1.  Doc NodeGroup-aware relate in http://hadoop.apache.org/docs/current 
> 2.  Doc NodeGroup-aware properties in core-default.xml.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to