[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2021-03-16 Thread tomscut (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

tomscut updated HDFS-3702:
--
Description: 
This is useful for Write-Ahead-Logs: these files are writen for recovery only, 
and are not read when there are no failures.

Taking HBase as an example, these files will be read only if the process that 
wrote them (the 'HBase regionserver') dies. This will likely come from a 
hardware failure, hence the corresponding datanode will be dead as well. So 
we're writing 3 replicas, but in reality only 2 of them are really useful.

  was:
This is useful for Write-Ahead-Logs: these files are writen for recovery only, 
and are not read when there are no failures.

Taking HBase as an example, these files will be read only if the process that 
wrote them (the 'HBase regionserver') dies. This will likely come from a 
hardware failure, hence the corresponding datanode will be dead as well. So 
we're writing 3 replicas, but in reality only 2 of them are really useful.



> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2017-01-06 Thread Junping Du (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Junping Du updated HDFS-3702:
-
Fix Version/s: (was: 2.9.0)

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 2.8.0, 3.0.0-alpha1
>
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-05-02 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Fix Version/s: 2.8.0

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.8.0, 2.9.0
>
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-04-27 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

   Resolution: Fixed
Fix Version/s: 2.9.0
   3.0.0
   Status: Resolved  (was: Patch Available)

Committed to trunk and branch-2.

Thanks a lot for the detailed suggestions and kindly reviews from 
[~andrew.wang], [~nkeywal], [~stack], [~arpitagarwal] and [~szetszwo]! 

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Fix For: 3.0.0, 2.9.0
>
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-04-18 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.012.patch

bq. BTW, we should check if excludedNodes.contains(writer) is already true; 
otherwise, fallback does not help.

Fixed in the newly updated patch. 

[~szetszwo] What do you think about [~cmccabe] and [~stack]'s suggestions? 
Would that works for you?

Thanks!




> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-04-05 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.011.patch

Hey, [~szetszwo] 

Thanks a lot for your good suggestions. 

bq. Please remove the "Fallback to use the default block placement." debug 
message 

Done

bq. When avoidLocalNode == false, ... We should only create the new array once 
in this case.

Done


bq. Please use new AddBlockFlag.NO_LOCAL_WRITE and add a new create method 
DistributedFileSystem but not adding CreateFlag.NO_LOCAL_WRITE.

Should we agree that {{AddBlockFlag}} is an internal flag used within HDFS? In 
the meantime, {{CreateFlag.NO_LOCAL_WRITE}}  is very similar to 
{{CreateFlag.LAZY_PERSIST}} in the way that

*  Both of them are hint for block placement. The actual file system implement 
can choose to support it or ignore it.  Both flags are not necessary to be HDFS 
specific, i.e., some other distributed file systems (Lustre, or even Tachyon) 
can support this as well. 

Thanks!

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702.011.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-24 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.010.patch

Updated the patch to:

* Move {{AddBlockFlag}} to {{o.a.h.hdfs}}
* Add one end-to-end test to use {{CreateFlag.NO_LOCAL_WRITE}} directly.


> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, 
> HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-22 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.009.patch

Updated the patch to mark {{AddBlockFlag}} to be {{Private}}. And it also marks 
{{CreateFlag.NO_LOCAL_WRITE}} as {{LimitedPrivate({"HBase"})}}.

Thanks a lot for the great suggestions, [~stack], [~andrew.wang], [~nkeywal], 
[~arpitagarwal], [~szetszwo]! 

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.008.patch

Thanks a lot for the suggestions, [~stack].  Updated the patch as you suggested.

bq. there has to be two declarations of the enum NO_LOCAL_WRITE

Yes, {{CreateFlag.NO_LOCAL_WRITE}} is the one visible to users, which should be 
used by the applications like HBase.   {{AddBlockFlag}} should  be only used 
within HDFS, I think.

bq.  Add this to release note I'd say.

Done.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702.008.patch, HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Release Note: This patch will attempt to allocate all replicas to remote 
DataNodes, by adding local DataNode to the excluded DataNodes. If no sufficient 
replicas can be obtained, it will fall back to default block placement policy, 
which writes one replica to local DataNode.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702_Design.pdf

Hey, [~arpitagarwal] . Here is the design doc.

The basic idea is very simple: {{BlockPlacementPolicy}} will try to allocate 
replicas by putting local node into excluded node first, if not able to obtain 
sufficient replicas, fall back to normal way.  The rest of this patch is 
changing the signatures of the related functions. 

Would it be OK? Thanks.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, 
> HDFS-3702_Design.pdf
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-10 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.007.patch

Updated the comments for {{ClientProtocol}} and {{AddBlockFlag}}.

The test failures are not relevant. They can be reproduced in trunk.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-09 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.006.patch

Thanks much for the quick reviews, [~andrew.wang].

bq. Can we hook into BlockPlacementPolicyDefault the same way as HDFS-4946? 

As we discussed offline, the sampling logic here is different to HDFS-4946.  In 
HDFS-4946, it tries to obtain a local node at first, if not success, it 
randomly picks from the entire DN pool. So the chosen DN is still possible to 
be local node.  However, this patch requires the chosen DN _exclusively_ from 
the rest of the DN pool.  So HDFS-4946 logic does not apply here, mostly due to 
its fallback code. 

Updated the patch to address the rest of the comments.


> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch, HDFS-3702.006.patch
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-09 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.005.patch

Fix relevant checkstyles and javadoc warnings.

Thanks a lot for the comments, [~nkeywal].  Glad to help out.

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, 
> HDFS-3702.005.patch
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-03-08 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.004.patch

Hey, [~andrew.wang]

Thanks for the suggestions. I change the patch to address your comments. The 
main idea is that the client passes a {{NO_LOCAL_WRITE}} flag to 
{{BlockPlacementPolicy}}, which tries excluding the local DN first. If no 
enough DN can be obtained, then {{BlockPlacementPolicy}} falls back to the 
normal procedure. 

Would you take another look?

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2016-02-22 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.003.patch

Re work the patch against the {{trunk}}.

* Also changed {{DFSClient}} to cache local DN list, so only one NN rpc for 
excluded local list when {{DFSClient}} creates the first {{DFSOutputStream}} 
with {{CreateFlag.NO_LOCAL_WRITE}}. 

> Add an option for NOT writing the blocks locally if there is a datanode on 
> the same box as the client
> -
>
> Key: HDFS-3702
> URL: https://issues.apache.org/jira/browse/HDFS-3702
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 2.5.1
>Reporter: Nicolas Liochon
>Assignee: Lei (Eddy) Xu
>Priority: Minor
>  Labels: BB2015-05-TBR
> Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
> HDFS-3702.002.patch, HDFS-3702.003.patch
>
>
> This is useful for Write-Ahead-Logs: these files are writen for recovery 
> only, and are not read when there are no failures.
> Taking HBase as an example, these files will be read only if the process that 
> wrote them (the 'HBase regionserver') dies. This will likely come from a 
> hardware failure, hence the corresponding datanode will be dead as well. So 
> we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2015-05-05 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer updated HDFS-3702:
---
Labels: BB2015-05-TBR  (was: )

 Add an option for NOT writing the blocks locally if there is a datanode on 
 the same box as the client
 -

 Key: HDFS-3702
 URL: https://issues.apache.org/jira/browse/HDFS-3702
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.5.1
Reporter: Nicolas Liochon
Assignee: Lei (Eddy) Xu
Priority: Minor
  Labels: BB2015-05-TBR
 Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
 HDFS-3702.002.patch


 This is useful for Write-Ahead-Logs: these files are writen for recovery 
 only, and are not read when there are no failures.
 Taking HBase as an example, these files will be read only if the process that 
 wrote them (the 'HBase regionserver') dies. This will likely come from a 
 hardware failure, hence the corresponding datanode will be dead as well. So 
 we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2014-11-24 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.002.patch

[~stack] and [~nkeywal] Thanks for your reviews. Your feedbacks help a lot!

I have updated the patch to:

# addressed comments and added logs.
# bq. Are there other cases in the codebase that you know of where the DN 
NetUtils.getHostname() works as a key for finding the datanode info in the NN?
I changed it to use all local network interface IP addresses as keys to find 
datanode.
# bq. we are doing an extra trip to the NN when we open a DFSOS with 
AVOID_LOCAL_COPY set? 
It now only asks NN when creating DFSClient, which is presumably a relatively 
rare operation. [~stack] Could you help to verify this assumption? Thanks!

 

 Add an option for NOT writing the blocks locally if there is a datanode on 
 the same box as the client
 -

 Key: HDFS-3702
 URL: https://issues.apache.org/jira/browse/HDFS-3702
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.5.1
Reporter: Nicolas Liochon
Assignee: Lei (Eddy) Xu
Priority: Minor
 Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, 
 HDFS-3702.002.patch


 This is useful for Write-Ahead-Logs: these files are writen for recovery 
 only, and are not read when there are no failures.
 Taking HBase as an example, these files will be read only if the process that 
 wrote them (the 'HBase regionserver') dies. This will likely come from a 
 hardware failure, hence the corresponding datanode will be dead as well. So 
 we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2014-11-20 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.001.patch

Update the patch to fix test failures. 

 Add an option for NOT writing the blocks locally if there is a datanode on 
 the same box as the client
 -

 Key: HDFS-3702
 URL: https://issues.apache.org/jira/browse/HDFS-3702
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.5.1
Reporter: Nicolas Liochon
Assignee: Lei (Eddy) Xu
Priority: Minor
 Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch


 This is useful for Write-Ahead-Logs: these files are writen for recovery 
 only, and are not read when there are no failures.
 Taking HBase as an example, these files will be read only if the process that 
 wrote them (the 'HBase regionserver') dies. This will likely come from a 
 hardware failure, hence the corresponding datanode will be dead as well. So 
 we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2014-11-18 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

Attachment: HDFS-3702.000.patch

I have created a patch that provides one flag {{CreateFlag#AVOID_LOCAL_COPY}}. 
When open {{DFSOutputStream}} with {{AVOID_LOCAL_COPY}} flag, 
{{DFSOutputStream}} will ask the {{NameNode}} for the {{DataNodes}} on the same 
box and put it / them into its excluded nodes list.  As a result, written 
blocks will be advised to avoid the same box as the client.

 Add an option for NOT writing the blocks locally if there is a datanode on 
 the same box as the client
 -

 Key: HDFS-3702
 URL: https://issues.apache.org/jira/browse/HDFS-3702
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 1.0.3, 2.0.0-alpha
Reporter: Nicolas Liochon
Priority: Minor
 Attachments: HDFS-3702.000.patch


 This is useful for Write-Ahead-Logs: these files are writen for recovery 
 only, and are not read when there are no failures.
 Taking HBase as an example, these files will be read only if the process that 
 wrote them (the 'HBase regionserver') dies. This will likely come from a 
 hardware failure, hence the corresponding datanode will be dead as well. So 
 we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client

2014-11-18 Thread Lei (Eddy) Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lei (Eddy) Xu updated HDFS-3702:

 Assignee: Lei (Eddy) Xu
 Target Version/s: 2.7.0
Affects Version/s: (was: 2.0.0-alpha)
   (was: 1.0.3)
   2.5.1
   Status: Patch Available  (was: Open)

 Add an option for NOT writing the blocks locally if there is a datanode on 
 the same box as the client
 -

 Key: HDFS-3702
 URL: https://issues.apache.org/jira/browse/HDFS-3702
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs-client
Affects Versions: 2.5.1
Reporter: Nicolas Liochon
Assignee: Lei (Eddy) Xu
Priority: Minor
 Attachments: HDFS-3702.000.patch


 This is useful for Write-Ahead-Logs: these files are writen for recovery 
 only, and are not read when there are no failures.
 Taking HBase as an example, these files will be read only if the process that 
 wrote them (the 'HBase regionserver') dies. This will likely come from a 
 hardware failure, hence the corresponding datanode will be dead as well. So 
 we're writing 3 replicas, but in reality only 2 of them are really useful.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)