[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] tomscut updated HDFS-3702: -- Description: This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. was: This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-3702: - Fix Version/s: (was: 2.9.0) > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 2.8.0, 3.0.0-alpha1 > > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Fix Version/s: 2.8.0 > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.8.0, 2.9.0 > > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Resolution: Fixed Fix Version/s: 2.9.0 3.0.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks a lot for the detailed suggestions and kindly reviews from [~andrew.wang], [~nkeywal], [~stack], [~arpitagarwal] and [~szetszwo]! > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Fix For: 3.0.0, 2.9.0 > > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.012.patch bq. BTW, we should check if excludedNodes.contains(writer) is already true; otherwise, fallback does not help. Fixed in the newly updated patch. [~szetszwo] What do you think about [~cmccabe] and [~stack]'s suggestions? Would that works for you? Thanks! > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702.012.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.011.patch Hey, [~szetszwo] Thanks a lot for your good suggestions. bq. Please remove the "Fallback to use the default block placement." debug message Done bq. When avoidLocalNode == false, ... We should only create the new array once in this case. Done bq. Please use new AddBlockFlag.NO_LOCAL_WRITE and add a new create method DistributedFileSystem but not adding CreateFlag.NO_LOCAL_WRITE. Should we agree that {{AddBlockFlag}} is an internal flag used within HDFS? In the meantime, {{CreateFlag.NO_LOCAL_WRITE}} is very similar to {{CreateFlag.LAZY_PERSIST}} in the way that * Both of them are hint for block placement. The actual file system implement can choose to support it or ignore it. Both flags are not necessary to be HDFS specific, i.e., some other distributed file systems (Lustre, or even Tachyon) can support this as well. Thanks! > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702.011.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.010.patch Updated the patch to: * Move {{AddBlockFlag}} to {{o.a.h.hdfs}} * Add one end-to-end test to use {{CreateFlag.NO_LOCAL_WRITE}} directly. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702.010.patch, > HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.009.patch Updated the patch to mark {{AddBlockFlag}} to be {{Private}}. And it also marks {{CreateFlag.NO_LOCAL_WRITE}} as {{LimitedPrivate({"HBase"})}}. Thanks a lot for the great suggestions, [~stack], [~andrew.wang], [~nkeywal], [~arpitagarwal], [~szetszwo]! > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702.009.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.008.patch Thanks a lot for the suggestions, [~stack]. Updated the patch as you suggested. bq. there has to be two declarations of the enum NO_LOCAL_WRITE Yes, {{CreateFlag.NO_LOCAL_WRITE}} is the one visible to users, which should be used by the applications like HBase. {{AddBlockFlag}} should be only used within HDFS, I think. bq. Add this to release note I'd say. Done. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702.008.patch, HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Release Note: This patch will attempt to allocate all replicas to remote DataNodes, by adding local DataNode to the excluded DataNodes. If no sufficient replicas can be obtained, it will fall back to default block placement policy, which writes one replica to local DataNode. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702_Design.pdf Hey, [~arpitagarwal] . Here is the design doc. The basic idea is very simple: {{BlockPlacementPolicy}} will try to allocate replicas by putting local node into excluded node first, if not able to obtain sufficient replicas, fall back to normal way. The rest of this patch is changing the signatures of the related functions. Would it be OK? Thanks. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch, > HDFS-3702_Design.pdf > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.007.patch Updated the comments for {{ClientProtocol}} and {{AddBlockFlag}}. The test failures are not relevant. They can be reproduced in trunk. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch, HDFS-3702.007.patch > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.006.patch Thanks much for the quick reviews, [~andrew.wang]. bq. Can we hook into BlockPlacementPolicyDefault the same way as HDFS-4946? As we discussed offline, the sampling logic here is different to HDFS-4946. In HDFS-4946, it tries to obtain a local node at first, if not success, it randomly picks from the entire DN pool. So the chosen DN is still possible to be local node. However, this patch requires the chosen DN _exclusively_ from the rest of the DN pool. So HDFS-4946 logic does not apply here, mostly due to its fallback code. Updated the patch to address the rest of the comments. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch, HDFS-3702.006.patch > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.005.patch Fix relevant checkstyles and javadoc warnings. Thanks a lot for the comments, [~nkeywal]. Glad to help out. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch, > HDFS-3702.005.patch > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.004.patch Hey, [~andrew.wang] Thanks for the suggestions. I change the patch to address your comments. The main idea is that the client passes a {{NO_LOCAL_WRITE}} flag to {{BlockPlacementPolicy}}, which tries excluding the local DN first. If no enough DN can be obtained, then {{BlockPlacementPolicy}} falls back to the normal procedure. Would you take another look? > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch, HDFS-3702.004.patch > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.003.patch Re work the patch against the {{trunk}}. * Also changed {{DFSClient}} to cache local DN list, so only one NN rpc for excluded local list when {{DFSClient}} creates the first {{DFSOutputStream}} with {{CreateFlag.NO_LOCAL_WRITE}}. > Add an option for NOT writing the blocks locally if there is a datanode on > the same box as the client > - > > Key: HDFS-3702 > URL: https://issues.apache.org/jira/browse/HDFS-3702 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client >Affects Versions: 2.5.1 >Reporter: Nicolas Liochon >Assignee: Lei (Eddy) Xu >Priority: Minor > Labels: BB2015-05-TBR > Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, > HDFS-3702.002.patch, HDFS-3702.003.patch > > > This is useful for Write-Ahead-Logs: these files are writen for recovery > only, and are not read when there are no failures. > Taking HBase as an example, these files will be read only if the process that > wrote them (the 'HBase regionserver') dies. This will likely come from a > hardware failure, hence the corresponding datanode will be dead as well. So > we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen Wittenauer updated HDFS-3702: --- Labels: BB2015-05-TBR (was: ) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client - Key: HDFS-3702 URL: https://issues.apache.org/jira/browse/HDFS-3702 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.5.1 Reporter: Nicolas Liochon Assignee: Lei (Eddy) Xu Priority: Minor Labels: BB2015-05-TBR Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, HDFS-3702.002.patch This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.002.patch [~stack] and [~nkeywal] Thanks for your reviews. Your feedbacks help a lot! I have updated the patch to: # addressed comments and added logs. # bq. Are there other cases in the codebase that you know of where the DN NetUtils.getHostname() works as a key for finding the datanode info in the NN? I changed it to use all local network interface IP addresses as keys to find datanode. # bq. we are doing an extra trip to the NN when we open a DFSOS with AVOID_LOCAL_COPY set? It now only asks NN when creating DFSClient, which is presumably a relatively rare operation. [~stack] Could you help to verify this assumption? Thanks! Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client - Key: HDFS-3702 URL: https://issues.apache.org/jira/browse/HDFS-3702 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.5.1 Reporter: Nicolas Liochon Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch, HDFS-3702.002.patch This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.001.patch Update the patch to fix test failures. Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client - Key: HDFS-3702 URL: https://issues.apache.org/jira/browse/HDFS-3702 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.5.1 Reporter: Nicolas Liochon Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-3702.000.patch, HDFS-3702.001.patch This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Attachment: HDFS-3702.000.patch I have created a patch that provides one flag {{CreateFlag#AVOID_LOCAL_COPY}}. When open {{DFSOutputStream}} with {{AVOID_LOCAL_COPY}} flag, {{DFSOutputStream}} will ask the {{NameNode}} for the {{DataNodes}} on the same box and put it / them into its excluded nodes list. As a result, written blocks will be advised to avoid the same box as the client. Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client - Key: HDFS-3702 URL: https://issues.apache.org/jira/browse/HDFS-3702 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 1.0.3, 2.0.0-alpha Reporter: Nicolas Liochon Priority: Minor Attachments: HDFS-3702.000.patch This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-3702) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client
[ https://issues.apache.org/jira/browse/HDFS-3702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lei (Eddy) Xu updated HDFS-3702: Assignee: Lei (Eddy) Xu Target Version/s: 2.7.0 Affects Version/s: (was: 2.0.0-alpha) (was: 1.0.3) 2.5.1 Status: Patch Available (was: Open) Add an option for NOT writing the blocks locally if there is a datanode on the same box as the client - Key: HDFS-3702 URL: https://issues.apache.org/jira/browse/HDFS-3702 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs-client Affects Versions: 2.5.1 Reporter: Nicolas Liochon Assignee: Lei (Eddy) Xu Priority: Minor Attachments: HDFS-3702.000.patch This is useful for Write-Ahead-Logs: these files are writen for recovery only, and are not read when there are no failures. Taking HBase as an example, these files will be read only if the process that wrote them (the 'HBase regionserver') dies. This will likely come from a hardware failure, hence the corresponding datanode will be dead as well. So we're writing 3 replicas, but in reality only 2 of them are really useful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)