[jira] [Comment Edited] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-15 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010483#comment-16010483
 ] 

Ewan Higgs edited comment on HDFS-11639 at 5/15/17 1:16 PM:


{quote}
Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash 
83dd14aa84ad697ad32c51007ac31ad39feb4288).{quote}
Thanks!
{quote}In DataTransfer#run(), the blockAlias should be null unless it is for a 
provided block. I think this will entail adding BlockAlias to transferBlock() 
and also to BlockCommand for the DatanodeProtocol.DNA_TRANSFER command. 
However, this will only be relevant for writing provided blocks (and in 
particular recovery).{quote}
If this entails a protocol change, I think it makes the most sense to do it at 
this point so all the protocol changes happen up front in one change if we need 
this to get in for 3.0. 
Does it make sense to have the BlockAlias in transferBlock? If we know the 
targetStorageTypes and targetStorageIDs then we can know that nothing needs to 
be transferred. Or is this an issue if we want to transfer from PROVIDED to 
DISK?

{quote}Looking over this patch, one thing that occurred to me is if it makes 
sense to unify FileRegionProvider with BlockProvider? They both have very close 
functionality.{quote}
I think this makes a lot of sense.
{quote}I like the use of BlockProvider#resolve(). If we unify 
FileRegionProvider with BlockProvider, then resolve can return null if the 
block map is accessible from the Datanodes also. If it is accessible only from 
the Namenode, then a non-null value can be propagated to the Datanode.{quote}
With the pending refactoring of the FsDatasetImpl which won't have replicas a 
priori, I wonder if it makes sense for the Datanode to have a 
FileRegionProvider or BlockProvider at all. They are given the appropriate 
block ID and block alias in the readBlock or writeBlock message. Maybe I'm 
overlooking what's still being provided.


was (Author: ehiggs):
{quote}
Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash 
83dd14aa84ad697ad32c51007ac31ad39feb4288).{quote}
Thanks!
{quote}In DataTransfer#run(), the blockAlias should be null unless it is for a 
provided block. I think this will entail adding BlockAlias to transferBlock() 
and also to BlockCommand for the DatanodeProtocol.DNA_TRANSFER command. 
However, this will only be relevant for writing provided blocks (and in 
particular recovery).{quote}
If this entails a protocol change, I think it makes the most sense to do it at 
this point so all the protocol changes happen up front in one change if we need 
this to get in for 3.0. 
Does it make sense to have the BlockAlias in transferBlock? If we know the 
targetStorageTypes and targetStorageIDs then we can know that nothing needs to 
be transferred. Or is this an issue if we want to transfer from PROVIDED to 
DISK?

{quote}Looking over this patch, one thing that occurred to me is if it makes 
sense to unify FileRegionProvider with BlockProvider? They both have very close 
functionality.{quote}
I think this makes a lot of sense.
{code}I like the use of BlockProvider#resolve(). If we unify FileRegionProvider 
with BlockProvider, then resolve can return null if the block map is accessible 
from the Datanodes also. If it is accessible only from the Namenode, then a 
non-null value can be propagated to the Datanode.{code}
With the pending refactoring of the FsDatasetImpl which won't have replicas a 
priori, I wonder if it makes sense for the Datanode to have a 
FileRegionProvider or BlockProvider at all. They are given the appropriate 
block ID and block alias in the readBlock or writeBlock message. Maybe I'm 
overlooking what's still being provided.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch, 
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol

2017-05-10 Thread Ewan Higgs (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004653#comment-16004653
 ] 

Ewan Higgs edited comment on HDFS-11639 at 5/10/17 1:14 PM:


Attached a patch that encodes the {{BlockAlias}} into the read and write 
protocol. This also adds the {{BlockAlias}} to the {{FileRegion}}.

This work is not yet complete as we need to connect the {{BlockSender}} to the 
{{FsDatasetImpl}} and/or {{ProvidedVolumeImpl}}, {{ReplicaMap}}, etc.


was (Author: ehiggs):
Attached a patch that encodes the {{BlockAlias}} into the read and write 
protocol. This also adds the {{BlockAlias}} to the {{FileRegion]}.

This work is not yet complete as we need to connect the {{BlockSender}} to the 
{{FsDatasetImpl}} and/or {{ProvidedVolumeImpl}}, {{ReplicaMap}}, etc.

> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which 
> encodes information about where the data comes from. i.e. URI, offset, 
> length, and nonce value. This data should be encoded in the protocol 
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is 
> available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org