[
https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16010483#comment-16010483
]
Ewan Higgs edited comment on HDFS-11639 at 5/15/17 1:16 PM:
{quote}
Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash
83dd14aa84ad697ad32c51007ac31ad39feb4288).{quote}
Thanks!
{quote}In DataTransfer#run(), the blockAlias should be null unless it is for a
provided block. I think this will entail adding BlockAlias to transferBlock()
and also to BlockCommand for the DatanodeProtocol.DNA_TRANSFER command.
However, this will only be relevant for writing provided blocks (and in
particular recovery).{quote}
If this entails a protocol change, I think it makes the most sense to do it at
this point so all the protocol changes happen up front in one change if we need
this to get in for 3.0.
Does it make sense to have the BlockAlias in transferBlock? If we know the
targetStorageTypes and targetStorageIDs then we can know that nothing needs to
be transferred. Or is this an issue if we want to transfer from PROVIDED to
DISK?
{quote}Looking over this patch, one thing that occurred to me is if it makes
sense to unify FileRegionProvider with BlockProvider? They both have very close
functionality.{quote}
I think this makes a lot of sense.
{quote}I like the use of BlockProvider#resolve(). If we unify
FileRegionProvider with BlockProvider, then resolve can return null if the
block map is accessible from the Datanodes also. If it is accessible only from
the Namenode, then a non-null value can be propagated to the Datanode.{quote}
With the pending refactoring of the FsDatasetImpl which won't have replicas a
priori, I wonder if it makes sense for the Datanode to have a
FileRegionProvider or BlockProvider at all. They are given the appropriate
block ID and block alias in the readBlock or writeBlock message. Maybe I'm
overlooking what's still being provided.
was (Author: ehiggs):
{quote}
Btw, I rebased the HDFS-9806 branch on the most recent version of trunk (hash
83dd14aa84ad697ad32c51007ac31ad39feb4288).{quote}
Thanks!
{quote}In DataTransfer#run(), the blockAlias should be null unless it is for a
provided block. I think this will entail adding BlockAlias to transferBlock()
and also to BlockCommand for the DatanodeProtocol.DNA_TRANSFER command.
However, this will only be relevant for writing provided blocks (and in
particular recovery).{quote}
If this entails a protocol change, I think it makes the most sense to do it at
this point so all the protocol changes happen up front in one change if we need
this to get in for 3.0.
Does it make sense to have the BlockAlias in transferBlock? If we know the
targetStorageTypes and targetStorageIDs then we can know that nothing needs to
be transferred. Or is this an issue if we want to transfer from PROVIDED to
DISK?
{quote}Looking over this patch, one thing that occurred to me is if it makes
sense to unify FileRegionProvider with BlockProvider? They both have very close
functionality.{quote}
I think this makes a lot of sense.
{code}I like the use of BlockProvider#resolve(). If we unify FileRegionProvider
with BlockProvider, then resolve can return null if the block map is accessible
from the Datanodes also. If it is accessible only from the Namenode, then a
non-null value can be propagated to the Datanode.{code}
With the pending refactoring of the FsDatasetImpl which won't have replicas a
priori, I wonder if it makes sense for the Datanode to have a
FileRegionProvider or BlockProvider at all. They are given the appropriate
block ID and block alias in the readBlock or writeBlock message. Maybe I'm
overlooking what's still being provided.
> [READ] Encode the BlockAlias in the client protocol
> ---
>
> Key: HDFS-11639
> URL: https://issues.apache.org/jira/browse/HDFS-11639
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: hdfs
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Attachments: HDFS-11639-HDFS-9806.001.patch,
> HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which
> encodes information about where the data comes from. i.e. URI, offset,
> length, and nonce value. This data should be encoded in the protocol
> ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}}) when a block is
> available using the PROVIDED storage type.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org