GetBlock APIs

Xiaoyu Yao (JIRA) Mon, 17 Apr 2017 17:12:04 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-11504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15971870#comment-15971870
 ]


Xiaoyu Yao edited comment on HDFS-11504 at 4/18/17 12:11 AM:
-------------------------------------------------------------

Thanks [~anu] for the review. I've updated the patch v3 to address the 
comments. 

bq. Something to think about , should we create a KeyManagerProtocol.proto file 
for these APIs , and move all Block APIs under KSM ?
StorageContainerLocation.proto – ScmLocatedBlockProto Would it make sense to 
return key + pipeline ?

Fixed.

bq. Do we need the AllocatedBlock#Builder Class. It has only 2 fields and they 
types are very different.
unused import in BlockManagerImpl 
org.apache.hadoop.hdfs.ozone.protocol.proto.ContainerProtos;

Fixed.

bq. allocateBlock
// TODO: handle block size greater than the container size
// For now, allow only block size <= containerSize
I don't think we should ever allow a block size larger than container size.
We should throw a proper exception here.
Preconditions.checkArgument(size <= containerSize, "Unsupported block size");
throw new IOException("Unable to create block while in chill mode"
Can we replace the IOException with StorageContainerException ?

Fixed.

How does the client differentiate if it needs to create a container or not ? 
Add a flag in the AllocatedBlock so that client is aware if the create has to 
be done by the client ?
if ((currentContainerName == null) || ((currentContainerName != null)
                  && (currentContainerUsed >= containerSize))) {
                currentContainerName = UUID.randomUUID().toString();
                pipeline = 
containerManager.allocateContainer(currentContainerName);
                currentContainerUsed = size;
This prevents us from recovering a block when a node fails. We have to do the 
TODO mentioned here. The issue is that pipelines get rewritten all the time 
(the recovery code when datanodes fail), so we cannot update blocks.db to do 
that. We need to single source of truth and blocks.db should not cache this 
info.
// TODO: block->container mapping or block->pipeline mapping
// block->container fits naturally into the layered design and 
// leave the container->pipeline mapping to container manager.
ScmLocatedBlock.java:
+ "; locations=" + locations
locations is a list, so not sure if this print is going to print
what you expect.

Fixed. 


was (Author: xyao):
bq. Something to think about , should we create a KeyManagerProtocol.proto file 
for these APIs , and move all Block APIs under KSM ?
StorageContainerLocation.proto – ScmLocatedBlockProto Would it make sense to 
return key + pipeline ?

Fixed.

bq. Do we need the AllocatedBlock#Builder Class. It has only 2 fields and they 
types are very different.
unused import in BlockManagerImpl 
org.apache.hadoop.hdfs.ozone.protocol.proto.ContainerProtos;

Fixed.

bq. allocateBlock
// TODO: handle block size greater than the container size
// For now, allow only block size <= containerSize
I don't think we should ever allow a block size larger than container size.
We should throw a proper exception here.
Preconditions.checkArgument(size <= containerSize, "Unsupported block size");
throw new IOException("Unable to create block while in chill mode"
Can we replace the IOException with StorageContainerException ?

Fixed.

How does the client differentiate if it needs to create a container or not ? 
Add a flag in the AllocatedBlock so that client is aware if the create has to 
be done by the client ?
if ((currentContainerName == null) || ((currentContainerName != null)
                  && (currentContainerUsed >= containerSize))) {
                currentContainerName = UUID.randomUUID().toString();
                pipeline = 
containerManager.allocateContainer(currentContainerName);
                currentContainerUsed = size;
This prevents us from recovering a block when a node fails. We have to do the 
TODO mentioned here. The issue is that pipelines get rewritten all the time 
(the recovery code when datanodes fail), so we cannot update blocks.db to do 
that. We need to single source of truth and blocks.db should not cache this 
info.
// TODO: block->container mapping or block->pipeline mapping
// block->container fits naturally into the layered design and 
// leave the container->pipeline mapping to container manager.
ScmLocatedBlock.java:
+ "; locations=" + locations
locations is a list, so not sure if this print is going to print
what you expect.

Fixed. 

> Ozone: SCM: Add AllocateBlock/DeleteBlock/GetBlock APIs
> -------------------------------------------------------
>
>                 Key: HDFS-11504
>                 URL: https://issues.apache.org/jira/browse/HDFS-11504
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7240
>            Reporter: Xiaoyu Yao
>            Assignee: Xiaoyu Yao
>         Attachments: HDFS-11504-HDFS-7240.001.patch, 
> HDFS-11504-HDFS-7240.002.patch
>
>
> The signature of the APIs are listed below. This allows SCM to 
> 1) allocateBlock for client and maintain the key->container mapping in level 
> DB in addition to the existing container to pipeline mapping in level DB. 
> 2) return the pipeline of a block based on the key.
> 3) remove the block based on the key of the block. 
> {code}
> <Pipeline,key> allocateBlock(long size)
> <Pipeline> getBlock(key);
> void deleteBlock(key);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-11504) Ozone: SCM: Add AllocateBlock/DeleteBlock/GetBlock APIs

Reply via email to