[ 
https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387204#comment-14387204
 ] 

Zhe Zhang commented on HDFS-7889:
---------------------------------

Thanks Bo for the revised patch!

I think the main changes to {{DFSOutputStream}} are:
# Change some fields and methods from private to protected
# {{streamer}} field used to be accessed directly; now added a getter 
{{getStreamer()}}
# Refactor: separate out 2 methods in {{writeChunkImpl}}: 
{{adjustChunkBoundary}} and {{encounterBlockBoundary}}
# Refactor: separate out a {{getFileStatus}} method
# Adds logic to {{newStreamForCreate}} and {{newStreamForAppend}} to create 
striped output streams based on file status.

Looks to me 1~4 can go into trunk under HDFS-7888. Regarding 5, since you 
already have an abstract {{getFileStatus}} method, maybe it's better to do that 
checking in {{DFSClient}}?

The main logics in {{DFSStripedOutputStream}} and {{StripedDataStreamer}} look 
pretty good to me overall. Some detailed issues to address:
# It's probably OK to pad zero-bytes now, but I think we should eventually get 
rid of it. Instead of actually writing zero-bytes to the DNs hosting raw data 
blocks, we should just flush the parity data. When we do that we can also get 
rid of {{bytesCurBlockGroup}}.
# 2nd constructor of {{StripedDataStreamer}} is not needed
# {{InterruptedException}} should be properly handled in 
{{locateFollowingBlock}}
# Array of generic objects is against Java best practice I think.
{code}
BlockingQueue<LocatedBlock>[] stripeBlocks = new BlockingQueue[blockGroupSize];
{code}
How about checking type when accessing stripedBlock elements, or just use an 
ArrayList? 
{code}
BlockingQueue<\?>[] stripeBlocks = new BlockingQueue[blockGroupSize];
...
assert stripedBlocks[i] instanceof LocatedBlock
Access stripedBlocks[i]
{code}

Nits:
# Typo in {{DFSOutputStream}}: clazzName
# In {{DFSStripedOutputStream}}: {{MaxBlockInQueue}} should be 
{{MaxBlocksInQueue}}


> Subclass DFSOutputStream to support writing striping layout files
> -----------------------------------------------------------------
>
>                 Key: HDFS-7889
>                 URL: https://issues.apache.org/jira/browse/HDFS-7889
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Li Bo
>         Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, 
> HDFS-7889-003.patch, HDFS-7889-004.patch
>
>
> After HDFS-7888, we can subclass  {{DFSOutputStream}} to support writing 
> striping layout files. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to