[ https://issues.apache.org/jira/browse/HDFS-7889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14387204#comment-14387204 ]
Zhe Zhang commented on HDFS-7889: --------------------------------- Thanks Bo for the revised patch! I think the main changes to {{DFSOutputStream}} are: # Change some fields and methods from private to protected # {{streamer}} field used to be accessed directly; now added a getter {{getStreamer()}} # Refactor: separate out 2 methods in {{writeChunkImpl}}: {{adjustChunkBoundary}} and {{encounterBlockBoundary}} # Refactor: separate out a {{getFileStatus}} method # Adds logic to {{newStreamForCreate}} and {{newStreamForAppend}} to create striped output streams based on file status. Looks to me 1~4 can go into trunk under HDFS-7888. Regarding 5, since you already have an abstract {{getFileStatus}} method, maybe it's better to do that checking in {{DFSClient}}? The main logics in {{DFSStripedOutputStream}} and {{StripedDataStreamer}} look pretty good to me overall. Some detailed issues to address: # It's probably OK to pad zero-bytes now, but I think we should eventually get rid of it. Instead of actually writing zero-bytes to the DNs hosting raw data blocks, we should just flush the parity data. When we do that we can also get rid of {{bytesCurBlockGroup}}. # 2nd constructor of {{StripedDataStreamer}} is not needed # {{InterruptedException}} should be properly handled in {{locateFollowingBlock}} # Array of generic objects is against Java best practice I think. {code} BlockingQueue<LocatedBlock>[] stripeBlocks = new BlockingQueue[blockGroupSize]; {code} How about checking type when accessing stripedBlock elements, or just use an ArrayList? {code} BlockingQueue<\?>[] stripeBlocks = new BlockingQueue[blockGroupSize]; ... assert stripedBlocks[i] instanceof LocatedBlock Access stripedBlocks[i] {code} Nits: # Typo in {{DFSOutputStream}}: clazzName # In {{DFSStripedOutputStream}}: {{MaxBlockInQueue}} should be {{MaxBlocksInQueue}} > Subclass DFSOutputStream to support writing striping layout files > ----------------------------------------------------------------- > > Key: HDFS-7889 > URL: https://issues.apache.org/jira/browse/HDFS-7889 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Li Bo > Assignee: Li Bo > Attachments: HDFS-7889-001.patch, HDFS-7889-002.patch, > HDFS-7889-003.patch, HDFS-7889-004.patch > > > After HDFS-7888, we can subclass {{DFSOutputStream}} to support writing > striping layout files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)