[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9079:
----------------------------
    Attachment: HDFS-9079.00.patch

This is a wip patch to explore the idea of serializing *different types* of 
update events from {{StripedDataStreamer}}.

The current implementation creates new tools ({{MultipleBlockingQueue}} 
{{ConcurrentPoll}}) to serialize all events of the same type. For example, all 
{{updateBlockForPipeline}} calls will be synchronized.

But as we discussed under HDFS-9040, it is still possible to have interleaved 
updates from different types of events. [~jingzhao]'s patch there does a great 
job to simplify the logic by concentrating most updates to the 
{{DFSStripedOutputStream}} level. I'm uploading a patch based on Jing and 
Walter's work under that JIRA. My earlier [comment | 
https://issues.apache.org/jira/browse/HDFS-9040?focusedCommentId=14741972&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14741972]
 describes part of the algorithm in the patch. It has the following potential 
benefits:
# I think Walter made a similar argument when creating his first wip patch on 
HDFS-9040: {{DFSOutputStream}} is not a long-running daemon, but a one-off 
thread for specific calls, like {{writeChunk}} and {{close}}. It's not easy to 
insert the logic to periodically checking streamers. 
{{BlockMetadataCoordinator}} in this patch is a little similar to 
{{BlockGroupStreamer}} in Walter's patch. It is a daemon to process incoming 
updates. This will avoid the need to wait for the next {{DFSOutputStream}} to 
process an update.
# This patch also limits the lifetime of a {{StripedDataStreamer}} to a single 
block. At {{endBlock}} the streamer will close itself. 

A few TODOs that I will add in the next rev:
# I need to add the logic of replacing all streamers at {{writeChunk}}, similar 
to {{replaceFailedStreamers}} in Jing's patch.
# Need to bump GS of all {{FINISHED}} streamers to the maximum of 
preallocation. When all streamers are either {{FINISHED}} or {{FAILED}}, need 
to update NN.
# Need to handle the block token problem Walter pointed out.

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9079
>                 URL: https://issues.apache.org/jira/browse/HDFS-9079
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9079.00.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped 
> block group ({{FSN#createNewBlock}}). For each new striped block group we can 
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can 
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we 
> shouldn't try to further recover anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to