[ 
https://issues.apache.org/jira/browse/HDFS-9098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhe Zhang updated HDFS-9098:
----------------------------
    Attachment: HDFS-9098.wip.patch

WIP patch to demonstrate the idea. It leverages ideas from the [IMUnit | 
http://mir.cs.illinois.edu/imunit/] paper and [sync_point testing | 
https://github.com/cloudera/kudu/blob/master/src/kudu/util/sync_point.cc] in 
Kudu.

The logic of {{syncPoint}} is still hacky because it needs to serve both as a 
synchronization point and a fault injector. I'm working on a better structure.

The added {{TestStripedDataStreamers}} has a very simple test to emulate the 
case where a second failure happens during the {{updateBlockForPipeline}} for 
the first failure. I think ideally we need to create {{BEFORE}} and {{AFTER}} 
events, like {{BEFORE_UPDATE_BLOCK_FOR_PIPELINE}} and 
{{AFTER_UPDATE_BLOCK_FOR_PIPELINE}}. The {{WRITE_CHUNK}} event is also a little 
tricky. We need to emulate the {{writeChunk}} for a specific offset.

> Erasure coding: emulate race conditions among striped streamers in write 
> pipeline
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-9098
>                 URL: https://issues.apache.org/jira/browse/HDFS-9098
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: 3.0.0
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9098.wip.patch
>
>
> Apparently the interleaving of events among {{StripedDataStreamer}}'s is very 
> tricky to handle. [~walter.k.su] and [~jingzhao] have discussed several race 
> conditions under HDFS-9040.
> Let's use FaultInjector to emulate different combinations of interleaved 
> events.
> In particular, we should consider inject delays in the following places:
> # {{Streamer#endBlock}}
> # {{Streamer#locateFollowingBlock}}
> # {{Streamer#updateBlockForPipeline}}
> # {{Streamer#updatePipeline}}
> # {{OutputStream#writeChunk}}
> # {{OutputStream#close}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to