[ 
https://issues.apache.org/jira/browse/HDFS-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217583#comment-15217583
 ] 

GAO Rui commented on HDFS-7661:
-------------------------------

Very creative idea, [~zhz]. {{Without any overwriting}} actually could simplify 
{{hflush/hsync}}. Inspired by your idea, I have came up with some new thoughts.

It may be a little strange to store data cells to parity DNs. Instead, maybe we 
could store IPB(Internal Parity Block) file as two parts(two seperate files). 
The first part is parity data which would not be modified. The second part is 
the flushed parity cell of the being written stripe. For the second part, we 
could keep the latest two version, for example, {{last-flushed-parity-cell-0}} 
and {{last-flushed-parity-cell-1}}. And the structure of 
{{last-flushed-parity-cell-X}} could be:  logical block group length + parity 
cell data.

So, for writing, whenever the being written stipe is been hflush/hsync, we 
replace the older {{last-flushed-parity-cell-X}} file with the new flushed 
logical block group length and new parity cell data. For reading, parity DN 
locally choose on of the two {{last-flushed-parity-cell-X}} files based on read 
client requests. 

With this kind of design we avoid {{overwriting}} IPB file, which simplify code 
implementation as well. Also we always keep the safety of the last flushed data 
by switch from two files names ({{last-flushed-parity-cell-0}} and 
{{last-flushed-parity-cell-1}}).

> [umbrella] support hflush and hsync for erasure coded files
> -----------------------------------------------------------
>
>                 Key: HDFS-7661
>                 URL: https://issues.apache.org/jira/browse/HDFS-7661
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: erasure-coding
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: GAO Rui
>         Attachments: EC-file-flush-and-sync-steps-plan-2015-12-01.png, 
> HDFS-7661-unitTest-wip-trunk.patch, HDFS-7661-wip.01.patch, 
> HDFS-EC-file-flush-sync-design-v20160323.pdf, 
> HDFS-EC-file-flush-sync-design-version1.1.pdf
>
>
> We also need to support hflush/hsync and visible length. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to