[ 
https://issues.apache.org/jira/browse/HADOOP-4663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12665224#action_12665224
 ] 

Konstantin Shvachko commented on HADOOP-4663:
---------------------------------------------

Hi Dhruba. There were several issues (we actually had a data loss) caused by 
promoting blocks from tmp to main storage. One of them is HADOOP-4702, which 
was probably when you were on vacation.
The problem was that partial blocks were promoted to the main storage even 
though they were transient.
I am arguing that your solution does not eliminate this condition. 
blocksBeingWritten can still contain transient blocks and they will still be 
promoted to the main storage. A simple example is when you write a block 
(without using sync()) and the DN fails in the middle leaving an incomplete 
(transient) block in blocksBeingWritten directory. This block will be moved to 
the main storage upon DN restart although it should be treated exactly as the 
blocks in blocksBeingReplicated directory are, because neither the block nor a 
part of it was ever finalized.
Does it make more sense?

> Datanode should delete files under tmp when upgraded from 0.17
> --------------------------------------------------------------
>
>                 Key: HADOOP-4663
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4663
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Raghu Angadi
>            Assignee: dhruba borthakur
>            Priority: Blocker
>             Fix For: 0.19.1
>
>         Attachments: deleteTmp.patch, deleteTmp2.patch, deleteTmp_0.18.patch, 
> handleTmp1.patch
>
>
> Before 0.18, when Datanode restarts, it deletes files under data-dir/tmp  
> directory since these files are not valid anymore. But in 0.18 it moves these 
> files to normal directory incorrectly making them valid blocks. One of the 
> following would work :
> - remove the tmp files during upgrade, or
> - if the files under /tmp are in pre-18 format (i.e. no generation), delete 
> them.
> Currently effect of this bug is that, these files end up failing block 
> verification and eventually get deleted. But cause incorrect over-replication 
> at the namenode before that.
> Also it looks like our policy regd treating files under tmp needs to be 
> defined better. Right now there are probably one or two more bugs with it. 
> Dhruba, please file them if you rememeber.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to