[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191314#comment-13191314
 ] 

Zhihong Yu commented on HBASE-5210:
-----------------------------------

I prefer Lawrence's approach.
The only consideration is that it takes relatively long for the proposed change 
in FileOutputCommitter.moveTaskOutputs() to be published, reviewed and pushed 
upstream.
                
> HFiles are missing from an incremental load
> -------------------------------------------
>
>                 Key: HBASE-5210
>                 URL: https://issues.apache.org/jira/browse/HBASE-5210
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.90.2
>         Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>            Reporter: Lawrence Simpson
>         Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *    1.      Both reducers start near the same time.
>  *    2.      The first reducer reaches the point where it wants to write its 
> first file.
>  *    3.      It uses the StoreFile class which contains a static Random 
> object 
>  *            which is initialized by default using a timestamp.
>  *    4.      The file name is generated using the random number generator.
>  *    5.      The file name is checked against other existing files.
>  *    6.      The file is written into temporary files in a directory named
>  *            after the reducer attempt.
>  *    7.      The second reduce task reaches the same point, but its 
> StoreClass
>  *            (which is now in the file system's cache) gets loaded within the
>  *            time resolution of the OS and thus initializes its Random()
>  *            object with the same seed as the first task.
>  *    8.      The second task also checks for an existing file with the name
>  *            generated by the random number generator and finds no conflict
>  *            because each task is writing files in its own temporary folder.
>  *    9.      The first task finishes and gets its temporary files committed
>  *            to the "real" folder specified for output of the HFiles.
>  *     10.    The second task then reaches its own conclusion and commits its
>  *            files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *            any files with the same name.  No warning messages or anything.
>  *            The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *    reduce task.  They are different reduce tasks so data is
>  *    really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I have done and by that time I may have some information on the
> results.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to