[ https://issues.apache.org/jira/browse/HBASE-4552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13133696#comment-13133696 ]
Jonathan Hsieh commented on HBASE-4552: --------------------------------------- If we have an hdfs failure, we won't be able to record or update information about what failed. This make me think we need to journal/log the intended atomic actions. Once we have the log, we can act depending on the situation: * If we complete successfully, we remove/invalidate log and carry on. * If we fail (can't write, rs goes down and restarts), we check to see if everything is in. If it isn't we rollback the subset of hfile loads that had happened. If rollback fails, we still have the log, so we can try later or maybe we kill the RS? How about we make this a subtask/follow on jira. The first cut will just detect the situation and log error messages (similar to what currently happens). A follow-on task will discuss and add/implement a recovery mechanism? > multi-CF bulk load is not atomic across column families > ------------------------------------------------------- > > Key: HBASE-4552 > URL: https://issues.apache.org/jira/browse/HBASE-4552 > Project: HBase > Issue Type: Bug > Components: regionserver > Affects Versions: 0.92.0 > Reporter: Todd Lipcon > Assignee: Jonathan Hsieh > Fix For: 0.92.0 > > > Currently the bulk load API simply imports one HFile at a time. With > multi-column-family support, this is inappropriate, since different CFs show > up separately. Instead, the IPC endpoint should take a of CF -> HFiles, so we > can online them all under a single region-wide lock. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira