[ https://issues.apache.org/jira/browse/HBASE-11484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14057805#comment-14057805 ]
deepankar commented on HBASE-11484: ----------------------------------- There are no cleanup hooks, I don't think the job cleans up the restoreDir. I was thinking of this another approach, where if you have a restored snapshot into a dir , restoring it again should not be creating any new files, This approach is ok in some sense what we wanted and we can use one standard restoreDir per snapshot exported. But I went through the code, unfortunately restoring will try to create new HFileLinks by deleting the old HFileLinks (exactly the same links). Is this supposed to happen or Is this something that might have been missed ? > Provide a way in TableSnapshotInputFormat, not to restore the regions to a > path for running MR every time, rather reuse a already restored path > ----------------------------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-11484 > URL: https://issues.apache.org/jira/browse/HBASE-11484 > Project: HBase > Issue Type: New Feature > Components: mapreduce > Reporter: deepankar > Priority: Minor > > We are trying to back a Hive Table by the Map Reduce over snapshots and we > don't want to restore the snapshot to a restoreDir every time we want to > execute a query. It would be nice if there is boolean in the function > *TableSnapshotInputFormat.setInput* and exposed outside in the > *TableMapReduceUtil.initTableSnapshotMapperJob*, with this boolean > it will check whether the snapshot and the restore dir are in sync, rather > than restoring again. > Is this Idea looks Ok to you guys or you have any other suggestions, I will > put up a patch for this if this idea is ok for guys -- This message was sent by Atlassian JIRA (v6.2#6252)