[
https://issues.apache.org/jira/browse/HBASE-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741292#action_12741292
]
Lars George commented on HBASE-1684:
------------------------------------
Stack, about your comments re: reducer/mapper needed. For the RestoreTable I am
using both, the mapper reads from the backup files and then randomizes the rows
using a random intermediate key. This is along what Ryan did with his pure
randomizer MR class. That way all the RegionServers are hit equally.
For the BackupTable I am using an IdentityTableMapper and encode the data in
the reducer to have it written out in the TextOutputFormat. After we discussed
that a while ago with you and Jon it should also be possible to use only a
Mapper and do the work there and set the Reducers to 0, which then hands out
the Mapper records straight to the TextOutputFormat.
Lastly, implementing Tool seems deprecated. The new mapreduce WordCounter
sample that comes with Hadoop 0.20 abandons it too. That is also why I changed
RowCounter not to use it when I cleaned up the hbase.mapreduce package. The
parsing of the generic options is done using the GenericParser directly inside
the main(), and the remaining arguments used for the specific MR job. I have
done the same in the attached two classes.
> Backup (Export/Import) contrib tool for 0.20
> --------------------------------------------
>
> Key: HBASE-1684
> URL: https://issues.apache.org/jira/browse/HBASE-1684
> Project: Hadoop HBase
> Issue Type: New Feature
> Components: contrib
> Affects Versions: 0.20.0
> Reporter: Jonathan Gray
> Assignee: Jonathan Gray
> Fix For: 0.20.1
>
> Attachments: BackupTable.java, HBASE-1684-v1.patch, RestoreTable.java
>
>
> Add a new Result/KeyValue based Export MapReduce job to contrib for 0.20.
> Make it in the hadoop 0.20 and hbase 0.20 MR API, and hbase 0.20 API
> (Result/Put).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.