[ https://issues.apache.org/jira/browse/HBASE-5604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250450#comment-13250450 ]
Zhihong Yu commented on HBASE-5604: ----------------------------------- {code} + if (tablesToUse == null || tableMap == null || tablesToUse.length != tableMap.length) { + // this can only happen when HLogMapper is used directly by a class other than WALPlayer + throw new IOException("No tables or incorrect table mapping specified."); {code} I think if we provide separate exceptions for the first two checks and the last check, it would be easier for user to understand. {code} + System.err.println(" -D" + HLogInputFormat.START_TIME_KEY + "=ms (only apply edit after this time)"); + System.err.println(" -D" + HLogInputFormat.END_TIME_KEY + "=ms (only apply edit before this time)"); {code} User would have to resort to conversion tool in order to find out the ms readings for desired date / time. Can we make this more user-friendly ? e.g. in TimeStampingFileContext.java we have: {code} this.sdf = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss"); {code} See also http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html#parse%28java.lang.String,%20java.text.ParsePosition%29 {code} + public String[] getLocations() throws IOException, InterruptedException { + // TODO: Find the data node with the most blocks for this HLog? {code} Would the above be addressed in a separate JIRA ? {code} + if (i>0) LOG.info("Skipped " + i + " entries."); {code} Minor: add spaces around '>' > M/R tools to replay WAL files > ----------------------------- > > Key: HBASE-5604 > URL: https://issues.apache.org/jira/browse/HBASE-5604 > Project: HBase > Issue Type: New Feature > Reporter: Lars Hofhansl > Assignee: Lars Hofhansl > Attachments: 5604-v4.txt, 5604-v6.txt, 5604-v7.txt, 5604-v8.txt, > HLog-5604-v3.txt > > > Just an idea I had. Might be useful for restore of a backup using the HLogs. > This could an M/R (with a mapper per HLog file). > The tool would get a timerange and a (set of) table(s). We'd pick the right > HLogs based on time before the M/R job is started and then have a mapper per > HLog file. > The mapper would then go through the HLog, filter all WALEdits that didn't > fit into the time range or are not any of the tables and then uses > HFileOutputFormat to generate HFiles. > Would need to indicate the splits we want, probably from a live table. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira