[ https://issues.apache.org/jira/browse/HBASE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221371#comment-13221371 ]
Zhihong Yu edited comment on HBASE-5509 at 3/3/12 12:18 AM: ------------------------------------------------------------ SnapshotUtilities.java misses license and javadoc for the class. {code} + public static boolean sameFile(FileSystem srcfs, FileStatus srcstatus, + FileSystem dstfs, Path dstpath, boolean skipCRCCheck) throws IOException { {code} Is it possible to make the src and dst comply to same data type ? Either FileStatus or Path. For sameFile(), I think false should be returned for dest file in the following case: {code} + //return true if checksum is not supported + //(i.e. some of the checksums is null) {code} {code} + public static Path getPathInTrash(Path path, String hbaseUser, + FileSystem srcFileSys) throws IOException { {code} I think FileSystem parameter should be placed as first parameter for the above method. {code} + String trashPrefix = "/user/" + hbaseUser + "/.Trash"; {code} I think the name of trash folder should be made configurable. For getStoreFileList(): {code} + * @param families + * a comma separated list of column families for which we need to {code} I think List<String> may be better data type for families parameter. This would make this method more general in that it is not tied to the format of user input. {code} + long retryTimeInMins = + conf.getInt("hbase.backups.region.retryTimeInMins", 5) * 60 * 1000L; {code} Please rename the above variable which is converted to millis unit. SnapshotMR.java misses license. was (Author: zhi...@ebaysf.com): SnapshotUtilities.java misses license and javadoc for the class. {code} + public static boolean sameFile(FileSystem srcfs, FileStatus srcstatus, + FileSystem dstfs, Path dstpath, boolean skipCRCCheck) throws IOException { {code} Is it possible to make the src and dst comply to same data type ? Either FileStatus or Path. For sameFile(), I think false should be returned for dest file in the following case: {code} + //return true if checksum is not supported + //(i.e. some of the checksums is null) {code} {code} + public static Path getPathInTrash(Path path, String hbaseUser, + FileSystem srcFileSys) throws IOException { {code} I think FileSystem parameter should be placed as first parameter for the above method. > MR based copier for copying HFiles (trunk version) > -------------------------------------------------- > > Key: HBASE-5509 > URL: https://issues.apache.org/jira/browse/HBASE-5509 > Project: HBase > Issue Type: Sub-task > Components: documentation, regionserver > Reporter: Karthik Ranganathan > Assignee: Lars Hofhansl > Fix For: 0.94.0, 0.96.0 > > Attachments: 5509.txt > > > This copier is a modification of the distcp tool in HDFS. It does the > following: > 1. List out all the regions in the HBase cluster for the required table > 2. Write the above out to a file > 3. Each mapper > 3.1 lists all the HFiles for a given region by querying the regionserver > 3.2 copies all the HFiles > 3.3 outputs success if the copy succeeded, failure otherwise. Failed > regions are retried in another loop > 4. Mappers are placed on nodes which have maximum locality for a given region > to speed up copying -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira