[ https://issues.apache.org/jira/browse/HDFS-12594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216518#comment-16216518 ]
Ewan Higgs commented on HDFS-12594: ----------------------------------- Some minor things on a first pass: {code} + if (getLastIndex() != -1) { + setLastIndex(-1); + } {code} Why not just set it? I think the basic design is a good approach but it would be nicer to restructure it by acknowledging that we're making a cursor/iterator here. So the report request/response as follows: {code} message GetSnapshotDiffReportListingRequestProto { required string snapshotRoot = 1; required string fromSnapshot = 2; required string toSnapshot = 3; required string startPath = 4; required int32 index = 5 [default = -1]; } // ... message SnapshotDiffReportListingProto { // full path of the directory where snapshots were taken repeated SnapshotDiffReportListingEntryProto modifiedEntries = 1; repeated SnapshotDiffReportListingEntryProto createdEntries = 2; repeated SnapshotDiffReportListingEntryProto deletedEntries = 3; required bytes startPath = 4; required int32 index = 5 [default = -1]; required bool isFromEarlier = 6; } {code} ... could be: {code} message SnapshotDiffReportCursorProto required string startPath = 4; required int32 index = 5 [default = -1]; } message GetSnapshotDiffReportListingRequestProto { required string snapshotRoot = 1; required string fromSnapshot = 2; required string toSnapshot = 3; optional SnapshotDiffReportCursorProto cursor = 4; } // ... message SnapshotDiffReportListingProto { // full path of the directory where snapshots were taken repeated SnapshotDiffReportListingEntryProto modifiedEntries = 1; repeated SnapshotDiffReportListingEntryProto createdEntries = 2; repeated SnapshotDiffReportListingEntryProto deletedEntries = 3; required bool isFromEarlier = 4; optional SnapshotDiffReportCursorProto cursor = 5; } {code} Making a request with no cursor starts at the beginning. > SnapshotDiff - snapshotDiff fails if the snapshotDiff report exceeds the RPC > response limit > ------------------------------------------------------------------------------------------- > > Key: HDFS-12594 > URL: https://issues.apache.org/jira/browse/HDFS-12594 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Reporter: Shashikant Banerjee > Assignee: Shashikant Banerjee > Attachments: HDFS-12594.001.patch, HDFS-12594.002.patch, > HDFS-12594.003.patch, SnapshotDiff_Improvemnets .pdf > > > The snapshotDiff command fails if the snapshotDiff report size is larger than > the configuration value of ipc.maximum.response.length which is by default > 128 MB. > Worst case, with all Renames ops in sanpshots each with source and target > name equal to MAX_PATH_LEN which is 8k characters, this would result in at > 8192 renames. > > SnapshotDiff is currently used by distcp to optimize copy operations and in > case of the the diff report exceeding the limit , it fails with the below > exception: > Test set: > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport > ------------------------------------------------------------------------------- > Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 112.095 sec > <<< FAILURE! - in > org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport > testDiffReportWithMillionFiles(org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport) > Time elapsed: 111.906 sec <<< ERROR! > java.io.IOException: Failed on local exception: > org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; > Host Details : local host is: "hw15685.local/10.200.5.230"; destination host > is: "localhost":59808; > Attached is the proposal for the changes required. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org