Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/13191 )
Change subject: [backup] Support partition alterations between Kudu backups ...................................................................... Patch Set 1: (7 comments) http://gerrit.cloudera.org:8080/#/c/13191/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/13191/1//COMMIT_MSG@19 PS1, Line 19: prevously previously http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala: http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala@119 PS1, Line 119: catch { : // Passthrough on NonCoveredRangeException. These are the expected way of : // detecting a partition which was dropped between backups and filtering : // out the rows from that dropped partition. : case ncr: NonCoveredRangeException => Unit : } Throwing and catching an exception isn't cheap, so this seems particularly expensive given it's on each row. If we're already poking holes in KuduPartitioner, perhaps we can create a cheaper way to check if a row is covered or not? http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala: http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala@330 PS1, Line 330: def getPartitionSchema(metadata: TableMetadataPB): PartitionSchema = { Method name is kinda confusing given KuduTable.getPartitionSchema. http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduPartitioner.java File java/kudu-client/src/main/java/org/apache/kudu/client/KuduPartitioner.java: http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduPartitioner.java@55 PS1, Line 55: Map<String, Partition> tabletIdToPartition) { Nit: indentation http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduPartitioner.java@61 PS1, Line 61: partitionByStartKey.put(EMPTY, NON_COVERED_RANGE_INDEX); Why did you move this into the constructor? I think it'd be easier to follow if all of the work was done in one place rather than split between build() and the constructor. Oh, now I get it: it's because the restore code creates a custom KuduPartitioner with its own metadata. That's somewhat icky, but I can't think of a good alternative. http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduPartitioner.java@194 PS1, Line 194: String tabletId = new String(tablet.getTabletId(), UTF_8); Could we avoid the conversion and retain the tablet ID as a byte array? Seems like that might be doable if we treat it as such in the protobuf. http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/PartitionSchema.java File java/kudu-client/src/main/java/org/apache/kudu/client/PartitionSchema.java: http://gerrit.cloudera.org:8080/#/c/13191/1/java/kudu-client/src/main/java/org/apache/kudu/client/PartitionSchema.java@44 PS1, Line 44: @InterfaceAudience.LimitedPrivate("Impala") : @InterfaceStability.Unstable Do these annotations automatically apply to nested classes too? -- To view, visit http://gerrit.cloudera.org:8080/13191 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31e0eb27f163c38840e5466ff85d0b4a44d4ec0a Gerrit-Change-Number: 13191 Gerrit-PatchSet: 1 Gerrit-Owner: Grant Henke <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Grant Henke <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Comment-Date: Wed, 01 May 2019 04:26:41 +0000 Gerrit-HasComments: Yes
