Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/10375 )
Change subject: WIP: Kudu Backup/Restore Spark Jobs ...................................................................... Patch Set 8: (6 comments) http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala: http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@65 PS8, Line 65: Serialization.write(metadata, out) > +1 to defining the schema in protobuf. I will move to define this in Protobuf and use generic PB to Json. http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala: http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala@30 PS8, Line 30: Address > nit: maybe we should name this "addresses" Done http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala@37 PS8, Line 37: timestamp > nit: UNIX timestamp in milliseconds since the epoch Done http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala: http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala@33 PS8, Line 33: @transient val table > how does this get reconstructed after deserialization? It doesn't. This is marked as @trasient so that it doesn't get serialized and deserialized. This means it can only be used on the driver. It's only needed on the driver. http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala@65 PS8, Line 65: Array(leaderLocation) > Hmm, we may want to simply use token.getTablet.getReplicas because the lead This will change when I support backups from the followers (shortly) http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala@89 PS8, Line 89: efficiency > what kind of efficiency do you mean? I mean to avoid decoding and re encoding the rows. We could consider just writing out raw "row" bytes instead of the typed data. -- To view, visit http://gerrit.cloudera.org:8080/10375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 Gerrit-Change-Number: 10375 Gerrit-PatchSet: 8 Gerrit-Owner: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Tue, 22 May 2018 19:07:44 +0000 Gerrit-HasComments: Yes