Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/10375 )
Change subject: WIP: Kudu Backup/Restore Spark Jobs ...................................................................... Patch Set 8: (1 comment) I haven't reviewed the code at all but I wanted to comment on one of Mike's suggestions. http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala File java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala: http://gerrit.cloudera.org:8080/#/c/10375/8/java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala@65 PS8, Line 65: Serialization.write(metadata, out) > Rather than use Spark serialization based on POJO (POSO?) definitions for m +1 to defining the schema in protobuf. However, I don't think we should use SchemaPB/PartitionSchemaPB directly. I think there's value in this code being as "third party" as possible. Meaning, as much as it makes sense, it shouldn't leverage Kudu internals, and it should use Kudu public APIs. That separation means this could serve as a reference implementation and can be ported to other execution frameworks with a minimum of fuss. Plus actually using internal Kudu protobufs would probably require poking holes in our shading scheme. Along the same lines, I think a container format is unnecessary, and I certainly wouldn't want us to port PBC to Java. I think we can get by with a simple pb to JSON conversion; protobuf provides utilities that do this. -- To view, visit http://gerrit.cloudera.org:8080/10375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 Gerrit-Change-Number: 10375 Gerrit-PatchSet: 8 Gerrit-Owner: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Fri, 18 May 2018 03:59:40 +0000 Gerrit-HasComments: Yes