Grant Henke has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10375 )
Change subject: Kudu Backup/Restore Spark Jobs ...................................................................... Kudu Backup/Restore Spark Jobs Adds a rough base implementation of Kudu backup and restore Spark jobs. There are many todos indicating gaps and more testing and details to be be finished. However, these base jobs work and are in a functional state that can be committed and iterated on as we build up and improve our backup functionality. These jobs, as annotated, should be considered private, unstable, and experimental. The backup job can output one to many tables data to any spark compatible path in any spark compatible format, the defaults being HDFS and Parquet. Each table’s data is written in a subdirectory of the provided path. The subdirectory’s name is the url encoded table name. Additionally in each table’s directory a json metadata file is output with the metadata needed to recreate the table that was exported when restoring. The restore job can read the data and metadata generated and create “restore” tables with a matching schema and reload the data. The job arguments are a work in progress and will likely be enhanced and simplified as we find what is useful and what isn’t through performance and functional testing. More documentation will be generated when the jobs are ready for general use. Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 Reviewed-on: http://gerrit.cloudera.org:8080/10375 Tested-by: Kudu Jenkins Reviewed-by: Mike Percy <mpe...@apache.org> --- M java/gradle/dependencies.gradle A java/kudu-backup/build.gradle A java/kudu-backup/pom.xml A java/kudu-backup/src/main/protobuf/backup.proto A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackup.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupOptions.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduBackupRDD.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestore.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/KuduRestoreOptions.scala A java/kudu-backup/src/main/scala/org/apache/kudu/backup/TableMetadata.scala A java/kudu-backup/src/test/resources/log4j.properties A java/kudu-backup/src/test/scala/org/apache/kudu/backup/TestKuduBackup.scala M java/kudu-client/src/main/java/org/apache/kudu/Type.java M java/kudu-client/src/test/java/org/apache/kudu/client/BaseKuduTest.java M java/kudu-client/src/test/java/org/apache/kudu/client/TestUtils.java M java/kudu-spark/src/test/scala/org/apache/kudu/spark/kudu/TestContext.scala M java/pom.xml M java/settings.gradle 18 files changed, 1,594 insertions(+), 7 deletions(-) Approvals: Kudu Jenkins: Verified Mike Percy: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/10375 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: If02183a2f833ffa0225eb7b0a35fc7531109e6f7 Gerrit-Change-Number: 10375 Gerrit-PatchSet: 21 Gerrit-Owner: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Grant Henke <granthe...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Mike Percy <mpe...@apache.org> Gerrit-Reviewer: Todd Lipcon <t...@apache.org>