Hi Dev,

*Please find initial solution.*


*CarbonData table backup and recovery*

*Background*

Customer has created one CarbonData table which is already loaded very huge
data, and now they install another cluster which want to use the same data
as this table and don’t want load again, because load data cost long time,
so they want can directly backup this table data and recover it in another
cluster. After recovery the data in the CarbonData user can use it as a
normal CarbonData table.

*Requirement Description*

A CarbonData table’s data can support backup the data and recover the data
which no need load data again.

To reuse the CarbonData table of another cluster a DDL should be provided
to create the CarbonData table from the existing carbon table schema.

*Solution*

Currently CarbonData has below three types of tables

1.   Normal table

2.   Pre Aggregate table

CarbonData should provide a DDL command to create the table from existing
table data.
Below DDL command could be used to create the table from existing table
data.

*  REGISTER TABLES FROM $dbPath*



           i.   The database path will be scanned to get all table schemas.

           ii. The schema will be read to get the database name, table name
and columns details.

           iii.  The *table will be registered to the hive catalog with
below details*

*CREATE TABLE $tbName USING carbondata OPTIONS (tableName
"$dbName.$tbName",*

*dbName "$dbName",*

*tablePath "$tablePath",*

*path "$tablePath"** )*


*Precondition**:*

i.        Before executing this command the old table schema and data
should be copied into the new store location.

ii.      If the table is aggregate table then all the aggregate tables
should be copied to the new store location.



*Validation:*


   1.    If database does not exist then the registration will fail.
   2.   The table will be registered only if same table name is not already
   registered.
   3.   If the table contains the aggregate tables then all the aggregate
   tables should be registered to hive catalog and if any the aggregate
   table does not exist then the table creation operation should fail.

Regards,

Shahid

Reply via email to