Re: DDL for CarbonData table backup and recovery (new feature)

2017-11-28 Thread mohdshahidkhan

Hi Dev,
The table level registration should be also be supported.'
-- Register the carbon tables at table level:
   
  *REGISTER TABLE $tbName;*
   
Use case: 
If user has 10 tables but wants to register only 2 or 3 table not all.

 
Regards,
Shahid



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: DDL for CarbonData table backup and recovery (new feature)

2017-11-26 Thread Mohammad Shahid Khan
Hi Ravindra & Likun
I am freezing the design and going to start the code.
Please revert me if any issues.

--Regards,
   Shahid

On Fri, Nov 24, 2017 at 12:40 PM, mohdshahidkhan <
mohdshahidkhan1...@gmail.com> wrote:

> *Please update solution:
> Instead of passing the dbLocation, the database name will be passed in the
> Register DDL*
> CarbonData table backup and recovery
> Background
> Customer has created one CarbonData table which is already loaded very huge
> data, and now they install another cluster which want to use the same data
> as this table and don’t want load again, because load data cost long time,
> so they want can directly backup this table data and recover it in another
> cluster. After recovery the data in the CarbonData user can use it as a
> normal CarbonData table.
> Requirement Description
> A CarbonData table’s data can support backup the data and recover the data
> which no need load data again.
> To reuse the CarbonData table of another cluster a DDL should be provided
> to
> create the CarbonData table from the existing carbon table schema.
> Solution
> Currently CarbonData has below three types of tables
> 1.   Normal table
> 2.   Pre Aggregate table
> CarbonData should provide a DDL command to create the table from existing
> table data.
> Below DDL command could be used to create the table from existing table
> data.
>
>   REGISTER TABLES FOR $DBName;
>
>i.  The database path will be retrived from hive catalog &
>The database path will be scanned to get all table.
>   ii.  The table schema will be read to get columns details.
>  iii.  The table will be registered to the hive catalog with below
> details
> CREATE TABLE $tbName USING carbondata OPTIONS (tableName "$dbName.$tbName",
> dbName "$dbName",
> tablePath "$tablePath",
> path "$tablePath" )
>
> Precondition:
> i. The user has to create the database and Before executing this
> command
> the old table schema and
>data should be copied into the database location.
>ii. If the table is aggregate table then all the aggregate tables should
> be copied to the  in database
>location .
>
> Validation:
>1. If database does not exist then the registration will fail.
>2. The table will be registered only if same table name is not already
> registered.
>3. If the table contains the aggregate tables then all the aggregate
> tables should be registered to hive
>catalog and if any of the aggregate table does not exist then the
> table creation operation should fail.
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>


Re: DDL for CarbonData table backup and recovery (new feature)

2017-11-26 Thread mohdshahidkhan
Thanks for the clarification Naresh.
Please find my answer.

Actually if the export command is on CarbonData table, we can just zip the 
actual table folder & associated agg table folders into user mentioned 
location. It dont export Metadata 
Copy data from 1 cluster to other will still remain same in your approach 
also. 
Agree, we don't want the export data, its simply user has the tables from
the previous cluster 
and want to use them, so to use that he has register with the hive.

After copying data into new cluster, how to synchronize incremental loads 
or schema evolution from old cluster to new cluster ? 
should we need to drop the table in new cluster, copy the data from old 
cluster to new cluster & recreate table again ? 
A. synch from old to new is not is scope.

I think creating carbondata table requires schema information also to be 
passed. 
CREATE TABLE $dbName.$tbName (${ fields.map(f => f.rawSchema).mkString(",") 
}) USING CARBONDATA OPTIONS (tableName "$tbName", dbName "$dbName", 
tablePath "$tablePath") 
A. agree will take the same.




--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/


Re: DDL for CarbonData table backup and recovery (new feature)

2017-11-23 Thread Naresh P R
Thanks for the clarification Shahid. Much Appreciated.

Actually if the export command is on CarbonData table, we can just zip the
actual table folder & associated agg table folders into user mentioned
location. It dont export Metadata
Copy data from 1 cluster to other will still remain same in your approach
also.

After copying data into new cluster, how to synchronize incremental loads
or schema evolution from old cluster to new cluster ?
should we need to drop the table in new cluster, copy the data from old
cluster to new cluster & recreate table again ?

I think creating carbondata table requires schema information also to be
passed.
CREATE TABLE $dbName.$tbName (${ fields.map(f => f.rawSchema).mkString(",")
}) USING CARBONDATA OPTIONS (tableName "$tbName", dbName "$dbName",
tablePath "$tablePath")
---
Regards,
Naresh P R


On Fri, Nov 24, 2017 at 10:02 AM, mohdshahidkhan <
mohdshahidkhan1...@gmail.com> wrote:

> Hi Naresh,
> Hive export export the meta data as well as the table data also.
> We do not want to export the table data as it will tedious for  TB's of
> data.
> We have table and table data in the store location but the table is not
> register with hive metastore.
>
> Regards,
> Shahid
>
>
>
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.
> n5.nabble.com/
>


Re: DDL for CarbonData table backup and recovery (new feature)

2017-11-23 Thread mohdshahidkhan
Hi Naresh,
Hive export export the meta data as well as the table data also.
We do not want to export the table data as it will tedious for  TB's of
data.
We have table and table data in the store location but the table is not
register with hive metastore.

Regards,
Shahid



--
Sent from: 
http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/