[jira] [Updated] (HAWQ-760) Hawq register

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-760:
-
Description: 
Scenario: 
1. Register a parquet file generated by other systems, such as Hive, Spark, etc.
2. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
data from Cluster A to Cluster B. Need Register data to Cluster B.
3. For the rollback of table. Do checkpoints somewhere, and need to rollback to 
previous checkpoint. 

Usage1
Description
Register a file/folder to an existing table. Can register a file or a folder. 
If we register a file, can specify eof of this file. If eof not specified, 
directly use actual file size. If we register a folder, directly use actual 
file size.
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f 
filepath] [-e eof]


Usage 2
Description
Register according to .yml configuration file. 
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
config] [--force][--repair]  

Behavior:
1. If table doesn't exist, will automatically create the table and register the 
files in .yml configuration file. Will use the filesize specified in .yml to 
update the catalog table. 

2. If table already exist, and neither --force nor --repair configured. Do not 
create any table, and directly register the files specified in .yml file to the 
table. Note that if the file is under table directory in HDFS, will throw 
error, say, to-be-registered files should not under the table path.

3. If table already exist, and --force is specified. Will clear all the catalog 
contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, and then 
re-register all the files to the table.  This is for scenario 2.

4. If table already exist, and --repair is specified. Will change both file 
folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml file 
configures. Note may some new generated files since the checkpoint may be 
deleted here. Also note the all the files in .yml file should all under the 
table folder on HDFS. Limitation: Do not support cases for hash table 
redistribution, table truncate and table drop. This is for scenario 3.

Requirements for both the cases:
1. To be registered file path has to colocate with HAWQ in the same HDFS 
cluster.
2. If to be registered is a hash table, the registered file number should be 
one or multiple times or hash table bucket number.

  was:Users sometimes want to register data files generated by other system 
like hive into hawq. We should add register function to support registering 
file(s) generated by other system like hive into hawq. So users could integrate 
their external file(s) into hawq conveniently.


> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Scenario: 
> 1. Register a parquet file generated by other systems, such as Hive, Spark, 
> etc.
> 2. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 3. For the rollback of table. Do checkpoints somewhere, and need to rollback 
> to previous checkpoint. 
> Usage1
> Description
> Register a file/folder to an existing table. Can register a file or a folder. 
> If we register a file, can specify eof of this file. If eof not specified, 
> directly use actual file size. If we register a folder, directly use actual 
> file size.
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f 
> filepath] [-e eof]
> Usage 2
> Description
> Register according to .yml configuration file. 
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
> config] [--force][--repair]  
> Behavior:
> 1. If table doesn't exist, will automatically create the table and register 
> the files in .yml configuration file. Will use the filesize specified in .yml 
> to update the catalog table. 
> 2. If table already exist, and neither --force nor --repair configured. Do 
> not create any table, and directly register the files specified in .yml file 
> to the table. Note that if the file is under table directory in HDFS, will 
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the 
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, 
> and then re-register all the files to the table.  This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file 
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml 
> file configures. Note may some new generated files sin

[jira] [Updated] (HAWQ-760) Hawq register

2016-07-15 Thread Goden Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goden Yao updated HAWQ-760:
---
Fix Version/s: (was: 2.0.0.0-incubating)
   backlog

> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Users sometimes want to register data files generated by other system like 
> hive into hawq. We should add register function to support registering 
> file(s) generated by other system like hive into hawq. So users could 
> integrate their external file(s) into hawq conveniently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-760) Hawq register

2016-05-27 Thread Yangcheng Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yangcheng Luo updated HAWQ-760:
---
Description: Users sometimes want to register data files generated by other 
system like hive into hawq. We should add register function to support 
registering file(s) generated by other system like hive into hawq. So users 
could integrate their external file(s) into hawq conveniently.  (was: Add 
register function to support registering file(s) generated by other system like 
hive into hawq. So users could integrate their external file(s) into hawq 
conveniently.)

> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lei Chang
>
> Users sometimes want to register data files generated by other system like 
> hive into hawq. We should add register function to support registering 
> file(s) generated by other system like hive into hawq. So users could 
> integrate their external file(s) into hawq conveniently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)