[jira] [Updated] (HAWQ-806) Add feature test for subplan with new framework

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-806:
-
Issue Type: Sub-task  (was: Test)
Parent: HAWQ-832

> Add feature test for subplan with new framework
> ---
>
> Key: HAWQ-806
> URL: https://issues.apache.org/jira/browse/HAWQ-806
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: hongwu
>Assignee: hongwu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-807) Add feature test for gpcopy with new framework

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-807:
-
Issue Type: Sub-task  (was: Test)
Parent: HAWQ-832

> Add feature test for gpcopy with new framework
> --
>
> Key: HAWQ-807
> URL: https://issues.apache.org/jira/browse/HAWQ-807
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: hongwu
>Assignee: hongwu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-802) Refactor alter owner installcheck case using new test framework

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-802:
-
Issue Type: Sub-task  (was: New Feature)
Parent: HAWQ-832

> Refactor alter owner installcheck case using new test framework
> ---
>
> Key: HAWQ-802
> URL: https://issues.apache.org/jira/browse/HAWQ-802
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Yi Jin
>Assignee: Yi Jin
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-818) Add order by for aggregate feature test.

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-818:
-
Issue Type: Sub-task  (was: Bug)
Parent: HAWQ-832

> Add order by for aggregate feature test.
> 
>
> Key: HAWQ-818
> URL: https://issues.apache.org/jira/browse/HAWQ-818
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
>
> feature test should add order by to achieve exact match



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-828) Remove deprecated aggregate, groupingsets test cases in installcheck-good

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-828:
-
Issue Type: Sub-task  (was: Test)
Parent: HAWQ-832

> Remove deprecated aggregate, groupingsets test cases in installcheck-good
> -
>
> Key: HAWQ-828
> URL: https://issues.apache.org/jira/browse/HAWQ-828
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-828) Remove deprecated aggregate, groupingsets test cases in installcheck-good

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma closed HAWQ-828.

Resolution: Fixed

> Remove deprecated aggregate, groupingsets test cases in installcheck-good
> -
>
> Key: HAWQ-828
> URL: https://issues.apache.org/jira/browse/HAWQ-828
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-834) Refactor goh_portals checkinstall cases using new test framework.

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-834:
-
Issue Type: Sub-task  (was: Test)
Parent: HAWQ-832

> Refactor goh_portals checkinstall cases using new test framework.
> -
>
> Key: HAWQ-834
> URL: https://issues.apache.org/jira/browse/HAWQ-834
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
>
> refactor goh_portals with new test framework



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-808) Add feature test for external_oid with new framework

2016-07-08 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-808:
-
Issue Type: Sub-task  (was: Test)
Parent: HAWQ-832

> Add feature test for external_oid with new framework
> 
>
> Key: HAWQ-808
> URL: https://issues.apache.org/jira/browse/HAWQ-808
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: hongwu
>Assignee: hongwu
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-910) "hawq register": before registration, need check the consistency between the file and HAWQ table

2016-07-08 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-910:


 Summary: "hawq register": before registration, need check the 
consistency between the file and HAWQ table
 Key: HAWQ-910
 URL: https://issues.apache.org/jira/browse/HAWQ-910
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Storage
Reporter: Lili Ma
Assignee: Lei Chang


As a user,
I can be notified that the uploading file is not consistent to the table I want 
to register to during registration
so that I can do corresponding modifications as early as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-13 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma reassigned HAWQ-256:


Assignee: Lili Ma  (was: Lei Chang)

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: Wish
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-13 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376213#comment-15376213
 ] 

Lili Ma commented on HAWQ-256:
--

We can extend the work from following aspects:
1. Create User Responsibility. Who is responsible for creating user? HAWQ? 
Ranger? Or Both?
2. Grant privilege. Who does this? HAWQ? Ranger? Or Both?
3. Authorization Part. HAWQ call Ranger REST API? What if the calling return 
value: true/false? If true, we can permit the user to access the object, If 
false, shall we check HAWQ own authorization? And what should we handle for 
Ranger Down case?
4. User List Register. If we have already HAWQ user list, shall we register it 
to Ranger? Is registering through LDAP feasible? HAWQ-LDAP-Ranger.
5. HAWQ Grant SQL function & Internal Implementation. What kind of 
objects(table/column?/database?) do we support, and what 
actions(insert/select/drop) do we support?

We can do further investigation and try to find some solutions from following 
aspects:
Other system implementation for Ranger. For example, Hive, HBase.
HAWQ internal grant function implementation.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: Wish
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lei Chang
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-13 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Issue Type: New Feature  (was: Wish)

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376472#comment-15376472
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco] Thanks for your answer :)

1. Yes, it's good for Ranger to import user list from component. Why I expose 
this question is that I noticed that Ranger has provided a function "Add New 
User" under tab "Settings/Users/Groups". Does it mean Ranger also supports 
creating user in Ranger itself? 
2. Grant privilege from just one side is relatively easy and clear.  What we 
need to discuss is which side we allow granting privilege, HAWQ, or Ranger? As 
you said, HAWQ side is a good choice since there's no change in admin behavior.
3. I also thinks it would be simple if we don't consider Ranger down or Ranger 
not exist problem. What about the scenarios that user don't intend to install 
Ranger?  Are users are all fine with Ranger? Currently the ACL information is 
stored in HAWQ catalog. Shall we remove the catalog information if we provide 
Ranger support?
4. Yes, LDAP/AD is a potential good solution for us :)
5. So In Hive and HBase, the grant operations are all done in the database side 
instead of Ranger side. Right? In this page it seems that Ranger admin console 
also supports creating a new policy from UI? Please correct me if my 
understanding is wrong. 

Actually, we are investigating and aiming at drafting a design doc. Will attach 
the design doc to this JIRA once done.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376472#comment-15376472
 ] 

Lili Ma edited comment on HAWQ-256 at 7/14/16 7:19 AM:
---

[~bosco] Thanks for your answer :)

1. Yes, it's good for Ranger to import user list from component. Why I expose 
this question is that I noticed that Ranger has provided a function "Add New 
User" under tab "Settings/Users/Groups". Does it mean Ranger also supports 
creating user in Ranger itself? 
2. Grant privilege from just one side is relatively easy and clear.  What we 
need to discuss is which side we allow granting privilege, HAWQ, or Ranger? As 
you said, HAWQ side is a good choice since there's no change in admin behavior.
3. I also thinks it would be simple if we don't consider Ranger down or Ranger 
not exist problem. What about the scenarios that user don't intend to install 
Ranger?  Are users are all fine with Ranger? Currently the ACL information is 
stored in HAWQ catalog. Shall we remove the catalog information if we provide 
Ranger support?
4. Yes, LDAP/AD is a potential good solution for us :)
5. So In Hive and HBase, the grant operations are all done in the database side 
instead of Ranger side. Right? In this page it seems that Ranger admin console 
also supports creating a new policy from UI? Please correct me if my 
understanding is wrong.  
https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5+-+User+Guide

Actually, we are investigating and aiming at drafting a design doc. Will attach 
the design doc to this JIRA once done.


was (Author: lilima):
[~bosco] Thanks for your answer :)

1. Yes, it's good for Ranger to import user list from component. Why I expose 
this question is that I noticed that Ranger has provided a function "Add New 
User" under tab "Settings/Users/Groups". Does it mean Ranger also supports 
creating user in Ranger itself? 
2. Grant privilege from just one side is relatively easy and clear.  What we 
need to discuss is which side we allow granting privilege, HAWQ, or Ranger? As 
you said, HAWQ side is a good choice since there's no change in admin behavior.
3. I also thinks it would be simple if we don't consider Ranger down or Ranger 
not exist problem. What about the scenarios that user don't intend to install 
Ranger?  Are users are all fine with Ranger? Currently the ACL information is 
stored in HAWQ catalog. Shall we remove the catalog information if we provide 
Ranger support?
4. Yes, LDAP/AD is a potential good solution for us :)
5. So In Hive and HBase, the grant operations are all done in the database side 
instead of Ranger side. Right? In this page it seems that Ranger admin console 
also supports creating a new policy from UI? Please correct me if my 
understanding is wrong. 

Actually, we are investigating and aiming at drafting a design doc. Will attach 
the design doc to this JIRA once done.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376472#comment-15376472
 ] 

Lili Ma edited comment on HAWQ-256 at 7/14/16 7:21 AM:
---

[~bosco] Thanks for your answer :)

1. Yes, it's good for Ranger to import user list from component. Why I expose 
this question is that I noticed that Ranger has provided a function "Add New 
User" under tab "Settings/Users/Groups". Does it mean Ranger also supports 
creating user in Ranger itself? 
2. Grant privilege from just one side is relatively easy and clear.  What we 
need to discuss is which side we allow granting privilege, HAWQ, or Ranger? As 
you said, HAWQ side is a good choice since there's no change in admin behavior.
3. I also thinks it would be simple if we don't consider Ranger down or Ranger 
not exist problem. What about the scenarios that user don't intend to install 
Ranger?  Are users all fine with Ranger? Currently the ACL information is 
stored in HAWQ catalog. Shall we remove the catalog information if we provide 
Ranger support?
4. Yes, LDAP/AD is a potential good solution for us :)
5. So In Hive and HBase, the grant operations are all done in the database side 
instead of Ranger side. Right? In this page it seems that Ranger admin console 
also supports creating a new policy from UI? Please correct me if my 
understanding is wrong.  
https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5+-+User+Guide

Actually, we are investigating and aiming at drafting a design doc. Will attach 
the design doc to this JIRA once done.


was (Author: lilima):
[~bosco] Thanks for your answer :)

1. Yes, it's good for Ranger to import user list from component. Why I expose 
this question is that I noticed that Ranger has provided a function "Add New 
User" under tab "Settings/Users/Groups". Does it mean Ranger also supports 
creating user in Ranger itself? 
2. Grant privilege from just one side is relatively easy and clear.  What we 
need to discuss is which side we allow granting privilege, HAWQ, or Ranger? As 
you said, HAWQ side is a good choice since there's no change in admin behavior.
3. I also thinks it would be simple if we don't consider Ranger down or Ranger 
not exist problem. What about the scenarios that user don't intend to install 
Ranger?  Are users are all fine with Ranger? Currently the ACL information is 
stored in HAWQ catalog. Shall we remove the catalog information if we provide 
Ranger support?
4. Yes, LDAP/AD is a potential good solution for us :)
5. So In Hive and HBase, the grant operations are all done in the database side 
instead of Ranger side. Right? In this page it seems that Ranger admin console 
also supports creating a new policy from UI? Please correct me if my 
understanding is wrong.  
https://cwiki.apache.org/confluence/display/RANGER/Apache+Ranger+0.5+-+User+Guide

Actually, we are investigating and aiming at drafting a design doc. Will attach 
the design doc to this JIRA once done.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378792#comment-15378792
 ] 

Lili Ma edited comment on HAWQ-256 at 7/15/16 3:16 AM:
---

[~bosco] Thanks. Things are getting more clear now.

So for the interaction between HAWQ and Ranger, I think there are mainly two 
parts:

1. Set policy.  When HAWQ users invoke GRANT SQL in HAWQ, need pass that 
command to Ranger to set the policy.
2.Check Authorization.  When HAWQ user want to operate on some objects, need 
contact Ranger to check whether the user has the privilege. 

Both these two parts of interaction rely on Ranger Plugin. 

What we need do next is detailing down the interface for interaction and 
designing the HAWQ own side implementation.  

Please suggest if I missed something.  Thanks


was (Author: lilima):
[~bosco] Thanks. Things are getting more clear now.

So for the interaction between HAWQ and Ranger, I think there are mainly two 
parts:

1. Set policy.  When HAWQ users invoke GRANT SQL in HAWQ, need pass that 
command to Ranger to set the policy.

2.Check Authorization.  When HAWQ user want to operate on some objects, need 
contact Ranger to check whether the user has the privilege. 

Both these two parts of interaction rely on Ranger Plugin. 

What we need do next is detailing down the interface for interaction and 
designing the HAWQ own side implementation.  

Please suggest if I missed something.  Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378792#comment-15378792
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco] Thanks. Things are getting more clear now.

So for the interaction between HAWQ and Ranger, I think there are mainly two 
parts:

1. Set policy.  When HAWQ users invoke GRANT SQL in HAWQ, need pass that 
command to Ranger to set the policy.

2.Check Authorization.  When HAWQ user want to operate on some objects, need 
contact Ranger to check whether the user has the privilege. 

Both these two parts of interaction rely on Ranger Plugin. 

What we need do next is detailing down the interface for interaction and 
designing the HAWQ own side implementation.  

Please suggest if I missed something.  Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-760) Hawq register

2016-07-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15378798#comment-15378798
 ] 

Lili Ma commented on HAWQ-760:
--

[~GodenYao] I noticed you have kindly help close this JIRA.

Actually, this is an umbrella JIRA, some sub-tasks of it has been finished and 
already code delivered.  But some sub-tasks are not development and postponed. 

> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lili Ma
> Fix For: 2.0.0.0-incubating
>
>
> Users sometimes want to register data files generated by other system like 
> hive into hawq. We should add register function to support registering 
> file(s) generated by other system like hive into hawq. So users could 
> integrate their external file(s) into hawq conveniently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-07-27 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Attachment: HAWQRangerSupportDesign.pdf

Hi all,
We have drafted a design for HAWQ Ranger Support. Any comments are welcome. 

Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-08 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411509#comment-15411509
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco],
Will  you confirm with folks in Ranger team for the API? Thanks

For the API interface, the grant/revoke part, we just design the API 
corresponding to SQL grant/revoke syntax.

For the check privilege part, I think your advise is sensible. We can provide 
the function for multiple resource check in one time. What about we change it 
to following format, say, allowing one requestor to check multiple resources, 
and for single resource, allowing multiple operations check?

{code}
{
“requestor” : “u1”,
   {
  {
“resource” : {“TABLE”: “t1”, “DATABASE”: “db1”},
“privilege” : “select”, "insert"
  },
  {
“resource” : {“TABLE”: “t2”, “DATABASE”: “db1”},
“privilege” : “select”
  }
   }
}
{code}

[~hubertzhang], your thoughts?

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-08 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411518#comment-15411518
 ] 

Lili Ma commented on HAWQ-256:
--

Another thing is for HAWQ sync with LDAP.  By our investigation, HAWQ needs to 
run "create role" command for user registered in LDAP. 
[~teaandcoffee], do you think providing a script for this is acceptable? Or we 
need to create a backend process to do the user information sync automatically? 
cc [~wenlin] for this discussion too. 

Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1001) Implement HAWQ user ACL check through Ranger

2016-08-12 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1001:
-

 Summary: Implement HAWQ user ACL check through Ranger
 Key: HAWQ-1001
 URL: https://issues.apache.org/jira/browse/HAWQ-1001
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Core
Reporter: Lili Ma
Assignee: Lei Chang


When a user run some query,  HAWQ can connect to Ranger to judge whether the 
user has the privilege to do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1002) Implement a switch in hawq-site.xml to configure whether use Ranger or not for ACL

2016-08-12 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1002:
-

 Summary: Implement a switch in hawq-site.xml to configure whether 
use Ranger or not for ACL
 Key: HAWQ-1002
 URL: https://issues.apache.org/jira/browse/HAWQ-1002
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Core
Reporter: Lili Ma
Assignee: Lei Chang


Implement a switch in hawq-site.xml to configure whether use Ranger or not for 
ACL



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1001) Implement HAWQ basic user ACL check through Ranger

2016-08-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1001:
--
Summary: Implement HAWQ basic user ACL check through Ranger  (was: 
Implement HAWQ user ACL check through Ranger)

> Implement HAWQ basic user ACL check through Ranger
> --
>
> Key: HAWQ-1001
> URL: https://issues.apache.org/jira/browse/HAWQ-1001
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: backlog
>
>
> When a user run some query,  HAWQ can connect to Ranger to judge whether the 
> user has the privilege to do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1003) Implement enhanced hawq ACL check through Ranger

2016-08-12 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1003:
-

 Summary: Implement enhanced hawq ACL check through Ranger
 Key: HAWQ-1003
 URL: https://issues.apache.org/jira/browse/HAWQ-1003
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Core
Reporter: Lili Ma
Assignee: Lei Chang


Implement enhanced hawq ACL check through Ranger, which means, if a query 
contains several tables, we can combine the multiple table request together, to 
send just one REST request to Ranger REST API Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1001) Implement HAWQ basic user ACL check through Ranger

2016-08-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1001:
--
   Assignee: Hubert Zhang  (was: Lei Chang)
Description: 
When a user run some query,  HAWQ can connect to Ranger to judge whether the 
user has the privilege to do that. 
For each object with unique oid, send one request to Ranger

  was:When a user run some query,  HAWQ can connect to Ranger to judge whether 
the user has the privilege to do that.


> Implement HAWQ basic user ACL check through Ranger
> --
>
> Key: HAWQ-1001
> URL: https://issues.apache.org/jira/browse/HAWQ-1001
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Lili Ma
>Assignee: Hubert Zhang
> Fix For: backlog
>
>
> When a user run some query,  HAWQ can connect to Ranger to judge whether the 
> user has the privilege to do that. 
> For each object with unique oid, send one request to Ranger



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1004) Decide How HAWQ connect Ranger, through which user, how to connect to REST Server

2016-08-12 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1004:
-

 Summary: Decide How HAWQ connect Ranger, through which user, how 
to connect to REST Server
 Key: HAWQ-1004
 URL: https://issues.apache.org/jira/browse/HAWQ-1004
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Core
Reporter: Lili Ma
Assignee: Lei Chang


Decide How HAWQ connect Ranger, through which user, how to connect to REST 
Server
Acceptance Criteria: 
Provide an interface for HAWQ connecting Ranger REST Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1004) Decide How HAWQ connect Ranger, through which user, how to connect to REST Server

2016-08-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1004:
--
Assignee: Lin Wen  (was: Lei Chang)

> Decide How HAWQ connect Ranger, through which user, how to connect to REST 
> Server
> -
>
> Key: HAWQ-1004
> URL: https://issues.apache.org/jira/browse/HAWQ-1004
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Lili Ma
>Assignee: Lin Wen
> Fix For: backlog
>
>
> Decide How HAWQ connect Ranger, through which user, how to connect to REST 
> Server
> Acceptance Criteria: 
> Provide an interface for HAWQ connecting Ranger REST Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420590#comment-15420590
 ] 

Lili Ma commented on HAWQ-256:
--

Below is the detailed list of default behaviors for the object operation if no 
privilege is specified. 
1. A database is allowed for connect and create temp table. Which means, user A 
creates a database, everyone can connect to this database, and create temp 
table into it.
2. Language is allowed to use. Which means, user A creates a language, everyone 
can use this language.
3. Function is allowed to execute. Which means, user A defines a function, then 
everyone can execute the function.

[~bosco], I think your suggestion on assigning a group "public" and assigns 
each resource the default behavior for this group makes sense. For the 'deny' 
operation, say, "I don't want user A to connect to database 1", we can simply 
add a "deny" record in Ranger 0.6, say, adding userA to the blacklist for 
database1. But for Ranger 0.5, what we can do for it??  For [~hubertzhang]'s 
solution for removing the user A out of 'public' group, then we need to keep 
one 'public' group for each object, I'm afraid there will be too many groups 
there...

[~bosco], could you share the upgrade possibility for the user to upgrade 
Ranger 0.5 to Ranger 0.6? Does the user need many efforts to do the upgrade? If 
not, I think we can just simply consider supporting Ranger 0.6. Things will get 
more clear then, Thanks:) 

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420590#comment-15420590
 ] 

Lili Ma edited comment on HAWQ-256 at 8/15/16 3:43 AM:
---

Below is the detailed list of default behaviors for the object operation if no 
privilege is specified. 
1. A database is allowed for connecting and creating temp table. Which means, 
user A creates a database, everyone can connect to this database, and create 
temp table into it.
2. Language is allowed for using. Which means, user A creates a language, 
everyone can use this language.
3. Function is allowed for executing. Which means, user A defines a function, 
then everyone can execute the function.

[~bosco], I think your suggestion on assigning a group "public" and assigns 
each resource the default behavior for this group makes sense. For the 'deny' 
operation, say, "I don't want user A to connect to database 1", we can simply 
add a "deny" record in Ranger 0.6, say, adding userA to the blacklist for 
database1. But for Ranger 0.5, what we can do for it??  For [~hubertzhang]'s 
solution for removing the user A out of 'public' group, then we need to keep 
one 'public' group for each object, I'm afraid there will be too many groups 
there...

[~bosco], could you share the upgrade possibility for the user to upgrade 
Ranger 0.5 to Ranger 0.6? Does the user need many efforts to do the upgrade? If 
not, I think we can just simply consider supporting Ranger 0.6. Things will get 
more clear then, Thanks:) 


was (Author: lilima):
Below is the detailed list of default behaviors for the object operation if no 
privilege is specified. 
1. A database is allowed for connect and create temp table. Which means, user A 
creates a database, everyone can connect to this database, and create temp 
table into it.
2. Language is allowed to use. Which means, user A creates a language, everyone 
can use this language.
3. Function is allowed to execute. Which means, user A defines a function, then 
everyone can execute the function.

[~bosco], I think your suggestion on assigning a group "public" and assigns 
each resource the default behavior for this group makes sense. For the 'deny' 
operation, say, "I don't want user A to connect to database 1", we can simply 
add a "deny" record in Ranger 0.6, say, adding userA to the blacklist for 
database1. But for Ranger 0.5, what we can do for it??  For [~hubertzhang]'s 
solution for removing the user A out of 'public' group, then we need to keep 
one 'public' group for each object, I'm afraid there will be too many groups 
there...

[~bosco], could you share the upgrade possibility for the user to upgrade 
Ranger 0.5 to Ranger 0.6? Does the user need many efforts to do the upgrade? If 
not, I think we can just simply consider supporting Ranger 0.6. Things will get 
more clear then, Thanks:) 

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-14 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15420594#comment-15420594
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco] [~vVineet], Could you help confirm whether these APIs definitions are 
OK? Thanks :)

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-16 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422415#comment-15422415
 ] 

Lili Ma commented on HAWQ-256:
--

Per offline discussion, we think the integration between LDAP and HAWQ is out 
of the scope for HAWQ Integration with Apache Ranger. And since HAWQ already 
supports LDAP sync, we decide to put this to lower priority.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-16 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422417#comment-15422417
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco] Thanks for your suggestion about default behaviors. I think Ranger 0.6 
can help us resolve this problem.  

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-16 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422420#comment-15422420
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco], as [~hubertzhang] mentioned, HAWQ currently has requests for the 
privileges combination, either 'ALL', or 'ANY'. Do you think it's feasible to 
implement it inside Ranger REST API service?  Certainly we can do it in HAWQ 
side, but there will be multiple communications with Ranger REST API, I'm 
afraid it may increase the time for checking privileges. So it's better to 
implement this judgement inside Ranger REST service. Your thoughts?

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-16 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422443#comment-15422443
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco][~vineetgoel][~lei_chang][~hubertzhang][~wenlin]
Another thing we need to discuss is whether we support user send "GRANT" SQL 
besides setting policy in Ranger.  If we also support Grant SQL, there is a 
minor difference between the "with grant option" of Grant SQL and what inside 
Ranger UI.  We need to discuss it clear.

Ranger has one button "Delegate Admin" when defining policy, this is different 
from what HAWQ grant SQL specifies.
That button in Ranger means the Ranger internal user has the privileges to 
operate the given path/object and assign someone else the rights for the 
objects. That button has no influence on Ranger external user, say, HAWQ 
internal user. For example, if we add a policy specifying user A has the 
privileges to select a table T and click on the button and user A is Ranger 
internal user, then user A has the right to log into Ranger and assign the 
insert/select privileges for table T to user B.
The grant SQL with grant option means that the to-be-granted user has the 
privilege to grant certain privileges to other users. If the grant privilege 
specifies just select, then user A can't grant insert privilege to user B. So 
this is minor different from what Ranger has already provided.

If we allow grant/revoke SQL from HAWQ, we need to add "grant" as an action 
option to the resource. Action option means for each action, it has an 
attribute which indicates whether this action can be granted by the user.
For example, admin grant two privileges:
"grant select on t1 to u1"
"grant insert on t1 to u1 with grant option"
Then u1 grant privilege to u2
"grant select on t1 to u2" result: failed!
grant insert on t1 to u2" result: succeed!
As a result, u2 can insert on t1, but it cannot select on t1.
Correspondingly, in Ranger, we have the following policies(* means with grant 
privilege):
t1 u1 insert*select
t1 u2 insert

So the conclusion is that we need double the privileges for defining "with 
grant option" if we want to support Grant/Revoke SQL from HAWQ side.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-16 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422443#comment-15422443
 ] 

Lili Ma edited comment on HAWQ-256 at 8/16/16 8:51 AM:
---

[~bosco] [~vineetgoel] [~lei_chang] [~hubertzhang] [~wenlin]
Another thing we need to discuss is whether we support user send "GRANT" SQL 
besides setting policy in Ranger.  If we also support Grant SQL, there is a 
minor difference between the "with grant option" of Grant SQL and what inside 
Ranger UI.  We need to discuss it clear.

Ranger has one button "Delegate Admin" when defining policy, this is different 
from what HAWQ grant SQL specifies.
That button in Ranger means the Ranger internal user has the privileges to 
operate the given path/object and assign someone else the rights for the 
objects. That button has no influence on Ranger external user, say, HAWQ 
internal user. For example, if we add a policy specifying user A has the 
privileges to select a table T and click on the button and user A is Ranger 
internal user, then user A has the right to log into Ranger and assign the 
insert/select privileges for table T to user B.
The grant SQL with grant option means that the to-be-granted user has the 
privilege to grant certain privileges to other users. If the grant privilege 
specifies just select, then user A can't grant insert privilege to user B. So 
this is minor different from what Ranger has already provided.

If we allow grant/revoke SQL from HAWQ, we need to add "grant" as an action 
option to the resource. Action option means for each action, it has an 
attribute which indicates whether this action can be granted by the user.
For example, admin grant two privileges:
"grant select on t1 to u1"
"grant insert on t1 to u1 with grant option"
Then u1 grant privilege to u2
"grant select on t1 to u2" result: failed!
grant insert on t1 to u2" result: succeed!
As a result, u2 can insert on t1, but it cannot select on t1.
Correspondingly, in Ranger, we have the following policies(* means with grant 
privilege):
t1 u1 insert*select
t1 u2 insert

So the conclusion is that we need double the privileges for defining "with 
grant option" if we want to support Grant/Revoke SQL from HAWQ side.


was (Author: lilima):
[~bosco][~vineetgoel][~lei_chang][~hubertzhang][~wenlin]
Another thing we need to discuss is whether we support user send "GRANT" SQL 
besides setting policy in Ranger.  If we also support Grant SQL, there is a 
minor difference between the "with grant option" of Grant SQL and what inside 
Ranger UI.  We need to discuss it clear.

Ranger has one button "Delegate Admin" when defining policy, this is different 
from what HAWQ grant SQL specifies.
That button in Ranger means the Ranger internal user has the privileges to 
operate the given path/object and assign someone else the rights for the 
objects. That button has no influence on Ranger external user, say, HAWQ 
internal user. For example, if we add a policy specifying user A has the 
privileges to select a table T and click on the button and user A is Ranger 
internal user, then user A has the right to log into Ranger and assign the 
insert/select privileges for table T to user B.
The grant SQL with grant option means that the to-be-granted user has the 
privilege to grant certain privileges to other users. If the grant privilege 
specifies just select, then user A can't grant insert privilege to user B. So 
this is minor different from what Ranger has already provided.

If we allow grant/revoke SQL from HAWQ, we need to add "grant" as an action 
option to the resource. Action option means for each action, it has an 
attribute which indicates whether this action can be granted by the user.
For example, admin grant two privileges:
"grant select on t1 to u1"
"grant insert on t1 to u1 with grant option"
Then u1 grant privilege to u2
"grant select on t1 to u2" result: failed!
grant insert on t1 to u2" result: succeed!
As a result, u2 can insert on t1, but it cannot select on t1.
Correspondingly, in Ranger, we have the following policies(* means with grant 
privilege):
t1 u1 insert*select
t1 u2 insert

So the conclusion is that we need double the privileges for defining "with 
grant option" if we want to support Grant/Revoke SQL from HAWQ side.

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-25 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438530#comment-15438530
 ] 

Lili Ma commented on HAWQ-256:
--

Agree on the Grant/Revoke should be disabled is ranger is enabled.

In this way, the with grant option will not be considered.

Another thing is the owner management. Besides normal ACL, HAWQ has a 
definition of owner. The owner of the object can do any operation.  And for the 
owner part,  "Grant parent role to member role" and "reassign" are two SQL 
commands for owner control. I think we should move owner control to Ranger, to 
enable a fully Ranger-centralized access control. Your thoughts? 
[~vVineet][~bosco]

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-25 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438531#comment-15438531
 ] 

Lili Ma commented on HAWQ-256:
--

[~bosco] Do you have any feedback on the API definition from Hortonworks side? 
Thanks 

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1026) HAWQ sync user information from LDAP

2016-08-26 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1026:
-

 Summary: HAWQ sync user information from LDAP
 Key: HAWQ-1026
 URL: https://issues.apache.org/jira/browse/HAWQ-1026
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Security
Reporter: Lili Ma
Assignee: Lei Chang


HAWQ sync user information from LDAP, so that HAWQ user doesn't need to 
manually create role in HAWQ for all users in LDAP.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-26 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15438530#comment-15438530
 ] 

Lili Ma edited comment on HAWQ-256 at 8/26/16 7:56 AM:
---

Agree on the Grant/Revoke should be disabled if ranger is enabled.

In this way, the with grant option will not be considered.

Another thing is the owner management. Besides normal ACL, HAWQ has a 
definition of owner. The owner of the object can do any operation.  And for the 
owner part,  "Grant parent role to member role" and "reassign" are two SQL 
commands for owner control. I think we should move owner control to Ranger, to 
enable a fully Ranger-centralized access control. Your thoughts? 
[~vVineet][~bosco]


was (Author: lilima):
Agree on the Grant/Revoke should be disabled is ranger is enabled.

In this way, the with grant option will not be considered.

Another thing is the owner management. Besides normal ACL, HAWQ has a 
definition of owner. The owner of the object can do any operation.  And for the 
owner part,  "Grant parent role to member role" and "reassign" are two SQL 
commands for owner control. I think we should move owner control to Ranger, to 
enable a fully Ranger-centralized access control. Your thoughts? 
[~vVineet][~bosco]

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1029) hawq register --help information not correct

2016-08-28 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1029:
-

 Summary: hawq register --help information not correct
 Key: HAWQ-1029
 URL: https://issues.apache.org/jira/browse/HAWQ-1029
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang


The example for usage case 1 is not correct, should modify it.
$ hawq register postgres parquet_table hdfs://localhost:8020/temp/hive.paq



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Attachment: HAWQRangerSupportDesign_v0.2.pdf

Attach a new version of design doc for HAWQ Ranger Integration, mainly 
checkPrivilege change. Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Attachment: HAWQRangerSupportDesign_v0.2.pdf

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Attachment: (was: HAWQRangerSupportDesign_v0.2.pdf)

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-256:
-
Attachment: (was: HAWQRangerSupportDesign_v0.2.pdf)

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-991) Add support for "HAWQ register" that could register tables by using "hawq extract" output

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-991:
-
Description: 
Scenario: 
1. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
data from Cluster A to Cluster B. Need Register data to Cluster B.
2. For the rollback of table. Do checkpoints somewhere, and need to rollback to 
previous checkpoint. 

Description:
Register according to .yml configuration file. 
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
config] [--force][--repair]  

Behaviors:
1. If table doesn't exist, will automatically create the table and register the 
files in .yml configuration file. Will use the filesize specified in .yml to 
update the catalog table. 

2. If table already exist, and neither --force nor --repair configured. Do not 
create any table, and directly register the files specified in .yml file to the 
table. Note that if the file is under table directory in HDFS, will throw 
error, say, to-be-registered files should not under the table path.

3. If table already exist, and --force is specified. Will clear all the catalog 
contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, and then 
re-register all the files to the table.  This is for scenario 2.

4. If table already exist, and --repair is specified. Will change both file 
folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml file 
configures. Note may some new generated files since the checkpoint may be 
deleted here. Also note the all the files in .yml file should all under the 
table folder on HDFS. Limitation: Do not support cases for hash table 
redistribution, table truncate and table drop. This is for scenario 3.

Requirements:
1. To be registered file path has to colocate with HAWQ in the same HDFS 
cluster.
2. If to be registered is a hash table, the registered file number should be 
one or multiple times or hash table bucket number.



  was:
User should be able to use hawq register to register table files into a new 
HAWQ cluster. It is some kind of protecting against corruption from users' 
perspective. Users use the last-known-good metadata to update the portion of 
catalog managing HDFS blocks. The table files or dictionary should be 
backuped(such as using distcp) into the same path in the new HDFS setting. And 
in this case, both AO and Parquet formats are supported.

Usage:
hawq extract -o t1.yml t1; // in HAWQ Cluster A
hawq register -h localhost -p 5432 -u me -d postgres -c t1.yml t1; // in HAWQ 
Cluster B



> Add support for "HAWQ register" that could register tables by using "hawq 
> extract" output
> -
>
> Key: HAWQ-991
> URL: https://issues.apache.org/jira/browse/HAWQ-991
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools, External Tables
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Scenario: 
> 1. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 2. For the rollback of table. Do checkpoints somewhere, and need to rollback 
> to previous checkpoint. 
> Description:
> Register according to .yml configuration file. 
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
> config] [--force][--repair]  
> Behaviors:
> 1. If table doesn't exist, will automatically create the table and register 
> the files in .yml configuration file. Will use the filesize specified in .yml 
> to update the catalog table. 
> 2. If table already exist, and neither --force nor --repair configured. Do 
> not create any table, and directly register the files specified in .yml file 
> to the table. Note that if the file is under table directory in HDFS, will 
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the 
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, 
> and then re-register all the files to the table.  This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file 
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml 
> file configures. Note may some new generated files since the checkpoint may 
> be deleted here. Also note the all the files in .yml file should all under 
> the table folder on HDFS. Limitation: Do not support cases for hash table 
> redistribution, table truncate and table drop. This is for scenario 3.
> Requirements:
> 1. To be registered file path has to colocate with HAWQ in the same HDFS 
> cluster.
> 2. If to be registered is a hash table, the r

[jira] [Updated] (HAWQ-760) Hawq register

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-760:
-
Description: 
Scenario: 
1. Register a parquet file generated by other systems, such as Hive, Spark, etc.
2. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
data from Cluster A to Cluster B. Need Register data to Cluster B.
3. For the rollback of table. Do checkpoints somewhere, and need to rollback to 
previous checkpoint. 

Usage1
Description
Register a file/folder to an existing table. Can register a file or a folder. 
If we register a file, can specify eof of this file. If eof not specified, 
directly use actual file size. If we register a folder, directly use actual 
file size.
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f 
filepath] [-e eof]


Usage 2
Description
Register according to .yml configuration file. 
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
config] [--force][--repair]  

Behavior:
1. If table doesn't exist, will automatically create the table and register the 
files in .yml configuration file. Will use the filesize specified in .yml to 
update the catalog table. 

2. If table already exist, and neither --force nor --repair configured. Do not 
create any table, and directly register the files specified in .yml file to the 
table. Note that if the file is under table directory in HDFS, will throw 
error, say, to-be-registered files should not under the table path.

3. If table already exist, and --force is specified. Will clear all the catalog 
contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, and then 
re-register all the files to the table.  This is for scenario 2.

4. If table already exist, and --repair is specified. Will change both file 
folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml file 
configures. Note may some new generated files since the checkpoint may be 
deleted here. Also note the all the files in .yml file should all under the 
table folder on HDFS. Limitation: Do not support cases for hash table 
redistribution, table truncate and table drop. This is for scenario 3.

Requirements for both the cases:
1. To be registered file path has to colocate with HAWQ in the same HDFS 
cluster.
2. If to be registered is a hash table, the registered file number should be 
one or multiple times or hash table bucket number.

  was:Users sometimes want to register data files generated by other system 
like hive into hawq. We should add register function to support registering 
file(s) generated by other system like hive into hawq. So users could integrate 
their external file(s) into hawq conveniently.


> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Scenario: 
> 1. Register a parquet file generated by other systems, such as Hive, Spark, 
> etc.
> 2. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 3. For the rollback of table. Do checkpoints somewhere, and need to rollback 
> to previous checkpoint. 
> Usage1
> Description
> Register a file/folder to an existing table. Can register a file or a folder. 
> If we register a file, can specify eof of this file. If eof not specified, 
> directly use actual file size. If we register a folder, directly use actual 
> file size.
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f 
> filepath] [-e eof]
> Usage 2
> Description
> Register according to .yml configuration file. 
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
> config] [--force][--repair]  
> Behavior:
> 1. If table doesn't exist, will automatically create the table and register 
> the files in .yml configuration file. Will use the filesize specified in .yml 
> to update the catalog table. 
> 2. If table already exist, and neither --force nor --repair configured. Do 
> not create any table, and directly register the files specified in .yml file 
> to the table. Note that if the file is under table directory in HDFS, will 
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the 
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, 
> and then re-register all the files to the table.  This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file 
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml 
> file configures. Note may some new generated files sin

[jira] [Commented] (HAWQ-760) Hawq register

2016-08-29 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447849#comment-15447849
 ] 

Lili Ma commented on HAWQ-760:
--

hawq register has two usages, 
1. Register a file/folder to an existing table. Can register a file or a 
folder. If we register a file, can specify eof of this file. If eof not 
specified, directly use actual file size. If we register a folder, directly use 
actual file size.
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-f 
filepath] [-e eof]
2. Register according to .yml configuration file. 
hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
config] [--force][--repair]  

HAWQ-991 is the second usage specifying registering according to .yml 
configuration file.

> Hawq register
> -
>
> Key: HAWQ-760
> URL: https://issues.apache.org/jira/browse/HAWQ-760
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Reporter: Yangcheng Luo
>Assignee: Lili Ma
> Fix For: backlog
>
>
> Users sometimes want to register data files generated by other system like 
> hive into hawq. We should add register function to support registering 
> file(s) generated by other system like hive into hawq. So users could 
> integrate their external file(s) into hawq conveniently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-991) "HAWQ register" could register tables according to .yml configuration file

2016-08-29 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-991:
-
Summary: "HAWQ register" could register tables according to .yml 
configuration file  (was: Add support for "HAWQ register" that could register 
tables by using "hawq extract" output)

> "HAWQ register" could register tables according to .yml configuration file
> --
>
> Key: HAWQ-991
> URL: https://issues.apache.org/jira/browse/HAWQ-991
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools, External Tables
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Scenario: 
> 1. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 2. For the rollback of table. Do checkpoints somewhere, and need to rollback 
> to previous checkpoint. 
> Description:
> Register according to .yml configuration file. 
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
> config] [--force][--repair]  
> Behaviors:
> 1. If table doesn't exist, will automatically create the table and register 
> the files in .yml configuration file. Will use the filesize specified in .yml 
> to update the catalog table. 
> 2. If table already exist, and neither --force nor --repair configured. Do 
> not create any table, and directly register the files specified in .yml file 
> to the table. Note that if the file is under table directory in HDFS, will 
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the 
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, 
> and then re-register all the files to the table.  This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file 
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml 
> file configures. Note may some new generated files since the checkpoint may 
> be deleted here. Also note the all the files in .yml file should all under 
> the table folder on HDFS. Limitation: Do not support cases for hash table 
> redistribution, table truncate and table drop. This is for scenario 3.
> Requirements:
> 1. To be registered file path has to colocate with HAWQ in the same HDFS 
> cluster.
> 2. If to be registered is a hash table, the registered file number should be 
> one or multiple times or hash table bucket number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-1025) Check the consistency of AO/Parquet_FileLocations.Files.size attribute in extracted yaml file and the actual file size in HDFS.

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma reassigned HAWQ-1025:
-

Assignee: Lili Ma  (was: hongwu)

> Check the consistency of AO/Parquet_FileLocations.Files.size attribute in 
> extracted yaml file and the actual file size in HDFS.
> ---
>
> Key: HAWQ-1025
> URL: https://issues.apache.org/jira/browse/HAWQ-1025
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: Lili Ma
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1025) Modify the content of yml file, and change hawq register implementation for the modification

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1025:
--
Summary: Modify the content of yml file, and change hawq register 
implementation for the modification  (was: Check the consistency of 
AO/Parquet_FileLocations.Files.size attribute in extracted yaml file and the 
actual file size in HDFS.)

> Modify the content of yml file, and change hawq register implementation for 
> the modification
> 
>
> Key: HAWQ-1025
> URL: https://issues.apache.org/jira/browse/HAWQ-1025
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: Lili Ma
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1025) Modify the content of yml file, and change hawq register implementation for the modification

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1025:
--
Description: 
1. Add bucket number for hash-distributed table in yml file, when hawq 
register, ensure the number of files be multiple times of the bucket number
2. hawq register should use the file size information in yml file to update the 
catalog table pg_aoseg.pg_paqseg_$relid
3. hawq register processing steps:
   a. create table
   b. mv all the files
   c. change the catalog table once.


  was:
1. Add bucket number for hash-distributed table in yml file, when hawq 
register, ensure the number of files be multiple times of the bucket number
2. hawq register should use the file size information in yml file to update the 
catalog table pg_aoseg.pg_paqseg_$relid
3. hawq register processing steps:
   a. create table
   b. mv all the files
   c. change the catalog table once.



> Modify the content of yml file, and change hawq register implementation for 
> the modification
> 
>
> Key: HAWQ-1025
> URL: https://issues.apache.org/jira/browse/HAWQ-1025
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: Lili Ma
> Fix For: 2.0.1.0-incubating
>
>
> 1. Add bucket number for hash-distributed table in yml file, when hawq 
> register, ensure the number of files be multiple times of the bucket number
> 2. hawq register should use the file size information in yml file to update 
> the catalog table pg_aoseg.pg_paqseg_$relid
> 3. hawq register processing steps:
>a. create table
>b. mv all the files
>c. change the catalog table once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1025) Modify the content of yml file, and change hawq register implementation for the modification

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1025:
--
Description: 
1. Add bucket number for hash-distributed table in yml file, when hawq 
register, ensure the number of files be multiple times of the bucket number
2. hawq register should use the file size information in yml file to update the 
catalog table pg_aoseg.pg_paqseg_$relid
3. hawq register processing steps:
   a. create table
   b. mv all the files
   c. change the catalog table once.


> Modify the content of yml file, and change hawq register implementation for 
> the modification
> 
>
> Key: HAWQ-1025
> URL: https://issues.apache.org/jira/browse/HAWQ-1025
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: Lili Ma
> Fix For: 2.0.1.0-incubating
>
>
> 1. Add bucket number for hash-distributed table in yml file, when hawq 
> register, ensure the number of files be multiple times of the bucket number
> 2. hawq register should use the file size information in yml file to update 
> the catalog table pg_aoseg.pg_paqseg_$relid
> 3. hawq register processing steps:
>a. create table
>b. mv all the files
>c. change the catalog table once.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (HAWQ-1024) Rollback if hawq register failed in process

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma closed HAWQ-1024.
-
Resolution: Invalid

> Rollback if hawq register failed in process
> ---
>
> Key: HAWQ-1024
> URL: https://issues.apache.org/jira/browse/HAWQ-1024
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1033) add --force option for hawq register

2016-08-30 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1033:
-

 Summary: add --force option for hawq register
 Key: HAWQ-1033
 URL: https://issues.apache.org/jira/browse/HAWQ-1033
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang


add --force option for hawq register




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1034) add --repair option for hawq register

2016-08-30 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1034:
-

 Summary: add --repair option for hawq register
 Key: HAWQ-1034
 URL: https://issues.apache.org/jira/browse/HAWQ-1034
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang


add --repair option for hawq register



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1035) support partition table register

2016-08-30 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1035:
-

 Summary: support partition table register
 Key: HAWQ-1035
 URL: https://issues.apache.org/jira/browse/HAWQ-1035
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang


Support partitiont table register, limited to 1 level partition table, since 
hawq extract only supports 1-level partition table



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1033) add --force option for hawq register

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1033:
--
Description: 
add --force option for hawq register

Will clear all the catalog contents in pg_aoseg.pg_paqseg_$relid while keep the 
files on HDFS, and then re-register all the files to the table.  This is for 
scenario cluster Disaster Recovery: Two clusters co-exist, periodically import 
data from Cluster A to Cluster B. Need Register data to Cluster B.


  was:
add --force option for hawq register

Will clear all the catalog contents in pg_aoseg.pg_paqseg_$relid while keep the 
files on HDFS, and then re-register all the files to the table.  This is for 
scenario 2.



> add --force option for hawq register
> 
>
> Key: HAWQ-1033
> URL: https://issues.apache.org/jira/browse/HAWQ-1033
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> add --force option for hawq register
> Will clear all the catalog contents in pg_aoseg.pg_paqseg_$relid while keep 
> the files on HDFS, and then re-register all the files to the table.  This is 
> for scenario cluster Disaster Recovery: Two clusters co-exist, periodically 
> import data from Cluster A to Cluster B. Need Register data to Cluster B.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1033) add --force option for hawq register

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1033:
--
Description: 
add --force option for hawq register

Will clear all the catalog contents in pg_aoseg.pg_paqseg_$relid while keep the 
files on HDFS, and then re-register all the files to the table.  This is for 
scenario 2.


  was:
add --force option for hawq register



> add --force option for hawq register
> 
>
> Key: HAWQ-1033
> URL: https://issues.apache.org/jira/browse/HAWQ-1033
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> add --force option for hawq register
> Will clear all the catalog contents in pg_aoseg.pg_paqseg_$relid while keep 
> the files on HDFS, and then re-register all the files to the table.  This is 
> for scenario 2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1034) add --repair option for hawq register

2016-08-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1034:
--
Description: 
add --repair option for hawq register

Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to the 
state which .yml file configures. Note may some new generated files since the 
checkpoint may be deleted here. Also note the all the files in .yml file should 
all under the table folder on HDFS. Limitation: Do not support cases for hash 
table redistribution, table truncate and table drop. This is for scenario 
rollback of table: Do checkpoints somewhere, and need to rollback to previous 
checkpoint. 

  was:add --repair option for hawq register


> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-31 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451558#comment-15451558
 ] 

Lili Ma commented on HAWQ-256:
--

[~thebellhead], quit good questions!

1. In order for tools, syntax checking, etc to work everyone (the HAWQ public 
role) requires access to the catalog and some of the toolkit. Will Ranger-only 
access control apply only to user created tables, views and external tables?
Yes, since the catalog tables and toolkits are shared and used by various 
users, Ranger-only access control just applies to user defined objects.  But 
the objects include not only database, table and view, but also include 
function, language, schema, tablespace and protocol. You can find the detailed 
objects and privileges in the design doc.

2. If so - will gpadmin and any other HAWQ-defined roles not have access to the 
data in Ranger managed tables?
Just as you mentioned, HAWQ uses gpadmin identity to create files on HDFS, say, 
when a specified userA creates a table in HAWQ, the HDFS files for the table 
are created by gpadmin instead of userA. Since Ranger lies in Hadoop 
eco-system, it usually needs to control both HAWQ and HDFS, I think we need 
assign gpadmin to the full privileges of hawq data file directory on HDFS in 
Ranger UI previously. 

About your concern about the superuser can see all the users' data, I think 
it's kind of like the "root" role in operation system?  If the users have 
concerns about the DBA/Superuser's unlimited access, I totally agree with you 
about the solution of "passing down user-identifiy" for solving this problem :)

3. How would this be extended for the hcatalog virtual database in HAWQ? Could 
the Ranger permissions for the underlying store (for instance Hive) be read and 
enforced/reported at the HAWQ level?
If HAWQ keeps the gpadmin for operating HDFS or external storage, I think we 
just need grant the privilege to superuser. But if we have implemented the 
user-identity passing down, say, the data files on HDFS for a table created by 
userA are owned by userA instead of gpadmin, in this way we need to double 
connect to Ranger, from HAWQ and HDFS respectively.  I haven't include the 
underlying store privileges check into HAWQ side, that may need multiple code 
changes. I think keeping the privileges in the component is another choice. 
Your thoughts?

Thanks
Lili


> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-31 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451558#comment-15451558
 ] 

Lili Ma edited comment on HAWQ-256 at 8/31/16 8:24 AM:
---

[~thebellhead], quit good questions!

1. In order for tools, syntax checking, etc to work everyone (the HAWQ public 
role) requires access to the catalog and some of the toolkit. Will Ranger-only 
access control apply only to user created tables, views and external tables?
Yes, since the catalog tables and toolkits are shared and used by various 
users, Ranger-only access control just applies to user defined objects.  But 
the objects include not only database, table and view, but also include 
function, language, schema, tablespace and protocol. You can find the detailed 
objects and privileges in the design doc. I have reviewed your proposal in 
HAWQ-1036, could you share what's your handing for the objects which don't lie 
in HDFS layer such as function, schema, language, etc?

2. If so - will gpadmin and any other HAWQ-defined roles not have access to the 
data in Ranger managed tables?
Just as you mentioned, HAWQ uses gpadmin identity to create files on HDFS, say, 
when a specified userA creates a table in HAWQ, the HDFS files for the table 
are created by gpadmin instead of userA. Since Ranger lies in Hadoop 
eco-system, it usually needs to control both HAWQ and HDFS, I think we need 
assign gpadmin to the full privileges of hawq data file directory on HDFS in 
Ranger UI previously. 

About your concern about the superuser can see all the users' data, I think 
it's kind of like the "root" role in operation system?  If the users have 
concerns about the DBA/Superuser's unlimited access, I totally agree with you 
about the solution of "passing down user-identifiy" for solving this problem :)

3. How would this be extended for the hcatalog virtual database in HAWQ? Could 
the Ranger permissions for the underlying store (for instance Hive) be read and 
enforced/reported at the HAWQ level?
If HAWQ keeps the gpadmin for operating HDFS or external storage, I think we 
just need grant the privilege to superuser. But if we have implemented the 
user-identity passing down, say, the data files on HDFS for a table created by 
userA are owned by userA instead of gpadmin, in this way we need to double 
connect to Ranger, from HAWQ and HDFS respectively.  I haven't include the 
underlying store privileges check into HAWQ side, that may need multiple code 
changes. I think keeping the privileges in the component is another choice. 
Your thoughts?

Thanks
Lili



was (Author: lilima):
[~thebellhead], quit good questions!

1. In order for tools, syntax checking, etc to work everyone (the HAWQ public 
role) requires access to the catalog and some of the toolkit. Will Ranger-only 
access control apply only to user created tables, views and external tables?
Yes, since the catalog tables and toolkits are shared and used by various 
users, Ranger-only access control just applies to user defined objects.  But 
the objects include not only database, table and view, but also include 
function, language, schema, tablespace and protocol. You can find the detailed 
objects and privileges in the design doc.

2. If so - will gpadmin and any other HAWQ-defined roles not have access to the 
data in Ranger managed tables?
Just as you mentioned, HAWQ uses gpadmin identity to create files on HDFS, say, 
when a specified userA creates a table in HAWQ, the HDFS files for the table 
are created by gpadmin instead of userA. Since Ranger lies in Hadoop 
eco-system, it usually needs to control both HAWQ and HDFS, I think we need 
assign gpadmin to the full privileges of hawq data file directory on HDFS in 
Ranger UI previously. 

About your concern about the superuser can see all the users' data, I think 
it's kind of like the "root" role in operation system?  If the users have 
concerns about the DBA/Superuser's unlimited access, I totally agree with you 
about the solution of "passing down user-identifiy" for solving this problem :)

3. How would this be extended for the hcatalog virtual database in HAWQ? Could 
the Ranger permissions for the underlying store (for instance Hive) be read and 
enforced/reported at the HAWQ level?
If HAWQ keeps the gpadmin for operating HDFS or external storage, I think we 
just need grant the privilege to superuser. But if we have implemented the 
user-identity passing down, say, the data files on HDFS for a table created by 
userA are owned by userA instead of gpadmin, in this way we need to double 
connect to Ranger, from HAWQ and HDFS respectively.  I haven't include the 
underlying store privileges check into HAWQ side, that may need multiple code 
changes. I think keeping the privileges in the component is another choice. 
Your thoughts?

Thanks
Lili


> Integrate Security with Apache Ranger
> --

[jira] [Commented] (HAWQ-256) Integrate Security with Apache Ranger

2016-08-31 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15454137#comment-15454137
 ] 

Lili Ma commented on HAWQ-256:
--

[~thebellhead]  From technical view, we can restrict HAWQSuperUser privilege in 
Ranger definitely. 

But, if we restrict that, HAWQ superuser behavior changes. I think this needs 
careful discussion, and it's out of the scope of this JIRA. Right?  Anyway, if 
everyone agrees to remove the superuser privileges, we can implement that 
function. Thanks

> Integrate Security with Apache Ranger
> -
>
> Key: HAWQ-256
> URL: https://issues.apache.org/jira/browse/HAWQ-256
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Michael Andre Pearce (IG)
>Assignee: Lili Ma
> Fix For: backlog
>
> Attachments: HAWQRangerSupportDesign.pdf, 
> HAWQRangerSupportDesign_v0.2.pdf
>
>
> Integrate security with Apache Ranger for a unified Hadoop security solution. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1036) Support user impersonation in PXF for external tables

2016-09-01 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15454926#comment-15454926
 ] 

Lili Ma commented on HAWQ-1036:
---

Hello, I think passing down user identity is a quite important area. 
I have several questions about this:
1. If we pass the privilege check down to HDFS or Hive, what about the objects 
which doesn't map to data storage, for example, language, function, schema, 
etc? 
2. For table object, how can we map the privilege to storage if underlying 
storage is HDFS? For example, for table, we may have 
create/select/insert/update/delete(although HAWQ doesn't support update/delete 
now, it may support these features later), which for HDFS file, we only have 
create/read/write/append. How shall we map them? 
3. Do the privileges check in this way happen during query execution, I think  
HAWQ-256 does this in planning period.
4. What if Ranger admin wants to assign table created by userA to userB?  Does 
he need to find out the underlying file folder and assign that folder 
privileges to userB? If yes, then he has to know the mapping between HAWQ table 
and HDFS files. Right?
5. Currently my understanding for PXF design is using a special user identity? 
What will happen after the change? Multiple users will have access to the 
external storage? What if we support S3 in the future? Need S3 give the 
privileges to all the users in HAWQ? 

Thanks
Lili


> Support user impersonation in PXF for external tables
> -
>
> Key: HAWQ-1036
> URL: https://issues.apache.org/jira/browse/HAWQ-1036
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: PXF, Security
>Reporter: Alastair "Bell" Turner
>Assignee: Goden Yao
>Priority: Critical
> Fix For: backlog
>
> Attachments: HAWQ_Impersonation_rationale.txt
>
>
> Currently HAWQ executes all queries as the user running the HAWQ process or 
> the user running the PXF process, not as the user who issued the query via 
> ODBC/JDBC/... This restricts the options available for integrating with 
> existing security defined in HDFS, Hive, etc.
> Impersonation provides an alternative Ranger integration (as discussed in 
> HAWQ-256 ) for consistent security across HAWQ, HDFS, Hive...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1004) Implement calling Ranger REST Service -- use mock server

2016-09-04 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1004:
--
Summary: Implement calling Ranger REST Service -- use mock server  (was: 
Decide How HAWQ connect Ranger, through which user, how to connect to REST 
Server)

> Implement calling Ranger REST Service -- use mock server
> 
>
> Key: HAWQ-1004
> URL: https://issues.apache.org/jira/browse/HAWQ-1004
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Lili Ma
>Assignee: Lin Wen
> Fix For: backlog
>
>
> Decide How HAWQ connect Ranger, through which user, how to connect to REST 
> Server
> Acceptance Criteria: 
> Provide an interface for HAWQ connecting Ranger REST Server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1032) Bucket number of newly added partition is not consistent with parent table.

2016-09-05 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1032:
--
Description: 
Failure Case
{code}
set deafult_hash_table_bucket_number = 12;
CREATE TABLE sales3 (id int, date date, amt decimal(10,2)) DISTRIBUTED 
BY (id)   PARTITION BY 
RANGE (date) ( START (date 
'2008-01-01') INCLUSIVEEND (date 
'2009-01-01') EXCLUSIVE EVERY 
(INTERVAL '1 day') );

set default_hash_table_bucket_number = 16;
ALTER TABLE sales3 ADD PARTITION   START (date 
'2009-03-01') INCLUSIVE   END (date 
'2009-04-01') EXCLUSIVE;
{code}

The newly added partition with buckcet number 16 is not consistent with parent 
partition.

  was:
Failure Case
{code}
set deafult_hash_table_bucket_number = 12;
CREATE TABLE sales3 (id int, date date, amt decimal(10,2)) DISTRIBUTED 
BY (id)   PARTITION BY 
RANGE (date) ( START (date 
'2008-01-01') INCLUSIVEEND (date 
'2009-01-01') EXCLUSIVE EVERY 
(INTERVAL '1 day') );

set deafult_hash_table_bucket_number = 16;
ALTER TABLE sales3 ADD PARTITION   START (date 
'2009-03-01') INCLUSIVE   END (date 
'2009-04-01') EXCLUSIVE;
{code}

The newly added partition with buckcet number 16 is not consistent with parent 
partition.


> Bucket number of newly added partition is not consistent with parent table.
> ---
>
> Key: HAWQ-1032
> URL: https://issues.apache.org/jira/browse/HAWQ-1032
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Core
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
>
> Failure Case
> {code}
> set deafult_hash_table_bucket_number = 12;
> CREATE TABLE sales3 (id int, date date, amt decimal(10,2)) 
> DISTRIBUTED BY (id)   
> PARTITION BY RANGE (date) 
> ( START (date '2008-01-01') INCLUSIVE 
>END (date '2009-01-01') EXCLUSIVE  
>EVERY (INTERVAL '1 day') );
> set default_hash_table_bucket_number = 16;
> ALTER TABLE sales3 ADD PARTITION   START 
> (date '2009-03-01') INCLUSIVE   END 
> (date '2009-04-01') EXCLUSIVE;
> {code}
> The newly added partition with buckcet number 16 is not consistent with 
> parent partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1044) Verify the correctness of hawq register

2016-09-12 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1044:
-

 Summary: Verify the correctness of hawq register
 Key: HAWQ-1044
 URL: https://issues.apache.org/jira/browse/HAWQ-1044
 Project: Apache HAWQ
  Issue Type: Sub-task
Reporter: Lili Ma
Assignee: Lei Chang


Verify the correctness of hawq register, summary all the use scenarios and 
design corresponding test cases for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1044) Verify the correctness of hawq register

2016-09-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1044:
--
Assignee: hongwu  (was: Lei Chang)

> Verify the correctness of hawq register
> ---
>
> Key: HAWQ-1044
> URL: https://issues.apache.org/jira/browse/HAWQ-1044
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: backlog
>
>
> Verify the correctness of hawq register, summary all the use scenarios and 
> design corresponding test cases for it.
> I think following test cases should be added for the HAWQ register.
> 1. Use Case 1: Register file/folder into HAWQ by specifying file/folder name
> a) hawq register -d postgres -f a.file tableA
> b) hawq register -d postgres -f a.file -e eof tableA
> c) hawq register -d postgres -f folderA tableA
> d) register file to existing table. normal path
> e) register file to existing table. error path: to-be-registered files under 
> the file folder for the existing table on HDFS. Should throw error out.
> f) verify wrong input file. The file format not parquet format.
> 2. Use case 2: Register into HAWQ table using .yml configuration file to a 
> non-existing table
> a) Verify normal input:
> create table a(a int, b int);
> insert into a values(generate_series(1,100), 25);
> hawq extract -d postgres -o a.yml a
> hawq register -d postgres -c a.yml b
> b) Modify the fileSize in .yml file to a value which is different from actual 
> data size of data file
> 3. Use Case 2: Regsiter into HAWQ table using .yml configuration file to an 
> existing table
> a) Verify normal path:
> Call multiple times of hawq register, to verify whether can succeed. Each 
> time the to-be-registered files are not under the table directory.
> b) Error path: to-be-registered files under the file folder for the existing 
> table on HDFS
> Should throw error out: not support!
> 4. Use Case 2: Register into HAWQ table using .yml configuration file by 
> specifying --force option
> a) The table not exist: should create a new table, and do the register
> b) The table already exist, but no data there: can directly call hawq register
> c) Table already exist, and already data there -- normal path: .yml 
> configuration file includes the data files under table directory, and 
> just include those data files.
> d) Table already exist, and already data there -- normal path: .yml 
> configuration file includes the data files under table directory, and 
> also includes data files not under table directory.
> e) Table already exist, and already data there -- error path: .yml 
> configuration file doesn't include the data files under that table directory. 
> Should throw error out, "there are already existing files under the table, 
> but not included in .yml configuration file"
> 5. Use Case 2: Register into HAWQ table using .yml configuration file by 
> specifying --repair option
> a) Normal Path 1: (Append to new file)
> create a tableA
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> insert new data into tableA
> call hawq register --repair option to rollback to the state
> b) Normal Path 2: (New files generated)
> Same as Normal Path 1, but during the second insert, use multiple inserts 
> concurrenly aiming at producing new files. Then call hawq register --repair,
> the new files should be discarded.
> c) Error Path: restributed
> Create a table with hash-distributed, distributed by column A
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> alter table redistributed by column B
> insert new data into tableA
> call hawq register --repair option to rollback to the state  
> --> should throw error "the table is redistributed"
> d) Error Path: table being truncated
> Create a table with hash-distributed, distributed by column A
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> truncate tableA
> call hawq register --repair option to rollback to the state  
> --> should throw error "the table becomes smaller than the .yml config file 
> specified."
> e) Error Path: files specified in .yml configuration not under data directory 
> of table A
> --> should throw error "the files should all under the table directory when 
> --repair option specified for hawq register"
> 6. hawq register partition table support
> a) Normal Path: create a 1-level partition table, calling hawq extract and 
> then hawq register, can work
> b) Error Path: create a 2-level partition table, calling hawq extract and 
> then hawq register, 
> --> should throw error "only supports 1-level partition table"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1044) Verify the correctness of hawq register

2016-09-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1044:
--
Description: 
Verify the correctness of hawq register, summary all the use scenarios and 
design corresponding test cases for it.

I think following test cases should be added for the HAWQ register.
1. Use Case 1: Register file/folder into HAWQ by specifying file/folder name
a) hawq register -d postgres -f a.file tableA
b) hawq register -d postgres -f a.file -e eof tableA
c) hawq register -d postgres -f folderA tableA
d) register file to existing table. normal path
e) register file to existing table. error path: to-be-registered files under 
the file folder for the existing table on HDFS. Should throw error out.
f) verify wrong input file. The file format not parquet format.

2. Use case 2: Register into HAWQ table using .yml configuration file to a 
non-existing table
a) Verify normal input:
create table a(a int, b int);
insert into a values(generate_series(1,100), 25);
hawq extract -d postgres -o a.yml a
hawq register -d postgres -c a.yml b
b) Modify the fileSize in .yml file to a value which is different from actual 
data size of data file

3. Use Case 2: Regsiter into HAWQ table using .yml configuration file to an 
existing table
a) Verify normal path:
Call multiple times of hawq register, to verify whether can succeed. Each time 
the to-be-registered files are not under the table directory.
b) Error path: to-be-registered files under the file folder for the existing 
table on HDFS
Should throw error out: not support!

4. Use Case 2: Register into HAWQ table using .yml configuration file by 
specifying --force option
a) The table not exist: should create a new table, and do the register
b) The table already exist, but no data there: can directly call hawq register
c) Table already exist, and already data there -- normal path: .yml 
configuration file includes the data files under table directory, and 
just include those data files.
d) Table already exist, and already data there -- normal path: .yml 
configuration file includes the data files under table directory, and 
also includes data files not under table directory.
e) Table already exist, and already data there -- error path: .yml 
configuration file doesn't include the data files under that table directory. 
Should throw error out, "there are already existing files under the table, but 
not included in .yml configuration file"

5. Use Case 2: Register into HAWQ table using .yml configuration file by 
specifying --repair option
a) Normal Path 1: (Append to new file)
create a tableA
insert some data into tableA
call hawq extract the metadata to a.yml file
insert new data into tableA
call hawq register --repair option to rollback to the state
b) Normal Path 2: (New files generated)
Same as Normal Path 1, but during the second insert, use multiple inserts 
concurrenly aiming at producing new files. Then call hawq register --repair,
the new files should be discarded.
c) Error Path: restributed
Create a table with hash-distributed, distributed by column A
insert some data into tableA
call hawq extract the metadata to a.yml file
alter table redistributed by column B
insert new data into tableA
call hawq register --repair option to rollback to the state  
--> should throw error "the table is redistributed"
d) Error Path: table being truncated
Create a table with hash-distributed, distributed by column A
insert some data into tableA
call hawq extract the metadata to a.yml file
truncate tableA
call hawq register --repair option to rollback to the state  
--> should throw error "the table becomes smaller than the .yml config file 
specified."
e) Error Path: files specified in .yml configuration not under data directory 
of table A
--> should throw error "the files should all under the table directory when 
--repair option specified for hawq register"

6. hawq register partition table support
a) Normal Path: create a 1-level partition table, calling hawq extract and then 
hawq register, can work
b) Error Path: create a 2-level partition table, calling hawq extract and then 
hawq register, 
--> should throw error "only supports 1-level partition table"

  was:Verify the correctness of hawq register, summary all the use scenarios 
and design corresponding test cases for it.


> Verify the correctness of hawq register
> ---
>
> Key: HAWQ-1044
> URL: https://issues.apache.org/jira/browse/HAWQ-1044
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Lei Chang
> Fix For: backlog
>
>
> Verify the correctness of hawq register, summary all the use scenarios and 
> design corresponding test cases for it.
> I think following test cases should be added for the HAWQ register.
> 1. Use Case 1: Register 

[jira] [Updated] (HAWQ-1035) support partition table register

2016-09-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1035:
--
Description: 
Support partition table register, limited to 1 level partition table, since 
hawq extract only supports 1-level partition table.

Expected behavior:
1. Create a partition table in HAWQ, then extract the information out to .yml 
file
2. Call hawq register and specify identified .yml file and a new table name, 
the files should be registered into the new table.

Works can be detailed down to implementation partition table registeration:
1. modify .yml configuration file parsing function, add content for partition 
table.
2. construct partition table DDL regards to .yml configuration file
3. map sub partition table name to the table list in .yml configuration file
4. register the subpartition table one by one

  was:Support partitiont table register, limited to 1 level partition table, 
since hawq extract only supports 1-level partition table


> support partition table register
> 
>
> Key: HAWQ-1035
> URL: https://issues.apache.org/jira/browse/HAWQ-1035
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Support partition table register, limited to 1 level partition table, since 
> hawq extract only supports 1-level partition table.
> Expected behavior:
> 1. Create a partition table in HAWQ, then extract the information out to .yml 
> file
> 2. Call hawq register and specify identified .yml file and a new table name, 
> the files should be registered into the new table.
> Works can be detailed down to implementation partition table registeration:
> 1. modify .yml configuration file parsing function, add content for partition 
> table.
> 2. construct partition table DDL regards to .yml configuration file
> 3. map sub partition table name to the table list in .yml configuration file
> 4. register the subpartition table one by one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1035) support partition table register

2016-09-12 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1035:
--
Description: 
Support partition table register, limited to 1 level partition table, since 
hawq extract only supports 1-level partition table.

Expected behavior:
1. Create a partition table in HAWQ, then extract the information out to .yml 
file
2. Call hawq register and specify identified .yml file and a new table name, 
the files should be registered into the new table.

Work can be detailed down to implement partition table register:
1. modify .yml configuration file parsing function, add content for partition 
table.
2. construct partition table DDL regards to .yml configuration file
3. map sub partition table name to the table list in .yml configuration file
4. register the subpartition table one by one

  was:
Support partition table register, limited to 1 level partition table, since 
hawq extract only supports 1-level partition table.

Expected behavior:
1. Create a partition table in HAWQ, then extract the information out to .yml 
file
2. Call hawq register and specify identified .yml file and a new table name, 
the files should be registered into the new table.

Works can be detailed down to implementation partition table registeration:
1. modify .yml configuration file parsing function, add content for partition 
table.
2. construct partition table DDL regards to .yml configuration file
3. map sub partition table name to the table list in .yml configuration file
4. register the subpartition table one by one


> support partition table register
> 
>
> Key: HAWQ-1035
> URL: https://issues.apache.org/jira/browse/HAWQ-1035
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Support partition table register, limited to 1 level partition table, since 
> hawq extract only supports 1-level partition table.
> Expected behavior:
> 1. Create a partition table in HAWQ, then extract the information out to .yml 
> file
> 2. Call hawq register and specify identified .yml file and a new table name, 
> the files should be registered into the new table.
> Work can be detailed down to implement partition table register:
> 1. modify .yml configuration file parsing function, add content for partition 
> table.
> 2. construct partition table DDL regards to .yml configuration file
> 3. map sub partition table name to the table list in .yml configuration file
> 4. register the subpartition table one by one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1044) Verify the correctness of hawq register

2016-09-12 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15486031#comment-15486031
 ] 

Lili Ma commented on HAWQ-1044:
---

We need design our test cases to verify hawq register from following aspects:
1. partition table/non-partition table,  
2. format: row-oriented/parquet
3. randomly distributed/hash distributed
4. partition policy, range partition or list partition.

> Verify the correctness of hawq register
> ---
>
> Key: HAWQ-1044
> URL: https://issues.apache.org/jira/browse/HAWQ-1044
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: backlog
>
>
> Verify the correctness of hawq register, summary all the use scenarios and 
> design corresponding test cases for it.
> I think following test cases should be added for the HAWQ register.
> 1. Use Case 1: Register file/folder into HAWQ by specifying file/folder name
> a) hawq register -d postgres -f a.file tableA
> b) hawq register -d postgres -f a.file -e eof tableA
> c) hawq register -d postgres -f folderA tableA
> d) register file to existing table. normal path
> e) register file to existing table. error path: to-be-registered files under 
> the file folder for the existing table on HDFS. Should throw error out.
> f) verify wrong input file. The file format not parquet format.
> 2. Use case 2: Register into HAWQ table using .yml configuration file to a 
> non-existing table
> a) Verify normal input:
> create table a(a int, b int);
> insert into a values(generate_series(1,100), 25);
> hawq extract -d postgres -o a.yml a
> hawq register -d postgres -c a.yml b
> b) Modify the fileSize in .yml file to a value which is different from actual 
> data size of data file
> 3. Use Case 2: Regsiter into HAWQ table using .yml configuration file to an 
> existing table
> a) Verify normal path:
> Call multiple times of hawq register, to verify whether can succeed. Each 
> time the to-be-registered files are not under the table directory.
> b) Error path: to-be-registered files under the file folder for the existing 
> table on HDFS
> Should throw error out: not support!
> 4. Use Case 2: Register into HAWQ table using .yml configuration file by 
> specifying --force option
> a) The table not exist: should create a new table, and do the register
> b) The table already exist, but no data there: can directly call hawq register
> c) Table already exist, and already data there -- normal path: .yml 
> configuration file includes the data files under table directory, and 
> just include those data files.
> d) Table already exist, and already data there -- normal path: .yml 
> configuration file includes the data files under table directory, and 
> also includes data files not under table directory.
> e) Table already exist, and already data there -- error path: .yml 
> configuration file doesn't include the data files under that table directory. 
> Should throw error out, "there are already existing files under the table, 
> but not included in .yml configuration file"
> 5. Use Case 2: Register into HAWQ table using .yml configuration file by 
> specifying --repair option
> a) Normal Path 1: (Append to new file)
> create a tableA
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> insert new data into tableA
> call hawq register --repair option to rollback to the state
> b) Normal Path 2: (New files generated)
> Same as Normal Path 1, but during the second insert, use multiple inserts 
> concurrenly aiming at producing new files. Then call hawq register --repair,
> the new files should be discarded.
> c) Error Path: restributed
> Create a table with hash-distributed, distributed by column A
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> alter table redistributed by column B
> insert new data into tableA
> call hawq register --repair option to rollback to the state  
> --> should throw error "the table is redistributed"
> d) Error Path: table being truncated
> Create a table with hash-distributed, distributed by column A
> insert some data into tableA
> call hawq extract the metadata to a.yml file
> truncate tableA
> call hawq register --repair option to rollback to the state  
> --> should throw error "the table becomes smaller than the .yml config file 
> specified."
> e) Error Path: files specified in .yml configuration not under data directory 
> of table A
> --> should throw error "the files should all under the table directory when 
> --repair option specified for hawq register"
> 6. hawq register partition table support
> a) Normal Path: create a 1-level partition table, calling hawq extract and 
> then hawq register, can work
> b) Error Path: create a 2-level partition table, calling hawq e

[jira] [Created] (HAWQ-1050) hawq register help can not return correct result indicating the help information

2016-09-13 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1050:
-

 Summary: hawq register help can not return correct result 
indicating the help information
 Key: HAWQ-1050
 URL: https://issues.apache.org/jira/browse/HAWQ-1050
 Project: Apache HAWQ
  Issue Type: Bug
Reporter: Lili Ma
Assignee: Lei Chang


hawq register help can not return correct result indicating the help 
information.
should keep help as a keyword and return same results as hawq register --help.

{code}
malilis-MacBook-Pro:~ malili$ hawq register help
20160914:09:56:37:007364 hawqregister:malilis-MacBook-Pro:malili-[INFO]:-Usage: 
hadoop [--config confdir] COMMAND
   where COMMAND is one of:
  fs   run a generic filesystem user client
  version  print the version
  jar run a jar file
  checknative [-a|-h]  check native hadoop and compression libraries 
availability
  distcp   copy file or directories recursively
  archive -archiveName NAME -p  *  create a hadoop 
archive
  classpathprints the class path needed to get the
  credential   interact with credential providers
   Hadoop jar and the required libraries
  daemonlogget/set the log level for each daemon
  traceview and modify Hadoop tracing settings
 or
  CLASSNAMErun the class named CLASSNAME

Most commands print help when invoked w/o parameters.
Traceback (most recent call last):
  File "/usr/local/hawq/bin/hawqregister", line 398, in 
check_hash_type(dburl, tablename) # Usage1 only support randomly 
distributed table
  File "/usr/local/hawq/bin/hawqregister", line 197, in check_hash_type
logger.error('Table not found in table gp_distribution_policy.' % tablename)
TypeError: not all arguments converted during string formatting
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1050) hawq register help can not return correct result indicating the help information

2016-09-13 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1050:
--
Issue Type: Sub-task  (was: Bug)
Parent: HAWQ-991

> hawq register help can not return correct result indicating the help 
> information
> 
>
> Key: HAWQ-1050
> URL: https://issues.apache.org/jira/browse/HAWQ-1050
> Project: Apache HAWQ
>  Issue Type: Sub-task
>Reporter: Lili Ma
>Assignee: Lei Chang
>
> hawq register help can not return correct result indicating the help 
> information.
> should keep help as a keyword and return same results as hawq register --help.
> {code}
> malilis-MacBook-Pro:~ malili$ hawq register help
> 20160914:09:56:37:007364 
> hawqregister:malilis-MacBook-Pro:malili-[INFO]:-Usage: hadoop [--config 
> confdir] COMMAND
>where COMMAND is one of:
>   fs   run a generic filesystem user client
>   version  print the version
>   jar run a jar file
>   checknative [-a|-h]  check native hadoop and compression libraries 
> availability
>   distcp   copy file or directories recursively
>   archive -archiveName NAME -p  *  create a hadoop 
> archive
>   classpathprints the class path needed to get the
>   credential   interact with credential providers
>Hadoop jar and the required libraries
>   daemonlogget/set the log level for each daemon
>   traceview and modify Hadoop tracing settings
>  or
>   CLASSNAMErun the class named CLASSNAME
> Most commands print help when invoked w/o parameters.
> Traceback (most recent call last):
>   File "/usr/local/hawq/bin/hawqregister", line 398, in 
> check_hash_type(dburl, tablename) # Usage1 only support randomly 
> distributed table
>   File "/usr/local/hawq/bin/hawqregister", line 197, in check_hash_type
> logger.error('Table not found in table gp_distribution_policy.' % 
> tablename)
> TypeError: not all arguments converted during string formatting
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1061) Improve hawq register for already bugs found

2016-09-18 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1061:
-

 Summary: Improve hawq register for already bugs found
 Key: HAWQ-1061
 URL: https://issues.apache.org/jira/browse/HAWQ-1061
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang


Fix the bugs found by the verification process



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HAWQ-1034) add --repair option for hawq register

2016-10-09 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma reopened HAWQ-1034:
---

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1104) Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml configuration, also add implementation in hawq register to recognize these values

2016-10-14 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1104:
-

 Summary: Add tupcount, varblockcount and eofuncompressed value in 
hawq extract yaml configuration, also add implementation in hawq register to 
recognize these values  
 Key: HAWQ-1104
 URL: https://issues.apache.org/jira/browse/HAWQ-1104
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang
 Fix For: 2.0.1.0-incubating


Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml 
configuration, and also add implementation in hawq register to recognize these 
values so the information in catalog table pg_aoseg.pg_aoseg_$relid or 
pg_aoseg.pg_paqseg_$relid can become correct.  

After the work, the information in catalog table will become correct if we 
register table according to the yaml configuration file which is generated by 
another table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1104) Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml configuration, also add implementation in hawq register to recognize these values

2016-10-14 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1104:
--
Assignee: hongwu  (was: Lei Chang)

> Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml 
> configuration, also add implementation in hawq register to recognize these 
> values  
> --
>
> Key: HAWQ-1104
> URL: https://issues.apache.org/jira/browse/HAWQ-1104
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Add tupcount, varblockcount and eofuncompressed value in hawq extract yaml 
> configuration, and also add implementation in hawq register to recognize 
> these values so the information in catalog table pg_aoseg.pg_aoseg_$relid or 
> pg_aoseg.pg_paqseg_$relid can become correct.  
> After the work, the information in catalog table will become correct if we 
> register table according to the yaml configuration file which is generated by 
> another table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1034) add --repair option for hawq register

2016-10-31 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624133#comment-15624133
 ] 

Lili Ma commented on HAWQ-1034:
---

Repair mode can be thought of particular case of force mode.  
1) Force mode registers the files according to yaml configuration file, erase 
all the records in catalog (pg_aoseg.pg_aoseg(paqseg)_$relid) and re-implement 
catalog insert. It requires HDFS files for the table be included in yaml 
configuation file.
2) Repair mode also registers files according to yaml configuration file, erase 
the catalog records and re-insert. But it doesn't require all the HDFS files 
for the table be included in yaml configuration file. It will directly delete 
those files which are under the table directory but not included in yaml 
configuration file. 
I'm a little concerned about directly deleting HDFS files, say, if user uses 
repair mode by mistake, his/her data may be deleted.  So, what if we just allow 
them to use force mode, and throw error for files under the directory but not 
included in yaml configuration file.  If user does think the files are 
unnecessary, he/she can delete the files by himself/herself.

The workaround for supporting repair mode use --force option:
1) If there is no added files since last checkpoint where the yaml 
configuration file is generated, force mode can directly handle it.
2) If there are some added files since last checkpoint which the user does want 
to delete, we can output those file information in force mode so that users can 
delete those files by themselves and then do register force mode again. 

Since we can use force mode to implement repair feature, we will remove 
existing code for repair mode and close this JIRA.  Thanks

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1034) add --repair option for hawq register

2016-10-31 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma resolved HAWQ-1034.
---
Resolution: Done

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-1034) add --repair option for hawq register

2016-10-31 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15624133#comment-15624133
 ] 

Lili Ma edited comment on HAWQ-1034 at 11/1/16 2:44 AM:


Repair mode can be thought of particular case of force mode.  
1) Force mode registers the files according to yaml configuration file, erase 
all the records in catalog (pg_aoseg.pg_aoseg(paqseg)_$relid) and re-implement 
catalog insert. It requires HDFS files for the table be included in yaml 
configuation file.
2) Repair mode also registers files according to yaml configuration file, erase 
the catalog records and re-insert. But it doesn't require all the HDFS files 
for the table be included in yaml configuration file. It will directly delete 
those files which are under the table directory but not included in yaml 
configuration file. 
Since repair mode may directly deleting HDFS files, say, if user uses repair 
mode by mistake, his/her data may be deleted, it may bring some risks.  We can 
allow them to use force mode, and throw error for files under the directory but 
not included in yaml configuration file.  If user does think the files are 
unnecessary, he/she can delete the files by himself/herself.

The workaround for supporting repair mode use --force option:
1) If there is no added files since last checkpoint where the yaml 
configuration file is generated, force mode can directly handle it.
2) If there are some added files since last checkpoint which the user does want 
to delete, we can output those file information in force mode so that users can 
delete those files by themselves and then do register force mode again. 

Since we can use force mode to implement repair feature, we will remove 
existing code for repair mode and close this JIRA.  Thanks


was (Author: lilima):
Repair mode can be thought of particular case of force mode.  
1) Force mode registers the files according to yaml configuration file, erase 
all the records in catalog (pg_aoseg.pg_aoseg(paqseg)_$relid) and re-implement 
catalog insert. It requires HDFS files for the table be included in yaml 
configuation file.
2) Repair mode also registers files according to yaml configuration file, erase 
the catalog records and re-insert. But it doesn't require all the HDFS files 
for the table be included in yaml configuration file. It will directly delete 
those files which are under the table directory but not included in yaml 
configuration file. 
I'm a little concerned about directly deleting HDFS files, say, if user uses 
repair mode by mistake, his/her data may be deleted.  So, what if we just allow 
them to use force mode, and throw error for files under the directory but not 
included in yaml configuration file.  If user does think the files are 
unnecessary, he/she can delete the files by himself/herself.

The workaround for supporting repair mode use --force option:
1) If there is no added files since last checkpoint where the yaml 
configuration file is generated, force mode can directly handle it.
2) If there are some added files since last checkpoint which the user does want 
to delete, we can output those file information in force mode so that users can 
delete those files by themselves and then do register force mode again. 

Since we can use force mode to implement repair feature, we will remove 
existing code for repair mode and close this JIRA.  Thanks

> add --repair option for hawq register
> -
>
> Key: HAWQ-1034
> URL: https://issues.apache.org/jira/browse/HAWQ-1034
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> add --repair option for hawq register
> Will change both file folder and catalog table pg_aoseg.pg_paqseg_$relid to 
> the state which .yml file configures. Note may some new generated files since 
> the checkpoint may be deleted here. Also note the all the files in .yml file 
> should all under the table folder on HDFS. Limitation: Do not support cases 
> for hash table redistribution, table truncate and table drop. This is for 
> scenario rollback of table: Do checkpoints somewhere, and need to rollback to 
> previous checkpoint. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1035) support partition table register

2016-10-31 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1035:
--
Assignee: Chunling Wang  (was: Hubert Zhang)

> support partition table register
> 
>
> Key: HAWQ-1035
> URL: https://issues.apache.org/jira/browse/HAWQ-1035
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Support partition table register, limited to 1 level partition table, since 
> hawq extract only supports 1-level partition table.
> Expected behavior:
> 1. Create a partition table in HAWQ, then extract the information out to .yml 
> file
> 2. Call hawq register and specify identified .yml file and a new table name, 
> the files should be registered into the new table.
> Work can be detailed down to implement partition table register:
> 1. modify .yml configuration file parsing function, add content for partition 
> table.
> 2. construct partition table DDL regards to .yml configuration file
> 3. map sub partition table name to the table list in .yml configuration file
> 4. register the subpartition table one by one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-1144) Register into a 2-level partition table, hawq register didn't throw error, and indicates that hawq register succeed, but no data can be selected out.

2016-11-03 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1144:
-

 Summary: Register into a 2-level partition table, hawq register 
didn't throw error, and indicates that hawq register succeed, but no data can 
be selected out.
 Key: HAWQ-1144
 URL: https://issues.apache.org/jira/browse/HAWQ-1144
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang
 Fix For: 2.0.1.0-incubating


Register into a 2-level partition table, hawq register didn't throw error, and 
indicates that hawq register succeed, but no data can be selected out.

Reproduce Steps:
1. Create a one-level partition table
```
 create table parquet_wt (id SERIAL,a1 int,a2 char(5),a3 numeric,a4 boolean 
DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 timestamp,a8 character 
varying(705),a9 bigint,a10 date,a11 varchar(600),a12 text,a13 decimal,a14 
real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with time zone,a19 timetz,a20 
path,a21 box,a22 macaddr,a23 interval,a24 character varying(800),a25 lseg,a26 
point,a27 double precision,a28 circle,a29 int4,a30 numeric(8),a31 polygon,a32 
date,a33 real,a34 money,a35 cidr,a36 inet,a37 time,a38 text,a39 bit,a40 bit 
varying(5),a41 smallint,a42 int )   WITH (appendonly=true, orientation=parquet) 
distributed randomly  Partition by range(a1) (start(1)  end(5000) every(1000) );
```
2. insert some data into this table
```
insert into parquet_wt 
(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,a39,a40,a41,a42)
 values(generate_series(1,20),'M',2011,'t','a','This is news of today: Deadlock 
between Republicans and Democrats over how best to reduce the U.S. deficit, and 
over what period, has blocked an agreement to allow the raising of the $14.3 
trillion debt ceiling','2001-12-24 02:26:11','U.S. House of Representatives 
Speaker John Boehner, the top Republican in Congress who has put forward a 
deficit reduction plan to be voted on later on Thursday said he had no control 
over whether his bill would avert a credit 
downgrade.',generate_series(2490,2505),'2011-10-11','The Republican-controlled 
House is tentatively scheduled to vote on Boehner proposal this afternoon at 
around 6 p.m. EDT (2200 GMT). The main Republican vote counter in the House, 
Kevin McCarthy, would not say if there were enough votes to pass the 
bill.','WASHINGTON:House Speaker John Boehner says his plan mixing spending 
cuts in exchange for raising the nations $14.3 trillion debt limit is not 
perfect but is as large a step that a divided government can take that is 
doable and signable by President Barack Obama.The Ohio Republican says the 
measure is an honest and sincere attempt at compromise and was negotiated with 
Democrats last weekend and that passing it would end the ongoing debt crisis. 
The plan blends $900 billion-plus in spending cuts with a companion increase in 
the nations borrowing 
cap.','1234.56',323453,generate_series(3452,3462),7845,'0011','2005-07-16 
01:51:15+1359','2001-12-13 
01:51:15','((1,2),(0,3),(2,1))','((2,3)(4,5))','08:00:2b:01:02:03','1-2','Republicans
 had been working throughout the day Thursday to lock down support for their 
plan to raise the nations debt ceiling, even as Senate Democrats vowed to 
swiftly kill it if 
passed.','((2,3)(4,5))','(6,7)',11.222,'((4,5),7)',32,3214,'(1,0,2,3)','2010-02-21',43564,'$1,000.00','192.168.1','126.1.3.4','12:30:45','Johnson
 & Johnsons McNeil Consumer Healthcare announced the voluntary dosage reduction 
today. Labels will carry new dosing instructions this fall.The company says it 
will cut the maximum dosage of Regular Strength Tylenol and other 
acetaminophen-containing products in 2012.Acetaminophen is safe when used as 
directed, says Edwin Kuffner, MD, McNeil vice president of over-the-counter 
medical affairs. But, when too much is taken, it can cause liver damage.The 
action is intended to cut the risk of such accidental overdoses, the company 
says in a news release.','1','0',12,23);
```
3. extract the metadata out for the table
```
hawq extract -d postgres -o ~/parquet.yaml parquet_wt
```
4. create a two-level partition table
```
CREATE TABLE parquet_wt_subpartgzip2
  (id SERIAL,a1 int,a2 
char(5),a3 numeric,a4 boolean DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 
timestamp,a8 character varying(705),a9 bigint,a10 date,a11 varchar(600),a12 
text,a13 decimal,a14 real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with 
time zone,a19 timetz,a20 path,a21 box,a22 macaddr,a23 interval,a24 character 
varying(800),a25 lseg,a26 point,a27 double precision,a28 circle,a29 int4,a30 
numeric(8),a31 polygon,a32 date,a33 real,a34 money,a35 cidr,a36 inet,a37 
time,a38 text,a39 bit,a40 bit varying(5),a41 smallint,a42 int )

[jira] [Updated] (HAWQ-1144) Register into a 2-level partition table, hawq register didn't throw error, and indicates that hawq register succeed, but no data can be selected out.

2016-11-03 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1144:
--
   Assignee: Lin Wen  (was: Lei Chang)
Description: 
Register into a 2-level partition table, hawq register didn't throw error, and 
indicates that hawq register succeed, but no data can be selected out.

Reproduce Steps:
1. Create a one-level partition table
{code}
 create table parquet_wt (id SERIAL,a1 int,a2 char(5),a3 numeric,a4 boolean 
DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 timestamp,a8 character 
varying(705),a9 bigint,a10 date,a11 varchar(600),a12 text,a13 decimal,a14 
real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with time zone,a19 timetz,a20 
path,a21 box,a22 macaddr,a23 interval,a24 character varying(800),a25 lseg,a26 
point,a27 double precision,a28 circle,a29 int4,a30 numeric(8),a31 polygon,a32 
date,a33 real,a34 money,a35 cidr,a36 inet,a37 time,a38 text,a39 bit,a40 bit 
varying(5),a41 smallint,a42 int )   WITH (appendonly=true, orientation=parquet) 
distributed randomly  Partition by range(a1) (start(1)  end(5000) every(1000) );
{code}
2. insert some data into this table
```
insert into parquet_wt 
(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,a39,a40,a41,a42)
 values(generate_series(1,20),'M',2011,'t','a','This is news of today: Deadlock 
between Republicans and Democrats over how best to reduce the U.S. deficit, and 
over what period, has blocked an agreement to allow the raising of the $14.3 
trillion debt ceiling','2001-12-24 02:26:11','U.S. House of Representatives 
Speaker John Boehner, the top Republican in Congress who has put forward a 
deficit reduction plan to be voted on later on Thursday said he had no control 
over whether his bill would avert a credit 
downgrade.',generate_series(2490,2505),'2011-10-11','The Republican-controlled 
House is tentatively scheduled to vote on Boehner proposal this afternoon at 
around 6 p.m. EDT (2200 GMT). The main Republican vote counter in the House, 
Kevin McCarthy, would not say if there were enough votes to pass the 
bill.','WASHINGTON:House Speaker John Boehner says his plan mixing spending 
cuts in exchange for raising the nations $14.3 trillion debt limit is not 
perfect but is as large a step that a divided government can take that is 
doable and signable by President Barack Obama.The Ohio Republican says the 
measure is an honest and sincere attempt at compromise and was negotiated with 
Democrats last weekend and that passing it would end the ongoing debt crisis. 
The plan blends $900 billion-plus in spending cuts with a companion increase in 
the nations borrowing 
cap.','1234.56',323453,generate_series(3452,3462),7845,'0011','2005-07-16 
01:51:15+1359','2001-12-13 
01:51:15','((1,2),(0,3),(2,1))','((2,3)(4,5))','08:00:2b:01:02:03','1-2','Republicans
 had been working throughout the day Thursday to lock down support for their 
plan to raise the nations debt ceiling, even as Senate Democrats vowed to 
swiftly kill it if 
passed.','((2,3)(4,5))','(6,7)',11.222,'((4,5),7)',32,3214,'(1,0,2,3)','2010-02-21',43564,'$1,000.00','192.168.1','126.1.3.4','12:30:45','Johnson
 & Johnsons McNeil Consumer Healthcare announced the voluntary dosage reduction 
today. Labels will carry new dosing instructions this fall.The company says it 
will cut the maximum dosage of Regular Strength Tylenol and other 
acetaminophen-containing products in 2012.Acetaminophen is safe when used as 
directed, says Edwin Kuffner, MD, McNeil vice president of over-the-counter 
medical affairs. But, when too much is taken, it can cause liver damage.The 
action is intended to cut the risk of such accidental overdoses, the company 
says in a news release.','1','0',12,23);
```
3. extract the metadata out for the table
```
hawq extract -d postgres -o ~/parquet.yaml parquet_wt
```
4. create a two-level partition table
```
CREATE TABLE parquet_wt_subpartgzip2
  (id SERIAL,a1 int,a2 
char(5),a3 numeric,a4 boolean DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 
timestamp,a8 character varying(705),a9 bigint,a10 date,a11 varchar(600),a12 
text,a13 decimal,a14 real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with 
time zone,a19 timetz,a20 path,a21 box,a22 macaddr,a23 interval,a24 character 
varying(800),a25 lseg,a26 point,a27 double precision,a28 circle,a29 int4,a30 
numeric(8),a31 polygon,a32 date,a33 real,a34 money,a35 cidr,a36 inet,a37 
time,a38 text,a39 bit,a40 bit varying(5),a41 smallint,a42 int ) 
WITH (appendonly=true, orientation=parquet) distributed 
randomly  Partition by range(a1) Subpartition by list(a2) subpartition template 
( default subpartition df_sp, subpartition sp1 values('M') , subpartition sp2 
values('F')  

[jira] [Created] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2016-11-03 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1145:
-

 Summary: After registering a partition table, if we want to insert 
some data into the table, it fails.
 Key: HAWQ-1145
 URL: https://issues.apache.org/jira/browse/HAWQ-1145
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: Lili Ma
Assignee: Lei Chang
 Fix For: 2.0.1.0-incubating


Reproduce Steps:
1. Create a partition table
CREATE TABLE parquet_LINEITEM_uncompressed( 



 L_ORDERKEY INT8,   



 L_PARTKEY BIGINT,  



 L_SUPPKEY BIGINT,  



 L_LINENUMBER BIGINT,   


 
L_QUANTITY decimal, 



L_EXTENDEDPRICE decimal,



L_DISCOUNT decimal, 



L_TAX decimal,  



L_RETURNFLAG CHAR(1),   



L_LINESTATUS CHAR(1),   



L_SHIPDATE date,



L_COMMITDATE date,  



L_RECEIPTDATE date, 


   

[jira] [Updated] (HAWQ-1144) Register into a 2-level partition table, hawq register didn't throw error, and indicates that hawq register succeed, but no data can be selected out.

2016-11-03 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1144:
--
Description: 
Register into a 2-level partition table, hawq register didn't throw error, and 
indicates that hawq register succeed, but no data can be selected out.

Reproduce Steps:
1. Create a one-level partition table
{code}
 create table parquet_wt (id SERIAL,a1 int,a2 char(5),a3 numeric,a4 boolean 
DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 timestamp,a8 character 
varying(705),a9 bigint,a10 date,a11 varchar(600),a12 text,a13 decimal,a14 
real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with time zone,a19 timetz,a20 
path,a21 box,a22 macaddr,a23 interval,a24 character varying(800),a25 lseg,a26 
point,a27 double precision,a28 circle,a29 int4,a30 numeric(8),a31 polygon,a32 
date,a33 real,a34 money,a35 cidr,a36 inet,a37 time,a38 text,a39 bit,a40 bit 
varying(5),a41 smallint,a42 int )   WITH (appendonly=true, orientation=parquet) 
distributed randomly  Partition by range(a1) (start(1)  end(5000) every(1000) );
{code}
2. insert some data into this table
{code}
insert into parquet_wt 
(a1,a2,a3,a4,a5,a6,a7,a8,a9,a10,a11,a12,a13,a14,a15,a16,a17,a18,a19,a20,a21,a22,a23,a24,a25,a26,a27,a28,a29,a30,a31,a32,a33,a34,a35,a36,a37,a38,a39,a40,a41,a42)
 values(generate_series(1,20),'M',2011,'t','a','This is news of today: Deadlock 
between Republicans and Democrats over how best to reduce the U.S. deficit, and 
over what period, has blocked an agreement to allow the raising of the $14.3 
trillion debt ceiling','2001-12-24 02:26:11','U.S. House of Representatives 
Speaker John Boehner, the top Republican in Congress who has put forward a 
deficit reduction plan to be voted on later on Thursday said he had no control 
over whether his bill would avert a credit 
downgrade.',generate_series(2490,2505),'2011-10-11','The Republican-controlled 
House is tentatively scheduled to vote on Boehner proposal this afternoon at 
around 6 p.m. EDT (2200 GMT). The main Republican vote counter in the House, 
Kevin McCarthy, would not say if there were enough votes to pass the 
bill.','WASHINGTON:House Speaker John Boehner says his plan mixing spending 
cuts in exchange for raising the nations $14.3 trillion debt limit is not 
perfect but is as large a step that a divided government can take that is 
doable and signable by President Barack Obama.The Ohio Republican says the 
measure is an honest and sincere attempt at compromise and was negotiated with 
Democrats last weekend and that passing it would end the ongoing debt crisis. 
The plan blends $900 billion-plus in spending cuts with a companion increase in 
the nations borrowing 
cap.','1234.56',323453,generate_series(3452,3462),7845,'0011','2005-07-16 
01:51:15+1359','2001-12-13 
01:51:15','((1,2),(0,3),(2,1))','((2,3)(4,5))','08:00:2b:01:02:03','1-2','Republicans
 had been working throughout the day Thursday to lock down support for their 
plan to raise the nations debt ceiling, even as Senate Democrats vowed to 
swiftly kill it if 
passed.','((2,3)(4,5))','(6,7)',11.222,'((4,5),7)',32,3214,'(1,0,2,3)','2010-02-21',43564,'$1,000.00','192.168.1','126.1.3.4','12:30:45','Johnson
 & Johnsons McNeil Consumer Healthcare announced the voluntary dosage reduction 
today. Labels will carry new dosing instructions this fall.The company says it 
will cut the maximum dosage of Regular Strength Tylenol and other 
acetaminophen-containing products in 2012.Acetaminophen is safe when used as 
directed, says Edwin Kuffner, MD, McNeil vice president of over-the-counter 
medical affairs. But, when too much is taken, it can cause liver damage.The 
action is intended to cut the risk of such accidental overdoses, the company 
says in a news release.','1','0',12,23);
{code}
3. extract the metadata out for the table
{code}
hawq extract -d postgres -o ~/parquet.yaml parquet_wt
{code}
4. create a two-level partition table
{code}
CREATE TABLE parquet_wt_subpartgzip2
  (id SERIAL,a1 int,a2 
char(5),a3 numeric,a4 boolean DEFAULT false ,a5 char DEFAULT 'd',a6 text,a7 
timestamp,a8 character varying(705),a9 bigint,a10 date,a11 varchar(600),a12 
text,a13 decimal,a14 real,a15 bigint,a16 int4 ,a17 bytea,a18 timestamp with 
time zone,a19 timetz,a20 path,a21 box,a22 macaddr,a23 interval,a24 character 
varying(800),a25 lseg,a26 point,a27 double precision,a28 circle,a29 int4,a30 
numeric(8),a31 polygon,a32 date,a33 real,a34 money,a35 cidr,a36 inet,a37 
time,a38 text,a39 bit,a40 bit varying(5),a41 smallint,a42 int ) 
WITH (appendonly=true, orientation=parquet) distributed 
randomly  Partition by range(a1) Subpartition by list(a2) subpartition template 
( default subpartition df_sp, subpartition sp1 values('M') , subpartition sp2 
values('F')   WIT

[jira] [Updated] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2016-11-03 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1145:
--
Assignee: Hubert Zhang  (was: Lei Chang)

> After registering a partition table, if we want to insert some data into the 
> table, it fails.
> -
>
> Key: HAWQ-1145
> URL: https://issues.apache.org/jira/browse/HAWQ-1145
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
>
> Reproduce Steps:
> 1. Create a partition table
> CREATE TABLE parquet_LINEITEM_uncompressed(   
>   
>   
>   
>  L_ORDERKEY INT8, 
>   
>   
>   
>  L_PARTKEY BIGINT,
>   
>   
>   
>  L_SUPPKEY BIGINT,
>   
>   
>   
>  L_LINENUMBER BIGINT, 
>   
>   
>   
>  L_QUANTITY decimal,  
>   
>   
>   
>  L_EXTENDEDPRICE decimal, 
>   
>   
>   
>  L_DISCOUNT decimal,  
>   
>   
>   
>  L_TAX decimal,   
>   
>   
>   
>  L_RETURNFLAG CHAR(1),
>   
>   
>   
>  L_LINESTATUS 
> CHAR(1),  
>   
>   
>  
> L_SHIPDATE date,  
>   
>   
>   
> L_COMMITDATE date,
>   
>   
> 

[jira] [Updated] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2016-11-03 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1145:
--
Description: 
Reproduce Steps:
1. Create a partition table
{code}
CREATE TABLE parquet_LINEITEM_uncompressed( 



 L_ORDERKEY INT8,   



 L_PARTKEY BIGINT,  



 L_SUPPKEY BIGINT,  



 L_LINENUMBER BIGINT,   


 
L_QUANTITY decimal, 



L_EXTENDEDPRICE decimal,



L_DISCOUNT decimal, 



L_TAX decimal,  



L_RETURNFLAG CHAR(1),   



L_LINESTATUS CHAR(1),   



L_SHIPDATE date,



L_COMMITDATE date,  



L_RECEIPTDATE date, 



L_SHIPINSTRUCT CHAR(25),



[jira] [Updated] (HAWQ-1091) HAWQ InputFormat Bugs

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1091:
--
Component/s: (was: Command Line Tools)
 Storage

> HAWQ InputFormat Bugs
> -
>
> Key: HAWQ-1091
> URL: https://issues.apache.org/jira/browse/HAWQ-1091
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Storage
>Reporter: hongwu
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> In "TPCHLocalTester.java" and "HAWQInputFormatPerformanceTest_TPCH.java", it 
> uses "WHERE content>=0" filter which is old condition in old version of HAWQ.
> dbgen binary is not included in hawq repo which is needed for 
> generate_load_tpch.pl script to generate data used for running mapreduce test 
> cases. We should disable these cases.
> A bug when size in extracted yaml file is zero.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-991) "HAWQ register" could register tables according to .yml configuration file

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-991:
-
Component/s: (was: External Tables)

> "HAWQ register" could register tables according to .yml configuration file
> --
>
> Key: HAWQ-991
> URL: https://issues.apache.org/jira/browse/HAWQ-991
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: hongwu
>Assignee: hongwu
> Fix For: 2.0.1.0-incubating
>
>
> Scenario: 
> 1. For cluster Disaster Recovery. Two clusters co-exist, periodically import 
> data from Cluster A to Cluster B. Need Register data to Cluster B.
> 2. For the rollback of table. Do checkpoints somewhere, and need to rollback 
> to previous checkpoint. 
> Description:
> Register according to .yml configuration file. 
> hawq register [-h hostname] [-p port] [-U username] [-d databasename] [-c 
> config] [--force][--repair]  
> Behaviors:
> 1. If table doesn't exist, will automatically create the table and register 
> the files in .yml configuration file. Will use the filesize specified in .yml 
> to update the catalog table. 
> 2. If table already exist, and neither --force nor --repair configured. Do 
> not create any table, and directly register the files specified in .yml file 
> to the table. Note that if the file is under table directory in HDFS, will 
> throw error, say, to-be-registered files should not under the table path.
> 3. If table already exist, and --force is specified. Will clear all the 
> catalog contents in pg_aoseg.pg_paqseg_$relid while keep the files on HDFS, 
> and then re-register all the files to the table.  This is for scenario 2.
> 4. If table already exist, and --repair is specified. Will change both file 
> folder and catalog table pg_aoseg.pg_paqseg_$relid to the state which .yml 
> file configures. Note may some new generated files since the checkpoint may 
> be deleted here. Also note the all the files in .yml file should all under 
> the table folder on HDFS. Limitation: Do not support cases for hash table 
> redistribution, table truncate and table drop. This is for scenario 3.
> Requirements:
> 1. To be registered file path has to colocate with HAWQ in the same HDFS 
> cluster.
> 2. If to be registered is a hash table, the registered file number should be 
> one or multiple times or hash table bucket number.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1035) support partition table register

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma resolved HAWQ-1035.
---
Resolution: Fixed

> support partition table register
> 
>
> Key: HAWQ-1035
> URL: https://issues.apache.org/jira/browse/HAWQ-1035
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Lili Ma
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> Support partition table register, limited to 1 level partition table, since 
> hawq extract only supports 1-level partition table.
> Expected behavior:
> 1. Create a partition table in HAWQ, then extract the information out to .yml 
> file
> 2. Call hawq register and specify identified .yml file and a new table name, 
> the files should be registered into the new table.
> Work can be detailed down to implement partition table register:
> 1. modify .yml configuration file parsing function, add content for partition 
> table.
> 2. construct partition table DDL regards to .yml configuration file
> 3. map sub partition table name to the table list in .yml configuration file
> 4. register the subpartition table one by one



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1113) In force mode, hawq register error when files in yaml is disordered

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1113:
--
Affects Version/s: 2.0.1.0-incubating

> In force mode, hawq register error when files in yaml is disordered
> ---
>
> Key: HAWQ-1113
> URL: https://issues.apache.org/jira/browse/HAWQ-1113
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> In force mode, hawq register error when files in yaml is in disordered. For 
> example, the files order in yaml is as following:
> {code}
>   Files:
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/2
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/4
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/5
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/6
> size: 270
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/3
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW2@/1
> size: 228
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/2
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/3
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/4
> size: 220
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/1
> size: 254
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/6
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/5
> size: 210
> {code}
> After hawq register success, we select data from table and get the error:
> {code}
> ERROR:  hdfs file length does not equal to metadata logic length! 
> (cdbdatalocality.c:1102)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1113) In force mode, hawq register error when files in yaml is disordered

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1113:
--
Assignee: Chunling Wang  (was: Lei Chang)

> In force mode, hawq register error when files in yaml is disordered
> ---
>
> Key: HAWQ-1113
> URL: https://issues.apache.org/jira/browse/HAWQ-1113
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
>
> In force mode, hawq register error when files in yaml is in disordered. For 
> example, the files order in yaml is as following:
> {code}
>   Files:
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/2
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/4
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/5
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/6
> size: 270
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/3
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW2@/1
> size: 228
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/2
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/3
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/4
> size: 220
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/1
> size: 254
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/6
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/5
> size: 210
> {code}
> After hawq register success, we select data from table and get the error:
> {code}
> ERROR:  hdfs file length does not equal to metadata logic length! 
> (cdbdatalocality.c:1102)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1113) In force mode, hawq register error when files in yaml is disordered

2016-11-10 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma resolved HAWQ-1113.
---
   Resolution: Fixed
Fix Version/s: 2.0.1.0-incubating

> In force mode, hawq register error when files in yaml is disordered
> ---
>
> Key: HAWQ-1113
> URL: https://issues.apache.org/jira/browse/HAWQ-1113
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Chunling Wang
>Assignee: Chunling Wang
> Fix For: 2.0.1.0-incubating
>
>
> In force mode, hawq register error when files in yaml is in disordered. For 
> example, the files order in yaml is as following:
> {code}
>   Files:
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/2
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/4
> size: 250
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/5
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/6
> size: 270
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/3
> size: 258
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW2@/1
> size: 228
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/2
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/3
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/4
> size: 220
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_OLD@/1
> size: 254
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/6
> size: 215
>   - path: /hawq_default/16385/@DATABASE_OID@/@TABLE_OID_NEW@/5
> size: 210
> {code}
> After hawq register success, we select data from table and get the error:
> {code}
> ERROR:  hdfs file length does not equal to metadata logic length! 
> (cdbdatalocality.c:1102)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2016-11-20 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma updated HAWQ-1145:
--
Attachment: dists.dss
dbgen

[~xunzhang] you can use these two files to generate tpch data. The way to 
generate lineitem_1g is 
dbgen -b dists.dss -s 1 -T L >lineitem_1g

> After registering a partition table, if we want to insert some data into the 
> table, it fails.
> -
>
> Key: HAWQ-1145
> URL: https://issues.apache.org/jira/browse/HAWQ-1145
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
> Attachments: dbgen, dists.dss
>
>
> Reproduce Steps:
> 1. Create a partition table
> {code}
> CREATE TABLE parquet_LINEITEM_uncompressed(   
>   
>   
>   
>  L_ORDERKEY INT8, 
>   
>   
>   
>  L_PARTKEY BIGINT,
>   
>   
>   
>  L_SUPPKEY BIGINT,
>   
>   
>   
>  L_LINENUMBER BIGINT, 
>   
>   
>   
>  L_QUANTITY decimal,  
>   
>   
>   
>  L_EXTENDEDPRICE decimal, 
>   
>   
>   
>  L_DISCOUNT decimal,  
>   
>   
>   
>  L_TAX decimal,   
>   
>   
>   
>  L_RETURNFLAG CHAR(1),
>   
>   
>   
>  L_LINESTATUS 
> CHAR(1),  
>   
>   
>  
> L_SHIPDATE date,  
>   
>   
>   
> L_COMMIT

[jira] [Created] (HAWQ-1167) Parquet format table estimate column width meets error for bpchar type

2016-11-21 Thread Lili Ma (JIRA)
Lili Ma created HAWQ-1167:
-

 Summary: Parquet format table estimate column width meets error 
for bpchar type
 Key: HAWQ-1167
 URL: https://issues.apache.org/jira/browse/HAWQ-1167
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Storage
Reporter: Lili Ma
Assignee: Lei Chang


In function estimateColumnWidth, the atttypmod attribute for type bpchar may be 
-1, and will provide wrong value to columnWidths. Should estimate a valid 
length for any type.
{code}
case HAWQ_TYPE_BPCHAR:
/* for char(n), atttypmod is n + 4 */
Assert(att->atttypmod > 4);
columnWidths[(*colidx)++] = att->atttypmod;
{code}
Reproduce type:
1. create table A(a int, b bpchar);
2. select oid from pg_class where relname='a';
3. select attname, atttypmod from pg_attribute where attrelid=$oid

will see the atttypmod is -1, thus will bring error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-1145) After registering a partition table, if we want to insert some data into the table, it fails.

2016-11-21 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683216#comment-15683216
 ] 

Lili Ma commented on HAWQ-1145:
---

During diagosis of this bug, found another bug, which is described in HAWQ-1167.

> After registering a partition table, if we want to insert some data into the 
> table, it fails.
> -
>
> Key: HAWQ-1145
> URL: https://issues.apache.org/jira/browse/HAWQ-1145
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Affects Versions: 2.0.1.0-incubating
>Reporter: Lili Ma
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
> Attachments: dbgen, dists.dss
>
>
> Reproduce Steps:
> 1. Create a partition table
> {code}
> CREATE TABLE parquet_LINEITEM_uncompressed(   
>   
>   
>   
>  L_ORDERKEY INT8, 
>   
>   
>   
>  L_PARTKEY BIGINT,
>   
>   
>   
>  L_SUPPKEY BIGINT,
>   
>   
>   
>  L_LINENUMBER BIGINT, 
>   
>   
>   
>  L_QUANTITY decimal,  
>   
>   
>   
>  L_EXTENDEDPRICE decimal, 
>   
>   
>   
>  L_DISCOUNT decimal,  
>   
>   
>   
>  L_TAX decimal,   
>   
>   
>   
>  L_RETURNFLAG CHAR(1),
>   
>   
>   
>  L_LINESTATUS 
> CHAR(1),  
>   
>   
>  
> L_SHIPDATE date,  
>   
>   
>   
> L_COMMITDATE date,  

[jira] [Commented] (HAWQ-1171) Support upgrade for hawq register.

2016-11-30 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710623#comment-15710623
 ] 

Lili Ma commented on HAWQ-1171:
---

It aims to provide multiple update functions. But for upgrade from HAWQ 2.0.X 
to 2.1.0, we only need to upgrade hawq register part.  For future releases, we 
may need upgrade other parts too, so we keep this script name. 

> Support upgrade for hawq register.
> --
>
> Key: HAWQ-1171
> URL: https://issues.apache.org/jira/browse/HAWQ-1171
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Core
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
>
> For Hawq register feature, we need to add some build-in functions to support 
> some catalog changes. This could be done by a hawqupgrade script.
> User interface:
> Hawq upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-1171) Support upgrade for hawq register.

2016-11-30 Thread Lili Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lili Ma resolved HAWQ-1171.
---
   Resolution: Fixed
Fix Version/s: 2.0.1.0-incubating

> Support upgrade for hawq register.
> --
>
> Key: HAWQ-1171
> URL: https://issues.apache.org/jira/browse/HAWQ-1171
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Core
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
>
> For Hawq register feature, we need to add some build-in functions to support 
> some catalog changes. This could be done by a hawqupgrade script.
> User interface:
> Hawq upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (HAWQ-1171) Support upgrade for hawq register.

2016-11-30 Thread Lili Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710623#comment-15710623
 ] 

Lili Ma edited comment on HAWQ-1171 at 12/1/16 2:36 AM:


It aims to provide multiple update functions. But for upgrade from HAWQ 2.0.0 
to current version, we only need to upgrade hawq register part.  For future 
releases, we may need upgrade other parts too, so we keep this script name. 


was (Author: lilima):
It aims to provide multiple update functions. But for upgrade from HAWQ 2.0.X 
to 2.1.0, we only need to upgrade hawq register part.  For future releases, we 
may need upgrade other parts too, so we keep this script name. 

> Support upgrade for hawq register.
> --
>
> Key: HAWQ-1171
> URL: https://issues.apache.org/jira/browse/HAWQ-1171
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Core
>Reporter: Hubert Zhang
>Assignee: Hubert Zhang
> Fix For: 2.0.1.0-incubating
>
>
> For Hawq register feature, we need to add some build-in functions to support 
> some catalog changes. This could be done by a hawqupgrade script.
> User interface:
> Hawq upgrade.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   3   >