[GitHub] incubator-hawq pull request #1245: HAWQ-1476. Augment enable-ranger-plugin.s...

2017-05-31 Thread ictmalili
Github user ictmalili commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1245#discussion_r119519516
  
--- Diff: ranger-plugin/conf/ranger-servicedef-hawq.json ---
@@ -244,7 +244,7 @@
   "name": "authentication",
   "type": "enum",
   "subType": "authType",
-  "mandatory": false,
+  "mandatory": true,
--- End diff --

Yes, we aim to update the connection to HAWQ using Kerberos when Kerberos 
authentication is set in pg_hba.conf. 
Why don't Ambari change pg_hba.conf in HAWQ once Kerberos is setup? I think 
we need protect connection to HAWQ.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #1246: HAWQ-1475. Add LICENSE, NOTICE, and DISCLAIMER f...

2017-05-31 Thread rvs
Github user rvs commented on the issue:

https://github.com/apache/incubator-hawq/pull/1246
  
LGTM!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq issue #1246: HAWQ-1475. Add LICENSE, NOTICE, and DISCLAIMER f...

2017-05-31 Thread huor
Github user huor commented on the issue:

https://github.com/apache/incubator-hawq/pull/1246
  
@rvs, we have separate rpm packages for hawq c/c++ components, and java 
components (pxf, ranger). 

```hawq_rpm_packages
├── apache-hawq-2.2.0.0-el7.x86_64.rpm
├── apache-tomcat-7.0.62-el6.noarch.rpm
├── hawq-ranger-plugin-2.2.0.0-1.el7.centos.noarch.rpm
├── pxf-3.2.1.0-1.el6.noarch.rpm
├── pxf-hbase-3.2.1.0-1.el6.noarch.rpm
├── pxf-hdfs-3.2.1.0-1.el6.noarch.rpm
├── pxf-hive-3.2.1.0-1.el6.noarch.rpm
├── pxf-jdbc-3.2.1.0-1.el6.noarch.rpm
├── pxf-json-3.2.1.0-1.el6.noarch.rpm
└── pxf-service-3.2.1.0-1.el6.noarch.rpm
```

It contains LICENSE, NOTICE, and DISCLAIMER files for hawq c/c++ components 
in this PR, i.e., apache-hawq-2.2.0.0-el7.x86_64.rpm.

For pxf (pxf-*.rpm, apache-tomcat-7.0.62-el6.noarch.rpm) and ranger 
(hawq-ranger-plugin-2.2.0.0-1.el7.centos.noarch.rpm), there are incompatible 
license and we will do them in separate PR once your feedback for pxf and 
ranger part is available.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032142#comment-16032142
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119489636
  
--- Diff: markdown/ranger/ranger-integration-config.html.md.erb ---
@@ -129,6 +129,64 @@ Once the connection between HAWQ and Ranger is 
configured, you may choose to set
 5. Click **Save** to save your changes.
 6. Select **Service Actions > Restart All** and confirm that you want to 
restart the HAWQ cluster.
 
+## Step 3: (Optional) Register a Standby Ranger 
Plug-in Service
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If this 
service goes down, all HAWQ database operations will fail. Configure a highly 
available HAWQ Ranger Plug-in Service to eliminate possible downtime should 
this situation occur.
+
+Ranger Admin high availability and HAWQ Ranger Plug-in Service high 
availability are independent; you can configure HAWQ Ranger Plug-in Service HA 
without configuring HA for Ranger Admin.
+
+### Prerequisites 
+
+Before you configure HAWQ Ranger authentication in high availability mode, 
ensure that you have:
+
+- Installed or upgraded to a version of HAWQ that includes support for 
HAWQ Ranger Authentication.
+
+- (Optional) Configured Ranger Admin for high availability.
+
+- Configured a HAWQ standby master node for your HAWQ cluster.
+
+You must configure a standby master for your HAWQ deployment before 
enabling HAWQ Ranger high availability mode. If you have not configured your 
HAWQ standby master, follow the instructions in [Adding a HAWQ Standby 
Master](../admin/ambari-admin.html#amb-add-standby) (if you manage your HAWQ 
cluster with Ambari) or [Configuring Master 
Mirroring](../admin/MasterMirroring.html#standby_master_configure) (for a 
command-line-managed HAWQ cluster).
+
+- Registered the HAWQ Ranger Plug-in Service on your HAWQ master node.
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If you 
have not yet enabled the Ranger Plug-in Service, refer to [Install Ranger 
Connectivity to HAWQ](ranger-integration-config.html#jar) for registration 
instructions. (Optional) If you have configured Ranger Admin HA, make sure to 
identify the Ranger Admin HA proxy when you enable the plug-in.
+
+
+### Configuring the Standby Ranger Plug-in Service 
+
+The standby Ranger Plug-in Service runs on the HAWQ standby master node, 
utilizing the same port number as that when the service runs on the master 
node. To enable HAWQ Ranger high availability, you must register the standby 
Ranger Plug-in Service on the standby master node, and then restart the standby.
--- End diff --

This paragraph should be merged with the intro paragraph directly under 
**Step 3**.  Right now it repeats some of that info.


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032138#comment-16032138
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119487987
  
--- Diff: markdown/ranger/ranger-integration-config.html.md.erb ---
@@ -129,6 +129,64 @@ Once the connection between HAWQ and Ranger is 
configured, you may choose to set
 5. Click **Save** to save your changes.
 6. Select **Service Actions > Restart All** and confirm that you want to 
restart the HAWQ cluster.
 
+## Step 3: (Optional) Register a Standby Ranger 
Plug-in Service
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If this 
service goes down, all HAWQ database operations will fail. Configure a highly 
available HAWQ Ranger Plug-in Service to eliminate possible downtime should 
this situation occur.
+
+Ranger Admin high availability and HAWQ Ranger Plug-in Service high 
availability are independent; you can configure HAWQ Ranger Plug-in Service HA 
without configuring HA for Ranger Admin.
--- End diff --

Might want to add that configuring HA for both is advised?  Also, this 
section uses the abbreviated term "Ranger Admin" throughout, which seems 
confusing to me.  Should probably stick with "Ranger Administration Host" to 
stay consistent with the first part of the doc.


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032139#comment-16032139
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119488319
  
--- Diff: markdown/ranger/ranger-integration-config.html.md.erb ---
@@ -129,6 +129,64 @@ Once the connection between HAWQ and Ranger is 
configured, you may choose to set
 5. Click **Save** to save your changes.
 6. Select **Service Actions > Restart All** and confirm that you want to 
restart the HAWQ cluster.
 
+## Step 3: (Optional) Register a Standby Ranger 
Plug-in Service
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If this 
service goes down, all HAWQ database operations will fail. Configure a highly 
available HAWQ Ranger Plug-in Service to eliminate possible downtime should 
this situation occur.
+
+Ranger Admin high availability and HAWQ Ranger Plug-in Service high 
availability are independent; you can configure HAWQ Ranger Plug-in Service HA 
without configuring HA for Ranger Admin.
+
+### Prerequisites 
+
+Before you configure HAWQ Ranger authentication in high availability mode, 
ensure that you have:
+
+- Installed or upgraded to a version of HAWQ that includes support for 
HAWQ Ranger Authentication.
--- End diff --

Wondering if this first bullet is really needed...


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032143#comment-16032143
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119490263
  
--- Diff: markdown/ranger/ranger-integration-config.html.md.erb ---
@@ -129,6 +129,64 @@ Once the connection between HAWQ and Ranger is 
configured, you may choose to set
 5. Click **Save** to save your changes.
 6. Select **Service Actions > Restart All** and confirm that you want to 
restart the HAWQ cluster.
 
+## Step 3: (Optional) Register a Standby Ranger 
Plug-in Service
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If this 
service goes down, all HAWQ database operations will fail. Configure a highly 
available HAWQ Ranger Plug-in Service to eliminate possible downtime should 
this situation occur.
+
+Ranger Admin high availability and HAWQ Ranger Plug-in Service high 
availability are independent; you can configure HAWQ Ranger Plug-in Service HA 
without configuring HA for Ranger Admin.
+
+### Prerequisites 
+
+Before you configure HAWQ Ranger authentication in high availability mode, 
ensure that you have:
+
+- Installed or upgraded to a version of HAWQ that includes support for 
HAWQ Ranger Authentication.
+
+- (Optional) Configured Ranger Admin for high availability.
+
+- Configured a HAWQ standby master node for your HAWQ cluster.
+
+You must configure a standby master for your HAWQ deployment before 
enabling HAWQ Ranger high availability mode. If you have not configured your 
HAWQ standby master, follow the instructions in [Adding a HAWQ Standby 
Master](../admin/ambari-admin.html#amb-add-standby) (if you manage your HAWQ 
cluster with Ambari) or [Configuring Master 
Mirroring](../admin/MasterMirroring.html#standby_master_configure) (for a 
command-line-managed HAWQ cluster).
+
+- Registered the HAWQ Ranger Plug-in Service on your HAWQ master node.
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If you 
have not yet enabled the Ranger Plug-in Service, refer to [Install Ranger 
Connectivity to HAWQ](ranger-integration-config.html#jar) for registration 
instructions. (Optional) If you have configured Ranger Admin HA, make sure to 
identify the Ranger Admin HA proxy when you enable the plug-in.
+
+
+### Configuring the Standby Ranger Plug-in Service 
+
+The standby Ranger Plug-in Service runs on the HAWQ standby master node, 
utilizing the same port number as that when the service runs on the master 
node. To enable HAWQ Ranger high availability, you must register the standby 
Ranger Plug-in Service on the standby master node, and then restart the standby.
+
+**Note**: If you configured and registered the master HAWQ Ranger Plug-in 
Service before you initialized your HAWQ standby master node, you do not need 
to perform the steps in this section.
--- End diff --

This note is confusing.  The last bullet under prerequisites specifically 
states that they should have already registered the master plug-in and standby 
master.  If it's true that setting up the master plug-in first and then 
configuring the standby master automatically registers the standby plug-in, 
then this note should probably appear much earlier (before the prerequisites). 


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032141#comment-16032141
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119492058
  
--- Diff: markdown/reference/guc/parameter_definitions.html.md.erb ---
@@ -2147,6 +2149,14 @@ Identifies the port on which the HAWQ Ranger Plug-in 
Service runs. The `hawq_rps
 
|-|-|-|
 | valid port number | 8432 | master, reload |
 
+## hawq\_rps\_check\_local\_interval
+
+When HAWQ Ranger authentication high availability mode is enabled and the 
Ranger Plug-in Service is active on the standby master node, HAWQ attempts to 
switch back to the service located on the master node as soon as it becomes 
available. The HAWQ master periodically attempts to re-establish contact with 
the service on the local node, using `hawq_rps_check_local_interval` as the 
polling time interval (in seconds) for this contact.
--- End diff --

This wording is a bit unclear as to whether it's talking about just the 
standby ranger service being used (which is the intent) vs whether the entire 
master node is down and the standby plug-in is being used.  I think it would be 
better to repeat the wording used in `ranger-ha.html`, which starts "Should the 
HAWQ master node fail to communicate with the local Ranger Plug-in Service..."


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1473) document ranger plug-in service high availability

2017-05-31 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16032140#comment-16032140
 ] 

ASF GitHub Bot commented on HAWQ-1473:
--

Github user dyozie commented on a diff in the pull request:

https://github.com/apache/incubator-hawq-docs/pull/120#discussion_r119489263
  
--- Diff: markdown/ranger/ranger-integration-config.html.md.erb ---
@@ -129,6 +129,64 @@ Once the connection between HAWQ and Ranger is 
configured, you may choose to set
 5. Click **Save** to save your changes.
 6. Select **Service Actions > Restart All** and confirm that you want to 
restart the HAWQ cluster.
 
+## Step 3: (Optional) Register a Standby Ranger 
Plug-in Service
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If this 
service goes down, all HAWQ database operations will fail. Configure a highly 
available HAWQ Ranger Plug-in Service to eliminate possible downtime should 
this situation occur.
+
+Ranger Admin high availability and HAWQ Ranger Plug-in Service high 
availability are independent; you can configure HAWQ Ranger Plug-in Service HA 
without configuring HA for Ranger Admin.
+
+### Prerequisites 
+
+Before you configure HAWQ Ranger authentication in high availability mode, 
ensure that you have:
+
+- Installed or upgraded to a version of HAWQ that includes support for 
HAWQ Ranger Authentication.
+
+- (Optional) Configured Ranger Admin for high availability.
+
+- Configured a HAWQ standby master node for your HAWQ cluster.
+
+You must configure a standby master for your HAWQ deployment before 
enabling HAWQ Ranger high availability mode. If you have not configured your 
HAWQ standby master, follow the instructions in [Adding a HAWQ Standby 
Master](../admin/ambari-admin.html#amb-add-standby) (if you manage your HAWQ 
cluster with Ambari) or [Configuring Master 
Mirroring](../admin/MasterMirroring.html#standby_master_configure) (for a 
command-line-managed HAWQ cluster).
+
+- Registered the HAWQ Ranger Plug-in Service on your HAWQ master node.
+
+The HAWQ Ranger Plug-in Service runs on the HAWQ master node. If you 
have not yet enabled the Ranger Plug-in Service, refer to [Install Ranger 
Connectivity to HAWQ](ranger-integration-config.html#jar) for registration 
instructions. (Optional) If you have configured Ranger Admin HA, make sure to 
identify the Ranger Admin HA proxy when you enable the plug-in.
+
+
+### Configuring the Standby Ranger Plug-in Service 
--- End diff --

This heading title should just be "Procedure".  


> document ranger plug-in service high availability
> -
>
> Key: HAWQ-1473
> URL: https://issues.apache.org/jira/browse/HAWQ-1473
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
>
> add RPS high availability information to the docs.  include config info as 
> well as failover scenarios.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (HAWQ-1474) docs - instructions to create/use a minimal psql client pkg

2017-05-31 Thread Lisa Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisa Owen closed HAWQ-1474.
---

> docs - instructions to create/use a minimal psql client pkg
> ---
>
> Key: HAWQ-1474
> URL: https://issues.apache.org/jira/browse/HAWQ-1474
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
> Fix For: 2.3.0.0-incubating
>
>
> add instructions to create, install, and use a minimal psql client pkg.  one 
> would install this package on a like linux client system outside of the hawq 
> cluster.
> this will basically provide instructions to:
> - package up the psql binary and libraries and an auto-generated 
> environment-setting file
> - install this on a client system
> - run the client



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1474) docs - instructions to create/use a minimal psql client pkg

2017-05-31 Thread Lisa Owen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lisa Owen resolved HAWQ-1474.
-
   Resolution: Fixed
Fix Version/s: 2.3.0.0-incubating

PR merged; resolving and closing.

> docs - instructions to create/use a minimal psql client pkg
> ---
>
> Key: HAWQ-1474
> URL: https://issues.apache.org/jira/browse/HAWQ-1474
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Lisa Owen
>Assignee: David Yozie
> Fix For: 2.3.0.0-incubating
>
>
> add instructions to create, install, and use a minimal psql client pkg.  one 
> would install this package on a like linux client system outside of the hawq 
> cluster.
> this will basically provide instructions to:
> - package up the psql binary and libraries and an auto-generated 
> environment-setting file
> - install this on a client system
> - run the client



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (HAWQ-1461) Improve partition parameters validation for PXF-JDBC plugin

2017-05-31 Thread Lav Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lav Jain reassigned HAWQ-1461:
--

Assignee: Lav Jain  (was: Ed Espino)

> Improve partition parameters validation for PXF-JDBC plugin
> ---
>
> Key: HAWQ-1461
> URL: https://issues.apache.org/jira/browse/HAWQ-1461
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: PXF
>Reporter: Oleksandr Diachenko
>Assignee: Lav Jain
>
> h3. User didn't pass interval type:
> {code}
> CREATE EXTERNAL TABLE pxf_jdbc_multiple_fragments_by_date (t1text, t2
> text, num1  int, dub1  double precision, dec1  numeric, tm timestamp, r real, 
> bg bigint, b boolean, tn smallint, sml smallint, dt date, vc1 varchar(5), c1 
> char(3), bin bytea) LOCATION 
> (E'pxf://127.0.0.1:51200/hawq_types?PROFILE=Jdbc_DRIVER=org.postgresql.Driver_URL=jdbc:postgresql:pxfautomation//localhost:5432_BY=dt:date=2015-03-06:2015-03-20=1=adiachenko')
>  FORMAT 'CUSTOM' (formatter='pxfwritable_import');
> {code}
> Actual behavior:
> {code}
> select * from pxf_jdbc_multiple_fragments_by_date;
> ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception 
> report   message   description   The server encountered an internal error 
> that prevented it from fulfilling this request.exception   
> java.lang.NullPointerException (libchurl.c:897)
> {code}
> h3. User didn't pass interval:
> {code}
> CREATE EXTERNAL TABLE pxf_jdbc_multiple_fragments_by_date (t1text, t2
> text, num1  int, dub1  double precision, dec1  numeric, tm timestamp, r real, 
> bg bigint, b boolean, tn smallint, sml smallint, dt date, vc1 varchar(5), c1 
> char(3), bin bytea) LOCATION 
> (E'pxf://127.0.0.1:51200/hawq_types?PROFILE=Jdbc_DRIVER=org.postgresql.Driver_URL=jdbc:postgresql:pxfautomation//localhost:5432_BY=dt:date=2015-03-06:2015-03-20=adiachenko')
>  FORMAT 'CUSTOM' (formatter='pxfwritable_import');
> {code}
> Actual behavior:
> {code}
> ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception 
> report   message   description   The server encountered an internal error 
> that prevented it from fulfilling this request.exception   
> java.lang.NullPointerException (libchurl.c:897)
> {code}
> h3. User didn't pass the upper boundary of a range:
> {code}
> CREATE EXTERNAL TABLE pxf_jdbc_multiple_fragments_by_date (t1text, t2
> text, num1  int, dub1  double precision, dec1  numeric, tm timestamp, r real, 
> bg bigint, b boolean, tn smallint, sml smallint, dt date, vc1 varchar(5), c1 
> char(3), bin bytea) LOCATION 
> (E'pxf://127.0.0.1:51200/hawq_types?PROFILE=Jdbc_DRIVER=org.postgresql.Driver_URL=jdbc:postgresql:pxfautomation//localhost:5432_BY=dt:date=2015-03-06:=1:DAY=adiachenko')
>  FORMAT 'CUSTOM' (formatter='pxfwritable_import');
> {code}
> Actual behavior:
> {code}
> select * from pxf_jdbc_multiple_fragments_by_date;
> ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception 
> report   message   java.lang.Exception: 
> java.lang.ArrayIndexOutOfBoundsException: 1description   The server 
> encountered an internal error that prevented it from fulfilling this request. 
>exception   javax.servlet.ServletException: java.lang.Exception: 
> java.lang.ArrayIndexOutOfBoundsException: 1 (libchurl.c:897)
> {code}
> h3. User didn't pass range:
> {code}
> CREATE EXTERNAL TABLE pxf_jdbc_multiple_fragments_by_date (t1text, t2
> text, num1  int, dub1  double precision, dec1  numeric, tm timestamp, r real, 
> bg bigint, b boolean, tn smallint, sml smallint, dt date, vc1 varchar(5), c1 
> char(3), bin bytea) LOCATION 
> (E'pxf://127.0.0.1:51200/hawq_types?PROFILE=Jdbc_DRIVER=org.postgresql.Driver_URL=jdbc:postgresql:pxfautomation//localhost:5432_BY=dt:date=1:DAY=adiachenko')
>  FORMAT 'CUSTOM' (formatter='pxfwritable_import');
> {code}
> Actual behavior:
> {code}
> select * from pxf_jdbc_multiple_fragments_by_date;
> ERROR:  remote component error (500) from '127.0.0.1:51200':  type  Exception 
> report   message   java.lang.Exception: java.lang.NullPointerException
> description   The server encountered an internal error that prevented it from 
> fulfilling this request.exception   javax.servlet.ServletException: 
> java.lang.Exception: java.lang.NullPointerException (libchurl.c:897)
> {code}
> Expected behavior fro all cases: user-friendly meaningful message, hinting to 
> a user which parameter is missing/incorrect.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-1454) Exclude certain jars from Ranger Plugin Service packaging

2017-05-31 Thread Lav Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16031504#comment-16031504
 ] 

Lav Jain commented on HAWQ-1454:


Commit Id is 
https://github.com/lavjain/incubator-hawq/commit/94a2c65b7f56ea93f0c6b0d5527ef431c4913c20

> Exclude certain jars from Ranger Plugin Service packaging
> -
>
> Key: HAWQ-1454
> URL: https://issues.apache.org/jira/browse/HAWQ-1454
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Lav Jain
>Assignee: Ed Espino
> Fix For: backlog
>
>
> The following jars may cause conflicts in certain environments depending on 
> how the classes are being loaded.
> ```
> WEB-INF/lib/jersey-json-1.9.jar
> WEB-INF/lib/jersey-core-1.9.jar
> WEB-INF/lib/jersey-server-1.9.jar
> ```
> We need to exclude them while building the RPM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (HAWQ-1454) Exclude certain jars from Ranger Plugin Service packaging

2017-05-31 Thread Lav Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lav Jain resolved HAWQ-1454.

   Resolution: Fixed
Fix Version/s: backlog

> Exclude certain jars from Ranger Plugin Service packaging
> -
>
> Key: HAWQ-1454
> URL: https://issues.apache.org/jira/browse/HAWQ-1454
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Security
>Reporter: Lav Jain
>Assignee: Ed Espino
> Fix For: backlog
>
>
> The following jars may cause conflicts in certain environments depending on 
> how the classes are being loaded.
> ```
> WEB-INF/lib/jersey-json-1.9.jar
> WEB-INF/lib/jersey-core-1.9.jar
> WEB-INF/lib/jersey-server-1.9.jar
> ```
> We need to exclude them while building the RPM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-56) Non-deterministic header results with "HEADER" option from external table

2017-05-31 Thread Jacek Dobrowolski (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030913#comment-16030913
 ] 

Jacek Dobrowolski commented on HAWQ-56:
---

https://hdb.docs.pivotal.io/200/hawq/pxf/ReadWritePXF.html#built-inprofiles

HdfsTextMulti  - Read delimited single or multi-line records (with quoted 
linefeeds) from plain text files on HDFS. This profile is not splittable (non 
parallel); therefore reading is slower than reading with HdfsTextSimple.

HdfsTextMulti is non splittable what makes it read all file blocks and strip 
only first file line (header line) so it correctly treat HEADER keyword.

Conclusion: you should not disable HEADER for PROFILE=HdfsTextMulti

> Non-deterministic header results with "HEADER" option from external table
> -
>
> Key: HAWQ-56
> URL: https://issues.apache.org/jira/browse/HAWQ-56
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: PXF
>Reporter: Goden Yao
>Assignee: Noa Horn
>Priority: Critical
> Fix For: 2.0.0.0-incubating
>
>
> *Repro Steps*
> External table definition
> {code:sql}
> drop external table if exists testtbl;
> create external table testtbl(i text, j text)
> location 
> ('pxf://nakaphdns/tmp/testdata/*?Fragmenter=com.pivotal.pxf.plugins.hdfs.HdfsDataFragmenter=com.pivotal.pxf.plugins.hdfs.TextFileAccessor=com.pivotal.pxf.plugins.hdfs.TextResolver')
> format 'TEXT' (delimiter ',' header);
> select * from testtbl;
> {code}
> example with 4 segment servers and 4 test files with headers in hdfs
> {code:sql}
> gpadmin=# select * from testtbl ;
>  i | j
> ---+---
>  3 | c
>  2 | b
>  1 | a
>  4 | d
> (4 rows)
> {code}
> With 5 test files
> {code:sql}
> gpadmin=# select * from testtbl ;
>  i  |   j
> +---
>  5  | e
>  2  | b
>  ID | Value
>  3  | c
>  1  | a
>  ID | VALUE
>  4  | d
> (7 rows)
> {code}
> *Analysis*
> When using HEADER option, header line is removed only once per segment.
> As a result there will be different results depending on the number of 
> segments/fragments are scanned. If the number of files is greater than the 
> number of segments, the header row is included in the data for some files.
> If the number of files is less than or equal to the number of segments, the 
> data retrieved is good. Thus, non-deterministic.
> The reason for this behavior is that header line handling is done by the 
> external protocol code (fileam.c) which checks if the header_line flag is 
> true, and if so skips the first line and sets the flag to false. This code 
> calls the custom protocol code (in our case pxf) to get the next tuples, and 
> so doesn't know if the tuples are from the same resource or not.
> From what I can see, in gpfdist protocol the problem is solved by letting the 
> custom protocol code handle this and marking the flag as false for the 
> external protocol infrastructure in fileam.c.
> *Proposed Solution*
> Add a check in pxf validator function (a function that is being called as 
> part of external table creation).
> This check will error out if HEADER option is used in a PXF table.
> Currently the validation function API only allows access to the table's URLs 
> (in the LOCATION part of the table), and not the format options. In order to 
> add the check an API change in the external protocol is required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HAWQ-56) Non-deterministic header results with "HEADER" option from external table

2017-05-31 Thread Jacek Dobrowolski (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16030906#comment-16030906
 ] 

Jacek Dobrowolski commented on HAWQ-56:
---

It gives non deterministic results on profile PROFILE=HdfsTextSimple, as it 
parallely reads each file block.

However when using profile PROFILE=HdfsTextMulti in pxf table definition, table 
is read serially and only first file line is stripped - and that works 
correctly.

Please reopen this and allow using HEADER with profile PROFILE=HdfsTextMulti.

> Non-deterministic header results with "HEADER" option from external table
> -
>
> Key: HAWQ-56
> URL: https://issues.apache.org/jira/browse/HAWQ-56
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: PXF
>Reporter: Goden Yao
>Assignee: Noa Horn
>Priority: Critical
> Fix For: 2.0.0.0-incubating
>
>
> *Repro Steps*
> External table definition
> {code:sql}
> drop external table if exists testtbl;
> create external table testtbl(i text, j text)
> location 
> ('pxf://nakaphdns/tmp/testdata/*?Fragmenter=com.pivotal.pxf.plugins.hdfs.HdfsDataFragmenter=com.pivotal.pxf.plugins.hdfs.TextFileAccessor=com.pivotal.pxf.plugins.hdfs.TextResolver')
> format 'TEXT' (delimiter ',' header);
> select * from testtbl;
> {code}
> example with 4 segment servers and 4 test files with headers in hdfs
> {code:sql}
> gpadmin=# select * from testtbl ;
>  i | j
> ---+---
>  3 | c
>  2 | b
>  1 | a
>  4 | d
> (4 rows)
> {code}
> With 5 test files
> {code:sql}
> gpadmin=# select * from testtbl ;
>  i  |   j
> +---
>  5  | e
>  2  | b
>  ID | Value
>  3  | c
>  1  | a
>  ID | VALUE
>  4  | d
> (7 rows)
> {code}
> *Analysis*
> When using HEADER option, header line is removed only once per segment.
> As a result there will be different results depending on the number of 
> segments/fragments are scanned. If the number of files is greater than the 
> number of segments, the header row is included in the data for some files.
> If the number of files is less than or equal to the number of segments, the 
> data retrieved is good. Thus, non-deterministic.
> The reason for this behavior is that header line handling is done by the 
> external protocol code (fileam.c) which checks if the header_line flag is 
> true, and if so skips the first line and sets the flag to false. This code 
> calls the custom protocol code (in our case pxf) to get the next tuples, and 
> so doesn't know if the tuples are from the same resource or not.
> From what I can see, in gpfdist protocol the problem is solved by letting the 
> custom protocol code handle this and marking the flag as false for the 
> external protocol infrastructure in fileam.c.
> *Proposed Solution*
> Add a check in pxf validator function (a function that is being called as 
> part of external table creation).
> This check will error out if HEADER option is used in a PXF table.
> Currently the validation function API only allows access to the table's URLs 
> (in the LOCATION part of the table), and not the format options. In order to 
> add the check an API change in the external protocol is required.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-hawq issue #1243: HAWQ-1458. Fix share input scan bug for writer p...

2017-05-31 Thread paul-guo-
Github user paul-guo- commented on the issue:

https://github.com/apache/incubator-hawq/pull/1243
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1243: HAWQ-1458. Fix share input scan bug for w...

2017-05-31 Thread paul-guo-
Github user paul-guo- commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1243#discussion_r119279208
  
--- Diff: src/backend/executor/nodeShareInputScan.c ---
@@ -666,6 +717,29 @@ static int retry_write(int fd, char *buf, int wsize)
return 0;
 }
 
+
+
+/*
+ * generate_lock_file_name
+ *
+ * Called by reader or writer to make the unique lock file name.
+ */
+void generate_lock_file_name(char* p, int size, int share_id, const char* 
name)
+{
+   if (strncmp(name , "writer", strlen("writer")) == 0)
+   {
+   sisc_lockname(p, size, share_id, "ready");
+   strncat(p, name, lengthof(p) - strlen(p) - 1);
--- End diff --

Not lengthof(p). Should be size?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1243: HAWQ-1458. Fix share input scan bug for w...

2017-05-31 Thread paul-guo-
Github user paul-guo- commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1243#discussion_r119279170
  
--- Diff: src/backend/executor/nodeMaterial.c ---
@@ -759,3 +764,30 @@ ExecEagerFreeMaterial(MaterialState *node)
}
 }
 
+
+/*
+ * mkLockFileForWriter
+ * 
+ * Create a unique lock file for writer, then use flock() to lock/unlock 
the lock file.
+ * We can make sure the lock file will be locked forerver until the writer 
process quits.
+ */
+static void mkLockFileForWriter(int size, int share_id, char * name)
+{
+   char *lock_file;
+   int lock;
+
+   lock_file = (char *)palloc0(sizeof(char) * MAXPGPATH);
--- End diff --

MAXPGPATH -> size?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1243: HAWQ-1458. Fix share input scan bug for w...

2017-05-31 Thread paul-guo-
Github user paul-guo- commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1243#discussion_r119279222
  
--- Diff: src/backend/executor/nodeShareInputScan.c ---
@@ -666,6 +717,29 @@ static int retry_write(int fd, char *buf, int wsize)
return 0;
 }
 
+
+
+/*
+ * generate_lock_file_name
+ *
+ * Called by reader or writer to make the unique lock file name.
+ */
+void generate_lock_file_name(char* p, int size, int share_id, const char* 
name)
+{
+   if (strncmp(name , "writer", strlen("writer")) == 0)
+   {
+   sisc_lockname(p, size, share_id, "ready");
+   strncat(p, name, lengthof(p) - strlen(p) - 1);
+   }
+   else
+   {
+   sisc_lockname(p, size, share_id, "done");
+   strncat(p, name, lengthof(p) - strlen(p) - 1);
--- End diff --

ditto.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-hawq pull request #1243: HAWQ-1458. Fix share input scan bug for w...

2017-05-31 Thread paul-guo-
Github user paul-guo- commented on a diff in the pull request:

https://github.com/apache/incubator-hawq/pull/1243#discussion_r119278591
  
--- Diff: src/backend/executor/nodeMaterial.c ---
@@ -759,3 +764,30 @@ ExecEagerFreeMaterial(MaterialState *node)
}
 }
 
+
+/*
+ * mkLockFileForWriter
+ * 
+ * Create a unique lock file for writer, then use flock() to lock/unlock 
the lock file.
+ * We can make sure the lock file will be locked forerver until the writer 
process quits.
+ */
+static void mkLockFileForWriter(int size, int share_id, char * name)
+{
+   char *lock_file;
+   int lock;
+
+   lock_file = (char *)palloc0(sizeof(char) * MAXPGPATH);
+   generate_lock_file_name(lock_file, size, share_id, name);
+   elog(DEBUG3, "The lock file for writer in SISC is %s", lock_file);
+   sisc_writer_lock_fd = open(lock_file, O_CREAT, S_IRWXU);
+   if(sisc_writer_lock_fd < 0)
+   {
+   elog(ERROR, "Could not create lock file %s for writer in SISC. 
The error number is %d", lock_file, errno);
+   }
+   lock = flock(sisc_writer_lock_fd, LOCK_EX | LOCK_NB);
+   if(lock == -1)
+   elog(DEBUG3, "Could not lock lock file  \"%s\" for writer in 
SISC . The error number is %d", lock_file, errno);
+   else if(lock == 0)
+   elog(DEBUG3, "Successfully locked lock file  \"%s\" for writer 
in SISC.The error number is %d", lock_file, errno);
--- End diff --

For (lock == 0), there is no need of errno in log.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---