[jira] [Created] (HAWQ-1485) Use user/password instead of credentials cache in Ranger lookup for HAWQ with Kerberos enabled.

2017-06-12 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1485:
---

 Summary: Use user/password instead of credentials cache in Ranger 
lookup for HAWQ with Kerberos enabled.
 Key: HAWQ-1485
 URL: https://issues.apache.org/jira/browse/HAWQ-1485
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Security
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


When used credentials cache:
Try error password in Ranger UI doesn't destroy the existed kerberos 
credentials (created by last success kinit command)
It's a strange behavior to user.

So we should use user/password for kerberos authentication.
Core logic:
{code}
Properties props = new Properties();
if (connectionProperties.containsKey(AUTHENTICATION) && 
connectionProperties.get(AUTHENTICATION).equals(KERBEROS)) {
//kerberos mode
props.setProperty("kerberosServerName", 
connectionProperties.get("principal"));
props.setProperty("jaasApplicationName", "pgjdbc");
}

String url = String.format("jdbc:postgresql://%s:%s/%s", 
connectionProperties.get("hostname"), connectionProperties.get("port"), db);
props.setProperty("user", connectionProperties.get("username"));
props.setProperty("password", connectionProperties.get("password"));

return DriverManager.getConnection(url, props);
{code}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1485) Use user/password instead of credentials cache in Ranger lookup for HAWQ with Kerberos enabled.

2017-06-12 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1485:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Use user/password instead of credentials cache in Ranger lookup for HAWQ with 
> Kerberos enabled.
> ---
>
> Key: HAWQ-1485
> URL: https://issues.apache.org/jira/browse/HAWQ-1485
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When used credentials cache:
> Try error password in Ranger UI doesn't destroy the existed kerberos 
> credentials (created by last success kinit command)
> It's a strange behavior to user.
> So we should use user/password for kerberos authentication.
> Core logic:
> {code}
> Properties props = new Properties();
> if (connectionProperties.containsKey(AUTHENTICATION) && 
> connectionProperties.get(AUTHENTICATION).equals(KERBEROS)) {
> //kerberos mode
> props.setProperty("kerberosServerName", 
> connectionProperties.get("principal"));
> props.setProperty("jaasApplicationName", "pgjdbc");
> }
> String url = String.format("jdbc:postgresql://%s:%s/%s", 
> connectionProperties.get("hostname"), connectionProperties.get("port"), db);
> props.setProperty("user", connectionProperties.get("username"));
> props.setProperty("password", connectionProperties.get("password"));
> return DriverManager.getConnection(url, props);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1485) Use user/password instead of credentials cache in Ranger lookup for HAWQ with Kerberos enabled.

2017-06-15 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1485.
---
Resolution: Fixed

fixed

> Use user/password instead of credentials cache in Ranger lookup for HAWQ with 
> Kerberos enabled.
> ---
>
> Key: HAWQ-1485
> URL: https://issues.apache.org/jira/browse/HAWQ-1485
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When used credentials cache:
> Try error password in Ranger UI doesn't destroy the existed kerberos 
> credentials (created by last success kinit command)
> It's a strange behavior to user.
> So we should use user/password for kerberos authentication.
> Core logic:
> {code}
> Properties props = new Properties();
> if (connectionProperties.containsKey(AUTHENTICATION) && 
> connectionProperties.get(AUTHENTICATION).equals(KERBEROS)) {
> //kerberos mode
> props.setProperty("kerberosServerName", 
> connectionProperties.get("principal"));
> props.setProperty("jaasApplicationName", "pgjdbc");
> }
> String url = String.format("jdbc:postgresql://%s:%s/%s", 
> connectionProperties.get("hostname"), connectionProperties.get("port"), db);
> props.setProperty("user", connectionProperties.get("username"));
> props.setProperty("password", connectionProperties.get("password"));
> return DriverManager.getConnection(url, props);
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1493) Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar

2017-06-29 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1493:
---

 Summary: Integrate Ranger lookup JAAS configuration in 
ranger-admin plugin jar
 Key: HAWQ-1493
 URL: https://issues.apache.org/jira/browse/HAWQ-1493
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Security
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


For support ranger lookup a kerberized HAWQ, we modify java.security file and 
add a .java.login.config file into ranger account now.
But both of the two files are global, have influenced other program, so we need 
integrate the JAAS configuration in a private scope.

After investigation, `setProperty("java.security.auth.login.config")` in 
ranger-admin plugin code is a good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1493) Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar

2017-06-29 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1493:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar
> -
>
> Key: HAWQ-1493
> URL: https://issues.apache.org/jira/browse/HAWQ-1493
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> For support ranger lookup a kerberized HAWQ, we modify java.security file and 
> add a .java.login.config file into ranger account now.
> But both of the two files are global, have influenced other program, so we 
> need integrate the JAAS configuration in a private scope.
> After investigation, `setProperty("java.security.auth.login.config")` in 
> ranger-admin plugin code is a good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1493) Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar

2017-06-29 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069401#comment-16069401
 ] 

Hongxu Ma commented on HAWQ-1493:
-

Only do `setProperty("java.security.auth.login.config")` doesn't work.
Because Ranger do some hacks by 
org.apache.ranger.audit.utils.InMemoryJAASConfiguration, we should also use 
this class as Ranger.


> Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar
> -
>
> Key: HAWQ-1493
> URL: https://issues.apache.org/jira/browse/HAWQ-1493
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> For support ranger lookup a kerberized HAWQ, we modify java.security file and 
> add a .java.login.config file into ranger account now.
> But both of the two files are global, have influenced other program, so we 
> need integrate the JAAS configuration in a private scope.
> After investigation, `setProperty("java.security.auth.login.config")` in 
> ranger-admin plugin code is a good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1493) Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar

2017-07-11 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1493.
---
Resolution: Fixed

fixed

> Integrate Ranger lookup JAAS configuration in ranger-admin plugin jar
> -
>
> Key: HAWQ-1493
> URL: https://issues.apache.org/jira/browse/HAWQ-1493
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Security
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> For support ranger lookup a kerberized HAWQ, we modify java.security file and 
> add a .java.login.config file into ranger account now.
> But both of the two files are global, have influenced other program, so we 
> need integrate the JAAS configuration in a private scope.
> After investigation, `setProperty("java.security.auth.login.config")` in 
> ranger-admin plugin code is a good solution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1506) Fix multi-append bug

2017-07-24 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1506:
---

 Summary: Fix multi-append bug
 Key: HAWQ-1506
 URL: https://issues.apache.org/jira/browse/HAWQ-1506
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: libhdfs
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1506) Fix multi-append bug

2017-07-24 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1506:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Fix multi-append bug
> 
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1506) Fix multi-append bug

2017-07-24 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16098118#comment-16098118
 ] 

Hongxu Ma commented on HAWQ-1506:
-

After read code of HDFS 
(hadoop-common/src/main/java/org/apache/hadoop/crypto/CryptoOutputStream.java):
We should do updateEncryptor(calculate a correct counter for AES-CTR) when open 
a file.

It should fix this bug.

> Fix multi-append bug
> 
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Fix multi-append bug of write a encryption zone directory

2017-07-24 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Summary: Fix multi-append bug of write a encryption zone directory  (was: 
Fix multi-append bug)

> Fix multi-append bug of write a encryption zone directory
> -
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Fix multi-append bug of write a encryption zone

2017-07-24 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Summary: Fix multi-append bug of write a encryption zone  (was: Fix 
multi-append bug of write a encryption zone directory)

> Fix multi-append bug of write a encryption zone
> ---
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Support multi-append a file within encryption zone

2017-07-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Summary: Support multi-append a file within encryption zone  (was: Fix 
multi-append bug of write a encryption zone)

> Support multi-append a file within encryption zone
> --
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Support multi-append a file within encryption zone

2017-07-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Description: 
Currently, multi-append 
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.

  was:
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.


> Support multi-append a file within encryption zone
> --
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Currently, multi-append 
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Support multi-append a file within encryption zone

2017-07-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Description: 
Currently, multi-append can cause write 
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.

  was:
Currently, multi-append 
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.


> Support multi-append a file within encryption zone
> --
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Currently, multi-append can cause write 
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # hdfsWrite() it multi-times.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1506) Support multi-append a file within encryption zone

2017-07-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1506:

Description: 
Currently, multi-append (*serializable*) can cause write the incorrect content.
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# Call hdfsWrite() twice.
# Then read all file contents => only the first write content is correct.

So we need to fix it.

  was:
Currently, multi-append can cause write 
Reproduction method:
# Open a file with O_APPEND flag in encryption zone directory.
# hdfsWrite() it multi-times.
# Then read all file contents => only the first write content is correct.

So we need to fix it.


> Support multi-append a file within encryption zone
> --
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Currently, multi-append (*serializable*) can cause write the incorrect 
> content.
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # Call hdfsWrite() twice.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1508) Fix travis broken caused by libssl path

2017-07-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1508:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Fix travis broken caused by libssl path
> ---
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
> {code}
> CMake Error at 
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
>  (message):
>   Could NOT find SSL (missing: SSL_INCLUDE_DIR)
> Call Stack (most recent call first):
>   
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
>  (_FPHSA_FAILURE_MESSAGE)
>   CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
>   CMakeLists.txt:24 (FIND_PACKAGE)
> -- Configuring incomplete, errors occurred!
> See also 
> "/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".
> failed to configure the project
> make[1]: *** [pre-config] Error 1
> make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1508) Fix travis broken caused by libssl path

2017-07-27 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1508:
---

 Summary: Fix travis broken caused by libssl path
 Key: HAWQ-1508
 URL: https://issues.apache.org/jira/browse/HAWQ-1508
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: libhdfs
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
{code}
CMake Error at 
/usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
 (message):

  Could NOT find SSL (missing: SSL_INCLUDE_DIR)

Call Stack (most recent call first):

  
/usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
 (_FPHSA_FAILURE_MESSAGE)

  CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)

  CMakeLists.txt:24 (FIND_PACKAGE)

-- Configuring incomplete, errors occurred!

See also 
"/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".

failed to configure the project

make[1]: *** [pre-config] Error 1

make: *** [all] Error 2
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1509) Support TDE read function

2017-07-27 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1509:
---

 Summary: Support TDE read function
 Key: HAWQ-1509
 URL: https://issues.apache.org/jira/browse/HAWQ-1509
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: libhdfs
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


Currently, we have already supported TDE write.
Then will support TDE read in this JIRA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1508) Fix travis broken caused by libssl path

2017-07-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1508.
---
Resolution: Fixed

fixed

> Fix travis broken caused by libssl path
> ---
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
> {code}
> CMake Error at 
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
>  (message):
>   Could NOT find SSL (missing: SSL_INCLUDE_DIR)
> Call Stack (most recent call first):
>   
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
>  (_FPHSA_FAILURE_MESSAGE)
>   CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
>   CMakeLists.txt:24 (FIND_PACKAGE)
> -- Configuring incomplete, errors occurred!
> See also 
> "/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".
> failed to configure the project
> make[1]: *** [pre-config] Error 1
> make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1509) Support TDE read function

2017-08-04 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1509:
---

Assignee: Amy  (was: Hongxu Ma)

> Support TDE read function
> -
>
> Key: HAWQ-1509
> URL: https://issues.apache.org/jira/browse/HAWQ-1509
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Amy
> Fix For: 2.3.0.0-incubating
>
>
> Currently, we have already supported TDE write.
> Then will support TDE read in this JIRA.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1193) TDE support in HAWQ

2017-08-04 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1193:
---

Assignee: Hongxu Ma  (was: Ivan Weng)

> TDE support in HAWQ
> ---
>
> Key: HAWQ-1193
> URL: https://issues.apache.org/jira/browse/HAWQ-1193
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: backlog
>
> Attachments: HAWQ_TDE_Design_ver0.1.pdf, HAWQ_TDE_Design_ver0.2 .pdf
>
>
>  TDE(transparently data encrypted) has been supported after hadoop 2.6:
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
> https://issues.apache.org/jira/browse/HDFS-6134
> Use TDE can promise:
> 1, hdfs file is encrypted.
> 2, network transfer between hdfs and libhdfs client is encrypted.
> So hawq will update libhdfs3 to support it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1506) Support multi-append a file within encryption zone

2017-08-04 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1506.
---
Resolution: Fixed

fixed

> Support multi-append a file within encryption zone
> --
>
> Key: HAWQ-1506
> URL: https://issues.apache.org/jira/browse/HAWQ-1506
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> Currently, multi-append (*serializable*) can cause write the incorrect 
> content.
> Reproduction method:
> # Open a file with O_APPEND flag in encryption zone directory.
> # Call hdfsWrite() twice.
> # Then read all file contents => only the first write content is correct.
> So we need to fix it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-04 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1510:
---

 Summary: Add TDE-related functionality into hawq command line tools
 Key: HAWQ-1510
 URL: https://issues.apache.org/jira/browse/HAWQ-1510
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

2, hawq state
show the encryption zone info if user enable tde in hawq.

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1511) Add TDE-related properties into function-test.xml

2017-08-04 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1511:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Add TDE-related properties into function-test.xml
> -
>
> Key: HAWQ-1511
> URL: https://issues.apache.org/jira/browse/HAWQ-1511
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> include:
> * dfs.encryption.key.provider.uri
> * hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-04 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1510:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> 2, hawq state
> show the encryption zone info if user enable tde in hawq.
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1511) Add TDE-related properties into function-test.xml

2017-08-04 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1511:
---

 Summary: Add TDE-related properties into function-test.xml
 Key: HAWQ-1511
 URL: https://issues.apache.org/jira/browse/HAWQ-1511
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: libhdfs
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


include:
* dfs.encryption.key.provider.uri
* hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1511) Add TDE-related properties into hdfs-client.xml

2017-08-06 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1511:

Summary: Add TDE-related properties into hdfs-client.xml  (was: Add 
TDE-related properties into function-test.xml)

> Add TDE-related properties into hdfs-client.xml
> ---
>
> Key: HAWQ-1511
> URL: https://issues.apache.org/jira/browse/HAWQ-1511
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> include:
> * dfs.encryption.key.provider.uri
> * hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-07 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state-
show the encryption zone info if user enable tde in hawq.

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-07 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state
show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

2, hawq state
show the encryption zone info if user enable tde in hawq.

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> -2, hawq state
> show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-07 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state-
show the encryption zone info if user enable tde in hawq.

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state
show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> -2, hawq state-
> show the encryption zone info if user enable tde in hawq.
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (HAWQ-1511) Add TDE-related properties into hdfs-client.xml

2017-08-07 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117759#comment-16117759
 ] 

Hongxu Ma edited comment on HAWQ-1511 at 8/8/17 2:37 AM:
-

Currently, I think default value of all properties (except 
dfs.encryption.key.provider.uri) is enough for user.


was (Author: hongxu ma):
Currently, I think default value of all properties (except 
{code:java}
dfs.encryption.key.provider.uri
{code}
) is enough for user.

> Add TDE-related properties into hdfs-client.xml
> ---
>
> Key: HAWQ-1511
> URL: https://issues.apache.org/jira/browse/HAWQ-1511
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> include:
> * dfs.encryption.key.provider.uri
> * hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1511) Add TDE-related properties into hdfs-client.xml

2017-08-07 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117759#comment-16117759
 ] 

Hongxu Ma commented on HAWQ-1511:
-

Currently, I think default value of all properties (except 
{code:java}
dfs.encryption.key.provider.uri
{code}
) is enough for user.

> Add TDE-related properties into hdfs-client.xml
> ---
>
> Key: HAWQ-1511
> URL: https://issues.apache.org/jira/browse/HAWQ-1511
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> include:
> * dfs.encryption.key.provider.uri
> * hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1511) Add TDE-related properties into hdfs-client.xml

2017-08-07 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1511:
---

Assignee: Amy  (was: Hongxu Ma)

> Add TDE-related properties into hdfs-client.xml
> ---
>
> Key: HAWQ-1511
> URL: https://issues.apache.org/jira/browse/HAWQ-1511
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Amy
> Fix For: 2.3.0.0-incubating
>
>
> include:
> * dfs.encryption.key.provider.uri
> * hadoop.security.crypto.buffer.size



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1514) TDE feature makes libhdfs3 require openssl1.1

2017-08-08 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16119343#comment-16119343
 ] 

Hongxu Ma commented on HAWQ-1514:
-

Hi Yi, 
In my tests, openssl >= 1.0.1 is enough. 
Because tde need AES-CTR-256 algorithm, I googled some info: seems it's only 
available since OpenSSL v1.0.1.

But your 1.0.21 test failed is a bit strange, I will check it later.


> TDE feature makes libhdfs3 require openssl1.1
> -
>
> Key: HAWQ-1514
> URL: https://issues.apache.org/jira/browse/HAWQ-1514
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: libhdfs
>Reporter: Yi Jin
>Assignee: Radar Lei
> Fix For: 2.3.0.0-incubating
>
>
> New TDE feature delivered in libhdfs3 requires specific version of openssl, 
> at least per my test, 1.0.21 does not work, and 1.1 source code built library 
> passed.
> So maybe we need some build and installation instruction improvement. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-14 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1518) Add a GUC for showing whether the data directory is an encryption zone

2017-08-16 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1518:
---

 Summary: Add a GUC for showing whether the data directory is an 
encryption zone
 Key: HAWQ-1518
 URL: https://issues.apache.org/jira/browse/HAWQ-1518
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Catalog
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


A GUC(read only) for showing whether the data directory is an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1518) Add a UDF for showing whether the data directory is an encryption zone

2017-08-16 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1518:

Summary: Add a UDF for showing whether the data directory is an encryption 
zone  (was: Add a GUC for showing whether the data directory is an encryption 
zone)

> Add a UDF for showing whether the data directory is an encryption zone
> --
>
> Key: HAWQ-1518
> URL: https://issues.apache.org/jira/browse/HAWQ-1518
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Catalog
>Reporter: Hongxu Ma
>Assignee: Radar Lei
> Fix For: 2.3.0.0-incubating
>
>
> A GUC(read only) for showing whether the data directory is an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1518) Add a UDF for showing whether the data directory is an encryption zone

2017-08-16 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1518:

Description: A UDF(read only) for showing whether the data directory is an 
encryption zone.  (was: A GUC(read only) for showing whether the data directory 
is an encryption zone.)

> Add a UDF for showing whether the data directory is an encryption zone
> --
>
> Key: HAWQ-1518
> URL: https://issues.apache.org/jira/browse/HAWQ-1518
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Catalog
>Reporter: Hongxu Ma
>Assignee: Radar Lei
> Fix For: 2.3.0.0-incubating
>
>
> A UDF(read only) for showing whether the data directory is an encryption zone.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-27 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143442#comment-16143442
 ] 

Hongxu Ma commented on HAWQ-1510:
-

Note: 
create encryption zone need hdfs **superuser privilege**.
so if hawq user and hdfs superuser is not the same one, you should create the 
encryption zone on hawq directory manually before running hawq-init script.

> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* 

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
cannot support transfer the existed(and non-empty) hawq_default directory into 
an encryption zone.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * 
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* create encryption zone need hdfs *superuser privilege*, so if hawq user and 
hdfs superuser is not the same one, you should create the encryption zone on 
hawq directory manually before running hawq-init script.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* if hawq user and hdfs superuser is not the same one, you should create the 
encryption zone on hawq directory manually before running hawq-init script.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * create encryption zone need hdfs *superuser privilege*, so if hawq user and 
> hdfs superuser is not the same one, you should create the encryption zone on 
> hawq directory manually before running hawq-init script.
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* if hawq user and hdfs superuser is not the same one, you should create the 
encryption zone on hawq directory manually before running hawq-init script.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* 

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * if hawq user and hdfs superuser is not the same one, you should create the 
> encryption zone on hawq directory manually before running hawq-init script.
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-08-27 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* create encryption zone need hdfs *superuser privilege*, so if hawq user and 
hdfs superuser is not the same one, you should create the encryption zone on 
hawq directory manually before running hawq-init script, example:
{code}
hdfs crypto -createZone -keyName key_demo -path /hawq_default/
{code}

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* create encryption zone need hdfs *superuser privilege*, so if hawq user and 
hdfs superuser is not the same one, you should create the encryption zone on 
hawq directory manually before running hawq-init script.

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * create encryption zone need hdfs *superuser privilege*, so if hawq user and 
> hdfs superuser is not the same one, you should create the encryption zone on 
> hawq directory manually before running hawq-init script, example:
> {code}
> hdfs crypto -createZone -keyName key_demo -path /hawq_default/
> {code}
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> 3, hawq register 
> cannot register file in different encryption zones / un-encryption zones.
> 4, hawq extract
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-08-28 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1520:
---

 Summary: gpcheckhdfs should skip hdfs trash directory
 Key: HAWQ-1520
 URL: https://issues.apache.org/jira/browse/HAWQ-1520
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Command Line Tools
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.3.0.0-incubating


When enable hdfs trash feature, there is a *Trash* directory under the 
encryption zone.
Example:

{code}
[gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
/hawq/hawq-1503886333  tdekey
[gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
Found 1 items
drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
/hawq/hawq-1503886333/.Trash
{code}

But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
directory.
We should fix it to skip this trash directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-08-28 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1520:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> gpcheckhdfs should skip hdfs trash directory
> 
>
> Key: HAWQ-1520
> URL: https://issues.apache.org/jira/browse/HAWQ-1520
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When enable hdfs trash feature, there is a *Trash* directory under the 
> encryption zone.
> Example:
> {code}
> [gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
> /hawq/hawq-1503886333  tdekey
> [gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
> /hawq/hawq-1503886333/.Trash
> {code}
> But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
> directory.
> We should fix it to skip this trash directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-08-28 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16143497#comment-16143497
 ] 

Hongxu Ma commented on HAWQ-1520:
-

about hdfs trash:
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_mc_config_trash.html

> gpcheckhdfs should skip hdfs trash directory
> 
>
> Key: HAWQ-1520
> URL: https://issues.apache.org/jira/browse/HAWQ-1520
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When enable hdfs trash feature, there is a *Trash* directory under the 
> encryption zone.
> Example:
> {code}
> [gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
> /hawq/hawq-1503886333  tdekey
> [gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
> /hawq/hawq-1503886333/.Trash
> {code}
> But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
> directory.
> We should fix it to skip this trash directory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-08-31 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1520:

Description: 
When enable hdfs trash feature, there is a *Trash* directory under the 
encryption zone.
Example:

{code}
[gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
/hawq/hawq-1503886333  tdekey
[gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
Found 1 items
drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
/hawq/hawq-1503886333/.Trash
{code}

But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
directory.
We should fix it to skip this trash directory.

Note:


  was:
When enable hdfs trash feature, there is a *Trash* directory under the 
encryption zone.
Example:

{code}
[gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
/hawq/hawq-1503886333  tdekey
[gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
Found 1 items
drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
/hawq/hawq-1503886333/.Trash
{code}

But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
directory.
We should fix it to skip this trash directory.


> gpcheckhdfs should skip hdfs trash directory
> 
>
> Key: HAWQ-1520
> URL: https://issues.apache.org/jira/browse/HAWQ-1520
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When enable hdfs trash feature, there is a *Trash* directory under the 
> encryption zone.
> Example:
> {code}
> [gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
> /hawq/hawq-1503886333  tdekey
> [gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
> /hawq/hawq-1503886333/.Trash
> {code}
> But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
> directory.
> We should fix it to skip this trash directory.
> Note:



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-08-31 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1520:

Description: 
When enable hdfs trash feature, there is a *Trash* directory under the 
encryption zone.
Example:

{code}
[gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
/hawq/hawq-1503886333  tdekey
[gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
Found 1 items
drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
/hawq/hawq-1503886333/.Trash
{code}

But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
directory.
We should fix it to skip this trash directory.

Note:
If you just enable trash without encryption zone, the trash directory is 
*/user//.Trash/*, so it doesn't influence gpcheckhdfs in the 
condition.

  was:
When enable hdfs trash feature, there is a *Trash* directory under the 
encryption zone.
Example:

{code}
[gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
/hawq/hawq-1503886333  tdekey
[gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
Found 1 items
drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
/hawq/hawq-1503886333/.Trash
{code}

But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
directory.
We should fix it to skip this trash directory.

Note:



> gpcheckhdfs should skip hdfs trash directory
> 
>
> Key: HAWQ-1520
> URL: https://issues.apache.org/jira/browse/HAWQ-1520
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When enable hdfs trash feature, there is a *Trash* directory under the 
> encryption zone.
> Example:
> {code}
> [gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
> /hawq/hawq-1503886333  tdekey
> [gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
> /hawq/hawq-1503886333/.Trash
> {code}
> But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
> directory.
> We should fix it to skip this trash directory.
> Note:
> If you just enable trash without encryption zone, the trash directory is 
> */user//.Trash/*, so it doesn't influence gpcheckhdfs in the 
> condition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1520) gpcheckhdfs should skip hdfs trash directory

2017-09-06 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1520.
---
Resolution: Fixed

fixed

> gpcheckhdfs should skip hdfs trash directory
> 
>
> Key: HAWQ-1520
> URL: https://issues.apache.org/jira/browse/HAWQ-1520
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> When enable hdfs trash feature, there is a *Trash* directory under the 
> encryption zone.
> Example:
> {code}
> [gpadmin@test1 hawq]$ sudo -u hdfs hdfs crypto -listZones
> /hawq/hawq-1503886333  tdekey
> [gpadmin@test1 hawq]$ hdfs dfs -ls /hawq/hawq-1503886333/
> Found 1 items
> drwxrwxrwt   - hdfs hdfs  0 2017-08-27 23:59 
> /hawq/hawq-1503886333/.Trash
> {code}
> But gpcheckhdfs scrpit doesn't consider it(/hawq/hawq-1503886333) as an empty 
> directory.
> We should fix it to skip this trash directory.
> Note:
> If you just enable trash without encryption zone, the trash directory is 
> */user//.Trash/*, so it doesn't influence gpcheckhdfs in the 
> condition.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-09-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1510:

Description: 
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* create encryption zone need hdfs *superuser privilege*, so if hawq user and 
hdfs superuser is not the same one, you should create the encryption zone on 
hawq directory manually before running hawq-init script, example:
{code}
hdfs crypto -createZone -keyName key_demo -path /hawq_default/
{code}

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

-3, hawq register-
cannot register file in different encryption zones / un-encryption zones.

-4, hawq extract-
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.


  was:
1, hawq init
the only way to enable tde in hawq:
user should give a key name(already created by hadoop key command) parameter 
when execuate the init command, it makes the whole hawq_default directory as an 
encryption zone.

note:
* cannot support transfer the existed(and non-empty) hawq_default directory 
into an encryption zone.
* create encryption zone need hdfs *superuser privilege*, so if hawq user and 
hdfs superuser is not the same one, you should create the encryption zone on 
hawq directory manually before running hawq-init script, example:
{code}
hdfs crypto -createZone -keyName key_demo -path /hawq_default/
{code}

command:
{code}
hawq init cluster --tde_keyname key_demo
{code}

-2, hawq state-
-show the encryption zone info if user enable tde in hawq.-

3, hawq register 
cannot register file in different encryption zones / un-encryption zones.

4, hawq extract
give user a warning of the table data is stored in encryption zone if user 
enable tde in hawq.



> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * create encryption zone need hdfs *superuser privilege*, so if hawq user and 
> hdfs superuser is not the same one, you should create the encryption zone on 
> hawq directory manually before running hawq-init script, example:
> {code}
> hdfs crypto -createZone -keyName key_demo -path /hawq_default/
> {code}
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> -3, hawq register-
> cannot register file in different encryption zones / un-encryption zones.
> -4, hawq extract-
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1510) Add TDE-related functionality into hawq command line tools

2017-09-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1510.
---
Resolution: Fixed

done

> Add TDE-related functionality into hawq command line tools
> --
>
> Key: HAWQ-1510
> URL: https://issues.apache.org/jira/browse/HAWQ-1510
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Command Line Tools
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: 2.3.0.0-incubating
>
>
> 1, hawq init
> the only way to enable tde in hawq:
> user should give a key name(already created by hadoop key command) parameter 
> when execuate the init command, it makes the whole hawq_default directory as 
> an encryption zone.
> note:
> * cannot support transfer the existed(and non-empty) hawq_default directory 
> into an encryption zone.
> * create encryption zone need hdfs *superuser privilege*, so if hawq user and 
> hdfs superuser is not the same one, you should create the encryption zone on 
> hawq directory manually before running hawq-init script, example:
> {code}
> hdfs crypto -createZone -keyName key_demo -path /hawq_default/
> {code}
> command:
> {code}
> hawq init cluster --tde_keyname key_demo
> {code}
> -2, hawq state-
> -show the encryption zone info if user enable tde in hawq.-
> -3, hawq register-
> cannot register file in different encryption zones / un-encryption zones.
> -4, hawq extract-
> give user a warning of the table data is stored in encryption zone if user 
> enable tde in hawq.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (HAWQ-1193) TDE support in HAWQ

2017-09-25 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1193.
---
Resolution: Fixed

finished

> TDE support in HAWQ
> ---
>
> Key: HAWQ-1193
> URL: https://issues.apache.org/jira/browse/HAWQ-1193
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
> Fix For: backlog
>
> Attachments: HAWQ_TDE_Design_ver0.1.pdf, HAWQ_TDE_Design_ver0.2 .pdf
>
>
>  TDE(transparently data encrypted) has been supported after hadoop 2.6:
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/TransparentEncryption.html
> https://issues.apache.org/jira/browse/HDFS-6134
> Use TDE can promise:
> 1, hdfs file is encrypted.
> 2, network transfer between hdfs and libhdfs client is encrypted.
> So hawq will update libhdfs3 to support it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (HAWQ-1544) prompt file count doesn't match hash bucket number when reorganize table

2017-11-06 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1544:
---

 Summary: prompt file count doesn't match hash bucket number when 
reorganize table
 Key: HAWQ-1544
 URL: https://issues.apache.org/jira/browse/HAWQ-1544
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Catalog
Reporter: Hongxu Ma
Assignee: Radar Lei


Reproduce:
{code}
postgres=# create table t_vm ( i int ) distributed randomly;
CREATE TABLE
postgres=# insert into t_vm select generate_series(1,1);
INSERT 0 1
postgres=# show default_hash_table_bucket_number ;
 default_hash_table_bucket_number
--
 6
(1 row)

postgres=# set default_hash_table_bucket_number=8;
SET
postgres=# show default_hash_table_bucket_number ;
 default_hash_table_bucket_number
--
 8
(1 row)

postgres=# alter table t_vm set with(reorganize=true) distributed by (i) ;
ALTER TABLE
postgres=# select count(*) from t_vm;
ERROR:  file count 8 in catalog is not in proportion to the bucket number 6 of 
hash table with oid=16619, some data may be lost, if you still want to continue 
the query by considering the table as random, set GUC 
allow_file_count_bucket_num_mismatch to on and try again. 
(cdbdatalocality.c:3776)
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1544) prompt file count doesn't match hash bucket number when reorganize table

2017-11-06 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1544:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> prompt file count doesn't match hash bucket number when reorganize table
> 
>
> Key: HAWQ-1544
> URL: https://issues.apache.org/jira/browse/HAWQ-1544
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>
> Reproduce:
> {code}
> postgres=# create table t_vm ( i int ) distributed randomly;
> CREATE TABLE
> postgres=# insert into t_vm select generate_series(1,1);
> INSERT 0 1
> postgres=# show default_hash_table_bucket_number ;
>  default_hash_table_bucket_number
> --
>  6
> (1 row)
> postgres=# set default_hash_table_bucket_number=8;
> SET
> postgres=# show default_hash_table_bucket_number ;
>  default_hash_table_bucket_number
> --
>  8
> (1 row)
> postgres=# alter table t_vm set with(reorganize=true) distributed by (i) ;
> ALTER TABLE
> postgres=# select count(*) from t_vm;
> ERROR:  file count 8 in catalog is not in proportion to the bucket number 6 
> of hash table with oid=16619, some data may be lost, if you still want to 
> continue the query by considering the table as random, set GUC 
> allow_file_count_bucket_num_mismatch to on and try again. 
> (cdbdatalocality.c:3776)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1546) The data files on hdfs after hawq load data were too large!!!

2017-11-11 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1546?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16248427#comment-16248427
 ] 

Hongxu Ma commented on HAWQ-1546:
-

Hi ~lynn
Hawq is not good at dealing with real-time data, and call insert statement 
frequently can cause big data files (and also cause select slowly) is a known 
issue of hawq (on parquet format).
As a walkaround, you can reduce the load frequency, every hour maybe a good 
choice, but obviously it loses the real-time ability...
 
Maybe AO table is better for your scenario, I will check it later, but 
obviously using AO table can cause some performance lose.


> The data files on hdfs after hawq load data were too large!!!
> -
>
> Key: HAWQ-1546
> URL: https://issues.apache.org/jira/browse/HAWQ-1546
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: lynn
>Assignee: Radar Lei
>
> create table person_l1 (id int, name varchar(20), age int, sex 
> char(1))with(appendonly=true,orientation=parquet,compresstype=snappy);
> create table person_l2 (id int, name varchar(20), age int, sex 
> char(1))with(appendonly=true,orientation=parquet,compresstype=snappy);
> 执行480次插入语句:
> sh insert.sh
> 脚本内容: 
> i4.sql:
> insert into person_l1 values(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1');
> insert.sh:
> #!/bin/bash
> num=480
> for ((i=0; i<$num; i=$[$i+1])) do
>psql -d test -f i4.sql
> done
> 执行1次插入语句:
> 1920条数据
>  psql -d test -f i1.sql 
> 脚本内容:
> i1.sql:
> SET hawq_rm_stmt_nvseg=10;
> SET hawq_rm_stmt_vseg_memory='512mb';
> insert into person_l2 values(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, 
> '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 
> 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, 
> '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 
> 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, 
> '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 
> 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, 
> '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, 
> '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 
> 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, 
> '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 
> 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, 
> '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 
> 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, 
> '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, 
> '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 
> 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, 
> '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 
> 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, 
> '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 
> 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, 
> '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, 
> '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 
> 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, 
> '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 
> 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, 
> '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 
> 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, 
> '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, 
> '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 
> 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, 
> '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 
> 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 'lynn', 28, 
> '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, '1'),(4, 
> 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 'lynn', 28, 
> '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28, '1'),(3, 
> 'lynn', 28, '1'),(4, 'lynn', 28, '1'),(1, 'lynn', 28, '1'),(2, 'lynn', 28

[jira] [Commented] (HAWQ-1552) hawq does not support hdfs storage policy?

2017-11-19 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258831#comment-16258831
 ] 

Hongxu Ma commented on HAWQ-1552:
-

Bad news: HAWQ doesn't support it now. 
HAWQ implements an own hdfs access lib: libhdfs3, so can't leverage benefits of 
the latest HDFS feature...


> hawq does not support hdfs storage policy?
> --
>
> Key: HAWQ-1552
> URL: https://issues.apache.org/jira/browse/HAWQ-1552
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: libhdfs
>Reporter: lynn
>Assignee: Radar Lei
>
> 1.The '/ssd" path on HDFS set storage policy ALL_SSD.
> hdfs storagepolicies -setStoragePolicy -path /ssd -policy ALL_SSD
> 2. check
> [hdfs@master1 ~]$ hdfs storagepolicies -getStoragePolicy -path /ssd
> The storage policy of /ssd:
> BlockStoragePolicy{ALL_SSD:12, storageTypes=[SSD], creationFallbacks=[DISK], 
> replicationFallbacks=[DISK]}
> 3. put file to hdfs
> hdfs dfs -put dd.txt /ssd/fs_ssd
> 4.check block location
> [hdfs@master1 ~]$ hdfs fsck /ssd/fs_ssd/dd.txt -blocks -locations -files
> decommissioned replica(s) and 0 decommissioning replica(s).
> 0. BP-845848702-192.168.1.130-1496396138316:blk_1075677761_7587369 len=7 
> repl=3 
> [DatanodeInfoWithStorage[192.168.1.133:50010,DS-1510d4e4-cfdb-4184-8f47-7417b91f4f5c,{color:red}SSD{color}],
>  
> DatanodeInfoWithStorage[192.168.1.132:50010,DS-7d498d01-8242-4621-8901-fe397a8196c3,{color:red}SSD{color}],
>  
> DatanodeInfoWithStorage[192.168.1.134:50010,DS-37c4e804-1b2a-4156-a54c-cecc8393bb09,{color:red}SSD{color}]]
> 5.hawq create filespace fs_ssd and tablespace ts_ssd, fs_ssd point to 
> /ssd/fs_ssd path 
> 6.psql create table 
> create table p(i 
> int)with(appendonly=true,orientation=parquet,compresstype=snappy)  tablespace 
> fs_ssd;
> 7. psql insert data
> insert into p values(1);
> 8. query the file on hdfs
> select c.relname, d.dat2tablespace tablespace_id, d.oid database_id, 
> c.relfilenode table_id
>   from pg_database d, pg_class c, pg_namespace n
>  where c.relnamespace = n.oid
>and d.datname = current_database()
>and c.relname = 'p';
> relname | tablespace_id | database_id | table_id 
> -+---+-+--
>  p   |   1021474 | 1021475 |  1037187
> 9. check the file of table "p" locations
> [hdfs@master1 ~]$ hdfs fsck /ssd/fs_ssd/1021474/1021475/1037187/1 -blocks 
> -locations -files
> Connecting to namenode via 
> http://master1.bigdata:50070/fsck?ugi=hdfs&blocks=1&locations=1&files=1&path=%2Fssd%2Ffs_ssd%2F1021474%2F1021475%2F1037187%2F1
> FSCK started by hdfs (auth:SIMPLE) from /192.168.1.130 for path 
> /ssd/fs_ssd/1021474/1021475/1037187/1 at Fri Nov 17 17:26:17 CST 2017
> /ssd/fs_ssd/1021474/1021475/1037187/1 188 bytes, 1 block(s):  OK
> 0. BP-845848702-192.168.1.130-1496396138316:blk_1075677763_7587371 len=188 
> repl=3 
> [DatanodeInfoWithStorage[192.168.1.134:50010,DS-4be28698-6ebd-4ae0-a515-f3fb5e1293ab,{color:red}DISK{color}],
>  
> DatanodeInfoWithStorage[192.168.1.133:50010,DS-99d56cac-5af0-483d-b93f-a1bbae038934,{color:red}DISK{color}],
>  
> DatanodeInfoWithStorage[192.168.1.132:50010,DS-22c09ee4-49ac-47ed-a592-4f0e84776086,{color:red}DISK{color}]]
> The ALL_SSD storage policy doesn't work!!!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1578) Regression Test (Feature->Ranger)Failed because pxfwritable_import_beginscan function was not found

2018-01-02 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309225#comment-16309225
 ] 

Hongxu Ma commented on HAWQ-1578:
-

[~Chiyang Wan]
Caused by your recent commit:
https://github.com/apache/incubator-hawq/commit/76e38c53b9377a055e6a2db6f63dc2e984c25025

Help to fix, thanks!


> Regression Test (Feature->Ranger)Failed because pxfwritable_import_beginscan 
> function was not found 
> 
>
> Key: HAWQ-1578
> URL: https://issues.apache.org/jira/browse/HAWQ-1578
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: PXF, Tests
>Reporter: WANG Weinan
>Assignee: Chiyang Wan
>
> The TestHawqRanger failed when do test PXFHiveTest and PXFHBaseTest, the test 
> log is shown as follow:
> Note: Google Test filter = TestHawqRanger.PXFHiveTest
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from TestHawqRanger
> [ RUN  ] TestHawqRanger.PXFHiveTest
> lib/sql_util.cpp:197: Failure
> Value of: is_sql_ans_diff
>   Actual: true
> Expected: false
> lib/sql_util.cpp:203: Failure
> Value of: true
>   Actual: true
> Expected: false
> [  FAILED  ] TestHawqRanger.PXFHiveTest (89777 ms)
> [--] 1 test from TestHawqRanger (89777 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (89777 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] TestHawqRanger.PXFHiveTest
>  1 FAILED TEST
> [125/133] TestHawqRanger.PXFHiveTest returned/aborted with exit code 1 (89787 
> ms)
> [128/133] TestHawqRanger.PXFHBaseTest (87121 ms)  
>   
> Note: Google Test filter = TestHawqRanger.PXFHBaseTest
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from TestHawqRanger
> [ RUN  ] TestHawqRanger.PXFHBaseTest
> lib/sql_util.cpp:197: Failure
> Value of: is_sql_ans_diff
>   Actual: true
> Expected: false
> lib/sql_util.cpp:203: Failure
> Value of: true
>   Actual: true
> Expected: false
> [  FAILED  ] TestHawqRanger.PXFHBaseTest (87098 ms)
> [--] 1 test from TestHawqRanger (87098 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (87099 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] TestHawqRanger.PXFHBaseTest
> We can find some suspicious log in master segment log file :
> 2018-01-03 05:21:30.170970 
> UTC,"gpadmin","hawq_feature_test_db",p109703,th-290256608,"127.0.0.1","56288",2018-01-03
>  05:21:29 
> UTC,14669,con2342,cmd4,seg-1,,,x14669,sx1,"ERROR","XX000","pxfwritable_import_beginscan
>  function was not found (nodeExternalscan.c:310)",,"select * from 
> test_hbase;",0,,"nodeExternalscan.c",310,"Stack trace:
> 10x8cf31e postgres errstart (elog.c:505)
> 20x8d11bb postgres elog_finish (elog.c:1459)
> 30x69134a postgres ExecInitExternalScan (nodeExternalscan.c:215)
> 40x670b9d postgres ExecInitNode (execProcnode.c:371)
> 50x69b7d1 postgres ExecInitMotion (nodeMotion.c:1096)
> 60x670064 postgres ExecInitNode (execProcnode.c:629)
> 70x66a407 postgres ExecutorStart (execMain.c:2048)
> 80x7f8fcd postgres PortalStart (pquery.c:1308)
> 90x7f0628 postgres  (postgres.c:1795)
> 10   0x7f1cb0 postgres PostgresMain (postgres.c:4897)
> 11   0x7a40c0 postgres  (postmaster.c:5486)
> 12   0x7a6e89 postgres PostmasterMain (postmaster.c:1459)
> 13   0x4a5a59 postgres main (main.c:226)
> 14   0x7fceea8a1d1d libc.so.6 __libc_start_main (??:0)
> 15   0x4a5ad9 postgres  (??:0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (HAWQ-1578) Regression Test (Feature->Ranger)Failed because pxfwritable_import_beginscan function was not found

2018-01-03 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309263#comment-16309263
 ] 

Hongxu Ma commented on HAWQ-1578:
-

Thanks! 
Seem doesn't consider the PXF external table (pxfXXX, e.g. pxfwritable_import) 
in nodeExternalscan.c:310

In my opintion, a walkaround fix is acceptable.

> Regression Test (Feature->Ranger)Failed because pxfwritable_import_beginscan 
> function was not found 
> 
>
> Key: HAWQ-1578
> URL: https://issues.apache.org/jira/browse/HAWQ-1578
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: PXF, Tests
>Reporter: WANG Weinan
>Assignee: Chiyang Wan
>
> The TestHawqRanger failed when do test PXFHiveTest and PXFHBaseTest, the test 
> log is shown as follow:
> Note: Google Test filter = TestHawqRanger.PXFHiveTest
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from TestHawqRanger
> [ RUN  ] TestHawqRanger.PXFHiveTest
> lib/sql_util.cpp:197: Failure
> Value of: is_sql_ans_diff
>   Actual: true
> Expected: false
> lib/sql_util.cpp:203: Failure
> Value of: true
>   Actual: true
> Expected: false
> [  FAILED  ] TestHawqRanger.PXFHiveTest (89777 ms)
> [--] 1 test from TestHawqRanger (89777 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (89777 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] TestHawqRanger.PXFHiveTest
>  1 FAILED TEST
> [125/133] TestHawqRanger.PXFHiveTest returned/aborted with exit code 1 (89787 
> ms)
> [128/133] TestHawqRanger.PXFHBaseTest (87121 ms)  
>   
> Note: Google Test filter = TestHawqRanger.PXFHBaseTest
> [==] Running 1 test from 1 test case.
> [--] Global test environment set-up.
> [--] 1 test from TestHawqRanger
> [ RUN  ] TestHawqRanger.PXFHBaseTest
> lib/sql_util.cpp:197: Failure
> Value of: is_sql_ans_diff
>   Actual: true
> Expected: false
> lib/sql_util.cpp:203: Failure
> Value of: true
>   Actual: true
> Expected: false
> [  FAILED  ] TestHawqRanger.PXFHBaseTest (87098 ms)
> [--] 1 test from TestHawqRanger (87098 ms total)
> [--] Global test environment tear-down
> [==] 1 test from 1 test case ran. (87099 ms total)
> [  PASSED  ] 0 tests.
> [  FAILED  ] 1 test, listed below:
> [  FAILED  ] TestHawqRanger.PXFHBaseTest
> We can find some suspicious log in master segment log file :
> 2018-01-03 05:21:30.170970 
> UTC,"gpadmin","hawq_feature_test_db",p109703,th-290256608,"127.0.0.1","56288",2018-01-03
>  05:21:29 
> UTC,14669,con2342,cmd4,seg-1,,,x14669,sx1,"ERROR","XX000","pxfwritable_import_beginscan
>  function was not found (nodeExternalscan.c:310)",,"select * from 
> test_hbase;",0,,"nodeExternalscan.c",310,"Stack trace:
> 10x8cf31e postgres errstart (elog.c:505)
> 20x8d11bb postgres elog_finish (elog.c:1459)
> 30x69134a postgres ExecInitExternalScan (nodeExternalscan.c:215)
> 40x670b9d postgres ExecInitNode (execProcnode.c:371)
> 50x69b7d1 postgres ExecInitMotion (nodeMotion.c:1096)
> 60x670064 postgres ExecInitNode (execProcnode.c:629)
> 70x66a407 postgres ExecutorStart (execMain.c:2048)
> 80x7f8fcd postgres PortalStart (pquery.c:1308)
> 90x7f0628 postgres  (postgres.c:1795)
> 10   0x7f1cb0 postgres PostgresMain (postgres.c:4897)
> 11   0x7a40c0 postgres  (postmaster.c:5486)
> 12   0x7a6e89 postgres PostmasterMain (postmaster.c:1459)
> 13   0x4a5a59 postgres main (main.c:226)
> 14   0x7fceea8a1d1d libc.so.6 __libc_start_main (??:0)
> 15   0x4a5ad9 postgres  (??:0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (HAWQ-1450) New HAWQ executor with vectorization & possible code generation

2018-01-22 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1450:
---

Assignee: Hongxu Ma  (was: Lei Chang)

> New HAWQ executor with vectorization & possible code generation
> ---
>
> Key: HAWQ-1450
> URL: https://issues.apache.org/jira/browse/HAWQ-1450
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lei Chang
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
>
> Most HAWQ executor code is inherited from postgres & gpdb. Let's discuss how 
> to build a new hawq executor with vectorization and possibly code generation. 
> These optimization may potentially improve the query performance a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1450) New HAWQ executor with vectorization & possible code generation

2018-01-22 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335470#comment-16335470
 ] 

Hongxu Ma commented on HAWQ-1450:
-

We will contribute a vectorized executor *extension* (so it doesn't influence 
other non-vectorized logic).

By now, it just includes some small vectorized operators/functions but is a 
good start for this issue.

The design doc will be released later, welcome to comment.

 

> New HAWQ executor with vectorization & possible code generation
> ---
>
> Key: HAWQ-1450
> URL: https://issues.apache.org/jira/browse/HAWQ-1450
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lei Chang
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
>
> Most HAWQ executor code is inherited from postgres & gpdb. Let's discuss how 
> to build a new hawq executor with vectorization and possibly code generation. 
> These optimization may potentially improve the query performance a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1583) Add vectorized executor extension and GNC

2018-01-23 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1583:
---

 Summary: Add vectorized executor extension and GNC
 Key: HAWQ-1583
 URL: https://issues.apache.org/jira/browse/HAWQ-1583
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Query Execution
Reporter: Hongxu Ma
Assignee: Lei Chang
 Fix For: backlog


The vectorized executor will be implemented as a extension (located at contrib 
directory).

And use a GNC to enable vectorized executor, e.g:

{code}

postgres=# set vectorized_executor_enable to on;

// run the new vectorized executor

postgres=# set vectorized_executor_enable to off;

// run the original HAWQ executor


{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1583) Add vectorized executor extension and GNC

2018-01-23 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1583:

Description: 
The vectorized executor will be implemented as a extension (located at contrib 
directory).

And use a GNC to enable vectorized executor, e.g:
{code:java}
postgres=# set vectorized_executor_enable to on;

// run the new vectorized executor

postgres=# set vectorized_executor_enable to off;

// run the original HAWQ executor
{code}

  was:
The vectorized executor will be implemented as a extension (located at contrib 
directory).

And use a GNC to enable vectorized executor, e.g:

{code}

postgres=# set vectorized_executor_enable to on;

// run the new vectorized executor

postgres=# set vectorized_executor_enable to off;

// run the original HAWQ executor


{code}


> Add vectorized executor extension and GNC
> -
>
> Key: HAWQ-1583
> URL: https://issues.apache.org/jira/browse/HAWQ-1583
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: Query Execution
>Reporter: Hongxu Ma
>Assignee: Lei Chang
>Priority: Major
> Fix For: backlog
>
>
> The vectorized executor will be implemented as a extension (located at 
> contrib directory).
> And use a GNC to enable vectorized executor, e.g:
> {code:java}
> postgres=# set vectorized_executor_enable to on;
> // run the new vectorized executor
> postgres=# set vectorized_executor_enable to off;
> // run the original HAWQ executor
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1514) TDE feature makes libhdfs3 require openssl1.1

2018-01-31 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347938#comment-16347938
 ] 

Hongxu Ma commented on HAWQ-1514:
-

We open a PR to check the openssl version in hawq configure (more detailed: 
find the needed function).

It prompt an error message if the check failed.

 

> TDE feature makes libhdfs3 require openssl1.1
> -
>
> Key: HAWQ-1514
> URL: https://issues.apache.org/jira/browse/HAWQ-1514
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: libhdfs
>Reporter: Yi Jin
>Assignee: WANG Weinan
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> New TDE feature delivered in libhdfs3 requires specific version of openssl, 
> at least per my test, 1.0.21 does not work, and 1.1 source code built library 
> passed.
> So maybe we need some build and installation instruction improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1514) TDE feature makes libhdfs3 require openssl1.1

2018-02-01 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16349843#comment-16349843
 ] 

Hongxu Ma commented on HAWQ-1514:
-

I have updated the "build and install" wiki:

Added the needed version and how to assign a specific openssl path by the 
environment variable:"{{DEPENDENCY_INSTALL_PREFIX}}".

> TDE feature makes libhdfs3 require openssl1.1
> -
>
> Key: HAWQ-1514
> URL: https://issues.apache.org/jira/browse/HAWQ-1514
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: libhdfs
>Reporter: Yi Jin
>Assignee: WANG Weinan
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> New TDE feature delivered in libhdfs3 requires specific version of openssl, 
> at least per my test, 1.0.21 does not work, and 1.1 source code built library 
> passed.
> So maybe we need some build and installation instruction improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (HAWQ-1514) TDE feature makes libhdfs3 require openssl1.1

2018-02-01 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1514.
---
Resolution: Fixed

fixed

> TDE feature makes libhdfs3 require openssl1.1
> -
>
> Key: HAWQ-1514
> URL: https://issues.apache.org/jira/browse/HAWQ-1514
> Project: Apache HAWQ
>  Issue Type: Task
>  Components: libhdfs
>Reporter: Yi Jin
>Assignee: WANG Weinan
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> New TDE feature delivered in libhdfs3 requires specific version of openssl, 
> at least per my test, 1.0.21 does not work, and 1.1 source code built library 
> passed.
> So maybe we need some build and installation instruction improvement. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (HAWQ-1544) prompt file count doesn't match hash bucket number when reorganize table

2018-02-07 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1544.
---
   Resolution: Fixed
Fix Version/s: 2.3.0.0-incubating

fixed

> prompt file count doesn't match hash bucket number when reorganize table
> 
>
> Key: HAWQ-1544
> URL: https://issues.apache.org/jira/browse/HAWQ-1544
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Catalog
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> Reproduce:
> {code}
> postgres=# create table t_vm ( i int ) distributed randomly;
> CREATE TABLE
> postgres=# insert into t_vm select generate_series(1,1);
> INSERT 0 1
> postgres=# show default_hash_table_bucket_number ;
>  default_hash_table_bucket_number
> --
>  6
> (1 row)
> postgres=# set default_hash_table_bucket_number=8;
> SET
> postgres=# show default_hash_table_bucket_number ;
>  default_hash_table_bucket_number
> --
>  8
> (1 row)
> postgres=# alter table t_vm set with(reorganize=true) distributed by (i) ;
> ALTER TABLE
> postgres=# select count(*) from t_vm;
> ERROR:  file count 8 in catalog is not in proportion to the bucket number 6 
> of hash table with oid=16619, some data may be lost, if you still want to 
> continue the query by considering the table as random, set GUC 
> allow_file_count_bucket_num_mismatch to on and try again. 
> (cdbdatalocality.c:3776)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1591) Common tuple batch structure for vectorized execution

2018-02-25 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1591:
---

 Summary: Common tuple batch structure for vectorized execution
 Key: HAWQ-1591
 URL: https://issues.apache.org/jira/browse/HAWQ-1591
 Project: Apache HAWQ
  Issue Type: Sub-task
  Components: Query Execution
Reporter: Hongxu Ma
Assignee: Lei Chang
 Fix For: backlog


A common tuple batch structure for vectorized execution, holds the tuples which 
be transfered between vectorized operators.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1594) Memory leak in standby master (gpsyncagent process)

2018-03-13 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1594:
---

 Summary: Memory leak in standby master (gpsyncagent process)
 Key: HAWQ-1594
 URL: https://issues.apache.org/jira/browse/HAWQ-1594
 Project: Apache HAWQ
  Issue Type: Bug
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: backlog


In a high workload scenario, the gpsyncagent process of standby master consumes 
memory continuously until restart it.

There are some Memory leak happened.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1594) Memory leak in standby master (gpsyncagent process)

2018-03-13 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1594:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Memory leak in standby master (gpsyncagent process)
> ---
>
> Key: HAWQ-1594
> URL: https://issues.apache.org/jira/browse/HAWQ-1594
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
>
> In a high workload scenario, the gpsyncagent process of standby master 
> consumes memory continuously until restart it.
> There are some Memory leak happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1594) Memory leak in standby master (gpsyncagent process)

2018-03-13 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16398030#comment-16398030
 ] 

Hongxu Ma commented on HAWQ-1594:
-

After some investigations, this issue is the same as GPDB MPP-20203, I will 
port the code later.

> Memory leak in standby master (gpsyncagent process)
> ---
>
> Key: HAWQ-1594
> URL: https://issues.apache.org/jira/browse/HAWQ-1594
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
>
> In a high workload scenario, the gpsyncagent process of standby master 
> consumes memory continuously until restart it.
> There are some Memory leak happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HAWQ-1594) Memory leak in standby master (gpsyncagent process)

2018-03-13 Thread Hongxu Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16398030#comment-16398030
 ] 

Hongxu Ma edited comment on HAWQ-1594 at 3/14/18 3:19 AM:
--

After some investigations, this issue is the same as a fixed issue of GPDB, I 
will port the code later.


was (Author: hongxu ma):
After some investigations, this issue is the same as GPDB MPP-20203, I will 
port the code later.

> Memory leak in standby master (gpsyncagent process)
> ---
>
> Key: HAWQ-1594
> URL: https://issues.apache.org/jira/browse/HAWQ-1594
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
>
> In a high workload scenario, the gpsyncagent process of standby master 
> consumes memory continuously until restart it.
> There are some Memory leak happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (HAWQ-1594) Memory leak in standby master (gpsyncagent process)

2018-04-11 Thread Hongxu Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1594.
---
   Resolution: Fixed
Fix Version/s: 2.4.0.0-incubating

fixed

> Memory leak in standby master (gpsyncagent process)
> ---
>
> Key: HAWQ-1594
> URL: https://issues.apache.org/jira/browse/HAWQ-1594
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog, 2.4.0.0-incubating
>
>
> In a high workload scenario, the gpsyncagent process of standby master 
> consumes memory continuously until restart it.
> There are some Memory leak happened.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HAWQ-1618) Segment panic at workfile_mgr_close_file() when transaction ROLLBACK

2018-05-29 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1618:
---

Assignee: Hongxu Ma  (was: Lei Chang)

> Segment panic at workfile_mgr_close_file() when transaction ROLLBACK
> 
>
> Key: HAWQ-1618
> URL: https://issues.apache.org/jira/browse/HAWQ-1618
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Log:
> {code}
> 2018-05-23 15:49:14.843058 
> UTC,"user","db",p179799,th401824032,"172.31.6.17","6935",2018-05-23 15:47:39 
> UTC,1260445558,con25148,cmd7,seg21,slice82,,x1260445558,sx1,"ERROR","25M01","*canceling
>  MPP operation*",,"INSERT INTO ...
> 2018-05-23 15:49:15.253671 UTC,,,p179799,th0,,,2018-05-23 15:47:39 
> UTC,0,con25148,cmd7,seg21,slice82"PANIC","XX000","Unexpected internal 
> error: Segment process r
> eceived signal SIGSEGV",,,0"1    0x8ce2a3 postgres gp_backtrace + 0xa3
> 2    0x8ce491 postgres  + 0x8ce491
> 3    0x7f2d147ae7e0 libpthread.so.0  + 0x147ae7e0
> 4    0x91f4ad postgres workfile_mgr_close_file + 0xd
> 5    0x90bc84 postgres  + 0x90bc84
> 6    0x4e6b60 postgres AbortTransaction + 0x240
> 7    0x4e75c5 postgres AbortCurrentTransaction + 0x25
> 8    0x7ed81a postgres PostgresMain + 0x6ea
> 9    0x7a0c50 postgres  + 0x7a0c50
> 10   0x7a3a19 postgres PostmasterMain + 0x759
> 11   0x4a5309 postgres main + 0x519
> 12   0x7f2d13cead1d libc.so.6 __libc_start_main + 0xfd
> 13   0x4a5389 postgres  + 0x4a5389"
> {code}
>  
> Core stack:
> {code}
> (gdb) bt
> #0  0x7f2d147ae6ab in raise () from libpthread.so.0
> #1  0x008ce552 in SafeHandlerForSegvBusIll (postgres_signal_arg=11, 
> processName=) at elog.c:4573
> #2  
> #3  *workfile_mgr_close_file* (work_set=0x0, file=0x7f2ce96d2de0, 
> canReportError=canReportError@entry=0 '\000') at workfile_file.c:129
> #4  0x0090bc84 in *ntuplestore_cleanup* (fNormal=0 '\000', 
> canReportError=0 '\000', ts=0x21f4810) at tuplestorenew.c:654
> #5  XCallBack_NTS (event=event@entry=XACT_EVENT_ABORT, 
> nts=nts@entry=0x21f4810) at tuplestorenew.c:674
> #6  0x004e6b60 in CallXactCallbacksOnce (event=) at 
> xact.c:3660
> #7  AbortTransaction () at xact.c:2871
> #8  0x004e75c5 in AbortCurrentTransaction () at xact.c:3377
> #9  0x007ed81a in PostgresMain (argc=, argv= out>, argv@entry=0x182c900, username=0x17ddcd0 "user") at postgres.c:4648
> #10 0x007a0c50 in BackendRun (port=0x17cfb10) at postmaster.c:5915
> #11 BackendStartup (port=0x17cfb10) at postmaster.c:5484
> #12 ServerLoop () at postmaster.c:2163
> #13 0x007a3a19 in PostmasterMain (argc=, 
> argv=) at postmaster.c:1454
> #14 0x004a5309 in main (argc=9, argv=0x1785d10) at main.c:226
> {code}
>  
> Repro:
> {code}
> # create test table
> drop table if exists testsisc; 
> create table testsisc (i1 int, i2 int, i3 int, i4 int); 
> insert into testsisc select i, i % 1000, i % 10, i % 75 from 
> generate_series(0,1) i;
> drop table if exists to_insert_into; 
> create table to_insert_into as 
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select t1.c1 as c11, t1.c2 as c12, t2.c1 as c21, t2.c2 as c22
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2
> limit 10;
> # run a long time query
> begin;
> set gp_simex_run=on;
> set gp_cte_sharing=on;
> insert into to_insert_into
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select *
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2;
> commit;
> {code}
> Kill one segment process when the second query is running. Then will find 
> panic log in segment log.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1618) Segment panic at workfile_mgr_close_file() when transaction ROLLBACK

2018-05-29 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1618:
---

 Summary: Segment panic at workfile_mgr_close_file() when 
transaction ROLLBACK
 Key: HAWQ-1618
 URL: https://issues.apache.org/jira/browse/HAWQ-1618
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Query Execution
Reporter: Hongxu Ma
Assignee: Lei Chang
 Fix For: 2.4.0.0-incubating


Log:

{code}

2018-05-23 15:49:14.843058 
UTC,"user","db",p179799,th401824032,"172.31.6.17","6935",2018-05-23 15:47:39 
UTC,1260445558,con25148,cmd7,seg21,slice82,,x1260445558,sx1,"ERROR","25M01","*canceling
 MPP operation*",,"INSERT INTO ...

2018-05-23 15:49:15.253671 UTC,,,p179799,th0,,,2018-05-23 15:47:39 
UTC,0,con25148,cmd7,seg21,slice82"PANIC","XX000","Unexpected internal 
error: Segment process r

eceived signal SIGSEGV",,,0"1    0x8ce2a3 postgres gp_backtrace + 0xa3

2    0x8ce491 postgres  + 0x8ce491

3    0x7f2d147ae7e0 libpthread.so.0  + 0x147ae7e0

4    0x91f4ad postgres workfile_mgr_close_file + 0xd

5    0x90bc84 postgres  + 0x90bc84

6    0x4e6b60 postgres AbortTransaction + 0x240

7    0x4e75c5 postgres AbortCurrentTransaction + 0x25

8    0x7ed81a postgres PostgresMain + 0x6ea

9    0x7a0c50 postgres  + 0x7a0c50

10   0x7a3a19 postgres PostmasterMain + 0x759

11   0x4a5309 postgres main + 0x519

12   0x7f2d13cead1d libc.so.6 __libc_start_main + 0xfd

13   0x4a5389 postgres  + 0x4a5389"

{code}

 

Core stack:

{code}

(gdb) bt

#0  0x7f2d147ae6ab in raise () from libpthread.so.0

#1  0x008ce552 in SafeHandlerForSegvBusIll (postgres_signal_arg=11, 
processName=) at elog.c:4573

#2  

#3  *workfile_mgr_close_file* (work_set=0x0, file=0x7f2ce96d2de0, 
canReportError=canReportError@entry=0 '\000') at workfile_file.c:129

#4  0x0090bc84 in *ntuplestore_cleanup* (fNormal=0 '\000', 
canReportError=0 '\000', ts=0x21f4810) at tuplestorenew.c:654

#5  XCallBack_NTS (event=event@entry=XACT_EVENT_ABORT, nts=nts@entry=0x21f4810) 
at tuplestorenew.c:674

#6  0x004e6b60 in CallXactCallbacksOnce (event=) at 
xact.c:3660

#7  AbortTransaction () at xact.c:2871

#8  0x004e75c5 in AbortCurrentTransaction () at xact.c:3377

#9  0x007ed81a in PostgresMain (argc=, argv=, argv@entry=0x182c900, username=0x17ddcd0 "user") at postgres.c:4648

#10 0x007a0c50 in BackendRun (port=0x17cfb10) at postmaster.c:5915

#11 BackendStartup (port=0x17cfb10) at postmaster.c:5484

#12 ServerLoop () at postmaster.c:2163

#13 0x007a3a19 in PostmasterMain (argc=, argv=) at postmaster.c:1454

#14 0x004a5309 in main (argc=9, argv=0x1785d10) at main.c:226

{code}

 

Repro:

{code}

# create test table

drop table if exists testsisc; 
create table testsisc (i1 int, i2 int, i3 int, i4 int); 
insert into testsisc select i, i % 1000, i % 10, i % 75 from 
generate_series(0,1) i;


drop table if exists to_insert_into; 
create table to_insert_into as 
with ctesisc as 
 (select count(i1) as c1,i3 as c2 from testsisc group by i3)
select t1.c1 as c11, t1.c2 as c12, t2.c1 as c21, t2.c2 as c22
from ctesisc as t1, ctesisc as t2
where t1.c1 = t2.c2
limit 10;

# run a long time query

begin;

set gp_simex_run=on;
set gp_cte_sharing=on;

insert into to_insert_into
with ctesisc as 
 (select count(i1) as c1,i3 as c2 from testsisc group by i3)
select *
from ctesisc as t1, ctesisc as t2
where t1.c1 = t2.c2;

commit;

{code}

Kill one segment process when the second query is running. Then will find panic 
log in segment log.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1618) Segment panic at workfile_mgr_close_file() when transaction ROLLBACK

2018-05-29 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16494743#comment-16494743
 ] 

Hongxu Ma commented on HAWQ-1618:
-

This issue has already fixed in GPDB:
https://github.com/greenplum-db/gpdb/commit/f0a0a593bde8cf0c9b9bbc79061a09f7164b54f7#diff-b37908413e016b54a45d549cf0539121

We should part this fix to HAWQ.


> Segment panic at workfile_mgr_close_file() when transaction ROLLBACK
> 
>
> Key: HAWQ-1618
> URL: https://issues.apache.org/jira/browse/HAWQ-1618
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Log:
> {code}
> 2018-05-23 15:49:14.843058 
> UTC,"user","db",p179799,th401824032,"172.31.6.17","6935",2018-05-23 15:47:39 
> UTC,1260445558,con25148,cmd7,seg21,slice82,,x1260445558,sx1,"ERROR","25M01","*canceling
>  MPP operation*",,"INSERT INTO ...
> 2018-05-23 15:49:15.253671 UTC,,,p179799,th0,,,2018-05-23 15:47:39 
> UTC,0,con25148,cmd7,seg21,slice82"PANIC","XX000","Unexpected internal 
> error: Segment process r
> eceived signal SIGSEGV",,,0"1    0x8ce2a3 postgres gp_backtrace + 0xa3
> 2    0x8ce491 postgres  + 0x8ce491
> 3    0x7f2d147ae7e0 libpthread.so.0  + 0x147ae7e0
> 4    0x91f4ad postgres workfile_mgr_close_file + 0xd
> 5    0x90bc84 postgres  + 0x90bc84
> 6    0x4e6b60 postgres AbortTransaction + 0x240
> 7    0x4e75c5 postgres AbortCurrentTransaction + 0x25
> 8    0x7ed81a postgres PostgresMain + 0x6ea
> 9    0x7a0c50 postgres  + 0x7a0c50
> 10   0x7a3a19 postgres PostmasterMain + 0x759
> 11   0x4a5309 postgres main + 0x519
> 12   0x7f2d13cead1d libc.so.6 __libc_start_main + 0xfd
> 13   0x4a5389 postgres  + 0x4a5389"
> {code}
>  
> Core stack:
> {code}
> (gdb) bt
> #0  0x7f2d147ae6ab in raise () from libpthread.so.0
> #1  0x008ce552 in SafeHandlerForSegvBusIll (postgres_signal_arg=11, 
> processName=) at elog.c:4573
> #2  
> #3  *workfile_mgr_close_file* (work_set=0x0, file=0x7f2ce96d2de0, 
> canReportError=canReportError@entry=0 '\000') at workfile_file.c:129
> #4  0x0090bc84 in *ntuplestore_cleanup* (fNormal=0 '\000', 
> canReportError=0 '\000', ts=0x21f4810) at tuplestorenew.c:654
> #5  XCallBack_NTS (event=event@entry=XACT_EVENT_ABORT, 
> nts=nts@entry=0x21f4810) at tuplestorenew.c:674
> #6  0x004e6b60 in CallXactCallbacksOnce (event=) at 
> xact.c:3660
> #7  AbortTransaction () at xact.c:2871
> #8  0x004e75c5 in AbortCurrentTransaction () at xact.c:3377
> #9  0x007ed81a in PostgresMain (argc=, argv= out>, argv@entry=0x182c900, username=0x17ddcd0 "user") at postgres.c:4648
> #10 0x007a0c50 in BackendRun (port=0x17cfb10) at postmaster.c:5915
> #11 BackendStartup (port=0x17cfb10) at postmaster.c:5484
> #12 ServerLoop () at postmaster.c:2163
> #13 0x007a3a19 in PostmasterMain (argc=, 
> argv=) at postmaster.c:1454
> #14 0x004a5309 in main (argc=9, argv=0x1785d10) at main.c:226
> {code}
>  
> Repro:
> {code}
> # create test table
> drop table if exists testsisc; 
> create table testsisc (i1 int, i2 int, i3 int, i4 int); 
> insert into testsisc select i, i % 1000, i % 10, i % 75 from 
> generate_series(0,1) i;
> drop table if exists to_insert_into; 
> create table to_insert_into as 
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select t1.c1 as c11, t1.c2 as c12, t2.c1 as c21, t2.c2 as c22
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2
> limit 10;
> # run a long time query
> begin;
> set gp_simex_run=on;
> set gp_cte_sharing=on;
> insert into to_insert_into
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select *
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2;
> commit;
> {code}
> Kill one segment process when the second query is running. Then will find 
> panic log in segment log.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Reopened] (HAWQ-1508) Fix travis broken caused by libssl path

2018-05-30 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reopened HAWQ-1508:
-

broken again, caused by json-c

> Fix travis broken caused by libssl path
> ---
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
> {code}
> CMake Error at 
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
>  (message):
>   Could NOT find SSL (missing: SSL_INCLUDE_DIR)
> Call Stack (most recent call first):
>   
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
>  (_FPHSA_FAILURE_MESSAGE)
>   CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
>   CMakeLists.txt:24 (FIND_PACKAGE)
> -- Configuring incomplete, errors occurred!
> See also 
> "/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".
> failed to configure the project
> make[1]: *** [pre-config] Error 1
> make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1508) Fix travis broken caused by dependency changed

2018-05-30 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1508:

Summary: Fix travis broken caused by dependency changed  (was: Fix travis 
broken caused by libssl path)

> Fix travis broken caused by dependency changed
> --
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
> {code}
> CMake Error at 
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
>  (message):
>   Could NOT find SSL (missing: SSL_INCLUDE_DIR)
> Call Stack (most recent call first):
>   
> /usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
>  (_FPHSA_FAILURE_MESSAGE)
>   CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)
>   CMakeLists.txt:24 (FIND_PACKAGE)
> -- Configuring incomplete, errors occurred!
> See also 
> "/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".
> failed to configure the project
> make[1]: *** [pre-config] Error 1
> make: *** [all] Error 2
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1508) Fix travis broken caused by dependency changed

2018-05-30 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1508:

Description: 
HAWQ travis-ci:

[https://travis-ci.org/apache/incubator-hawq/builds]

 

It breaks usually caused by some

  was:
https://travis-ci.org/apache/incubator-hawq/builds/257989127?utm_source=github_status&utm_medium=notification
{code}
CMake Error at 
/usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:148
 (message):

  Could NOT find SSL (missing: SSL_INCLUDE_DIR)

Call Stack (most recent call first):

  
/usr/local/Cellar/cmake/3.6.2/share/cmake/Modules/FindPackageHandleStandardArgs.cmake:388
 (_FPHSA_FAILURE_MESSAGE)

  CMake/FindSSL.cmake:24 (FIND_PACKAGE_HANDLE_STANDARD_ARGS)

  CMakeLists.txt:24 (FIND_PACKAGE)

-- Configuring incomplete, errors occurred!

See also 
"/Users/travis/build/apache/incubator-hawq/depends/libhdfs3/build/CMakeFiles/CMakeOutput.log".

failed to configure the project

make[1]: *** [pre-config] Error 1

make: *** [all] Error 2
{code}


> Fix travis broken caused by dependency changed
> --
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> HAWQ travis-ci:
> [https://travis-ci.org/apache/incubator-hawq/builds]
>  
> It breaks usually caused by some



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1508) Fix travis broken caused by dependency changed

2018-05-30 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1508:

Description: 
HAWQ travis-ci:

[https://travis-ci.org/apache/incubator-hawq/builds]

 

It breaks usually caused by dependencies change.

We should pay attention to it and let it back when broken.

  was:
HAWQ travis-ci:

[https://travis-ci.org/apache/incubator-hawq/builds]

 

It breaks usually caused by some


> Fix travis broken caused by dependency changed
> --
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> HAWQ travis-ci:
> [https://travis-ci.org/apache/incubator-hawq/builds]
>  
> It breaks usually caused by dependencies change.
> We should pay attention to it and let it back when broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1508) Fix travis broken

2018-05-30 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1508:

Summary: Fix travis broken  (was: Fix travis broken caused by dependency 
changed)

> Fix travis broken
> -
>
> Key: HAWQ-1508
> URL: https://issues.apache.org/jira/browse/HAWQ-1508
> Project: Apache HAWQ
>  Issue Type: Sub-task
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> HAWQ travis-ci:
> [https://travis-ci.org/apache/incubator-hawq/builds]
>  
> It breaks usually caused by dependencies change.
> We should pay attention to it and let it back when broken.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-06-04 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501160#comment-16501160
 ] 

Hongxu Ma commented on HAWQ-1597:
-

Thanks! [~wlin]

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime 
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HAWQ-1597) Implement Runtime Filter for Hash Join

2018-06-04 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16501160#comment-16501160
 ] 

Hongxu Ma edited comment on HAWQ-1597 at 6/5/18 2:05 AM:
-

Good job, Thanks. [~wlin]


was (Author: hongxu ma):
Thanks! [~wlin]

> Implement Runtime Filter for Hash Join
> --
>
> Key: HAWQ-1597
> URL: https://issues.apache.org/jira/browse/HAWQ-1597
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lin Wen
>Assignee: Lin Wen
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
> Attachments: 111BA854-7318-46A7-8338-5F2993D60FA3.png, HAWQ Runtime 
> Filter Design.pdf, HAWQ Runtime Filter Design.pdf, q17_modified_hawq.gif
>
>
> Bloom filter is a space-efficient probabilistic data structure invented in 
> 1970, which is used to test whether an element is a member of a set.
> Nowdays, bloom filter is widely used in OLAP or data-intensive applications 
> to quickly filter data. It is usually implemented in OLAP systems for hash 
> join. The basic idea is, when hash join two tables, during the build phase, 
> build a bloomfilter information for the inner table, then push down this 
> bloomfilter information to the scan of the outer table, so that, less tuples 
> from the outer table will be returned to hash join node and joined with hash 
> table. It can greatly improment the hash join performance if the selectivity 
> is high.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (HAWQ-1618) Segment panic at workfile_mgr_close_file() when transaction ROLLBACK

2018-06-11 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma resolved HAWQ-1618.
-
Resolution: Won't Fix

> Segment panic at workfile_mgr_close_file() when transaction ROLLBACK
> 
>
> Key: HAWQ-1618
> URL: https://issues.apache.org/jira/browse/HAWQ-1618
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Log:
> {code}
> 2018-05-23 15:49:14.843058 
> UTC,"user","db",p179799,th401824032,"172.31.6.17","6935",2018-05-23 15:47:39 
> UTC,1260445558,con25148,cmd7,seg21,slice82,,x1260445558,sx1,"ERROR","25M01","*canceling
>  MPP operation*",,"INSERT INTO ...
> 2018-05-23 15:49:15.253671 UTC,,,p179799,th0,,,2018-05-23 15:47:39 
> UTC,0,con25148,cmd7,seg21,slice82"PANIC","XX000","Unexpected internal 
> error: Segment process r
> eceived signal SIGSEGV",,,0"1    0x8ce2a3 postgres gp_backtrace + 0xa3
> 2    0x8ce491 postgres  + 0x8ce491
> 3    0x7f2d147ae7e0 libpthread.so.0  + 0x147ae7e0
> 4    0x91f4ad postgres workfile_mgr_close_file + 0xd
> 5    0x90bc84 postgres  + 0x90bc84
> 6    0x4e6b60 postgres AbortTransaction + 0x240
> 7    0x4e75c5 postgres AbortCurrentTransaction + 0x25
> 8    0x7ed81a postgres PostgresMain + 0x6ea
> 9    0x7a0c50 postgres  + 0x7a0c50
> 10   0x7a3a19 postgres PostmasterMain + 0x759
> 11   0x4a5309 postgres main + 0x519
> 12   0x7f2d13cead1d libc.so.6 __libc_start_main + 0xfd
> 13   0x4a5389 postgres  + 0x4a5389"
> {code}
>  
> Core stack:
> {code}
> (gdb) bt
> #0  0x7f2d147ae6ab in raise () from libpthread.so.0
> #1  0x008ce552 in SafeHandlerForSegvBusIll (postgres_signal_arg=11, 
> processName=) at elog.c:4573
> #2  
> #3  *workfile_mgr_close_file* (work_set=0x0, file=0x7f2ce96d2de0, 
> canReportError=canReportError@entry=0 '\000') at workfile_file.c:129
> #4  0x0090bc84 in *ntuplestore_cleanup* (fNormal=0 '\000', 
> canReportError=0 '\000', ts=0x21f4810) at tuplestorenew.c:654
> #5  XCallBack_NTS (event=event@entry=XACT_EVENT_ABORT, 
> nts=nts@entry=0x21f4810) at tuplestorenew.c:674
> #6  0x004e6b60 in CallXactCallbacksOnce (event=) at 
> xact.c:3660
> #7  AbortTransaction () at xact.c:2871
> #8  0x004e75c5 in AbortCurrentTransaction () at xact.c:3377
> #9  0x007ed81a in PostgresMain (argc=, argv= out>, argv@entry=0x182c900, username=0x17ddcd0 "user") at postgres.c:4648
> #10 0x007a0c50 in BackendRun (port=0x17cfb10) at postmaster.c:5915
> #11 BackendStartup (port=0x17cfb10) at postmaster.c:5484
> #12 ServerLoop () at postmaster.c:2163
> #13 0x007a3a19 in PostmasterMain (argc=, 
> argv=) at postmaster.c:1454
> #14 0x004a5309 in main (argc=9, argv=0x1785d10) at main.c:226
> {code}
>  
> Repro:
> {code}
> # create test table
> drop table if exists testsisc; 
> create table testsisc (i1 int, i2 int, i3 int, i4 int); 
> insert into testsisc select i, i % 1000, i % 10, i % 75 from 
> generate_series(0,1) i;
> drop table if exists to_insert_into; 
> create table to_insert_into as 
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select t1.c1 as c11, t1.c2 as c12, t2.c1 as c21, t2.c2 as c22
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2
> limit 10;
> # run a long time query
> begin;
> set gp_simex_run=on;
> set gp_cte_sharing=on;
> insert into to_insert_into
> with ctesisc as 
>  (select count(i1) as c1,i3 as c2 from testsisc group by i3)
> select *
> from ctesisc as t1, ctesisc as t2
> where t1.c1 = t2.c2;
> commit;
> {code}
> Kill one segment process when the second query is running. Then will find 
> panic log in segment log.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-06-19 Thread Hongxu Ma (JIRA)
Hongxu Ma created HAWQ-1627:
---

 Summary: Support setting the max protocol message size when 
talking with HDFS
 Key: HAWQ-1627
 URL: https://issues.apache.org/jira/browse/HAWQ-1627
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: libhdfs
Reporter: Hongxu Ma
Assignee: Radar Lei
 Fix For: 2.4.0.0-incubating


Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
adjusted.
When the max size is reached (accessing a very big HDFS table/file), will see 
the following lines in hawq master log:
{code}
2018-06-20 11:21:56.768003 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
[libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
was rejected because it was too big (more than 67108864 bytes).  To increase 
the limit (or to
disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771657 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
RPC call ""getFsStats"" on server ""localhost:9000"":
RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
protocol mismatch: RPC channel cannot parse response header.
@   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
@   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
@   
Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
@   
Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771711 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
:invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::getFsStats()
@   Hdfs::Internal::NamenodeProxy::getFsStats()
@   Hdfs::Internal::FileSystemImpl::getFsStats()
@   Hdfs::Internal::FileSystemImpl::connect()
@   Hdfs::FileSystem::connect(char const*, char const*, char const*)
@   Hdfs::FileSystem::connect(char const*)
@   hdfsBuilderConnect
@   gpfs_hdfs_connect
@   HdfsConnect
@   HdfsGetConnection
@   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
{code}

Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ also 
should has a guc about it.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-06-19 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1627:

Description: 
Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
adjusted.
When the max size is reached (accessing a very big HDFS table/file), will see 
the following lines in hawq master log:
{code}
2018-06-20 11:21:56.768003 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
[libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
was rejected because it was too big (more than 67108864 bytes).  To increase 
the limit (or to
disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771657 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
RPC call ""getFsStats"" on server ""localhost:9000"":
RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
protocol mismatch: RPC channel cannot parse response header.
@   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
@   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
@   
Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
@   
Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771711 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
:invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::getFsStats()
@   Hdfs::Internal::NamenodeProxy::getFsStats()
@   Hdfs::Internal::FileSystemImpl::getFsStats()
@   Hdfs::Internal::FileSystemImpl::connect()
@   Hdfs::FileSystem::connect(char const*, char const*, char const*)
@   Hdfs::FileSystem::connect(char const*)
@   hdfsBuilderConnect
@   gpfs_hdfs_connect
@   HdfsConnect
@   HdfsGetConnection
@   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
{code}

Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ should 
also add a guc for it.


  was:
Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
adjusted.
When the max size is reached (accessing a very big HDFS table/file), will see 
the following lines in hawq master log:
{code}
2018-06-20 11:21:56.768003 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
[libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
was rejected because it was too big (more than 67108864 bytes).  To increase 
the limit (or to
disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771657 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
RPC call ""getFsStats"" on server ""localhost:9000"":
RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
protocol mismatch: RPC channel cannot parse response header.
@   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
@   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
@   
Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
@   
Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
2018-06-20 11:21:56.771711 
CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
log:
:invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
@   Hdfs::Internal::NamenodeImpl::getFsStats()
@   Hdfs::Internal::NamenodeProxy::getFsStats()
@   Hdfs::Internal::FileSystemImpl::getFsStats()
@   Hdfs::Internal::FileSystemImpl::connect()
@   Hdfs::FileSystem::connect(char const*, char const*, char const*)
@   Hdfs::FileSystem::connect(char const*)
@   hdfsBuilderConnect
@   gpfs_hdfs_connect
@   HdfsConnect
@   HdfsGetConnection
@   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
{code}

Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ also 
should has a guc about it.



> Support setting the max protocol message size when talking with HDFS
> 
>
> Key: HAWQ-1627
> URL: https://issues.apache.org/jira/browse/HAWQ-1627
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-in

[jira] [Assigned] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-06-19 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma reassigned HAWQ-1627:
---

Assignee: Hongxu Ma  (was: Radar Lei)

> Support setting the max protocol message size when talking with HDFS
> 
>
> Key: HAWQ-1627
> URL: https://issues.apache.org/jira/browse/HAWQ-1627
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
> adjusted.
> When the max size is reached (accessing a very big HDFS table/file), will see 
> the following lines in hawq master log:
> {code}
> 2018-06-20 11:21:56.768003 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> [libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
> was rejected because it was too big (more than 67108864 bytes).  To increase 
> the limit (or to
> disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
> google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771657 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> 2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
> RPC call ""getFsStats"" on server ""localhost:9000"":
> RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
> protocol mismatch: RPC channel cannot parse response header.
> @   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
> @   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
> @   
> Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
> @   
> Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771711 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> :invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::getFsStats()
> @   Hdfs::Internal::NamenodeProxy::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::connect()
> @   Hdfs::FileSystem::connect(char const*, char const*, char const*)
> @   Hdfs::FileSystem::connect(char const*)
> @   hdfsBuilderConnect
> @   gpfs_hdfs_connect
> @   HdfsConnect
> @   HdfsGetConnection
> @   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
> {code}
> Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ also 
> should has a guc about it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1450) New HAWQ executor with vectorization & possible code generation

2018-06-19 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16517807#comment-16517807
 ] 

Hongxu Ma commented on HAWQ-1450:
-

All work finished, thanks! [~weinan003] [~zhangshujie]

Form Weinan's mail in hawq-dev:
"At present, it supports 7 types of data trigger the vectorized function and 
reach two to eight times 
acceleration comparing with regular execution in different performance test 
queries. Furthermore,
it has an extended friendly framework, you can patch other types into as your 
need.
"

> New HAWQ executor with vectorization & possible code generation
> ---
>
> Key: HAWQ-1450
> URL: https://issues.apache.org/jira/browse/HAWQ-1450
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lei Chang
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog
>
> Attachments: hawq_vectorized_execution_design_v0.1.pdf
>
>
> Most HAWQ executor code is inherited from postgres & gpdb. Let's discuss how 
> to build a new hawq executor with vectorization and possibly code generation. 
> These optimization may potentially improve the query performance a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HAWQ-1450) New HAWQ executor with vectorization & possible code generation

2018-06-19 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma updated HAWQ-1450:

Fix Version/s: 2.4.0.0-incubating

> New HAWQ executor with vectorization & possible code generation
> ---
>
> Key: HAWQ-1450
> URL: https://issues.apache.org/jira/browse/HAWQ-1450
> Project: Apache HAWQ
>  Issue Type: New Feature
>  Components: Query Execution
>Reporter: Lei Chang
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: backlog, 2.4.0.0-incubating
>
> Attachments: hawq_vectorized_execution_design_v0.1.pdf
>
>
> Most HAWQ executor code is inherited from postgres & gpdb. Let's discuss how 
> to build a new hawq executor with vectorization and possibly code generation. 
> These optimization may potentially improve the query performance a lot.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-06-20 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518816#comment-16518816
 ] 

Hongxu Ma edited comment on HAWQ-1627 at 6/21/18 2:32 AM:
--

User can set this property in _etc/hdfs-client.xml_:
{code:java}

  ipc.maximum.data.length
  134217728

{code}
Default value is: 67108864(64M)
  


was (Author: hongxu ma):
User can set this value in _etc/hdfs-client.xml_:
{code}

  ipc.maximum.data.length
  134217728
 
{code}

Default value is: 67108864(64M)
 

> Support setting the max protocol message size when talking with HDFS
> 
>
> Key: HAWQ-1627
> URL: https://issues.apache.org/jira/browse/HAWQ-1627
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
> adjusted.
> When the max size is reached (accessing a very big HDFS table/file), will see 
> the following lines in hawq master log:
> {code}
> 2018-06-20 11:21:56.768003 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> [libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
> was rejected because it was too big (more than 67108864 bytes).  To increase 
> the limit (or to
> disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
> google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771657 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> 2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
> RPC call ""getFsStats"" on server ""localhost:9000"":
> RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
> protocol mismatch: RPC channel cannot parse response header.
> @   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
> @   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
> @   
> Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
> @   
> Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771711 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> :invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::getFsStats()
> @   Hdfs::Internal::NamenodeProxy::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::connect()
> @   Hdfs::FileSystem::connect(char const*, char const*, char const*)
> @   Hdfs::FileSystem::connect(char const*)
> @   hdfsBuilderConnect
> @   gpfs_hdfs_connect
> @   HdfsConnect
> @   HdfsGetConnection
> @   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
> {code}
> Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ should 
> also add a guc for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-06-20 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518816#comment-16518816
 ] 

Hongxu Ma commented on HAWQ-1627:
-

User can set this value in _etc/hdfs-client.xml_:
{code}

  ipc.maximum.data.length
  134217728
 
{code}

Default value is: 67108864(64M)
 

> Support setting the max protocol message size when talking with HDFS
> 
>
> Key: HAWQ-1627
> URL: https://issues.apache.org/jira/browse/HAWQ-1627
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
> adjusted.
> When the max size is reached (accessing a very big HDFS table/file), will see 
> the following lines in hawq master log:
> {code}
> 2018-06-20 11:21:56.768003 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> [libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
> was rejected because it was too big (more than 67108864 bytes).  To increase 
> the limit (or to
> disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
> google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771657 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> 2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
> RPC call ""getFsStats"" on server ""localhost:9000"":
> RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
> protocol mismatch: RPC channel cannot parse response header.
> @   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
> @   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
> @   
> Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
> @   
> Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771711 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> :invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::getFsStats()
> @   Hdfs::Internal::NamenodeProxy::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::connect()
> @   Hdfs::FileSystem::connect(char const*, char const*, char const*)
> @   Hdfs::FileSystem::connect(char const*)
> @   hdfsBuilderConnect
> @   gpfs_hdfs_connect
> @   HdfsConnect
> @   HdfsGetConnection
> @   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
> {code}
> Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ should 
> also add a guc for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1636) Compile apache hawq failure due to unsupported syntax in libyarn on osx 10.11

2018-07-04 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16533182#comment-16533182
 ] 

Hongxu Ma commented on HAWQ-1636:
-

[~oushu1wangziming1] Thanks.

This issue is the same as HAWQ travis failure:

[https://travis-ci.org/apache/incubator-hawq/builds/399048338?utm_source=github_status&utm_medium=notification]

I think it's a env problem (e.g. compiler) and can be fixed easily.

Are you planning to open a PR? It's good opportunity for new contributor, :-)

> Compile apache hawq failure due to unsupported syntax in libyarn on osx 10.11
> -
>
> Key: HAWQ-1636
> URL: https://issues.apache.org/jira/browse/HAWQ-1636
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Build
>Affects Versions: 2.3.0.0-incubating
>Reporter: WangZiming
>Assignee: Radar Lei
>Priority: Major
>
> Follow instruction 
> ([https://cwiki.apache.org/confluence/display/HAWQ/Build+and+Install)] to 
> build apache hawq on osx 10.11, it fails due to unsupported syntax in libyarn:
> {code:java}
> 1. ./configure
> 2. make
> [ 9%] Building CXX object 
> src/CMakeFiles/libyarn-shared.dir/libyarnclient/ApplicationClient.cpp.o
> cd /Users/wangziming/workplace/incubator-hawq/depends/libyarn/build/src && 
> /usr/bin/g++ -DTEST_HDFS_PREFIX=\"./\" -D_GNU_SOURCE -D__STDC_FORMAT_MACROS 
> -Dlibyarn_shared_EXPORTS 
> -I/Users/wangziming/workplace/incubator-hawq/depends/thirdparty/googletest/googletest/include
>  
> -I/Users/wangziming/workplace/incubator-hawq/depends/thirdparty/googletest/googlemock/include
>  -I/Users/wangziming/workplace/incubator-hawq/depends/libyarn/src 
> -I/Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/common 
> -I/Users/wangziming/workplace/incubator-hawq/depends/libyarn/build/src 
> -I/usr/local/include -I/usr/include/libxml2 
> -I/Users/wangziming/workplace/incubator-hawq/depends/libyarn/mock 
> -fno-omit-frame-pointer -msse4.2 -std=c++0x -O2 -g -DNDEBUG -fPIC -o 
> CMakeFiles/libyarn-shared.dir/libyarnclient/ApplicationClient.cpp.o -c 
> /Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp
> /Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:76:10:
>  error: no template named 'vector'; did you mean 'std::vector'?
> for (vector::iterator it = rmConfInfos.begin();
> ^~
> std::vector
> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/vector:457:29: 
> note: 'std::vector' declared here
> class _LIBCPP_TYPE_VIS_ONLY vector
> ^
> /Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:79:14:
>  error: no template named 'vector'; did you mean 'std::vector'?
> for (vector::iterator it2 = rmInfos.begin();
> ^~
> std::vector
> /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/iterator:1244:75:
>  note: 'std::vector' declared here
> template  friend class _LIBCPP_TYPE_VIS_ONLY vector;
> ^
> /Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/libyarnclient/ApplicationClient.cpp:98:17:
>  warning: format specifies type 'int' but the argument has type 'size_type' 
> (aka 'unsigned long') [-Wformat]
> rmInfos.size());
> ^~
> /Users/wangziming/workplace/incubator-hawq/depends/libyarn/src/common/Logger.h:59:47:
>  note: expanded from macro 'LOG'
> Yarn::Internal::RootLogger.printf(s, fmt, ##_VA_ARGS_)
> ^~~
> 1 warning and 2 errors generated.
> make[4]: *** 
> [src/CMakeFiles/libyarn-shared.dir/libyarnclient/ApplicationClient.cpp.o] 
> Error 1
> make[3]: *** [src/CMakeFiles/libyarn-shared.dir/all] Error 2
> make[2]: *** [all] Error 2
> make[1]: *** [build] Error 2
> make: *** [all] Error 2{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Closed] (HAWQ-1627) Support setting the max protocol message size when talking with HDFS

2018-07-09 Thread Hongxu Ma (JIRA)


 [ 
https://issues.apache.org/jira/browse/HAWQ-1627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hongxu Ma closed HAWQ-1627.
---
Resolution: Fixed

fixed

> Support setting the max protocol message size when talking with HDFS
> 
>
> Key: HAWQ-1627
> URL: https://issues.apache.org/jira/browse/HAWQ-1627
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: libhdfs
>Reporter: Hongxu Ma
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.4.0.0-incubating
>
>
> Now, the max size of protocol message in libhdfs is 64MB, and it cannot be 
> adjusted.
> When the max size is reached (accessing a very big HDFS table/file), will see 
> the following lines in hawq master log:
> {code}
> 2018-06-20 11:21:56.768003 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> [libprotobuf ERROR google/protobuf/io/coded_stream.cc:208] A protocol message 
> was rejected because it was too big (more than 67108864 bytes).  To increase 
> the limit (or to
> disable these warnings), see CodedInputStream::SetTotalBytesLimit() in 
> google/protobuf/io/coded_stream.h.""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771657 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> 2018-06-20 11:21:56.771492, p75751, th0x7fffcd7303c0, ERROR Failed to invoke 
> RPC call ""getFsStats"" on server ""localhost:9000"":
> RpcChannel.cpp: 783: HdfsRpcException: RPC channel to ""localhost:9000"" got 
> protocol mismatch: RPC channel cannot parse response header.
> @   Hdfs::Internal::RpcChannelImpl::readOneResponse(bool)
> @   Hdfs::Internal::RpcChannelImpl::checkOneResponse()
> @   
> Hdfs::Internal::RpcChannelImpl::invokeInternal(std::__1::shared_ptr)
> @   
> Hdfs::Internal::RpcChannelImpl:""SysLoggerMain","syslogger.c",518,
> 2018-06-20 11:21:56.771711 
> CST,,,p75703,th-8481004160,,,seg-1,"LOG","0","3rd party error 
> log:
> :invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::invoke(Hdfs::Internal::RpcCall const&)
> @   Hdfs::Internal::NamenodeImpl::getFsStats()
> @   Hdfs::Internal::NamenodeProxy::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::getFsStats()
> @   Hdfs::Internal::FileSystemImpl::connect()
> @   Hdfs::FileSystem::connect(char const*, char const*, char const*)
> @   Hdfs::FileSystem::connect(char const*)
> @   hdfsBuilderConnect
> @   gpfs_hdfs_connect
> @   HdfsConnect
> @   HdfsGetConnection
> @   HdfsGetFileBlockLocati""SysLoggerMain","syslogger.c",518,
> {code}
> Considering HDFS has a guc *"ipc.maximum.data.length"* to set it, HAWQ should 
> also add a guc for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HAWQ-1642) How to use Ranger to control access to HAWQ row and column read?

2018-07-24 Thread Hongxu Ma (JIRA)


[ 
https://issues.apache.org/jira/browse/HAWQ-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553891#comment-16553891
 ] 

Hongxu Ma commented on HAWQ-1642:
-

Unfortunately, this feature is not supported on HAWQ side, although Ranger has 
already supported it.

And no plan to support it in the future.

 

> How to use Ranger to control access to HAWQ row and column read?
> 
>
> Key: HAWQ-1642
> URL: https://issues.apache.org/jira/browse/HAWQ-1642
> Project: Apache HAWQ
>  Issue Type: Wish
>  Components: Storage
>Reporter: ercengsha
>Assignee: Hongxu Ma
>Priority: Major
> Fix For: 2.3.0.0-incubating
>
>
> How to use Ranger to control access to HAWQ row and column read?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


<    1   2