[jira] [Resolved] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen resolved HAWQ-637.
--
Resolution: Fixed

> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231684#comment-15231684
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

Github user linwen closed the pull request at:

https://github.com/apache/incubator-hawq/pull/565


> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-640) Wrongly timing the reqeusts for YARN containers causes resource manager not fully acquire resource

2016-04-07 Thread Yi Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Jin reassigned HAWQ-640:
---

Assignee: Yi Jin  (was: Lei Chang)

> Wrongly timing the reqeusts for YARN containers causes resource manager not 
> fully acquire resource
> --
>
> Key: HAWQ-640
> URL: https://issues.apache.org/jira/browse/HAWQ-640
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Yi Jin
>Assignee: Yi Jin
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-640) Wrongly timing the reqeusts for YARN containers causes resource manager not fully acquire resource

2016-04-07 Thread Yi Jin (JIRA)
Yi Jin created HAWQ-640:
---

 Summary: Wrongly timing the reqeusts for YARN containers causes 
resource manager not fully acquire resource
 Key: HAWQ-640
 URL: https://issues.apache.org/jira/browse/HAWQ-640
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Resource Manager
Reporter: Yi Jin
Assignee: Lei Chang






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-605) Some segment capacity changes are not logged out and when segment goes to up status, the capacity is not adjusted

2016-04-07 Thread Yi Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yi Jin resolved HAWQ-605.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Some segment capacity changes are not logged out and when segment goes to up 
> status, the capacity is not adjusted
> -
>
> Key: HAWQ-605
> URL: https://issues.apache.org/jira/browse/HAWQ-605
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Yi Jin
>Assignee: Yi Jin
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-634) Let resource manager cancel waiting and return allocated resource when unregister connection rpc is called

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231615#comment-15231615
 ] 

ASF GitHub Bot commented on HAWQ-634:
-

GitHub user jiny2 opened a pull request:

https://github.com/apache/incubator-hawq/pull/568

HAWQ-634. Let resource manager cancel waiting and return allocated resource 
when unregister connection rpc is called



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jiny2/incubator-hawq HAWQ-634

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/568.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #568


commit 3d1ddd1af78c1d7b0a33e4e6e6d707e1d3be4a30
Author: YI JIN 
Date:   2016-04-08T04:27:42Z

HAWQ-634. Let resource manager cancel waiting and return allocated resource 
when unregister connection rpc is called




> Let resource manager cancel waiting and return allocated resource when 
> unregister connection rpc is called
> --
>
> Key: HAWQ-634
> URL: https://issues.apache.org/jira/browse/HAWQ-634
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Resource Manager
>Reporter: Yi Jin
>Assignee: Yi Jin
>
> This is an improvement to help quickly recycle resourced used by a cancelled 
> query.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231511#comment-15231511
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

Github user jiny2 commented on the pull request:

https://github.com/apache/incubator-hawq/pull/565#issuecomment-207179261
  
LGTM +1


> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231504#comment-15231504
 ] 

ASF GitHub Bot commented on HAWQ-636:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/566


> hawq check should read configurations in /etc/security/limits.d
> ---
>
> Key: HAWQ-636
> URL: https://issues.apache.org/jira/browse/HAWQ-636
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Radar Lei
>Assignee: Radar Lei
>
> Currently 'hawq check' only check configurations from 
> /etc/security/limits.conf.
> We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231483#comment-15231483
 ] 

ASF GitHub Bot commented on HAWQ-636:
-

Github user yaoj2 commented on the pull request:

https://github.com/apache/incubator-hawq/pull/566#issuecomment-207168475
  
+1


> hawq check should read configurations in /etc/security/limits.d
> ---
>
> Key: HAWQ-636
> URL: https://issues.apache.org/jira/browse/HAWQ-636
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Radar Lei
>Assignee: Radar Lei
>
> Currently 'hawq check' only check configurations from 
> /etc/security/limits.conf.
> We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15231479#comment-15231479
 ] 

ASF GitHub Bot commented on HAWQ-636:
-

Github user huor commented on the pull request:

https://github.com/apache/incubator-hawq/pull/566#issuecomment-207167202
  
+1


> hawq check should read configurations in /etc/security/limits.d
> ---
>
> Key: HAWQ-636
> URL: https://issues.apache.org/jira/browse/HAWQ-636
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Radar Lei
>Assignee: Radar Lei
>
> Currently 'hawq check' only check configurations from 
> /etc/security/limits.conf.
> We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HAWQ-628) \l+ returns hcatalog not supported error

2016-04-07 Thread Goden Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Goden Yao updated HAWQ-628:
---
Description: 
This is likely not from HAWQ client

{code}
gpadmin=# \l+
ERROR:  database hcatalog (OID 6120) is reserved (dbsize.c:185)

gpadmin=# \l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gpadmin   | gpadmin | UTF8 |
 hcatalog  | gpadmin | UTF8 |
 lc_demo   | gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(6 rows)
{code}


*Expected Behavior*
show hcatalog under \l+ normally with as much information as we can provide

  was:
{code}
gpadmin=# \l+
ERROR:  database hcatalog (OID 6120) is reserved (dbsize.c:185)

gpadmin=# \l
 List of databases
   Name|  Owner  | Encoding | Access privileges
---+-+--+---
 gpadmin   | gpadmin | UTF8 |
 hcatalog  | gpadmin | UTF8 |
 lc_demo   | gpadmin | UTF8 |
 postgres  | gpadmin | UTF8 |
 template0 | gpadmin | UTF8 |
 template1 | gpadmin | UTF8 |
(6 rows)
{code}


*Expected Behavior*
show hcatalog under \l+ normally with as much information as we can provide


> \l+ returns hcatalog not supported error
> 
>
> Key: HAWQ-628
> URL: https://issues.apache.org/jira/browse/HAWQ-628
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Hcatalog, PXF
>Reporter: Goden Yao
>Assignee: Goden Yao
>
> This is likely not from HAWQ client
> {code}
> gpadmin=# \l+
> ERROR:  database hcatalog (OID 6120) is reserved (dbsize.c:185)
> gpadmin=# \l
>  List of databases
>Name|  Owner  | Encoding | Access privileges
> ---+-+--+---
>  gpadmin   | gpadmin | UTF8 |
>  hcatalog  | gpadmin | UTF8 |
>  lc_demo   | gpadmin | UTF8 |
>  postgres  | gpadmin | UTF8 |
>  template0 | gpadmin | UTF8 |
>  template1 | gpadmin | UTF8 |
> (6 rows)
> {code}
> *Expected Behavior*
> show hcatalog under \l+ normally with as much information as we can provide



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-639) HAWQ support ORC as a native storage format

2016-04-07 Thread Goden Yao (JIRA)
Goden Yao created HAWQ-639:
--

 Summary: HAWQ support ORC as a native storage format
 Key: HAWQ-639
 URL: https://issues.apache.org/jira/browse/HAWQ-639
 Project: Apache HAWQ
  Issue Type: New Feature
  Components: Storage
Reporter: Goden Yao
Assignee: Lei Chang


As a user, I want to be able to store my table as ORC format in HAWQ



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-633) Do not error out when cleaning up workfiles during AbortTransaction

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230521#comment-15230521
 ] 

ASF GitHub Bot commented on HAWQ-633:
-

Github user gcaragea closed the pull request at:

https://github.com/apache/incubator-hawq/pull/562


> Do not error out when cleaning up workfiles during AbortTransaction
> ---
>
> Key: HAWQ-633
> URL: https://issues.apache.org/jira/browse/HAWQ-633
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Query Execution
>Reporter: George Caragea
>Assignee: George Caragea
>
> If we reach out of disk space when creating temporary spill files, we will 
> error out during writing to disk, and abort the transaction. 
> When aborting the transaction, part of the cleanup code we call 
> workfile_mgr_unlink_directory() to delete the directory containing all the 
> work files. But in some cases that directory might not even be created, 
> because of the out of disk space. 
> Instead of erroring out again, just give a warning and continue with the 
> abort code. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-638) gpload bug using pip installed PyGreSQL

2016-04-07 Thread hongwu (JIRA)
hongwu created HAWQ-638:
---

 Summary: gpload bug using pip installed PyGreSQL
 Key: HAWQ-638
 URL: https://issues.apache.org/jira/browse/HAWQ-638
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Command Line Tools
Reporter: hongwu
Assignee: Lei Chang


Since greenplum's gpload is based on private patch upon PyGreSQL for internal 
usage, and gpload.py is copied from greenplum incompletely, it will generate 
error while using gpload tools.

Details: 
self.db.notices() depends on the implementation of pg.DB.notices, which was 
implemented internal in greenplum, it is wrong to use this attribute in gpload 
tool of hawq.

Reference:
https://github.com/apache/incubator-hawq/blob/master/tools/bin/gpload.py#L704

https://github.com/greenplum-db/gpdb/blob/master/gpMgmt/bin/pythonSrc/PyGreSQL-4.0/pgmodule.c#L2929



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-638) gpload bug using pip installed PyGreSQL

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230281#comment-15230281
 ] 

ASF GitHub Bot commented on HAWQ-638:
-

Github user xunzhang commented on the pull request:

https://github.com/apache/incubator-hawq/pull/567#issuecomment-206923112
  
cc @radarwave 
ps: I am not sure why Jenkins could not merge into master.


> gpload bug using pip installed PyGreSQL
> ---
>
> Key: HAWQ-638
> URL: https://issues.apache.org/jira/browse/HAWQ-638
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: hongwu
>Assignee: Lei Chang
>
> Since greenplum's gpload is based on private patch upon PyGreSQL for internal 
> usage, and gpload.py is copied from greenplum incompletely, it will generate 
> error while using gpload tools.
> Details: 
> self.db.notices() depends on the implementation of pg.DB.notices, which was 
> implemented internal in greenplum, it is wrong to use this attribute in 
> gpload tool of hawq.
> Reference:
> https://github.com/apache/incubator-hawq/blob/master/tools/bin/gpload.py#L704
> https://github.com/greenplum-db/gpdb/blob/master/gpMgmt/bin/pythonSrc/PyGreSQL-4.0/pgmodule.c#L2929



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-638) gpload bug using pip installed PyGreSQL

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230276#comment-15230276
 ] 

ASF GitHub Bot commented on HAWQ-638:
-

GitHub user xunzhang opened a pull request:

https://github.com/apache/incubator-hawq/pull/567

HAWQ-638. gpload bugfix using pip installed PyGreSQL

See https://issues.apache.org/jira/browse/HAWQ-638 for detailed info.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xunzhang/incubator-hawq hotfix-gpload

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/567.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #567


commit 29a3c439eb8e426d79d3c1e71f37c0d84f3792da
Author: xunzhang 
Date:   2016-04-07T14:01:59Z

HAWQ-638. gpload bugfix using pip installed PyGreSQL




> gpload bug using pip installed PyGreSQL
> ---
>
> Key: HAWQ-638
> URL: https://issues.apache.org/jira/browse/HAWQ-638
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Command Line Tools
>Reporter: hongwu
>Assignee: Lei Chang
>
> Since greenplum's gpload is based on private patch upon PyGreSQL for internal 
> usage, and gpload.py is copied from greenplum incompletely, it will generate 
> error while using gpload tools.
> Details: 
> self.db.notices() depends on the implementation of pg.DB.notices, which was 
> implemented internal in greenplum, it is wrong to use this attribute in 
> gpload tool of hawq.
> Reference:
> https://github.com/apache/incubator-hawq/blob/master/tools/bin/gpload.py#L704
> https://github.com/greenplum-db/gpdb/blob/master/gpMgmt/bin/pythonSrc/PyGreSQL-4.0/pgmodule.c#L2929



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230188#comment-15230188
 ] 

ASF GitHub Bot commented on HAWQ-636:
-

GitHub user radarwave opened a pull request:

https://github.com/apache/incubator-hawq/pull/566

HAWQ-636. hawq check include /etc/security/limits.d

Tested config files in '/etc/security/limits.d', it works well with 
different config file and different user if defined user domain.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/radarwave/incubator-hawq HAWQ-636

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/566.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #566


commit f8fe2cf88e0b529f2a48657e4c7c14b515580e7b
Author: rlei 
Date:   2016-04-07T10:16:42Z

HAWQ-636. hawq check include /etc/security/limits.d




> hawq check should read configurations in /etc/security/limits.d
> ---
>
> Key: HAWQ-636
> URL: https://issues.apache.org/jira/browse/HAWQ-636
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Radar Lei
>Assignee: Radar Lei
>
> Currently 'hawq check' only check configurations from 
> /etc/security/limits.conf.
> We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230060#comment-15230060
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

GitHub user linwen opened a pull request:

https://github.com/apache/incubator-hawq/pull/565

HAWQ-637. Fix RM process error if property is missing or invalid in h…

Please review. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/linwen/incubator-hawq hawq-637

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/565.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #565


commit fda5929b784bcd39458dc6b0dec9d7617110dc38
Author: Wen Lin 
Date:   2016-04-07T10:30:42Z

HAWQ-637. Fix RM process error if property is missing or invalid in 
hawq-site.xml with Yarn mode.




> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread Lin Wen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Wen reassigned HAWQ-637:


Assignee: Lin Wen  (was: Lei Chang)

> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread Lin Wen (JIRA)
Lin Wen created HAWQ-637:


 Summary: RM process is error if property is missing in 
hawq-site.xml
 Key: HAWQ-637
 URL: https://issues.apache.org/jira/browse/HAWQ-637
 Project: Apache HAWQ
  Issue Type: Bug
  Components: Resource Manager
Reporter: Lin Wen
Assignee: Lei Chang


start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml

scripts show start hawq successfully, but RM process is not correct.
```
gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, master 
resource managercon4 error exit in 2m 0s
```
```
x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
#1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
#2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
#3  0x009d6047 in errfinish (dummy=0) at elog.c:597
#4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
resource broker failed to start resource broker process. error=%d") at 
elog.c:1463
#5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
resourcebroker_LIBYARN.c:96
#6  0x00a5d924 in RB_start (isforked=1 '\001') at 
resourcebroker_API.c:58
#7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
#8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
resourcemanager.c:513
#9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
resourcemanager.c:332
#10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
#11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
#12 0x00895c77 in do_reaper () at postmaster.c:4021
#13 0x0089203b in ServerLoop () at postmaster.c:2136
#14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
postmaster.c:1454
#15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226

The fix is to let RB and RM process work normally, but RB can't register itself 
to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread Ming LI (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming LI reassigned HAWQ-635:


Assignee: Ming LI  (was: Lei Chang)

> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Ming LI
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread Ming LI (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming LI resolved HAWQ-635.
--
Resolution: Fixed

> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230048#comment-15230048
 ] 

ASF GitHub Bot commented on HAWQ-635:
-

Github user asfgit closed the pull request at:

https://github.com/apache/incubator-hawq/pull/564


> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15230042#comment-15230042
 ] 

ASF GitHub Bot commented on HAWQ-635:
-

Github user liming01 commented on the pull request:

https://github.com/apache/incubator-hawq/pull/564#issuecomment-206797814
  
@wangzw , yes, I agree with you. Thanks.


> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread Radar Lei (JIRA)

 [ 
https://issues.apache.org/jira/browse/HAWQ-636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radar Lei reassigned HAWQ-636:
--

Assignee: Radar Lei  (was: Lei Chang)

> hawq check should read configurations in /etc/security/limits.d
> ---
>
> Key: HAWQ-636
> URL: https://issues.apache.org/jira/browse/HAWQ-636
> Project: Apache HAWQ
>  Issue Type: Improvement
>  Components: Command Line Tools
>Reporter: Radar Lei
>Assignee: Radar Lei
>
> Currently 'hawq check' only check configurations from 
> /etc/security/limits.conf.
> We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-636) hawq check should read configurations in /etc/security/limits.d

2016-04-07 Thread Radar Lei (JIRA)
Radar Lei created HAWQ-636:
--

 Summary: hawq check should read configurations in 
/etc/security/limits.d
 Key: HAWQ-636
 URL: https://issues.apache.org/jira/browse/HAWQ-636
 Project: Apache HAWQ
  Issue Type: Improvement
  Components: Command Line Tools
Reporter: Radar Lei
Assignee: Lei Chang


Currently 'hawq check' only check configurations from /etc/security/limits.conf.

We should also include the configuration files inside /etc/security/limits.d/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229958#comment-15229958
 ] 

ASF GitHub Bot commented on HAWQ-635:
-

Github user wangzw commented on the pull request:

https://github.com/apache/incubator-hawq/pull/564#issuecomment-206774175
  
If the issue is caused by the exception thrown by 
RpcClientImpl::getChannel(). The only chance is that the system run out of 
memory or fail to create thread.  In either case, you will see error message 
"RpcClient failed to create a channel to" in HAWQ log. 

Please refer the code in RpcClientImpl::getChannel()


try {
   ...

rc->addRef();

if (!cleaning) {
cleaning = true;

if (cleaner.joinable()) {
cleaner.join();
}

CREATE_THREAD(cleaner, bind(::clean, this));
}
} catch ...



Isn't it much easier to fix this issue by moving ```rc->addRef();``` to the 
end of the ```try``` block?



> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229933#comment-15229933
 ] 

ASF GitHub Bot commented on HAWQ-635:
-

Github user wangzw commented on the pull request:

https://github.com/apache/incubator-hawq/pull/564#issuecomment-206763422
  
Sure, will do.


> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229915#comment-15229915
 ] 

ASF GitHub Bot commented on HAWQ-635:
-

GitHub user liming01 opened a pull request:

https://github.com/apache/incubator-hawq/pull/564

HAWQ-635. QE process does not exit in libhdfs

1) The problem is caused by the wrong refs in RpcChannelImpl class, so it 
run a dead loop when process exiting. I suspect it is called by the exception 
thrown by RpcClientImpl::getChannel(), which already addRef(), but doesn't call 
close() when exception occurs.

2) This problem cannot be fixed by SIGNAL, because the clean up process is 
already called when dead loop occurs.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/liming01/incubator-hawq 
mli/process_not_exit_libhdfs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/564.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #564


commit 3156d8a0fa4197ab154ac3fbc3f6f42aa1715dbf
Author: Ming LI 
Date:   2016-04-07T08:32:39Z

HAWQ-635. QE process does not exit in libhdfs




> QE process does not exit in libhdfs
> ---
>
> Key: HAWQ-635
> URL: https://issues.apache.org/jira/browse/HAWQ-635
> Project: Apache HAWQ
>  Issue Type: Bug
>Reporter: Ming LI
>Assignee: Lei Chang
>
> The QE process cannot exit. 
> The calling stack is:
> [gpadmin@sdw3 ~]$ pstack 489333
> #0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
> #1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec 
> const&) () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
> #2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
> from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
> #6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
> #7  0x7ff755b04456 in __do_global_dtors_aux () from 
> /data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
> #8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HAWQ-635) QE process does not exit in libhdfs

2016-04-07 Thread Ming LI (JIRA)
Ming LI created HAWQ-635:


 Summary: QE process does not exit in libhdfs
 Key: HAWQ-635
 URL: https://issues.apache.org/jira/browse/HAWQ-635
 Project: Apache HAWQ
  Issue Type: Bug
Reporter: Ming LI
Assignee: Lei Chang


The QE process cannot exit. 
The calling stack is:

[gpadmin@sdw3 ~]$ pstack 489333
#0  0x0033f560ef3d in nanosleep () from /lib64/libpthread.so.0
#1  0x7ff75309c74a in boost::this_thread::hiden::sleep_for(timespec const&) 
() from 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libboost_thread.so.1.53.0
#2  0x7ff755b850b8 in Hdfs::Internal::RpcChannelImpl::waitForExit() () from 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
#3  0x7ff755b97eff in Hdfs::Internal::RpcClientImpl::close() () from 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
#4  0x7ff755b98094 in Hdfs::Internal::RpcClientImpl::~RpcClientImpl() () 
from 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
#5  0x00540c59 in boost::detail::shared_count::~shared_count() ()
#6  0x0033f52361bd in __cxa_finalize () from /lib64/libc.so.6
#7  0x7ff755b04456 in __do_global_dtors_aux () from 
/data/pulse-agent-data/HAWQ-main-FeatureTest-opt-sanity/product/hawq/./lib/libhdfs3.so.1
#8  0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-614) Table with Segment Reject Limit fails to flush AO file when all data is rejected

2016-04-07 Thread hongwu (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15229848#comment-15229848
 ] 

hongwu commented on HAWQ-614:
-

Yes, because I try to replicate the scene with gpload(with a 2000 wrong lines 
of csv file, ERROR_LIMIT = 3000), it works well too. 
Please paste the gpload log file and your yml configure file.

> Table with Segment Reject Limit fails to flush AO file when all data is 
> rejected
> 
>
> Key: HAWQ-614
> URL: https://issues.apache.org/jira/browse/HAWQ-614
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: External Tables
>Reporter: Kyle R Dunn
>Assignee: hongwu
>Priority: Minor
> Attachments: image008.jpg
>
>
> An error message (attached) is received if *all* data gets rejected (for any 
> reason) when using segment reject limit option with an error table.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)