[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231684#comment-15231684
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

Github user linwen closed the pull request at:

https://github.com/apache/incubator-hawq/pull/565


> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231531#comment-15231531
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

Github user huor commented on the pull request:

https://github.com/apache/incubator-hawq/pull/565#issuecomment-207183443
  
@jiny2:
1. For critical parameters that might make RM unable to start if incorrect 
value is specified, we need to error out while loading them. This makes user be 
aware of critical configuration issues as early as possible (at hawq start 
stage, etc).
2. While for parameters that its correctness is not so obvious, we can use 
the fix in this change to postpone it to runtime.


> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15231511#comment-15231511
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

Github user jiny2 commented on the pull request:

https://github.com/apache/incubator-hawq/pull/565#issuecomment-207179261
  
LGTM +1


> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HAWQ-637) RM process is error if property is missing in hawq-site.xml

2016-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/HAWQ-637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15230060#comment-15230060
 ] 

ASF GitHub Bot commented on HAWQ-637:
-

GitHub user linwen opened a pull request:

https://github.com/apache/incubator-hawq/pull/565

HAWQ-637. Fix RM process error if property is missing or invalid in h…

Please review. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/linwen/incubator-hawq hawq-637

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-hawq/pull/565.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #565


commit fda5929b784bcd39458dc6b0dec9d7617110dc38
Author: Wen Lin 
Date:   2016-04-07T10:30:42Z

HAWQ-637. Fix RM process error if property is missing or invalid in 
hawq-site.xml with Yarn mode.




> RM process is error if property is missing in hawq-site.xml
> ---
>
> Key: HAWQ-637
> URL: https://issues.apache.org/jira/browse/HAWQ-637
> Project: Apache HAWQ
>  Issue Type: Bug
>  Components: Resource Manager
>Reporter: Lin Wen
>Assignee: Lin Wen
>
> start hawq in yarn mode,yarn RM address is not configured in hawq-site.xml
> scripts show start hawq successfully, but RM process is not correct.
> ```
> gpadmin  235458 235448  0 08:48 ?00:00:00 postgres: port  5432, 
> master resource managercon4 error exit in 2m 0s
> ```
> ```
> x86_64 libidn-1.18-2.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 
> pam-1.1.1-13.el6.x86_64 zlib-1.2.3-29.el6.x86_64
> (gdb) bt
> #0  0x00366e6e14d3 in __select_nocancel () from /lib64/libc.so.6
> #1  0x00b885d0 in pg_usleep (microsec=3000) at pgsleep.c:43
> #2  0x009dd9d8 in elog_debug_linger (edata=0x117c6c0) at elog.c:4129
> #3  0x009d6047 in errfinish (dummy=0) at elog.c:597
> #4  0x009d86b4 in elog_finish (elevel=21, fmt=0xdb4de0 "YARN mode 
> resource broker failed to start resource broker process. error=%d") at 
> elog.c:1463
> #5  0x00a5e96b in RB_LIBYARN_start (isforked=1 '\001') at 
> resourcebroker_LIBYARN.c:96
> #6  0x00a5d924 in RB_start (isforked=1 '\001') at 
> resourcebroker_API.c:58
> #7  0x00a9417f in MainHandlerLoop () at resourcemanager.c:545
> #8  0x00a940d0 in ResManagerMainServer2ndPhase () at 
> resourcemanager.c:513
> #9  0x00a93b64 in ResManagerMain (argc=3, argv=0x7fffdefaa6f0) at 
> resourcemanager.c:332
> #10 0x00a93d72 in ResManagerProcessStartup () at resourcemanager.c:400
> #11 0x0089525f in CommenceNormalOperations () at postmaster.c:3673
> #12 0x00895c77 in do_reaper () at postmaster.c:4021
> #13 0x0089203b in ServerLoop () at postmaster.c:2136
> #14 0x008911ae in PostmasterMain (argc=9, argv=0x3407940) at 
> postmaster.c:1454
> #15 0x007aaf1a in main (argc=9, argv=0x3407940) at main.c:226
> The fix is to let RB and RM process work normally, but RB can't register 
> itself to Hadoop Yarn RM, since configure in hawk-site.xml is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)