[ 
https://issues.apache.org/jira/browse/SLIDER-1055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15094145#comment-15094145
 ] 

Steve Loughran commented on SLIDER-1055:
----------------------------------------

Well, this is interesting: it also means that the process would be decoupled 
from the agent process lifecycle: if YARN destroyed the container, the slider 
agent would go, but not hbase.

I'm renaming this JIRA to explicitly call out HBase, as it is (presumably) 
something specifically in the hbase package. 

[~te...@apache.org] —can you take a quick look at this?

> hbase-daemon executed by slider is excepted from nodemanager container 
> monitoring
> ---------------------------------------------------------------------------------
>
>                 Key: SLIDER-1055
>                 URL: https://issues.apache.org/jira/browse/SLIDER-1055
>             Project: Slider
>          Issue Type: Bug
>          Components: application/hbase
>    Affects Versions: Slider 0.81
>            Reporter: kyungwan nam
>
> here is nodemanager log of a host where a HBASE_REGIONSERVER component is 
> running
> {code}
> 2016-01-12 14:11:49,237 DEBUG monitor.ContainersMonitorImpl 
> (ContainersMonitorImpl.java:run(361)) - Current ProcessTree list : [ 9801 ]
> 2016-01-12 14:11:49,237 DEBUG monitor.ContainersMonitorImpl 
> (ContainersMonitorImpl.java:run(436)) - Constructing ProcessTree for : PID = 
> 9801 ContainerId = container_e07_1451897008090_0009_01_000003
> 2016-01-12 14:11:49,262 DEBUG util.ProcfsBasedProcessTree 
> (ProcfsBasedProcessTree.java:updateProcessTree(274)) - [ 9801 9806 ]
> 2016-01-12 14:11:49,262 INFO  monitor.ContainersMonitorImpl 
> (ContainersMonitorImpl.java:run(458)) - Memory usage of ProcessTree 9801 for 
> container-id container_e07_1451897008090_0009_01_000003: 14.2 MB of 1 GB 
> physical memory used; 517.1 MB of 2.1 GB virtual memory used
> {code}
> used memory for the container is lower than i expected.
> because pids ( 9801 9806 ) are slider-agent process. regionserver process was 
> excepted from monitoring.
> here is the result of "ps axjf" 
> {code}
>  9798  9801  9801  9801 ?           -1 Ss     500   0:00      \_ /bin/bash -c 
> python ./infra/agent/slider-agent/agent/main.py --label 
> container_e07_1451897008090_0009_01_000003___HBASE_REGIONSERVER --zk-quorum 
>  9801  9806  9801  9801 ?           -1 Sl     500   0:01          \_ python 
> ./infra/agent/slider-agent/agent/main.py --label 
> container_e07_1451897008090_0009_01_000003___HBASE_REGIONSERVER --zk-quorum 
>     1  9979  9801  9801 ?           -1 S      500   0:00 bash 
> /volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/bin/hbase-daemon.sh
>  --config 
> /volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/conf
>  foreground_start regionserver
>  9979  9994  9801  9801 ?           -1 Sl     500   0:10  \_ 
> /package/jdk-1.7.0_45/bin/java -Dproc_regionserver 
> -XX:OnOutOfMemoryError=kill -9 %p -Xmx1000m -XX:+UseConcMarkSweepGC 
> -XX:ErrorFile=/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/hs_err_pid%p.log
>  -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps 
> -Xloggc:/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/gc.log-201601121408
>  -Xmn200m -XX:CMSInitiatingOccupancyFraction=70 -Xms1024m -Xmx1024m 
> -Dhbase.log.dir=/var/logs/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003
>  -Dhbase.log.file=hbase-yarn-regionserver.log 
> -Dhbase.home.dir=/volume/nodemanager/usercache/yarn/appcache/application_1451897008090_0009/container_e07_1451897008090_0009_01_000003/app/install/hbase-0.98.13-hadoop2/bin/..
>  -Dhbase.id.str=yarn -Dhbase.root.logger=INFO,RFA 
> -Djava.library.path=/package/hadoop-yarn-2.7.1-arch-centos6-x86_64/lib/native 
> -Dhbase.security.logger=INFO,RFAS 
> org.apache.hadoop.hbase.regionserver.HRegionServer start
> {code}
> when i use the ProcfsBasedProcessTree (default)
> process-tree is determined by relationship between parent and child process.
> so, daemonized process (ppid=1) can’t be included in process-tree.
> I don't know it can be fixed in slider.
> does it need to implement another ResourceCalculatorProcessTree to replace 
> the ProcfsBasedProcessTree?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to