[ 
https://issues.apache.org/jira/browse/AURORA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16095732#comment-16095732
 ] 

Reza Motamedi edited comment on AURORA-1939 at 7/21/17 3:42 AM:
----------------------------------------------------------------

Following [~StephanErb]'s suggestion tried to guard the psutil's `oneshot` as a 
critical section. It does not seem to work however:


{code}
...
from threading import Lock
...

oneshot_lock = Lock()

def process_to_sample(process):
  """ Given a psutil.Process, return a current ProcessSample """
  try:
    with oneshot_lock:
      with process.oneshot():
        # the nonblocking get_cpu_percent call is stateful on a particular 
Process object, and hence
        # >2 consecutive calls are required before it will return a non-zero 
value
        rate = process.cpu_percent(0.0) / 100.0
        cpu_times = process.cpu_times()
...
{code}



was (Author: rezam):
Following [~StephanErb]'s suggestion tried to guard the psutil's `oneshot` as a 
critical section. It does not seem to work however:

```
...
from threading import Lock
...

oneshot_lock = Lock()

def process_to_sample(process):
  """ Given a psutil.Process, return a current ProcessSample """
  try:
    with oneshot_lock:
      with process.oneshot():
        # the nonblocking get_cpu_percent call is stateful on a particular 
Process object, and hence
        # >2 consecutive calls are required before it will return a non-zero 
value
        rate = process.cpu_percent(0.0) / 100.0
        cpu_times = process.cpu_times()
...
```

> Thermos landing (host) page reports incorrect CPU rates when it is busy
> -----------------------------------------------------------------------
>
>                 Key: AURORA-1939
>                 URL: https://issues.apache.org/jira/browse/AURORA-1939
>             Project: Aurora
>          Issue Type: Bug
>            Reporter: Reza Motamedi
>            Priority: Minor
>
> Thermos Observer uses `psutil` to monitor resource consumption of Thermos 
> Processes. On a busy machine, I have noticed negative CPU values when 
> visiting the Thermos landing page.
> In my test I reproduced this by starting many processes that constantly 
> create short lived children. This indicates that in time between 
> `process_collector_psutil` looks up the Process children and the time it 
> calculates the CPU time the pid of the child is actually reused by another 
> much younger process, which leads to negative CPU times.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to