Hi Prabhu,

Unfortunately I don't believe that is the same issue we are seeing. We are
experiencing high cpu usage and we are not getting OOM errors.

Is there reason to believe they're the same issue?


On Tue, Apr 2, 2019, 2:15 AM Prabhu Josephraj <pjos...@cloudera.com> wrote:

> Hi George,
>
>     Have seen this issue - RM UI will show the old job list and the RM
> process heap usage will be high. This is due to a Bug fixed by YARN-7163.
> Can you test with patch from YARN-7163.
>
> Thanks,
> Prabhu Joseph
>
>
> On Tue, Apr 2, 2019 at 4:59 AM George Liaw <george.a.l...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> Using Hadoop 2.7.2.
>> Wondering if anyone's seen an issue before where every once in a while
>> the resource manager gets into a weird state where the Applications
>> dashboard shows jobs running, but there are no actual jobs running on the
>> cluster. When this happens we'll see RM cpu usage flat-lining at very high
>> levels (around 85%), but the datanodes/nodemanagers will have no load
>> because of no jobs running. If we restart the RM and let it fail over to
>> the stand-by, the cluster will go back to normal behavior and start running
>> jobs again after 15-30 minutes.
>>
>> Bit of a strange situation - not entirely sure why the RM would fail to
>> realize that the jobs have finished running and that the jobs sitting in
>> accepted state are free to run. Also strange that the RM gets stuck at high
>> cpu usage.
>>
>> If anyone can point me in the right direction on how to debug or resolve
>> this, that would be much appreciated!
>>
>> --
>> George A. Liaw
>>
>> (408) 318-7920
>> george.a.l...@gmail.com
>> LinkedIn <http://www.linkedin.com/in/georgeliaw/>
>>
>

Reply via email to