Hi George, Have seen this issue - RM UI will show the old job list and the RM process heap usage will be high. This is due to a Bug fixed by YARN-7163. Can you test with patch from YARN-7163.
Thanks, Prabhu Joseph On Tue, Apr 2, 2019 at 4:59 AM George Liaw <george.a.l...@gmail.com> wrote: > Hi all, > > Using Hadoop 2.7.2. > Wondering if anyone's seen an issue before where every once in a while the > resource manager gets into a weird state where the Applications dashboard > shows jobs running, but there are no actual jobs running on the cluster. > When this happens we'll see RM cpu usage flat-lining at very high levels > (around 85%), but the datanodes/nodemanagers will have no load because of > no jobs running. If we restart the RM and let it fail over to the stand-by, > the cluster will go back to normal behavior and start running jobs again > after 15-30 minutes. > > Bit of a strange situation - not entirely sure why the RM would fail to > realize that the jobs have finished running and that the jobs sitting in > accepted state are free to run. Also strange that the RM gets stuck at high > cpu usage. > > If anyone can point me in the right direction on how to debug or resolve > this, that would be much appreciated! > > -- > George A. Liaw > > (408) 318-7920 > george.a.l...@gmail.com > LinkedIn <http://www.linkedin.com/in/georgeliaw/> >