Hi Vinay, In the out files I could see nothing other than the output of ulimit -all . Do I need to enable any other kind of logging to get more information?
Thanks, Kishore On Mon, Dec 16, 2013 at 5:41 PM, Vinayakumar B <vinayakuma...@huawei.com>wrote: > Hi Krishna, > > > > Please check the out files as well for daemons. You may find something. > > > > > > Cheers, > > Vinayakumar B > > > > *From:* Krishna Kishore Bonagiri [mailto:write2kish...@gmail.com] > *Sent:* 16 December 2013 16:50 > *To:* user@hadoop.apache.org > *Subject:* Re: Yarn -- one of the daemons getting killed > > > > Hi Vinod, > > > > Yes, I am running on Linux. > > > > I was actually searching for a corresponding message in /var/log/messages > to confirm that OOM killed my daemons, but could not find any corresponding > messages there! According to the following link, it looks like if it is a > memory issue, I should see a messages even if OOM is disabled, but I don't > see it. > > > > http://www.redhat.com/archives/taroon-list/2007-August/msg00006.html > > > > And, is memory consumption more in case of two node cluster than a > single node one? Also, I see this problem only when I give "*" as the node > name. > > > > One other thing I suspected was the allowed number of user processes, I > increased that to 31000 from 1024 but that also didn't help. > > > > Thanks, > > Kishore > > > > On Fri, Dec 13, 2013 at 11:51 PM, Vinod Kumar Vavilapalli < > vino...@hortonworks.com> wrote: > > Yes, that is what I suspect. That is why I asked if everything is on a > single node. If you are running linux, linux OOM killer may be shooting > things down. When it happens, you will see something like "'killed process" > in system's syslog. > > > > Thanks, > > +Vinod > > > > On Dec 13, 2013, at 4:52 AM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > > > > Vinod, > > > > One more thing I observed is that, my Client which submits Application > Master one after another continuously also gets killed sometimes. So, it is > always any of the Java Processes that is getting killed. Does it indicate > some excessive memory usage by them or something like that, that is causing > them die? If so, how can we resolve this kind of issue? > > > > Thanks, > > Kishore > > > > On Fri, Dec 13, 2013 at 10:16 AM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > > No, I am running on 2 node cluster. > > > > On Fri, Dec 13, 2013 at 1:52 AM, Vinod Kumar Vavilapalli < > vino...@hortonworks.com> wrote: > > Is all of this on a single node? > > > > Thanks, > > +Vinod > > > > On Dec 12, 2013, at 3:26 AM, Krishna Kishore Bonagiri < > write2kish...@gmail.com> wrote: > > > > Hi, > > I am running a small application on YARN (2.2.0) in a loop of 500 times, > and while doing so one of the daemons, node manager, resource manager, or > data node is getting killed (I mean disappearing) at a random point. I see > no information in the corresponding log files. How can I know why is it > happening so? > > > > And, one more observation is that, this is happening only when I am using > "*" for node name in the container requests, otherwise when I used a > specific node name, everything is fine. > > > > Thanks, > > Kishore > > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > > > > > > > > > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity > to which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > > >