> If my Groom Child Process fails for some reason, the processes are not killed > automatically
I also experienced this problem before. I guess, if one of processes crashed with OutOfMemory, other processes infinitely waiting for it. This is a bug. On Sat, Aug 29, 2015 at 1:02 AM, Behroz Sikander <[email protected]> wrote: > Just another quick question. If my Groom Child Process fails for some > reason, the processes are not killed automatically. If i run JPS command, I > can still see something like "3791 GroomServer$BSPPeerChild". Is this the > expected behavior ? > > I am using latest hama version (0.7.0). > Regards, > Behroz > > On Fri, Aug 28, 2015 at 4:12 PM, Behroz Sikander <[email protected]> wrote: > >> Ok I will try it out. >> >> No, actually I am learning alot by facing these problems. It is actually a >> good thing :D >> >> Regards, >> Behroz >> >> On Fri, Aug 28, 2015 at 5:52 AM, Edward J. Yoon <[email protected]> >> wrote: >> >>> > message managers. Hmmm, I will recheck my logic related to messages. Btw >>> >>> Serialization (like GraphJobMessage) is good idea. It stores multiple >>> messages in serialized form in a single object to reduce the memory >>> usage and RPC overhead. >>> >>> > what is the limit of these message managers ? How much data at a single >>> > time they can handle ? >>> >>> It depends on memory. >>> >>> > P.S. Each day, as I am moving towards a big cluster I am running into >>> > problems (alot of them :D). >>> >>> Haha, sorry for inconvenient and thanks for your reports. >>> >>> On Fri, Aug 28, 2015 at 11:25 AM, Behroz Sikander <[email protected]> >>> wrote: >>> > Ok. So, I do have a memory problem. I will try to scale out. >>> > >>> > *>>Each task processor has two message manager, one for outgoing and >>> one* >>> > >>> > *for incoming. All these are handled in memory, so it sometimesrequires >>> > large memory space.* >>> > So, you mean that before barrier synchronization, I have alot of data in >>> > message managers. Hmmm, I will recheck my logic related to messages. Btw >>> > what is the limit of these message managers ? How much data at a single >>> > time they can handle ? >>> > >>> > P.S. Each day, as I am moving towards a big cluster I am running into >>> > problems (alot of them :D). >>> > >>> > Regards, >>> > Behroz Sikander >>> > >>> > On Fri, Aug 28, 2015 at 4:04 AM, Edward J. Yoon <[email protected]> >>> > wrote: >>> > >>> >> > for 3 Groom child process + 2GB for Ubuntu OS). Is this correct >>> >> > understanding ? >>> >> >>> >> and, >>> >> >>> >> > on a big dataset. I think these exceptions have something to do with >>> >> Ubuntu >>> >> > OS killing the hama process due to lack of memory. So, I was curious >>> >> about >>> >> >>> >> Yes, you're right. >>> >> >>> >> Each task processor has two message manager, one for outgoing and one >>> >> for incoming. All these are handled in memory, so it sometimes >>> >> requires large memory space. To solve the OutOfMemory issue, you >>> >> should scale-out your cluster by increasing the number of nodes and >>> >> job tasks, or optimize your algorithm. Another option is >>> >> disk-spillable message manager. This is not supported yet. >>> >> >>> >> On Fri, Aug 28, 2015 at 10:45 AM, Behroz Sikander <[email protected]> >>> >> wrote: >>> >> > Hi, >>> >> > Yes. According to hama-default.xml, each machine will open 3 process >>> with >>> >> > 2GB memory each. This means that my VMs need atleast 8GB memory (2GB >>> each >>> >> > for 3 Groom child process + 2GB for Ubuntu OS). Is this correct >>> >> > understanding ? >>> >> > >>> >> > I recently ran into the following exceptions when I was trying to run >>> >> hama >>> >> > on a big dataset. I think these exceptions have something to do with >>> >> Ubuntu >>> >> > OS killing the hama process due to lack of memory. So, I was curious >>> >> about >>> >> > my configurations. >>> >> > 'BSP task process exit with nonzero status of 137.' >>> >> > 'BSP task process exit with nonzero status of 1' >>> >> > >>> >> > >>> >> > >>> >> > Regards, >>> >> > Behroz >>> >> > >>> >> > On Fri, Aug 28, 2015 at 3:04 AM, Edward J. Yoon < >>> [email protected]> >>> >> > wrote: >>> >> > >>> >> >> Hi, >>> >> >> >>> >> >> You can change the max tasks per node by setting below property in >>> >> >> hama-site.xml. :-) >>> >> >> >>> >> >> <property> >>> >> >> <name>bsp.tasks.maximum</name> >>> >> >> <value>3</value> >>> >> >> <description>The maximum number of BSP tasks that will be run >>> >> >> simultaneously >>> >> >> by a groom server.</description> >>> >> >> </property> >>> >> >> >>> >> >> >>> >> >> On Fri, Aug 28, 2015 at 5:18 AM, Behroz Sikander < >>> [email protected]> >>> >> >> wrote: >>> >> >> > Hi, >>> >> >> > Recently, I noticed that my hama deployment is only opening 3 >>> >> processes >>> >> >> per >>> >> >> > machine. This is because of the configuration settings in the >>> default >>> >> >> hama >>> >> >> > file. >>> >> >> > >>> >> >> > My questions is why 3 and why not 5 or 7 ? What criteria's should >>> be >>> >> >> > considered if I want to increase the value ? >>> >> >> > >>> >> >> > Regards, >>> >> >> > Behroz >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> Best Regards, Edward J. Yoon >>> >> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> Best Regards, Edward J. Yoon >>> >> >>> >>> >>> >>> -- >>> Best Regards, Edward J. Yoon >>> >> >> -- Best Regards, Edward J. Yoon
