Hi Edward, That was the issue I was thinking of first. So, I increased bsp.child.java.opts to 8Gb and that of the Groomservers to 4Gb. After that, the 84-tasks run worked, but with 60 tasks it fails as said above. Should I give it more memory? I would think that these amounts per task/Groomserver should be enough.
Regars, Steven On Wed, Nov 20, 2013 at 12:16 PM, Edward J. Yoon <[email protected]>wrote: > > The only case the program does run, is when I use the maximum number of > > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the maximum > > number of tasks to 12 per node, thus 84. But when I force the program to > run > > with 60 tasks, the "Job Failed" comes up with no additional info. > > Your case looks like a memory problem. Can you check the memory space > during job execution? or try to increase the max heap of BSP child > JVM. > > > the "Job Failed" comes up with no additional info. > > Sorry for the inconvenience, i'll check it out and see what's wrong. > > On Wed, Nov 20, 2013 at 6:22 PM, Steven van Beelen <[email protected]> > wrote: > > I have a very similar problem as Anveshi Charuvaka is mailing about. > > > > What I found additionally when I set task logging to DEBUG mode, is that > the > > DEBUG logs get interrupted at same point and replaced with the "INFO > > bsp.BSPJobClient: Job failed." message. > > My program works in local, distributed and pseudo mode, so that's > probably > > not the issue. > > > > The only case the program does run, is when I use the maximum number of > > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the maximum > > number of tasks to 12 per node, thus 84. But when I force the program to > run > > with 60 tasks, the "Job Failed" comes up with no additional info. > > > > Last note: I'm running an Inverted Indexing algorithm with a data set of > > approximately 17 GB. > > Could someone help me with this? > > > > Regards, Steven > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon >
