Can I combine the Spilling Queue with the Sorted Message Queue? (e.g. conf.set(MessageManager.QUEUE_TYPE_CLASS, "org.apache.hama.bsp.message.queue.SortedMessageQueue");) My implementation inclines the messages to be received sorted, hence the question.
My program has only one superstep. It is an implementation of Inverted Indexing which first reads in a Sequence File consisting of <key, value> pairs where the key is a Text object and the value a IntWritable. The program first parses the Texts Objects, stores each separate word and its frequency. After each document, it sends a messages to another peer containing the word, document id and the frequency. If all the documents have been worked through, sync() is called. After that, a list is created for every word, consisting of all the <document_id, frequency> pairs found. On Wed, Nov 20, 2013 at 2:40 PM, Edward J. Yoon <[email protected]>wrote: > Why don't you use Spilling Queue? Then, it'll work without no problem. > > >> > Last note: I'm running an Inverted Indexing algorithm with a data set > of > >> > approximately 17 GB. > > How many supersteps is needed? If your job is too > communication-intensive, maybe you should consider another approach. > > On Wed, Nov 20, 2013 at 10:14 PM, Steven van Beelen > <[email protected]> wrote: > > Hi Edward, > > > > That was the issue I was thinking of first. So, I increased > > bsp.child.java.opts to 8Gb and that of the Groomservers to 4Gb. > > After that, the 84-tasks run worked, but with 60 tasks it fails as said > > above. > > Should I give it more memory? I would think that these amounts per > > task/Groomserver should be enough. > > > > Regars, Steven > > > > > > > > On Wed, Nov 20, 2013 at 12:16 PM, Edward J. Yoon <[email protected] > >wrote: > > > >> > The only case the program does run, is when I use the maximum number > of > >> > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the > maximum > >> > number of tasks to 12 per node, thus 84. But when I force the program > to > >> run > >> > with 60 tasks, the "Job Failed" comes up with no additional info. > >> > >> Your case looks like a memory problem. Can you check the memory space > >> during job execution? or try to increase the max heap of BSP child > >> JVM. > >> > >> > the "Job Failed" comes up with no additional info. > >> > >> Sorry for the inconvenience, i'll check it out and see what's wrong. > >> > >> On Wed, Nov 20, 2013 at 6:22 PM, Steven van Beelen < > [email protected]> > >> wrote: > >> > I have a very similar problem as Anveshi Charuvaka is mailing about. > >> > > >> > What I found additionally when I set task logging to DEBUG mode, is > that > >> the > >> > DEBUG logs get interrupted at same point and replaced with the "INFO > >> > bsp.BSPJobClient: Job failed." message. > >> > My program works in local, distributed and pseudo mode, so that's > >> probably > >> > not the issue. > >> > > >> > The only case the program does run, is when I use the maximum number > of > >> > machines (i.e. 7 machines, with 12 cores, 128GB ram..). I set the > maximum > >> > number of tasks to 12 per node, thus 84. But when I force the program > to > >> run > >> > with 60 tasks, the "Job Failed" comes up with no additional info. > >> > > >> > Last note: I'm running an Inverted Indexing algorithm with a data set > of > >> > approximately 17 GB. > >> > Could someone help me with this? > >> > > >> > Regards, Steven > >> > >> > >> > >> -- > >> Best Regards, Edward J. Yoon > >> @eddieyoon > >> > > > > -- > Best Regards, Edward J. Yoon > @eddieyoon >
