The documentation on Starfish http://www.cs.duke.edu/starfish/index.html looks promising , I have not used it. I wonder if others on the list have found it more useful than setting mapred.task.profile. C On Feb 29, 2012, at 3:53 PM, Mark question wrote:
> I've used hadoop profiling (.prof) to show the stack trace but it was hard > to follow. jConsole locally since I couldn't find a way to set a port > number to child processes when running them remotely. Linux commands > (top,/proc), showed me that the virtual memory is almost twice as my > physical which means swapping is happening which is what I'm trying to > avoid. > > So basically, is there a way to assign a port to child processes to monitor > them remotely (asked before by Xun) or would you recommend another > monitoring tool? > > Thank you, > Mark > > > On Wed, Feb 29, 2012 at 11:35 AM, Charles Earl <charles.ce...@gmail.com>wrote: > >> Mark, >> So if I understand, it is more the memory management that you are >> interested in, rather than a need to run an existing C or C++ application >> in MapReduce platform? >> Have you done profiling of the application? >> C >> On Feb 29, 2012, at 2:19 PM, Mark question wrote: >> >>> Thanks Charles .. I'm running Hadoop for research to perform duplicate >>> detection methods. To go deeper, I need to understand what's slowing my >>> program, which usually starts with analyzing memory to predict best input >>> size for map task. So you're saying piping can help me control memory >> even >>> though it's running on VM eventually? >>> >>> Thanks, >>> Mark >>> >>> On Wed, Feb 29, 2012 at 11:03 AM, Charles Earl <charles.ce...@gmail.com >>> wrote: >>> >>>> Mark, >>>> Both streaming and pipes allow this, perhaps more so pipes at the level >> of >>>> the mapreduce task. Can you provide more details on the application? >>>> On Feb 29, 2012, at 1:56 PM, Mark question wrote: >>>> >>>>> Hi guys, thought I should ask this before I use it ... will using C >> over >>>>> Hadoop give me the usual C memory management? For example, malloc() , >>>>> sizeof() ? My guess is no since this all will eventually be turned into >>>>> bytecode, but I need more control on memory which obviously is hard for >>>> me >>>>> to do with Java. >>>>> >>>>> Let me know of any advantages you know about streaming in C over >> hadoop. >>>>> Thank you, >>>>> Mark >>>> >>>> >> >>