understanding failing my job, Giraph/Hadoop memory usage, under-utilized nodes, and moving forward

2014-09-22 Thread Matthew Cornell
Hi Folks, I've spent the last two months learning, installing, coding, and analyzing the performance of our Giraph app, and I'm able to run on small inputs on our tiny cluster (yay!) I am now stuck trying to figure out why larger inputs fail, why only some compute nodes are being used, and

Re: understanding failing my job, Giraph/Hadoop memory usage, under-utilized nodes, and moving forward

2014-09-22 Thread Matthew Saltz
Sorry, should be *org.apache.giraph.utils.MemoryUtils.getRuntimeMemoryStats(), *I left out the giraph. On Mon, Sep 22, 2014 at 8:10 PM, Matthew Saltz sal...@gmail.com wrote: Hi Matthew, I answered a few of your questions in-line (unfortunately they might not help the larger problem, but

Re: understanding failing my job, Giraph/Hadoop memory usage, under-utilized nodes, and moving forward

2014-09-22 Thread Matthew Saltz
Hi Matthew, I answered a few of your questions in-line (unfortunately they might not help the larger problem, but hopefully it'll help a bit). Best, Matthew On Mon, Sep 22, 2014 at 5:50 PM, Matthew Cornell m...@matthewcornell.org wrote: Hi Folks, I've spent the last two months learning,

The relation between the number of partitions, number of workers, number of mappers

2014-09-22 Thread xuhong zhang
I know that the number of mappers equals to the number of worker * mapred.tasktracker.map.tasks.maximum How about the number of partitions? Thanks -- Xuhong Zhang

Re: The relation between the number of partitions, number of workers, number of mappers

2014-09-22 Thread Lukas Nalezenec
Hi, Number of mappers = number of workers Number of partitions = (multiplier * (number of workers) ^ 2 ) by default (multiplier = 1 by default) Lukas On 22.9.2014 23:18, xuhong zhang wrote: I know that the number of mappers equals to the number of worker *