Dear Avery,

Regarding this decision about resource allocation, do you have a
methodology or a rule of thumb that helps you decide which setting is
expected to perform well?
For example, with a given input (number of graph vertices), can you
estimate what number of workers and how much memory per worker would be
optimal? Or the other way around: given a pool of resources (cores &
memory), what's a reasonable graph size?

That insight would be really interesting.

Thanks,
Alexandros

On 11 December 2012 19:40, Avery Ching <ach...@apache.org> wrote:

> We are running several Giraph applications in production using our version
> of Hadoop (Corona) at Facebook.  The part you have to be careful about is
> ensuring you have enough resources for your job to run.  But otherwise, we
> are able to run at FB-scale (i.e. 1billion+ nodes, many more edges).
>
> Avery
>
>
> On 12/11/12 5:58 AM, Gustavo Enrique Salazar Torres wrote:
>
>> Hi:
>>
>> I implemented a graph algorithm to recommend content to our users.
>> Although it is working (implementation uses Mahout) it very inefficient
>> because I have to run many iterations in order to perform a breadth-first
>> search on my graph.
>> I would like to use Giraph for that task. I would like to know if it is
>> production ready. I'm running jobs on Amazon EMR.
>>
>> Thanks in advance.
>> Gustavo
>>
>
>

Reply via email to