I like the idea of keeping the messages out of the vertices there is a lot
of unneeded data copying/GC going on and if this eliminates some that would
be fantastic and I think a big help through the whole job run, memory wise.

On Fri, Aug 3, 2012 at 4:03 AM, Gianmarco De Francisci Morales <
g...@apache.org> wrote:

> Hi,
>
> >Are you saying that out-of-core is faster that hitting memory boundaries
> > >(i.e. GC)?  It is a bit tough to imagine that out-of-core beats in-core
> > >=).
> >
> > That's the only explanation I could think of, honestly it sounds wrong to
> > me too. But those are the results I keep getting. If someone has a better
> > one I'd love to hear it :-)
>
>
> I am not surprised.
> Streaming sequentially from a disk is faster than random reading from
> memory [1].
> Add the GC overhead, and you get an explanation for your results.
>
> [1] The Pathologies of Big Data,
> http://queue.acm.org/detail.cfm?id=1563874
>
> Cheers,
> --
> Gianmarco
>

Reply via email to