The Giraph/Pregel model is based on bulk synchronous parallel computing, where the programmer is abstracted from the details of how the parallelization occurs (infrastructure does this for you). Additionally the APIs are built for graph-processing. Since the computing model is well defined (BSP), the infrastructure can checkpoint the state of the application at the appropriate time and also handle failures without user interaction.

MPI is a much lower level and generic API, where messages are send to processes. Users must pack/unpack their own messages and deliver messages to the appropriate data structures. Users must partition their own data. As of MPI 2, the state of a failed process leaves the application in an undefined state (usually dead).

Hope that helps,

Avery

On 8/6/13 10:19 AM, Yang wrote:
it seems that the paradigm offered by Giraph/Pregel is very similar to the programming paradim of PVM , and to a lesser degree, MPI. using PVM, we often engages in such "iterative cycles" where all the nodes sync on a barrier and then enters the next cycle.

so what is the extra features offered by Giraph/Pregel? I can see persistence/restarting of tasks, and maybe abstraction of the user-code-specific part into the API so that users are not concerned with the actual message passing (message passing is done by the framework).

Thanks
Yang

Reply via email to