Hi, I know about MapReduce/Hadoop and trying to get myself around BSP/Hama-Giraph by comparing MR and BSP.
- Map Phase in MR is similar to Computation Phase in BSP. BSP allows for process to exchange data in the communication phase, but there is no communication between the mappers in the Map Phase. Though the data flows from Map tasks to Reducer tasks. Please correct me if I am wrong. Any other significant differences? - After going through the documentation for Hama and Giraph, noticed that they both use Hadoop as the underlying framework. In both Hama and Giraph an MR Job is submitted. Does each superstep in BSP correspond to a Job in MR? Where are the incoming, outgoing messages and state stored - HDFS or HBase or Local or pluggable? - If a Vertex is deactivated and again activated after receiving a message, does is run on the same node or a different node in the cluster? Regards, Praveen
