Wow, Thanks for sharing nice comment and blog post!! I'll look at it soon and try to join to this discussion. :)
On Sat, Jul 3, 2010 at 1:08 AM, Claudio Martella <[email protected]> wrote: > I'll try to make things clear. > > The end of a superstep is obtained by the un-activation of all the > vertices. In pregel the superstep is over when all the vertices call > VoteToHalt() (in hama it's done by the sync() method). This happens at > the end of each computation by each vertex. Each vertex is activated by > the arrival of a message directed to that vertex. This means that each > superstep computation is atomic and it should be considered in the > design of the algorithm. That's a change of paradigm and that's what > pregel author's call the "vertex's perspective programming". > > So no, there's no assumption about all the vertices to be active at the > beginning of each superstep. > > About Felix's argument: yes, fewer supersteps mean less communication > and synchronization overhead. At the same time, having longer supersteps > will mean that it's more probable that certain vertices end their > computation earlier than others, making them idle for a long time > (waiting for the others to finish), and loosing computational time. So > ideally it should be a good balance between long computational > supersteps (decreasing communication overhead) and short computational > supersteps (decreasing idle time). > This is an intrinsical problem of BSP models because of the barrier. On > the contrary DataFlow models don't have barriers and each computation is > more independent, therefore more similar to the model you have in mind. > > Hope this helps. > > I attach the text from my blog post (roughly obtained with html2text) as > requested. > > Cheers, > > Claudio > > > zercal wrote: >> The paper I found about pregel are not very detailed, >> is it "http://portal.acm.org/citation.cfm?id=1582716.1582723"? >> I guess, in this paper, vertices are assumed to be all actived at every >> superstep. simply random >> access will reduce some communication cost but take more superstep. >> However, is there any way of vertice selection method can be performed >> that at every super step, each vertex knows whether to active according to >> information it kept and received from other vertex? >> But I can't found more detail from that paper... >> Becides, I can not access your blogs. Would you please send me your article? >> >> Thank you very much! >> from Xiong Chenyan >> >> ÔÚ2010-07-02 20:04:40£¬"Felix Halim" <[email protected]> дµÀ£º >> >>> Exactly how to activate a particular vertex is not clear from the >>> paper (is it random access?) and this feature is probably not as good >>> as it sounds for complex graph algorithms. It might be better off to >>> assume all vertices are active (to reduce the overhead of the flag >>> needed and the space to make it randomly accessible, by storing it in >>> blocks or whatever). >>> >>> Here is my argument: >>> >>> The way Pregel (and existing MR) works is iterative, where each >>> iteration is separated by a super-step barrier where all messages have >>> to arrive. Algorithms that have fewer super-steps are preferable than >>> those that have large number of super-steps. In fact, we should >>> measure Algorithms in terms of the number of super-steps required. To >>> minimize the number of super-steps, likely we need to activate as much >>> vertices as possible to do all the work in current super-step, rather >>> than spill-over to the next super-step. In this case, the feature to >>> "turn off" vertices is useless, since most of the time all vertices >>> will be active to effectively reduce the number of super-steps. >>> >>> Unfortunately, I don't have experiments to backup my argument... I >>> don't have Pregel... >>> >>> Felix Halim >>> >>> >>> On Fri, Jul 2, 2010 at 7:37 PM, Claudio Martella >>> <[email protected]> wrote: >>> >>>> I did too. See: >>>> >>>> http://blog.acaro.org/entry/pregel-is-out-but-what-is-pregel >>>> >>>> >>>> Felix Halim wrote: >>>> >>>>> I have. See my comment in this blog: >>>>> >>>>> http://blog.udanax.org/2010/06/summary-of-google-pregel.html >>>>> >>>>> Felix Halim >>>>> >>>>> >>>>> On Tue, Jun 8, 2010 at 4:00 AM, Mark Kerzner <[email protected]> >>>>> wrote: >>>>> >>>>> >>>>>> Hi, >>>>>> >>>>>> anybody has read it? >>>>>> >>>>>> Thank you, >>>>>> Mark >>>>>> >>>>>> >>>>>> >>>>> >>>> -- >>>> Claudio Martella >>>> Digital Technologies >>>> Unit Research & Development - Analyst >>>> >>>> TIS innovation park >>>> Via Siemens 19 | Siemensstr. 19 >>>> 39100 Bolzano | 39100 Bozen >>>> Tel. +39 0471 068 123 >>>> Fax +39 0471 068 129 >>>> [email protected] http://www.tis.bz.it >>>> >>>> Short information regarding use of personal data. According to Section 13 >>>> of Italian Legislative Decree no. 196 of 30 June 2003, we inform you that >>>> we process your personal data in order to fulfil contractual and fiscal >>>> obligations and also to send you information regarding our services and >>>> events. Your personal data are processed with and without electronic means >>>> and by respecting data subjects' rights, fundamental freedoms and dignity, >>>> particularly with regard to confidentiality, personal identity and the >>>> right to personal data protection. At any time and without formalities you >>>> can write an e-mail to [email protected] in order to object the processing >>>> of your personal data for the purpose of sending advertising materials and >>>> also to exercise the right to access personal data and other rights >>>> referred to in Section 7 of Decree 196/2003. The data controller is TIS >>>> Techno Innovation Alto Adige, Siemens Street n. 19, Bolzano. You can find >>>> the complete information on the web site www.tis.bz.it. >>>> >>>> >>>> >>>> > > > -- > Claudio Martella > Digital Technologies > Unit Research & Development - Analyst > > TIS innovation park > Via Siemens 19 | Siemensstr. 19 > 39100 Bolzano | 39100 Bozen > Tel. +39 0471 068 123 > Fax +39 0471 068 129 > [email protected] http://www.tis.bz.it > > Short information regarding use of personal data. According to Section 13 of > Italian Legislative Decree no. 196 of 30 June 2003, we inform you that we > process your personal data in order to fulfil contractual and fiscal > obligations and also to send you information regarding our services and > events. Your personal data are processed with and without electronic means > and by respecting data subjects' rights, fundamental freedoms and dignity, > particularly with regard to confidentiality, personal identity and the right > to personal data protection. At any time and without formalities you can > write an e-mail to [email protected] in order to object the processing of > your personal data for the purpose of sending advertising materials and also > to exercise the right to access personal data and other rights referred to in > Section 7 of Decree 196/2003. The data controller is TIS Techno Innovation > Alto Adige, Siemens Street n. 19, Bolzano. You can find the complete > information on the web site www.tis.bz.it. > > > > > > * ** ** ** ** ** * _ c c_ l l_ a a_ u u_ d d_ i i_ o o_ _ m m_ a a_ r r_ t > t_ e e_ l l_ l l_ a a * ** ** ** ** ** * > * ** ** ** ** * _ [ [_ r r_ s s_ s s_ _ f f_ e e_ e e_ d d_ ] ] _ a a_ r r_ > c c_ h h_ i i_ v v_ e e _ a a_ b b_ o o_ u u_ t t * ** ** ** ** * > * ** ** ** ** ** * _ G G_ o o_ o o_ g g_ l l_ e e_ _ P P_ r r_ e e_ g g_ e > e_ l l_ _ i i_ s s_ _ o o_ u u_ t t_ . ._ _ B B_ u u_ t t_ _ w w_ h h_ a > a_ t t_ _ i i_ s s_ _ P P_ r r_ e e_ g g_ e e_ l l_ ? ? * ** ** ** ** ** * > June 21, 2010 > Google Pregel's _ p_ a_ p_ e_ r is finally out. But what is Pregel? It has > been mentioned > in many posts talking about NoSQL, GrabhDBs, Big Data, even the Facebook > OpenGraph, so it looks like, apart of the hype fuzz, there's a little > confusion > about what it is, what it does, what it's good for and what it certainly is > not. Let's start with the latter. > Google Pregel is not a database (neither RDBMS nor NoSQL), no key-value store > or any new means of storing data (big or small it might be). Putting it in the > same lists with GraphDBs like _ N_ e_ o_ 4_ j, _ H_ y_ p_ e_ r_ G_ r_ a_ p_ > h_ D_ B or even Twitter's _ F_ l_ o_ c_ k_ D_ B is > somehow like putting MapReduce in the NoSQL group. > GraphDBs are storage systems that use graph representations for data where > each > node represents an entity with unique ids, type and properties. An arc > represents a relationship between two nodes and itself can have a type and > properties. Think of a GraphDB as a RDBMS where instead of tables you have a > graph. It's Semantic Web's triple stores brought to general purpose. > Why would you use a GraphDB? Well, as you can describe your data in terms of > entities and relationship, you're able to avoid defining a schema as we know > it. It's a smaller step towards schema-less representation of data that actual > NoSQLs provide. > Informally, you can describe your data with ER diagrams without translating > them into tables, keeping the representation dynamic and avoiding the costs of > schema redefinition in RDBMs. Plus, it's very efficient and easy to write > queries that allow you to get, for example, all the followers of a user, all > the users he follows, the items related to him (tweets he wrote?) and maybe > the > users connected to his items (users retweeting or reply his tweets?) in one > go. > That's basically what Twitter wants to achieve with FlockDB, but with a > general > GraphDB you can describe your data and the relationships between your data as > you wish. > You store data in a GraphDB and you recall it in an easy and efficient way. So > what's Pregel good for? What if if you want to mine the data in the graph > (i.e. > Google's Pagerank, Facebook's social network analysis, Twitter's retweeting/ > authority analysis)? Google reports that 80% of their distributed computation > is based on MapReduce (Google Maps, Search Indexing, clustering in Google > News, > reports of Google Trends, Google Translate etc.) so we can only guess that the > rest 20% is based on Pregel and the authors report they can work with graphs > of > the size of billions of vertices. Plus, implementing Pagerank is just about 15 > lines of code... > That's what Pregel is for. So what is it? Pregel is a system for large-scale > graph processing. It provides a fault-tolerant framework for the execution of > graph algorithms in parallel over many machines. Think of it as MapReduce re- > thought for graph operations. > But what's wrong with MapReduce and graph algorithms? Nothing particularly, > though it can lead to suboptimal performance because the graph state has to be > passed from one phase to the other generating a lot of I/O, but in general we > can say it has some usability issues as it doesn't provide a way to do any > per- > vertex calculation. In general, it's not easy to express graph algorithms in > M/ > R. Pregel fills a gap as there are no frameworks for graph processing that > address both distributability and fault-tolerance. > Pregel's architecture is inspired by the Bulk Synchronous Parallel model > introduced by Valiant. BSP is a computational model for the execution of > parallel algorithms on top of multiple sequential Von Neumann machines. It > gives an abstraction, just like M/R, that allows the programmer to think about > the parallel expression of his solution without the hassle of communication > and > memory allocation in a distributed system. Before we get into details I think > two things have to be underlined. > First, again like M/R, although the model is used by Google to distribute > computation among multiple computers that's not necessary, in principle BSP > fits parallel programming on SMP or NUMA machines and mainframes. > Second, although the model is used by Google to distribute graph processing, > BSP can be used to distribute other kind of algorithms like matrix > manipulation, just like M/R. > Ok, how does BSP work? I'll take the diagram and snippet from the _ B_ S_ P_ > _ p_ a_ g_ e_ _ f_ r_ o_ m > _ W_ i_ k_ i_ p_ e_ d_ i_ a: > ‚ÄúA BSP computer consists of processors connected by a communication network. > Each processor has a fast local memory, and may follow different threads of > computation. > A BSP computation proceeds in a series of global supersteps. A superstep > consists of three ordered stages: > 1. Concurrent computation: Several computations take place on every > participating processor. Each process only uses values stored in the > local memory of the processor. The computations are independent in the > sense that they occur asynchronously of all the others. > 2. Communication: At this stage, the processes exchange data between > themselves. > 3. Barrier synchronisation: When a process reaches this point (the barrier), > it waits until all other processes have finished their communication > actions. > The figure below shows this in a diagrammatic form. The processes are not > regarded as having a particular linear order (from left to right or > otherwise), > and may be mapped to processors in any way.‚Äù > [bsp architecture] > Ok, basically at every superstep every processor executes the same algorithm > on > its data: its state and the incoming messages. At superstep t every processor > will work on its state, which is the result of its computation at superstep t- > 1, and the messages sent to him at superstep t-1. As a result of the superstep > t computation the processor will send messages to other processors and these > messages will be the incoming messages at superstep t+1. And the cycle goes > on. > The barrier synchronisation is the moment where t gets to be t+1. > It is easy to see that each computation should take approximately the same > amount of time, otherwise a long lasting computation will force the others to > wait idle. > How does Pregel implement BSP? Quoting Pregel's original paper: ‚ÄúThe input > to > a Pregel computation is a directed graph in which each vertex is uniquely > identiÔ¨Åed by a string vertex identiÔ¨Åer. Each vertex is associated with a > modiÔ¨Åable, user deÔ¨Åned value. The directed edges are associated with their > source vertices, and each edge consists of a modiÔ¨Åable, user deÔ¨Åned value > and a target vertex identiÔ¨Åer. A typical Pregel computation consists of > input, when the graph is initialized, followed by a sequence of supersteps > separated by global synchronization points until the algorithm terminates, and > Ô¨Ånishing with output. > Within each superstep the vertices compute in parallel, each executing the > same > user-deÔ¨Åned function that expresses the logic of a given algorithm. A vertex > can modify its state or that of its outgoing edges, receive messages sent to > it > in the previous superstep, send messages to other vertices (to be received in > the next superstep), or even mutate the topology of the graph. Edges are not > Ô¨Årst-class citizens in this model, having no associated computation. > Algorithm termination is based on every vertex voting to halt. In superstep 0, > every vertex is in the active state; all active vertices participate in the > computation of any given superstep. A vertex deactivates itself by voting to > halt. This means that the vertex has no further work to do unless triggered > externally, and the Pregel framework will not execute that vertex in > subsequent > supersteps unless it receives a message. If reactivated by a message, a vertex > must explicitly deactivate itself again. The algorithm as a whole terminates > when all vertices are simultaneously inactive and there are no messages in > transit.‚Äù > The mapping between BSP and Pregel is very simple: each local computation of > BSP maps to the user-defined function in Pregel and the communication often, > but not necessarily, corresponds to edge connectivity between nodes. The > barrier is defined by the halt voting of all the active nodes. > From the perspective of the API Pregel requires the implementation of the > virtual Compute() method of the class Vertex. The class Vertex itself provides > VoteToHalt(), SendMessageTo(), GetValue(), GetOutEdgeIterator() and const > methods superstep() and vertex_id(). > Like M/R, it provides the possibility to define Combiners in order to reduce > message passing overhead by combining messages together where semantically > possible. Like Sawzall Pregel provides Aggregators which allow global > communication by receiving messages from multiple vertices, combining them and > sending the result back to the vertices. They are useful for statistics (think > of an histogram of vertex degrees) or for global controlling (for example an > aggregator can collect all the vertices' PageRank deltas to calculate the > convergence condition). > From the perspective of architecture, Pregel follows a master/worker > architecture, like most of the other Google frameworks. The master is > responsible of partitioning the graph with a hash function based on the vertex > ID (like hash(ID) mod #partitions although a topology-aware partitioner might > be able to minimize communication between workers by keeping messages intra- > machine) but doesn't compute any partition. At the beginning of computation > the > workers subscribe to the computation to the master. > Once the graph is partitioned and the partitions are assigned to workers, the > master issues the start of the superstep. Each worker loops through all his > active vertices, calling the Compute() method and delivering the messages > collected in the previous superstep. The new messages are delivered before the > end of the superstep, right before telling the master the list of active > vertices for the next superstep. > After the computation halts the master might ask the workers to dump their > graph partition to disk. > At the moment there are no projects that handle such computational power over > graphs. _ P_ a_ r_ a_ l_ l_ e_ l_ _ B_ G_ L and _ C_ G_ M_ L_ i_ b can > handle parallel processing on graphs but > don't scale to this size and is not fault-tolerant. _ H_ a_ m_ a, an Apache > incubated > project, aims at developing a similar model to Pregel, but it's not complete > yet. > So, where do I go now? > _ G_ o_ o_ g_ l_ e_ '_ s_ _ R_ e_ s_ e_ a_ r_ c_ h_ _ B_ l_ o_ g_ _ a_ n_ > n_ o_ u_ n_ c_ e_ m_ e_ n_ t > _ B_ S_ P_ _ W_ o_ r_ l_ d_ -_ w_ i_ d_ e > _ N_ o_ S_ Q_ L_ _ G_ r_ a_ p_ h_ D_ B > Please enable JavaScript to view the _ c_ o_ m_ m_ e_ n_ t_ s_ _ p_ o_ w_ > e_ r_ e_ d_ _ b_ y_ _ D_ i_ s_ q_ u_ s_ . _ b_ l_ o_ g_ _ c_ o_ m_ m_ e_ > n_ t_ s > _ p_ o_ w_ e_ r_ e_ d_ _ b_ y_ _ D_ i_ s_ q_ u_ s > > -- Best Regards, Edward J. Yoon [email protected] http://blog.udanax.org
