Hello everybody. Almost a month later, I bump this topic because actually there’s still no clear answer about the fate of the PartitionContext class, introduced in Giraph-504 and included in Giraph-1.0.0. It seems like that this feature was not ported into the new version (1.1.0). Even if I strongly believe that the new Giraph design fulfils PartitionContext purpose so that it’s unnecessary, I do not have any evidence to support that.
Does anybody have a clue? ~~~~~~~~~~~~~~~~~~~ Ing. Alessio Arleo Dottorando in Ingegneria Industriale e dell’Informazione Dottore Magistrale in Ingegneria Informatica e dell’Automazione Dottore in Ingegneria Informatica ed Elettronica Linkedin: it.linkedin.com/in/IngArleo <http://it.linkedin.com/in/IngArleo> Skype: Ing. Alessio Arleo Tel: +39 075 5853920 Cell: +39 349 0575782 ~~~~~~~~~~~~~~~~~~~ > On 25 Feb 2015, at 19:56, Arjun Sharma <as469...@gmail.com> wrote: > > Thanks Matthew for your replies! They are quite helpful. Regarding question > number 4, I see a commit of PartitionContext here by Maja > http://mail-archives.apache.org/mod_mbox/giraph-commits/201302.mbox/%3c20130209001122.ddad73a...@tyr.zones.apache.org%3E > > <http://mail-archives.apache.org/mod_mbox/giraph-commits/201302.mbox/%3c20130209001122.ddad73a...@tyr.zones.apache.org%3E>, > but it seems to be removed from the current version? > > > On Wed, Feb 25, 2015 at 3:30 AM, Matthew Saltz <sal...@gmail.com > <mailto:sal...@gmail.com>> wrote: > Hi, > > 1) The partitions are processed in parallel based on the number of threads > you specify. The vertices within a partition are processed sequentially. You > may want to use more partitions than threads, that way if one partition takes > a particularly long time to be processed, the other threads can continue > processing the remaining partitions. If you have four machines with 12 > threads each for example, with one worker per machine, the default number of > partitions will be 4^2 = 16 partitions, whereas you actually have 48 threads > available, so you'd probably want to specify the number of partitions > manually to a larger number to take advantage of parallelism. > 2) Yes > 3) If you are only doing single threading, there's no reason to do multiple > partitions per worker > 3 (the second one)) I'm not familiar with the out-of-core functionality > 4) I'm not sure > > I'm basing this on the version of Giraph from this summer, not the most > recent release, but I don't think this part has changed. May want to verify > by looking at the code. > > Best, > Matthew > > On Wed, Feb 25, 2015 at 3:25 AM, Arjun Sharma <as469...@gmail.com > <mailto:as469...@gmail.com>> wrote: > Hi, > > I understand that by default, the number of partitions = number of workers ^ > 2. So, if we have N workers, each worker will process N partitions. I have a > number of questions: > > 1- By default, does Giraph process the N partitions within a single worker > sequentially? If yes, when setting the parameter giraph.numComputeThreads, > will partitions within each thread be computed sequentially? > > 2- By default, does Giraph keep all partitions in memory? > > 3- If the answers to 1 and 2 are yes and yes, is there any advantage from > using multiple partitions versus a single partition in the case of single > threading per worker? > > 3- How does the out-of-core partitions affect out-of-core messages? Are they > completely independent? For example, if the number of partitions to be kept > in memory is set to a number less than N, and at the same time all messages > are set to be kept in memory, will ALL messages be kept in memory, even those > from out-of-core partitions? If the situation is reversed, where all > partitions are kept in memory, and out-of-core messaging is set, will > messages from memory-based partitions be saved on disk? > > 4- Is there a class like a PartitionContext, where you can access > preSuperstep and postSuperstep *per partition*, along the lines of > WorkerContext? > > >