Hello everybody.

Almost a month later, I bump this topic because actually there’s still no clear 
answer about the fate of the PartitionContext class, introduced in Giraph-504 
and included in Giraph-1.0.0. It seems like that this feature was not ported 
into the new version (1.1.0). Even if I strongly believe that the new Giraph 
design fulfils PartitionContext purpose so that it’s unnecessary, I do not have 
any evidence to support that. 

Does anybody have a clue?

~~~~~~~~~~~~~~~~~~~

Ing. Alessio Arleo

Dottorando in Ingegneria Industriale e dell’Informazione

Dottore Magistrale in Ingegneria Informatica e dell’Automazione
Dottore in Ingegneria Informatica ed Elettronica

Linkedin: it.linkedin.com/in/IngArleo <http://it.linkedin.com/in/IngArleo>
Skype: Ing. Alessio Arleo

Tel: +39 075 5853920
Cell: +39 349 0575782

~~~~~~~~~~~~~~~~~~~



> On 25 Feb 2015, at 19:56, Arjun Sharma <as469...@gmail.com> wrote:
> 
> Thanks Matthew for your replies! They are quite helpful. Regarding question 
> number 4, I see a commit of PartitionContext here by Maja 
> http://mail-archives.apache.org/mod_mbox/giraph-commits/201302.mbox/%3c20130209001122.ddad73a...@tyr.zones.apache.org%3E
>  
> <http://mail-archives.apache.org/mod_mbox/giraph-commits/201302.mbox/%3c20130209001122.ddad73a...@tyr.zones.apache.org%3E>,
>  but it seems to be removed from the current version?
> 
> 
> On Wed, Feb 25, 2015 at 3:30 AM, Matthew Saltz <sal...@gmail.com 
> <mailto:sal...@gmail.com>> wrote:
> Hi,
> 
> 1) The partitions are processed in parallel based on the number of threads 
> you specify. The vertices within a partition are processed sequentially. You 
> may want to use more partitions than threads, that way if one partition takes 
> a particularly long time to be processed, the other threads can continue 
> processing the remaining partitions. If you have four machines with 12 
> threads each for example, with one worker per machine, the default number of 
> partitions will be 4^2 = 16 partitions, whereas you actually have 48 threads 
> available, so you'd probably want to specify the number of partitions 
> manually to a larger number to take advantage of parallelism. 
> 2) Yes 
> 3) If you are only doing single threading, there's no reason to do multiple 
> partitions per worker
> 3 (the second one)) I'm not familiar with the out-of-core functionality
> 4) I'm not sure
> 
> I'm basing this on the version of Giraph from this summer, not the most 
> recent release, but I don't think this part has changed. May want to verify 
> by looking at the code.  
> 
> Best,
> Matthew
> 
> On Wed, Feb 25, 2015 at 3:25 AM, Arjun Sharma <as469...@gmail.com 
> <mailto:as469...@gmail.com>> wrote:
> Hi,
> 
> I understand that by default, the number of partitions = number of workers ^ 
> 2. So, if we have N workers, each worker will process N partitions. I have a 
> number of questions:
> 
> 1- By default, does Giraph process the N partitions within a single worker 
> sequentially? If yes, when setting the parameter giraph.numComputeThreads, 
> will partitions within each thread be computed sequentially?
> 
> 2- By default, does Giraph keep all partitions in memory?
> 
> 3- If the answers to 1 and 2 are yes and yes, is there any advantage from 
> using multiple partitions versus a single partition in the case of single 
> threading per worker?
> 
> 3- How does the out-of-core partitions affect out-of-core messages? Are they 
> completely independent? For example, if the number of partitions to be kept 
> in memory is set to a number less than N, and at the same time all messages 
> are set to be kept in memory, will ALL messages be kept in memory, even those 
> from out-of-core partitions? If the situation is reversed, where all 
> partitions are kept in memory, and out-of-core messaging is set, will 
> messages from memory-based partitions be saved on disk?
> 
> 4- Is there a class like a PartitionContext, where you can access 
> preSuperstep and postSuperstep *per partition*, along the lines of 
> WorkerContext?
> 
> 
> 

Reply via email to