Re: How to utilize combiners

2013-08-21 Thread Claudio Martella
Hi Kyle,

combiners are set by the user, as you recognized, and called automatically
by the infrastructure at different moments in the path. Combined messages
are passed transparently to the compute method (namely less messages than a
vertex would have received without a combiner).
Have a look at the PageRank examples and benchmark code.

Best,
Claudio


On Tue, Aug 20, 2013 at 8:51 PM, Kyle Orlando kyle.r.orla...@gmail.comwrote:

 Hey all,

 I was wondering if there was any example code I could look at that uses a
 combiner.  Creating your own Combiner is easy enough, e.g.
 DoubleSumCombiner, but I am confused as to how/where I would use the
 classes in my code.

 For example, say I wanted to utilize the DoubleSumCombiner class to sum up
 all of the messages arriving at a particular vertex at the beginning of the
 superstep, and I wanted to do this for each vertex in the graph.  Where
 should I instantiate a DoubleSumCombiner, when should I call the combine()
 and createInitialMessage() methods, etc. in the compute() method?

 What further confuses me is that I see that the MasterCompute class has
 methods for setCombiner() and getCombiner(), and that there is also a
 command line option -c to specify a Combiner.  I'm not really sure if these
 are even necessary, but if they are, I don't know how these come into play
 either.

 Some clarification or direction towards an example would be nice!

 Thanks,
 --
 Kyle Orlando
 Computer Engineering Major
 University of Maryland




-- 
   Claudio Martella
   claudio.marte...@gmail.com


RE: Dynamic Graphs

2013-08-21 Thread Marco Aurelio Barbosa Fagnani Lotz
Dear Mr. Martella,

Once achieved the conditions for updating the vertex data base, what it the 
best way for the Injector Vertex to call an input reader again?

I am able to access all the HDFS data, but I guess the vertex would need to 
have access to the input splits and also the vertex input format that I 
designate. Am I correct? Or there is a way that one can just ask Zookeeper to 
create new splits and distribute to the workers from given a path in DFS?

Best Regards,
Marco Lotz

From: Claudio Martella claudio.marte...@gmail.com
Sent: 14 August 2013 15:25
To: user@giraph.apache.org
Subject: Re: Dynamic Graphs

Hi Marco,

Giraph currently does not support that. One way of doing this would be by 
having a specific (pseudo-)vertex to act as the injector of the new vertices 
and edges For example, it would read a file from HDFS and call the mutable API 
during the computation, superstep after superstep.


On Wed, Aug 14, 2013 at 3:02 PM, Marco Aurelio Barbosa Fagnani Lotz 
m.a.b.l...@stu12.qmul.ac.ukmailto:m.a.b.l...@stu12.qmul.ac.uk wrote:
Hello all,

I would like to know if there is any form to use dynamic graphs with Giraph. By 
dynamic one can read graphs that may change while Giraph is 
computing/deliberating. The changes are in the input file and are not caused by 
the graph computation itself.

Is there any way to analyse it using Giraph? If not, anyone has any 
idea/suggestion if it is possible to modify the framework in order to process 
it?

Best Regards,
Marco Lotz



--
   Claudio Martella
   claudio.marte...@gmail.commailto:claudio.marte...@gmail.com


Re: Dynamic Graphs

2013-08-21 Thread Claudio Martella
As I said, the injection of the new vertices/edges would have to be done
manually, hence without any support of the infrastructure. I'd suggest
you implement a WorkerContext class that supports the reading of a specific
file with a specific format (under your control) from HDFS, and that is
accessed by this particular special vertex (e.g. based on the vertex ID).

Does this make sense?


On Wed, Aug 21, 2013 at 2:13 PM, Marco Aurelio Barbosa Fagnani Lotz 
m.a.b.l...@stu12.qmul.ac.uk wrote:

  Dear Mr. Martella,

 Once achieved the conditions for updating the vertex data base, what it
 the best way for the Injector Vertex to call an input reader again?

 I am able to access all the HDFS data, but I guess the vertex would need
 to have access to the input splits and also the vertex input format that I
 designate. Am I correct? Or there is a way that one can just ask Zookeeper
 to create new splits and distribute to the workers from given a path in DFS?

 Best Regards,
 Marco Lotz
  --
 *From:* Claudio Martella claudio.marte...@gmail.com
 *Sent:* 14 August 2013 15:25
 *To:* user@giraph.apache.org
 *Subject:* Re: Dynamic Graphs

  Hi Marco,

  Giraph currently does not support that. One way of doing this would be
 by having a specific (pseudo-)vertex to act as the injector of the new
 vertices and edges For example, it would read a file from HDFS and call the
 mutable API during the computation, superstep after superstep.


 On Wed, Aug 14, 2013 at 3:02 PM, Marco Aurelio Barbosa Fagnani Lotz 
 m.a.b.l...@stu12.qmul.ac.uk wrote:

  Hello all,

 I would like to know if there is any form to use dynamic graphs with
 Giraph. By dynamic one can read graphs that may change while Giraph is
 computing/deliberating. The changes are in the input file and are not
 caused by the graph computation itself.

 Is there any way to analyse it using Giraph? If not, anyone has any
 idea/suggestion if it is possible to modify the framework in order to
 process it?

 Best Regards,
 Marco Lotz




  --
Claudio Martella
claudio.marte...@gmail.com




-- 
   Claudio Martella
   claudio.marte...@gmail.com


Re: MultiVertexInputFormat

2013-08-21 Thread Maja Kabiljo
Hi Yasser,

You can do this through the Configuration parameters. You should call:
description1.addParameter(myApplication.vertexInputPath, file1.txt);
and
description2.addParameter(myApplication.vertexInputPath, file2.txt);
Then from the code of your InputFormat class you can get this parameter from 
Configuration. If it's not already, make sure your InputFormat implements 
ImmutableClassesGiraphConfigurable, and configuration is going to be set in it 
automatically.

You can also take a look at HiveGiraphRunner which uses multiple inputs and 
sets parameters user passes from command line.

Hope this helps,
Maja

From: Yasser Altowim 
yasser.alto...@ericsson.commailto:yasser.alto...@ericsson.com
Reply-To: user@giraph.apache.orgmailto:user@giraph.apache.org 
user@giraph.apache.orgmailto:user@giraph.apache.org
Date: Monday, August 19, 2013 9:16 AM
To: user@giraph.apache.orgmailto:user@giraph.apache.org 
user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: RE: MultiVertexInputFormat

Hi Guys,

 Any help on this will be appreciated. I am repeating my question and my 
code below:


I am implementing an algorithm in Giraph that reads the vertex values from two 
input files, each has its own format. I am not using  any EdgeInputFormatClass. 
I am now using VertexInputFormatDescription along with MultiVertexInputFormats, 
but still could not figure out how to set the Vertex input path for each Input 
Format Class. Can you please take a look at my code below and show me how to 
set the Vertex Input Path? I have taken a look at HiveGiraphRunner but still no 
luck. Thanks

if (null == getConf()) {
conf = new Configuration();
}

GiraphConfiguration gconf = new GiraphConfiguration(getConf());
int workers = Integer.parseInt(arg0[2]);
gconf.setWorkerConfiguration(workers, workers, 100.0f);

ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

// Input one
VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
// how to set the vertex input path? i.e. how to say that I want to read 
file1.txt using this input format class
vertexInputDescriptions.add(description1);

// Input two
VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
// how to set the vertex input path?
vertexInputDescriptions.add(description2);


GiraphConstants.VERTEX_INPUT_FORMAT_CLASS.set(gconf,

MultiVertexInputFormat.class);

VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));

gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
gconf.setComputationClass(UseCase1Vertex.class);
GiraphJob job = new GiraphJob(gconf, Use Case 1);
FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1]));
return job.run(true) ? 0 : -1;


Thanks in advance.

Best,
Yasser

From: Yasser Altowim [mailto:yasser.alto...@ericsson.com]
Sent: Friday, August 16, 2013 11:36 AM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: RE: MultiVertexInputFormat

Thanks a lot Avery for your response. I am now using 
VertexInputFormatDescription, but still could not figure out how to set the 
Vertex input path. I just need to read the vertex values from two different 
files, each with its own format. I am not using  any EdgeInputFormatClass.

 Can you please take a look at my code below and show me how to set the 
Vertex Input Path? Thanks


if (null == getConf()) {
conf = new Configuration();
   }

   GiraphConfiguration gconf = new GiraphConfiguration(getConf());
   int workers = Integer.parseInt(arg0[2]);
   gconf.setWorkerConfiguration(workers, workers, 100.0f);



   ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

   // Input one
   VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description1);

  // Input two
   VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description2);


  
VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));


   gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
   gconf.setComputationClass(UseCase1Vertex.class);
   GiraphJob job = new GiraphJob(gconf, Use Case 1);
   FileOutputFormat.setOutputPath(job.getInternalJob(), new 
Path(arg0[1]));