RE: Exception with Large Graphs

2013-09-03 Thread Yasser Altowim
Hi Avery,

Thanks for your response. The data I am loading is almost 9 GB, and I 
have 10 nodes, each has a 4G of ram.

Best,
Yasser

From: Avery Ching [mailto:ach...@apache.org]
Sent: Friday, August 30, 2013 4:43 PM
To: user@giraph.apache.org
Subject: Re: Exception with Large Graphs

That error is from the master dying (likely due to the results of another 
worker dying).  Can you do a rough calculation of the size of data that you 
expect to be loaded and check if the memory is enough?

On 8/30/13 11:19 AM, Yasser Altowim wrote:
Guys,

   Can someone please help me with this issue? Thanks.

Best,
Yasser

From: Yasser Altowim
Sent: Thursday, August 29, 2013 11:16 AM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: Exception with Large Graphs

Hi,

 I am implementing an algorithm using Giraph, and I was able to run my 
algorithm on relatively small datasets (64,000,000 vertices and 128,000,000 
edges). However, when I increase the size of the dataset to 128,000,000 
vertices and 256,000,000 edges, the job takes so much time to load the 
vertices, and then it gives me the following exception.

I have tried to increase the heap size and the task timeout value in 
the mapred-site.xml configuration file, and even vary the number of workers 
from 1 to 10, but still getting the same exceptions. I have a cluster of 10 
nodes, and each node has  a 4G of ram.  Thanks in advance.

2013-08-29 10:22:53,150 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet 
java.util.concurrent.FutureTask@1a129460mailto:java.util.concurrent.FutureTask@1a129460
2013-08-29 10:22:53,151 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:23:07,938 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 7769685 vertices at 14250.953615591572 vertices/sec 15539370 edges at 
28500.77593053654 edges/sec Memory (free/total/max) = 680.21M / 3207.44M / 
3555.56M
2013-08-29 10:23:14,538 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8019685 vertices at 14533.557468366102 vertices/sec 16039370 edges at 
29065.97491865343 edges/sec Memory (free/total/max) = 906.80M / 3242.75M / 
3555.56M
2013-08-29 10:23:21,888 INFO org.apache.giraph.worker.InputSplitsCallable: 
loadFromInputSplit: Finished loading 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/9 (v=1212852, e=2425704)
2013-08-29 10:23:37,911 INFO org.apache.giraph.worker.InputSplitsHandler: 
reserveInputSplit: Reserved input split path 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19, overall roughly 
7.518797% input splits reserved
2013-08-29 10:23:37,923 INFO org.apache.giraph.worker.InputSplitsCallable: 
getInputSplit: Reserved 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19 from ZooKeeper and 
got input split 
'org.apache.giraph.io.formats.multi.InputSplitWithInputFormatIndex@24004559'
2013-08-29 10:23:44,313 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8482537 vertices at 14585.340134636266 vertices/sec 16965074 edges at 
29169.59449002283 edges/sec Memory (free/total/max) = 538.93M / 3186.13M / 
3555.56M
2013-08-29 10:23:49,963 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8732537 vertices at 14870.726503632277 vertices/sec 17465074 edges at 
29740.356341344923 edges/sec Memory (free/total/max) = 489.84M / 3222.56M / 
3555.56M
2013-08-29 10:34:28,371 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet 
java.util.concurrent.FutureTask@1a129460mailto:java.util.concurrent.FutureTask@1a129460
2013-08-29 10:34:34,847 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:34:34,850 INFO 
org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window 
metrics MBytes/sec sent = 0, MBytes/sec received = 0.0161, MBytesSent = 0.0002, 
MBytesReceived = 12.3175, ave sent req MBytes = 0, ave received req MBytes = 
0.0587, secs waited = 765.881
2013-08-29 10:34:35,698 INFO org.apache.zookeeper.ClientCnxn: Client session 
timed out, have not heard from server in 649805ms for sessionid 
0x140cb1140540006, closing socket connection and attempting reconnect
2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.BspService: process: 
Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent 
state:Disconnected type:None path:null
2013-08-29 10:34:42,472 WARN org.apache.giraph.worker.InputSplitsHandler: 
process: Problem with zookeeper, got event with path null, state Disconnected, 
event type None
2013-08-29 10:34:43,819 INFO

RE: Exception with Large Graphs

2013-08-30 Thread Yasser Altowim
Guys,

   Can someone please help me with this issue? Thanks.

Best,
Yasser

From: Yasser Altowim
Sent: Thursday, August 29, 2013 11:16 AM
To: user@giraph.apache.org
Subject: Exception with Large Graphs

Hi,

 I am implementing an algorithm using Giraph, and I was able to run my 
algorithm on relatively small datasets (64,000,000 vertices and 128,000,000 
edges). However, when I increase the size of the dataset to 128,000,000 
vertices and 256,000,000 edges, the job takes so much time to load the 
vertices, and then it gives me the following exception.

I have tried to increase the heap size and the task timeout value in 
the mapred-site.xml configuration file, and even vary the number of workers 
from 1 to 10, but still getting the same exceptions. I have a cluster of 10 
nodes, and each node has  a 4G of ram.  Thanks in advance.

2013-08-29 10:22:53,150 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet 
java.util.concurrent.FutureTask@1a129460mailto:java.util.concurrent.FutureTask@1a129460
2013-08-29 10:22:53,151 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:23:07,938 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 7769685 vertices at 14250.953615591572 vertices/sec 15539370 edges at 
28500.77593053654 edges/sec Memory (free/total/max) = 680.21M / 3207.44M / 
3555.56M
2013-08-29 10:23:14,538 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8019685 vertices at 14533.557468366102 vertices/sec 16039370 edges at 
29065.97491865343 edges/sec Memory (free/total/max) = 906.80M / 3242.75M / 
3555.56M
2013-08-29 10:23:21,888 INFO org.apache.giraph.worker.InputSplitsCallable: 
loadFromInputSplit: Finished loading 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/9 (v=1212852, e=2425704)
2013-08-29 10:23:37,911 INFO org.apache.giraph.worker.InputSplitsHandler: 
reserveInputSplit: Reserved input split path 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19, overall roughly 
7.518797% input splits reserved
2013-08-29 10:23:37,923 INFO org.apache.giraph.worker.InputSplitsCallable: 
getInputSplit: Reserved 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19 from ZooKeeper and 
got input split 
'org.apache.giraph.io.formats.multi.InputSplitWithInputFormatIndex@24004559'
2013-08-29 10:23:44,313 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8482537 vertices at 14585.340134636266 vertices/sec 16965074 edges at 
29169.59449002283 edges/sec Memory (free/total/max) = 538.93M / 3186.13M / 
3555.56M
2013-08-29 10:23:49,963 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8732537 vertices at 14870.726503632277 vertices/sec 17465074 edges at 
29740.356341344923 edges/sec Memory (free/total/max) = 489.84M / 3222.56M / 
3555.56M
2013-08-29 10:34:28,371 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet 
java.util.concurrent.FutureTask@1a129460mailto:java.util.concurrent.FutureTask@1a129460
2013-08-29 10:34:34,847 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4mailto:org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:34:34,850 INFO 
org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window 
metrics MBytes/sec sent = 0, MBytes/sec received = 0.0161, MBytesSent = 0.0002, 
MBytesReceived = 12.3175, ave sent req MBytes = 0, ave received req MBytes = 
0.0587, secs waited = 765.881
2013-08-29 10:34:35,698 INFO org.apache.zookeeper.ClientCnxn: Client session 
timed out, have not heard from server in 649805ms for sessionid 
0x140cb1140540006, closing socket connection and attempting reconnect
2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.BspService: process: 
Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent 
state:Disconnected type:None path:null
2013-08-29 10:34:42,472 WARN org.apache.giraph.worker.InputSplitsHandler: 
process: Problem with zookeeper, got event with path null, state Disconnected, 
event type None
2013-08-29 10:34:43,819 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server slave5.ericsson-magic.net/10.126.72.165:22181
2013-08-29 10:34:44,077 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to slave5.ericsson-magic.net/10.126.72.165:22181, initiating session
2013-08-29 10:34:44,220 WARN org.apache.giraph.bsp.BspService: process: Got 
unknown null path event WatchedEvent state:Expired type:None path:null
2013-08-29 10:34:44,220 WARN org.apache.giraph.worker.InputSplitsHandler: 
process: Problem with zookeeper, got event with path null, state Expired, event 
type None
2013-08-29

Exception with Large Graphs

2013-08-29 Thread Yasser Altowim
Hi,

 I am implementing an algorithm using Giraph, and I was able to run my 
algorithm on relatively small datasets (64,000,000 vertices and 128,000,000 
edges). However, when I increase the size of the dataset to 128,000,000 
vertices and 256,000,000 edges, the job takes so much time to load the 
vertices, and then it gives me the following exception.

I have tried to increase the heap size and the task timeout value in 
the mapred-site.xml configuration file, and even vary the number of workers 
from 1 to 10, but still getting the same exceptions. I have a cluster of 10 
nodes, and each node has  a 4G of ram.  Thanks in advance.

2013-08-29 10:22:53,150 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460
2013-08-29 10:22:53,151 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:23:07,938 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 7769685 vertices at 14250.953615591572 vertices/sec 15539370 edges at 
28500.77593053654 edges/sec Memory (free/total/max) = 680.21M / 3207.44M / 
3555.56M
2013-08-29 10:23:14,538 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8019685 vertices at 14533.557468366102 vertices/sec 16039370 edges at 
29065.97491865343 edges/sec Memory (free/total/max) = 906.80M / 3242.75M / 
3555.56M
2013-08-29 10:23:21,888 INFO org.apache.giraph.worker.InputSplitsCallable: 
loadFromInputSplit: Finished loading 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/9 (v=1212852, e=2425704)
2013-08-29 10:23:37,911 INFO org.apache.giraph.worker.InputSplitsHandler: 
reserveInputSplit: Reserved input split path 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19, overall roughly 
7.518797% input splits reserved
2013-08-29 10:23:37,923 INFO org.apache.giraph.worker.InputSplitsCallable: 
getInputSplit: Reserved 
/_hadoopBsp/job_201308290837_0003/_vertexInputSplitDir/19 from ZooKeeper and 
got input split 
'org.apache.giraph.io.formats.multi.InputSplitWithInputFormatIndex@24004559'
2013-08-29 10:23:44,313 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8482537 vertices at 14585.340134636266 vertices/sec 16965074 edges at 
29169.59449002283 edges/sec Memory (free/total/max) = 538.93M / 3186.13M / 
3555.56M
2013-08-29 10:23:49,963 INFO 
org.apache.giraph.worker.VertexInputSplitsCallable: readVertexInputSplit: 
Loaded 8732537 vertices at 14870.726503632277 vertices/sec 17465074 edges at 
29740.356341344923 edges/sec Memory (free/total/max) = 489.84M / 3222.56M / 
3555.56M
2013-08-29 10:34:28,371 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460
2013-08-29 10:34:34,847 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Waiting for 
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@30d320e4
2013-08-29 10:34:34,850 INFO 
org.apache.giraph.comm.netty.handler.RequestDecoder: decode: Server window 
metrics MBytes/sec sent = 0, MBytes/sec received = 0.0161, MBytesSent = 0.0002, 
MBytesReceived = 12.3175, ave sent req MBytes = 0, ave received req MBytes = 
0.0587, secs waited = 765.881
2013-08-29 10:34:35,698 INFO org.apache.zookeeper.ClientCnxn: Client session 
timed out, have not heard from server in 649805ms for sessionid 
0x140cb1140540006, closing socket connection and attempting reconnect
2013-08-29 10:34:42,471 WARN org.apache.giraph.bsp.BspService: process: 
Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent 
state:Disconnected type:None path:null
2013-08-29 10:34:42,472 WARN org.apache.giraph.worker.InputSplitsHandler: 
process: Problem with zookeeper, got event with path null, state Disconnected, 
event type None
2013-08-29 10:34:43,819 INFO org.apache.zookeeper.ClientCnxn: Opening socket 
connection to server slave5.ericsson-magic.net/10.126.72.165:22181
2013-08-29 10:34:44,077 INFO org.apache.zookeeper.ClientCnxn: Socket connection 
established to slave5.ericsson-magic.net/10.126.72.165:22181, initiating session
2013-08-29 10:34:44,220 WARN org.apache.giraph.bsp.BspService: process: Got 
unknown null path event WatchedEvent state:Expired type:None path:null
2013-08-29 10:34:44,220 WARN org.apache.giraph.worker.InputSplitsHandler: 
process: Problem with zookeeper, got event with path null, state Expired, event 
type None
2013-08-29 10:34:44,221 INFO org.apache.zookeeper.ClientCnxn: EventThread shut 
down
2013-08-29 10:34:44,240 INFO org.apache.zookeeper.ClientCnxn: Unable to 
reconnect to ZooKeeper service, session 0x140cb1140540006 has expired, closing 
socket connection
2013-08-29 10:35:35,442 INFO org.apache.giraph.utils.ProgressableUtils: 
waitFor: Future result not ready yet java.util.concurrent.FutureTask@1a129460
2013-08-29 10:35:35,443 INFO 

RE: MultiVertexInputFormat

2013-08-28 Thread Yasser Altowim
Thanks Maja for your response. That works but as I told you I had to modify the 
implementation of the MultiVertexInputFormat. I am posting my fix here in case 
someone runs into a similar problem.

  @Override
  public VertexReaderI, V, E createVertexReader(InputSplit inputSplit,
  TaskAttemptContext context) throws IOException {
if (inputSplit instanceof InputSplitWithInputFormatIndex) {
  // When multithreaded input is used we need to make sure other threads
  // don't change context's configuration while we use it
  synchronized (context) {
InputSplitWithInputFormatIndex split =
(InputSplitWithInputFormatIndex) inputSplit;
VertexInputFormatI, V, E vertexInputFormat =
vertexInputFormats.get(split.getInputFormatIndex());
VertexReaderI, V, E vertexReader =
vertexInputFormat.createVertexReader(split.getSplit(), context);
return new WrappedVertexReaderI, V, E(
vertexReader, vertexInputFormat.getConf()) {
  @Override
  public void initialize(InputSplit inputSplit,
  TaskAttemptContext context) throws IOException,
  InterruptedException {
// When multithreaded input is used we need to make sure other
// threads don't change context's configuration while we use it
synchronized (context) {
  super.initialize(inputSplit, context);
}
  }
};
  }
} else {
  throw new IllegalStateException(createVertexReader: Got InputSplit  +
  which was not created by this class:  +
  inputSplit.getClass().getName());
}
  }

I changed the line in red above to the following:
super.initialize(((InputSplitWithInputFormatIndex) inputSplit).getSplit(), 
context);


Best,
Yasser

From: Maja Kabiljo [mailto:majakabi...@fb.com]
Sent: Wednesday, August 21, 2013 8:24 PM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: Re: MultiVertexInputFormat

Hi Yasser,

You can do this through the Configuration parameters. You should call:
description1.addParameter(myApplication.vertexInputPath, file1.txt);
and
description2.addParameter(myApplication.vertexInputPath, file2.txt);
Then from the code of your InputFormat class you can get this parameter from 
Configuration. If it's not already, make sure your InputFormat implements 
ImmutableClassesGiraphConfigurable, and configuration is going to be set in it 
automatically.

You can also take a look at HiveGiraphRunner which uses multiple inputs and 
sets parameters user passes from command line.

Hope this helps,
Maja

From: Yasser Altowim 
yasser.alto...@ericsson.commailto:yasser.alto...@ericsson.com
Reply-To: user@giraph.apache.orgmailto:user@giraph.apache.org 
user@giraph.apache.orgmailto:user@giraph.apache.org
Date: Monday, August 19, 2013 9:16 AM
To: user@giraph.apache.orgmailto:user@giraph.apache.org 
user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: RE: MultiVertexInputFormat

Hi Guys,

 Any help on this will be appreciated. I am repeating my question and my 
code below:


I am implementing an algorithm in Giraph that reads the vertex values from two 
input files, each has its own format. I am not using  any EdgeInputFormatClass. 
I am now using VertexInputFormatDescription along with MultiVertexInputFormats, 
but still could not figure out how to set the Vertex input path for each Input 
Format Class. Can you please take a look at my code below and show me how to 
set the Vertex Input Path? I have taken a look at HiveGiraphRunner but still no 
luck. Thanks

if (null == getConf()) {
conf = new Configuration();
}

GiraphConfiguration gconf = new GiraphConfiguration(getConf());
int workers = Integer.parseInt(arg0[2]);
gconf.setWorkerConfiguration(workers, workers, 100.0f);

ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

// Input one
VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
// how to set the vertex input path? i.e. how to say that I want to read 
file1.txt using this input format class
vertexInputDescriptions.add(description1);

// Input two
VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
// how to set the vertex input path?
vertexInputDescriptions.add(description2);


GiraphConstants.VERTEX_INPUT_FORMAT_CLASS.set(gconf,

MultiVertexInputFormat.class);

VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));

gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
gconf.setComputationClass(UseCase1Vertex.class);
GiraphJob job = new GiraphJob(gconf, Use Case 1);
FileOutputFormat.setOutputPath

RE: MultiVertexInputFormat

2013-08-19 Thread Yasser Altowim
Hi Guys,

 Any help on this will be appreciated. I am repeating my question and my 
code below:


I am implementing an algorithm in Giraph that reads the vertex values from two 
input files, each has its own format. I am not using  any EdgeInputFormatClass. 
I am now using VertexInputFormatDescription along with MultiVertexInputFormats, 
but still could not figure out how to set the Vertex input path for each Input 
Format Class. Can you please take a look at my code below and show me how to 
set the Vertex Input Path? I have taken a look at HiveGiraphRunner but still no 
luck. Thanks

if (null == getConf()) {
conf = new Configuration();
}

GiraphConfiguration gconf = new GiraphConfiguration(getConf());
int workers = Integer.parseInt(arg0[2]);
gconf.setWorkerConfiguration(workers, workers, 100.0f);

ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

// Input one
VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
// how to set the vertex input path? i.e. how to say that I want to read 
file1.txt using this input format class
vertexInputDescriptions.add(description1);

// Input two
VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
// how to set the vertex input path?
vertexInputDescriptions.add(description2);


GiraphConstants.VERTEX_INPUT_FORMAT_CLASS.set(gconf,

MultiVertexInputFormat.class);

VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));

gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
gconf.setComputationClass(UseCase1Vertex.class);
GiraphJob job = new GiraphJob(gconf, Use Case 1);
FileOutputFormat.setOutputPath(job.getInternalJob(), new Path(arg0[1]));
return job.run(true) ? 0 : -1;


Thanks in advance.

Best,
Yasser

From: Yasser Altowim [mailto:yasser.alto...@ericsson.com]
Sent: Friday, August 16, 2013 11:36 AM
To: user@giraph.apache.org
Subject: RE: MultiVertexInputFormat

Thanks a lot Avery for your response. I am now using 
VertexInputFormatDescription, but still could not figure out how to set the 
Vertex input path. I just need to read the vertex values from two different 
files, each with its own format. I am not using  any EdgeInputFormatClass.

 Can you please take a look at my code below and show me how to set the 
Vertex Input Path? Thanks


if (null == getConf()) {
conf = new Configuration();
   }

   GiraphConfiguration gconf = new GiraphConfiguration(getConf());
   int workers = Integer.parseInt(arg0[2]);
   gconf.setWorkerConfiguration(workers, workers, 100.0f);



   ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

   // Input one
   VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description1);

  // Input two
   VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description2);


  
VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));


   gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
   gconf.setComputationClass(UseCase1Vertex.class);
   GiraphJob job = new GiraphJob(gconf, Use Case 1);
   FileOutputFormat.setOutputPath(job.getInternalJob(), new 
Path(arg0[1]));
   return job.run(true) ? 0 : -1;



Best,
Yasser

From: Avery Ching [mailto:ach...@apache.org]
Sent: Friday, August 16, 2013 9:50 AM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: Re: MultiVertexInputFormat

This is doable in Giraph, you can use as many vertex or edge input formats as 
you like (via GIRAPH-639).  You just need to choose MultiVertexInputFormat 
and/or MultiEdgeInputFromat

See VertexInputFormatDescription for vertex input formats

  /**
   * VertexInputFormats description - JSON array containing a JSON array for
   * each vertex input. Vertex input JSON arrays contain one or two elements -
   * first one is the name of vertex input class, and second one is JSON object
   * with all specific parameters for this vertex input. For example:
   * [[VIF1,{p:v1}],[VIF2,{p:v2,q:v}]]
   */
  public static final StrConfOption VERTEX_INPUT_FORMAT_DESCRIPTIONS =
  new StrConfOption(giraph.multiVertexInput.descriptions, null,
  VertexInputFormats description - JSON array containing

RE: MultiVertexInputFormat

2013-08-16 Thread Yasser Altowim
Thanks a lot Avery for your response. I am now using 
VertexInputFormatDescription, but still could not figure out how to set the 
Vertex input path. I just need to read the vertex values from two different 
files, each with its own format. I am not using  any EdgeInputFormatClass.

 Can you please take a look at my code below and show me how to set the 
Vertex Input Path? Thanks


if (null == getConf()) {
conf = new Configuration();
   }

   GiraphConfiguration gconf = new GiraphConfiguration(getConf());
   int workers = Integer.parseInt(arg0[2]);
   gconf.setWorkerConfiguration(workers, workers, 100.0f);



   ListVertexInputFormatDescription vertexInputDescriptions = 
Lists.newArrayList();

   // Input one
   VertexInputFormatDescription description1 = new 
VertexInputFormatDescription(UseCase1FirstVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description1);

  // Input two
   VertexInputFormatDescription description2 = new 
VertexInputFormatDescription(UseCase1SecondVertexInputFormat.class);
   // how to set the vertex input path?
   vertexInputDescriptions.add(description2);


  
VertexInputFormatDescription.VERTEX_INPUT_FORMAT_DESCRIPTIONS.set(gconf,InputFormatDescription.toJsonString(vertexInputDescriptions));


   gconf.setVertexOutputFormatClass(UseCase1OutputFormat.class);
   gconf.setComputationClass(UseCase1Vertex.class);
   GiraphJob job = new GiraphJob(gconf, Use Case 1);
   FileOutputFormat.setOutputPath(job.getInternalJob(), new 
Path(arg0[1]));
   return job.run(true) ? 0 : -1;



Best,
Yasser

From: Avery Ching [mailto:ach...@apache.org]
Sent: Friday, August 16, 2013 9:50 AM
To: user@giraph.apache.org
Subject: Re: MultiVertexInputFormat

This is doable in Giraph, you can use as many vertex or edge input formats as 
you like (via GIRAPH-639).  You just need to choose MultiVertexInputFormat 
and/or MultiEdgeInputFromat

See VertexInputFormatDescription for vertex input formats

  /**
   * VertexInputFormats description - JSON array containing a JSON array for
   * each vertex input. Vertex input JSON arrays contain one or two elements -
   * first one is the name of vertex input class, and second one is JSON object
   * with all specific parameters for this vertex input. For example:
   * [[VIF1,{p:v1}],[VIF2,{p:v2,q:v}]]
   */
  public static final StrConfOption VERTEX_INPUT_FORMAT_DESCRIPTIONS =
  new StrConfOption(giraph.multiVertexInput.descriptions, null,
  VertexInputFormats description - JSON array containing a JSON  +
  array for each vertex input. Vertex input JSON arrays contain  +
  one or two elements - first one is the name of vertex input  +
  class, and second one is JSON object with all specific parameters  +
  for this vertex input. For example: [[\VIF1\,{\p\:\v1\}], +
  [\VIF2\,{\p\:\v2\,\q\:\v\}]]\);

See EdgeInputFormatDescription for edge input formats

  /**
   * EdgeInputFormats description - JSON array containing a JSON array for
   * each edge input. Edge input JSON arrays contain one or two elements -
   * first one is the name of edge input class, and second one is JSON object
   * with all specific parameters for this edge input. For example:
   * [[EIF1,{p:v1}],[EIF2,{p:v2,q:v}]]
   */
  public static final StrConfOption EDGE_INPUT_FORMAT_DESCRIPTIONS =
  new StrConfOption(giraph.multiEdgeInput.descriptions, null,
  EdgeInputFormats description - JSON array containing a JSON array  +
  for each edge input. Edge input JSON arrays contain one or two  +
  elements - first one is the name of edge input class, and second  +
  one is JSON object with all specific parameters for this edge  +
  input. For example: [[\EIF1\,{\p\:\v1\}], +
  [\EIF2\,{\p\:\v2\,\q\:\v\}]]);

Hope that helps,

Avery

On 8/16/13 8:45 AM, Yasser Altowim wrote:
Guys, any help with this will be appreciated. Thanks.

From: Yasser Altowim [mailto:yasser.alto...@ericsson.com]
Sent: Thursday, August 15, 2013 2:07 PM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: MultiVertexInputFormat

Hi,

 I am implementing an algorithm using Giraph. My  algorithm needs 
to read input data from two files, each has its own format. My questions are:


1.   How can I use the MultiVertexInputFormat class? Is there any example 
that shows how this class can be used?

2.   How can I specify this class when running my job using the Giraph 
Runner or using a driver class?

Thanks in advance.

Best,
Yasser




MultiVertexInputFormat

2013-08-15 Thread Yasser Altowim
Hi,

 I am implementing an algorithm using Giraph. My  algorithm needs 
to read input data from two files, each has its own format. My questions are:


1.   How can I use the MultiVertexInputFormat class? Is there any example 
that shows how this class can be used?

2.   How can I specify this class when running my job using the Giraph 
Runner or using a driver class?

Thanks in advance.

Best,
Yasser