Vertex with no outgoing edges doesn't execute Compute in Superstep 0

2014-08-28 Thread Sardeshmukh, Vivek
Hello all,


I was playing around with Shortest path example. I decided to write my own 
input format to match with SNAP's LJ dataset (

http://snap.stanford.edu/data/soc-LiveJournal1.html)http://snap.stanford.edu/data/soc-LiveJournal1.html
 ). This is an edge format so I wrote LongFloatTextEdgeInputFormat (which is 
similar to IntNullTextEdgeInputFormat given in io/formats directory).

To test this I had following input :


0 1

0 2

0 3

1 2

2 0

2 3


And all edges have weight 1.

I tried running ShortestPath example by specifying the edge input format and 
the above input file and I got the following output :

0 2.0

1 0

2 1.0

3 0


Notice that for vertex 3 it should be 2 (source is vertex 1). I thought there 
is some problem with my input format, so went back to vertex input format as 
specified in the quick start guide. Here is my vertex input in JSON format :

[0,0,[[1,1],[2,1],[3,1]]]
[1,0,[[2,1]]]
[2,0,[[3,1],[0,1]]]


Notice that vertex 3 doesn't have any outgoing edge so I didn't added an entry 
for it. Even with this I got the same output. Then I enabled debug and found 
that

Vertex 3 doesn't execute Superstep 0 at all. It only executed Superstep 3 and 4 
(in Superstep 3 it receives a message from vertex 2). Also, in Superstep 3 it 
shows that it has vertex value = 0.

Does it mean that vertices with no outgoing edges are not active in the 
beginning? Is there any way to fix this?


A quick and dirty fix will be adding a line to vertex input file - [3, 0, []]

but what if I don't want to use vertex input format and use edge input format 
as described above?


Thank you.

Sincerely,
Vivek


RE: giraph 1.1.0 Execution Error

2014-08-07 Thread Sardeshmukh, Vivek
Hi Xenia, 

I think there is some problem with Zookeeper. Can you make sure that Zookeeper 
server is running. If it is running then is it on port 22181? (because your 
Giraph job is trying to connect on this port). If Zookeeper is running on some 
different port then try running your Giraph job with -Dgiraph.zkList=zookeper 
server ip:zookeeper port 

I'm not sure whether you have to start an instance of zookeeper separately or 
Giraph will start one for you, I have a separate instance running on my cluster 
and I specify the server and port via -Dgiraph.zkList option. 

I hope that works. 

Vivek


From: xeniad20 xenia...@gmail.com
Sent: Thursday, August 7, 2014 3:46 PM
To: user@giraph.apache.org
Subject: giraph 1.1.0 Execution Error

Hi experts,

I try to execute Giraph 1.1.0 on a small cluster but I have the
following Errors:

2014-08-07 23:35:46,141 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server DataNode2/10.190.12.33:22181. Will not
attempt to authenticate using SASL (unknown error)
2014-08-07 23:35:46,142 WARN org.apache.zookeeper.ClientCnxn: Session
0x147b22ebf420001 for server null, unexpected error, closing socket
connection and attempting reconnect
java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
 at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-08-07 23:35:46,243 WARN org.apache.giraph.zk.ZooKeeperExt:
deleteExt: Connection loss on attempt 2, waiting 5000 msecs before retrying.
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for
/_hadoopBsp/job_201408072332_0003/_applicationAttemptsDir/0/_superstepDir/1/_workerHealthyDir/datanode1_1
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873)
 at org.apache.giraph.zk.ZooKeeperExt.deleteExt(ZooKeeperExt.java:302)
 at
org.apache.giraph.worker.BspServiceWorker.unregisterHealth(BspServiceWorker.java:768)
 at
org.apache.giraph.worker.BspServiceWorker.failureCleanup(BspServiceWorker.java:782)
 at
org.apache.giraph.graph.GraphTaskManager.workerFailureCleanup(GraphTaskManager.java:900)
 at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:100)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
 at org.apache.hadoop.mapred.Child.main(Child.java:249)
2014-08-07 23:35:48,126 INFO org.apache.zookeeper.ClientCnxn: Opening
socket connection to server DataNode2/10.190.12.33:22181. Will not
attempt to authenticate using SASL (unknown error)
2014-08-07 23:35:48,127 WARN org.apache.zookeeper.ClientCnxn: Session
0x147b22ebf420001 for server null, unexpected error, closing socket
connection and attempting reconnect
java.net.ConnectException: Connection refused
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
 at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
 at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
2014-08-07 23:35:49,368 FATAL org.apache.giraph.graph.GraphMapper:
uncaughtException: OverrideExceptionHandler on thread Thread-12, msg =
createExt: Failed to create
/_hadoopBsp/job_201408072332_0003/_workerProgresses/1 after 3 tries!,
exiting...
java.lang.IllegalStateException: createExt: Failed to create
/_hadoopBsp/job_201408072332_0003/_workerProgresses/1 after 3 tries!
 at org.apache.giraph.zk.ZooKeeperExt.createExt(ZooKeeperExt.java:182)
 at
org.apache.giraph.zk.ZooKeeperExt.createOrSetExt(ZooKeeperExt.java:247)
 at
org.apache.giraph.worker.WorkerProgress.writeToZnode(WorkerProgress.java:110)
 at
org.apache.giraph.worker.WorkerProgressWriter$1.run(WorkerProgressWriter.java:59)
 at java.lang.Thread.run(Thread.java:724)

However Giraph 1.0.0 version run without any problems.
What might be the solution for the above errors?

Any help is appreciated.

Thanks
Xenia


RE: Setting variable value in Compute class and using it in the next superstep

2014-07-23 Thread Sardeshmukh, Vivek
Hello again,


As Tom and Matthew suggested I wrote my own custom vertex value class and input 
format class. I followed Matthew's example to create my own custom vertex class 
but now I'm getting the following error while running the program


java.lang.IllegalStateException: newInstance: Illegal access 
org.apache.giraph.examples.DeltaVertexWritable
at 
org.apache.giraph.utils.ReflectionUtils.newInstance(ReflectionUtils.java:84)
at 
org.apache.giraph.utils.WritableUtils.createWritable(WritableUtils.java:68)
at 
org.apache.giraph.factories.DefaultVertexValueFactory.newInstance(DefaultVertexValueFactory.java:48)
at 
org.apache.giraph.conf.ImmutableClassesGiraphConfiguration.createVertexValue(ImmutableClassesGiraphConfiguration.java:729)
at 
org.apache.giraph.utils.VertexIterator.resetEmptyVertex(VertexIterator.java:69)
at org.apache.giraph.utils.VertexIterator.init(VertexIterator.java:60)
at 
org.apache.giraph.comm.requests.SendWorkerVerticesRequest.doRequest(SendWorkerVerticesRequest.java:108)
at 
org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.doRequest(NettyWorkerClientRequestProcessor.java:466)
at 
org.apache.giraph.comm.netty.NettyWorkerClientRequestProcessor.flush(NettyWorkerClientRequestProcessor.java:412)
at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:241)
at 
org.apache.giraph.worker.InputSplitsCallable.call(InputSplitsCallable.java:60)
at 
org.apache.giraph.utils.LogStacktraceCallable.call(LogStacktraceCallable.java:51)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)





Here is my DeltaVertexWritable class - 
https://gist.github.com/sar-vivek/df09cca17cc3f6b5ac60


I tried digging a bit but I couldn't get any success [at the first place I even 
didn't understand the error message!]



Thank you.

Vivek

From: Sardeshmukh, Vivek vivek-sardeshm...@uiowa.edu
Sent: Monday, July 21, 2014 6:06 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next 
superstep


Thank you Matthew. Now writing a custom vertex class and input format seems 
doable! Thank you.



--
Vivek

From: Matthew Saltz sal...@gmail.com
Sent: Monday, July 21, 2014 5:50 PM
To: user@giraph.apache.org
Subject: Re: Setting variable value in Compute class and using it in the next 
superstep

Yeah, that's true. Sorry I forgot that part. Luckily, it isn't too tricky 
either, depending on the input format of your graph. Here's another 
examplehttps://gist.github.com/saltzm/ab7172c57dec927061be to get you 
started, for a very simple input format for edges with no values. I basically 
took the code straight from 
herehttp://giraph.apache.org/apidocs/org/apache/giraph/io/formats/LongLongNullTextInputFormat.html
 and modified where I needed to it to return the InputFormat that I needed for 
my code. You'll probably be better off digging through some of the already 
implemented InputFormat classes that come with Giraph to do something similar, 
since I'm guessing your input files will be different than mine. Take a look at 
the subclasses of 
TextVertexInputFormathttp://giraph.apache.org/apidocs/org/apache/giraph/io/formats/TextVertexInputFormat.html,
 since they deal with a lot of common input format styles, and see if you can 
modify their code to work with your custom vertex data format. Now, the example 
I give you is also easy because I just use the default constructor of the 
class, but if you need to load additional data from the file into your vertex 
data and the default constructor isn't appropriate, you may have to do some 
extra parsing and legwork for that.

Best of luck,
Matthew



On Tue, Jul 22, 2014 at 12:28 AM, Sardeshmukh, Vivek 
vivek-sardeshm...@uiowa.edumailto:vivek-sardeshm...@uiowa.edu wrote:

Thank you Matthew for the example link. It is helpful. I'll give it a shot.


If I have a custom vertex class isn't it necessary to change the 
VertexInputFormat class too? Since this class loads the data into the vertex 
and if vertex has a custom value field then it doesn't know how to load the 
input. Am I right?


Vivek

From: Schweiger, Tom thschwei...@ebay.commailto:thschwei...@ebay.com
Sent: Monday, July 21, 2014 5:16 PM
To: user@giraph.apache.orgmailto:user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next 
superstep


For more than one flag, a custom class is necessary (unless you're able to, 
say, toggle the sign bit to get double usage out or a value).

I've started a private thread with Vivek to get a better understanding of what 
he was trying to solve.

And you are also correct that there isn't much

Setting variable value in Compute class and using it in the next superstep

2014-07-21 Thread Sardeshmukh, Vivek
Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a 
vertex v). If this flag is set then execute some other block of code *only 
once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override 
compute function? I defined the flag as a public variable and setting it once 
the conditions are met but it seems the value is not carried over to the next 
superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able 
to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement 
Delta-stepping shortest path algorithm ( 
http://dl.acm.org/citation.cfm?id=740136 or 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was 
mentioned in Pregel paper. A vertex relax light edges if it belongs to the 
minimum bucket index (of course, aggregators!). Once a vertex is done with 
relaxing light edges it relaxes heavy edges (here is where I need a flag) once. 
A vertex may be re-inserted to a newer bucket and may have to execute all the 
steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)



RE: Setting variable value in Compute class and using it in the next superstep

2014-07-21 Thread Sardeshmukh, Vivek
Thank you Tom for your prompt reply.


If that is the case then I might be doing something wrong. I'll take a closer 
look with debug enabled and keep you posted.


Thank you again.


Vivek


From: Schweiger, Tom thschwei...@ebay.com
Sent: Monday, July 21, 2014 4:37 PM
To: user@giraph.apache.org
Subject: RE: Setting variable value in Compute class and using it in the next 
superstep

And in answer of :

This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?
No, you don't need a custom vertex class or vertex input format. You can 
create/initialize the value at the beginning of the first superstep.


From: Sardeshmukh, Vivek [vivek-sardeshm...@uiowa.edu]
Sent: Monday, July 21, 2014 2:05 PM
To: user@giraph.apache.org
Subject: Setting variable value in Compute class and using it in the next 
superstep


Hi, all--


In my algorithm, I need to set a flag if certain conditions hold (locally at a 
vertex v). If this flag is set then execute some other block of code *only 
once*, and do nothing until some other condition is hold.


My question is, can I declare a flag variable in the class where I override 
compute function? I defined the flag as a public variable and setting it once 
the conditions are met but it seems the value is not carried over to the next 
superstep.

I dig a little bit in this mailing list and found this

https://www.mail-archive.com/user@giraph.apache.org/msg01266.html


This post also suggests (along with what I described above) to have a field in 
the vertex value itself. For that I need to change the vertex input format and 
also create my own custom vertex class. Is it really necessary?


By the way, I am using Giraph 1.1.0 compiled against Hadoop 1.0.3. I was able 
to run SimpleShortestPathComputation successfully.


Here are more technical details of my algorithm: I am trying to implement 
Delta-stepping shortest path algorithm ( 
http://dl.acm.org/citation.cfm?id=740136 or 
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.2200 ). This was 
mentioned in Pregel paper. A vertex relax light edges if it belongs to the 
minimum bucket index (of course, aggregators!). Once a vertex is done with 
relaxing light edges it relaxes heavy edges (here is where I need a flag) once. 
A vertex may be re-inserted to a newer bucket and may have to execute all the 
steps that I described here again.


Thanks.


Sincerely,

Vivek
A beginner in Giraph (and Java too!)