File reading in giraph

2014-04-05 Thread Jyoti Yadav
Hi folks..

Is it possible to read a text file in master compute function. Text file
just contain some integer values..

Seeking your ideas..
Thanks..
With Regards

Jyoti


Giraph job hangs and is eventually killed

2014-04-05 Thread John Yost
Hi Everyone,

I have a shortest path implementation that completes and outputs the
correct results to a counter, but then hangs after the last superstep and
is eventually killed by Hadoop.

Here's the output from the console:

main-SendThread(localhost.localdomain:2181)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
localhost.localdomain/127.0.0.1:2181. Will not attempt to authenticate
using SASL (unknown error)
[main-SendThread(localhost.localdomain:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
localhost.localdomain/127.0.0.1:2181, initiating session
[main-SendThread(localhost.localdomain:2181)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
localhost.localdomain/127.0.0.1:2181, sessionid = 0x1451fc674a30007,
negotiated timeout = 4
14/04/04 22:19:44 INFO job.JobProgressTracker: Data from 1 workers -
Storing data: 0 out of 11 vertices stored; 0 out of 1 partitions stored;
min free memory on worker 1 - 119.73MB, average 119.73MB
14/04/04 22:19:45 INFO mapred.JobClient:  map 100% reduce 0%
14/04/04 22:19:49 INFO job.JobProgressTracker: Data from 1 workers -
Storing data: 0 out of 11 vertices stored; 0 out of 1 partitions stored;
min free memory on worker 1 - 119.73MB, average 119.73MB
14/04/04 22:19:54 INFO job.JobProgressTracker: Data from 1 workers -
Storing data: 0 out of 11 vertices stored; 0 out of 1 partitions stored;
min free memory on worker 1 - 119.44MB, average 119.44MB
1

This is the stack trace I see in Hadoop after the job is killed:

Caused by: java.lang.IllegalStateException: waitFor:
ExecutionException occurred while waiting for
org.apache.giraph.utils.ProgressableUtils$FutureWaitable@43349eef
at 
org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:193)
at 
org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:151)
at 
org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:136)
at 
org.apache.giraph.utils.ProgressableUtils.getFutureResult(ProgressableUtils.java:99)
at 
org.apache.giraph.utils.ProgressableUtils.getResultsWithNCallables(ProgressableUtils.java:233)
at 
org.apache.giraph.worker.BspServiceWorker.saveVertices(BspServiceWorker.java:1033)
at 
org.apache.giraph.worker.BspServiceWorker.cleanup(BspServiceWorker.java:1179)
at 
org.apache.giraph.graph.GraphTaskManager.cleanup(GraphTaskManager.java:843)
at org.apache.giraph.graph.GraphMapper.cleanup(GraphMapper.java:81)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
... 7 more
Caused by: java.util.concurrent.ExecutionException:
java.lang.IllegalStateException:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed
to create file 
/user/prototype/giraph/twitter-path-result/_temporary/_attempt_201404012018_0003_m_01_0/part-m-1
for DFSClient_attempt_201404012018_0003_m_01_0_-1149212770_1 on
client 127.0.0.1 because current leaseholder is trying to recreate
file.
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1452)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1324)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1266)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:668)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:647)
at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:578)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1393)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1389)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1387)

I realize that the root cause appears to be within Hadoop and not Giraph,
but I am wondering if there is Giraph configuration parameter I am missing?
 In researching the HDFS exception (not many posts on this, BTW), one
responder opined that this exception is due to speculative execution being
enabled.

Also, I tested a standard Map/Reduce job writing to the same datablock and
it worked fine, so I don't think HDFS is the problem (corrupt datablock,
etc...)

Any ideas?

--John


Re: Custom partitioning among workers

2014-04-05 Thread Akshay Trivedi
Hi,
Thank you Lukas for your reply. I have distributed my graph using
custom WorkerGraphPartitioner and MasterGraphPartitioner. But I am
facing difficulty in labeling the vertices initially. In order to
label vertices,I need to run breadth first search(bfs) on the graph.
Where should the bfs code be placed? If master's compute() method is
called before partitioner, then using bfs I can label vertices in it
and using these labels I can distribute my graph. If not, what to do?

Regards,
Akshay

On Sat, Apr 5, 2014 at 4:07 AM, Lukas Nalezenec
lukas.naleze...@firma.seznam.cz wrote:
 Hi,
 Make labels be part of vertex id (I know, its limiting) then implement
 custom WorkerGraphPartitioner and MasterGraphPartitioner.

 Regards
 Lukas


 On 4.4.2014 13:59, Akshay Trivedi wrote:

 Hi all,
 I need help on partitioning graph. I input the graph and label it. Now
 I want that all vertices with same label are assigned to a worker and
 no other vertices with different labels are assigned to the same
 worker. In this way each group of vertices having same label are
 assigned to an unique worker. Can anyone help me with this.
   Thank you in advance

 Regards
 Akshay




Re: Custom partitioning among workers

2014-04-05 Thread Lukas Nalezenec

Hi,

You can simply write two Giraph jobs.
If you patch source code with patch from ticket GIRAPH-886 you
can IMHO change vertex ids and partitioning in the middle of computation.

Regards
Lukas

On 5.4.2014 13:38, Akshay Trivedi wrote:

Hi,
Thank you Lukas for your reply. I have distributed my graph using
custom WorkerGraphPartitioner and MasterGraphPartitioner. But I am
facing difficulty in labeling the vertices initially. In order to
label vertices,I need to run breadth first search(bfs) on the graph.
Where should the bfs code be placed? If master's compute() method is
called before partitioner, then using bfs I can label vertices in it
and using these labels I can distribute my graph. If not, what to do?

Regards,
Akshay

On Sat, Apr 5, 2014 at 4:07 AM, Lukas Nalezenec
lukas.naleze...@firma.seznam.cz wrote:

Hi,
Make labels be part of vertex id (I know, its limiting) then implement
custom WorkerGraphPartitioner and MasterGraphPartitioner.

Regards
Lukas


On 4.4.2014 13:59, Akshay Trivedi wrote:

Hi all,
I need help on partitioning graph. I input the graph and label it. Now
I want that all vertices with same label are assigned to a worker and
no other vertices with different labels are assigned to the same
worker. In this way each group of vertices having same label are
assigned to an unique worker. Can anyone help me with this.
   Thank you in advance

Regards
Akshay