You have to "address" dangling nodes on your adjacency list. So your input must look like:
0 1 2 1 1 2 2 1 2 3 3 <-- this one was missing causing the Null Pointer Exception. 5 See http://wiki.apache.org/hama/PageRank under "Submit your own Webgraph". > This piece of text will adjacent Site1 to Site2 and Site3, Site2 to Site3 > and Site3 is a dangling node. As you can see a site is always on the > leftmost side (we call it the key-site), and the outlinks are seperated by > tabs (\t) as the following elements. > Make sure that every site's outlink can somewhere be found in the file as > a key-site. Otherwise it will result in weird > NullPointerExceptions<http://wiki.apache.org/hama/NullPointerExceptions>. > Good luck. Am 27. April 2012 16:56 schrieb SWP <[email protected]>: > I am dealing with the PageRank example > from hama-dist-0.5.0-incubating-**source.tar.gz RC2 > which I downloaded from > http://people.apache.org/~**edwardyoon/dist/<http://people.apache.org/%7Eedwardyoon/dist/> > a few days ago. > > My input graph has some "dangling edges", that is, edges pointing to > non-existing nodes. > Here are the adjacencies of a small example. The format is: > source target1 target2 target3 ... > > 0 1 2 > 1 1 2 > 2 1 2 3 > 5 > > Your see that 2 has an edge directed to 3 but there is no adjacency list > given for 3. > > Now, when I run this example through pagerank-text2seq and then the > pagerank examle, I get a NullPointerException: > > 12/04/27 16:15:17 ERROR bsp.LocalBSPRunner: Exception during BSP execution! > java.lang.NullPointerException > at org.apache.hama.graph.**GraphJobRunner.bsp(**GraphJobRunner.java:96) > at org.apache.hama.bsp.**LocalBSPRunner$BSPRunner.run(** > LocalBSPRunner.java:256) > at org.apache.hama.bsp.**LocalBSPRunner$BSPRunner.call(** > LocalBSPRunner.java:286) > at org.apache.hama.bsp.**LocalBSPRunner$BSPRunner.call(** > LocalBSPRunner.java:1) > at java.util.concurrent.**FutureTask$Sync.innerRun(** > FutureTask.java:303) > at java.util.concurrent.**FutureTask.run(FutureTask.**java:138) > at java.util.concurrent.**ThreadPoolExecutor$Worker.** > runTask(ThreadPoolExecutor.**java:886) > at java.util.concurrent.**ThreadPoolExecutor$Worker.run(** > ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.**java:662) > > The problem appears to be that when GraphJobRunner's bsp() method looks > up the vertex to which the message is addressed, > it is not found in the vertices map. > (By the way, if you replace 5 with 3 in the example, it works - because > then the target vertex can be looked up.) > > See the vertices.get(e.getKey()) statement in the code snippet below. > Of course one can avoid the exception by adding a check in > GraphJobRunner.java (at line about 95) like this: > > if(vertices.containsKey(e.**getKey())) > { > vertices.get(e.getKey()).**compute(msgs.iterator()); > } else { > System.out.println("Ignoring message(s) '" + msgs + "' sent to > vertex '" + e.getKey() +"'"); > } > > However, what I really want is: > check within PageRank.PageRankVertex's compute() method whether the target > vertex exists > before sending out a message to it. > > That is, in PageRank.java (line 60) , instead of > > sendMessageToNeighbors(new > DoubleWritable(this.getValue()**.get() > / numEdges)); > > I would like to send messages only to "existing" vertices, that is, > those which have an adjacency list in the input. > > Any hints how this can be achieved? > I appears that I am not supposed to access the vertices field of > GraphJobRunner class in some way from within the PageRank.PageRankVertex > class? > > I concede that my example graph may qualify as invalid input ... but on > the other hand: how could I add those missing vertices after a first pass > through the adjacency lists input? > > Clemens Gröpl > > -- > > Semantic Web Project, IT > > Unister GmbH > Barfußgäßchen 11 | 04109 Leipzig > > Telefon: +49 (0)341 49288 4496 > [email protected] > <mailto:%20contact-semweb@**unister-gmbh.de<[email protected]> > > > www.unister.de <http://www.unister.de> > > Vertretungsberechtigter Geschäftsführer: Thomas Wagner > Amtsgericht Leipzig, HRB: 19056 > > -- Thomas Jungblut Berlin <[email protected]>
