Hi Paolo, I see what you mean.
However, the problem with this input is that the dangling vertices that don't have a line of their own (such as 11) cannot contribute their accumulated rank, as no vertex for them will be instantiated. So counting them doesn't help either. I think that we should rely on users supplying valid input (a line for each vertex) and not try to correct for that in the vertex class. Creating a line for each vertex from such a file is an easy task that is doable with a single MapReduce pass over the data beforehand. --sebastian On 28.05.2012 22:13, Paolo Castagna wrote: > Hi Sebastian, > you can try yourself with some simple input (which contains dangling nodes). > > For example, say I have this adjacency list (with mistakes/repetitions and > self-links, ignore those): > > 1 1 2 2 > 2 3 5 7 9 11 > 3 3 3 6 > 4 > 5 1 2 3 11 > 8 10 > 10 5 > > How many vertices? 7 or 11? > > I think this graph has 11 and it's 11 you need to use as number of vertices > when you compute PageRank. > > Paolo > > Sebastian Schelter wrote: >> Hi Paolo, >> >> Why would getNumVertices() not give back the correct number of vertices? >> This call should always give back the overall number of vertices (if it >> doesn't we have to fix it) and you shouldn't have to rely on tricks to >> count stuff via aggregators. >> >> --sebastian > >