Hey guys, I was playing with PageRank these days, and had some weird results. I wanted to use the Input Format Reader and Output Format Writer given inside SimplePageRankComputation, so I gave my input file and called the specific code in the command line.
Some of the vertices got value > 1. So I had a look in the logs, and noticed that it is generating its own vertices and my input file is never used. The Vertex class inside the SimplePageRankVertexReader has the following lines: LongWritable vertexId = new LongWritable( (inputSplit.getSplitIndex() * totalRecords) + recordsRead); DoubleWritable vertexValue = new DoubleWritable(vertexId.get() * 10d); long targetVertexId = (vertexId.get() + 1) % (inputSplit.getNumSplits() * totalRecords); float edgeValue = vertexId.get() * 100f; And in the Task Logs, it prints 2013-05-30 11:28:20,808 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=0, vertexValue=0.0, targetVertexId=1, edgeValue=0.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=1, vertexValue=10.0, targetVertexId=2, edgeValue=100.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=2, vertexValue=20.0, targetVertexId=3, edgeValue=200.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=3, vertexValue=30.0, targetVertexId=4, edgeValue=300.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=4, vertexValue=40.0, targetVertexId=5, edgeValue=400.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=5, vertexValue=50.0, targetVertexId=6, edgeValue=500.0 2013-05-30 11:28:20,809 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=6, vertexValue=60.0, targetVertexId=7, edgeValue=600.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=7, vertexValue=70.0, targetVertexId=8, edgeValue=700.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=8, vertexValue=80.0, targetVertexId=9, edgeValue=800.0 2013-05-30 11:28:20,810 INFO org.apache.giraph.examples.SimplePageRankVertex$SimplePageRankVertexReader: next: Return vertexId=9, vertexValue=90.0, targetVertexId=0, edgeValue=900.0 Could someone explain why is this happening? Thanks! -- Maria Stylianou Intern at Telefonica, Barcelona, Spain Master Student of European Master in Distributed Computing<http://www.kth.se/en/studies/programmes/master/em/emdc> marsty5.wordpress.com