[ https://issues.apache.org/jira/browse/MAHOUT-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884344#comment-13884344 ]
Sebastian Schelter commented on MAHOUT-742: ------------------------------------------- Mahout also contains a math library, first you should check whether it already contains what you need :) > Pagerank implementation in Map/Reduce > ------------------------------------- > > Key: MAHOUT-742 > URL: https://issues.apache.org/jira/browse/MAHOUT-742 > Project: Mahout > Issue Type: New Feature > Components: Graph > Affects Versions: 0.6 > Reporter: Christoph Nagel > Assignee: Sebastian Schelter > Fix For: 0.6 > > Attachments: MAHOUT-742.patch > > > Hi, > my name is Christoph Nagel. I'm student on technical university Berlin and > participating on the course of Isabel Drost and Sebastian Schelter. > My work is to implement the pagerank-algorithm, where the pagerank-vector > fits in memory. > For the computation I used the naive algorithm shown in the book 'Mining of > Massive Datasets' from Rajaraman & Ullman > (http://www-scf.usc.edu/~csci572/2012Spring/UllmanMiningMassiveDataSets.pdf). > Matrix- and vector-multiplication are done with mahout methods. > Most work is the transformation the input graph, which has to consists of a > nodes- and edges file. > Format of nodes file: <node>\n > Format of edges file: <startNode>\t<endNode>\n > Therefore I created the following classes: > * LineIndexer: assigns each line an index > * EdgesToIndex: indexes the nodes of the edges > * EdgesIndexToTransitionMatrix: creates the transition matrix > * Pagerank: computes PR from transition matrix > * JoinNodesWithPagerank: creates the joined output > * PagerankExampleJob: does the complete job > Each class has a test (not PagerankExampleJob) and I took the example of the > book for evaluating. -- This message was sent by Atlassian JIRA (v6.1.5#6160)