[ https://issues.apache.org/jira/browse/MAHOUT-742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884338#comment-13884338 ]
Nilesh Chakraborty commented on MAHOUT-742: ------------------------------------------- Also, what do I need know about Mahout's policy for using 3rd party linear algebra libraries like MTJ, Colt etc.? Say if I need to borrow a lot of functionality from one such library, do I need to rewrite the code so as to eliminate any dependencies on such libraries? What about Apache Commons? I'd also appreciate it if you could give me some pointers/resources where such guidelines are detailed. > Pagerank implementation in Map/Reduce > ------------------------------------- > > Key: MAHOUT-742 > URL: https://issues.apache.org/jira/browse/MAHOUT-742 > Project: Mahout > Issue Type: New Feature > Components: Graph > Affects Versions: 0.6 > Reporter: Christoph Nagel > Assignee: Sebastian Schelter > Fix For: 0.6 > > Attachments: MAHOUT-742.patch > > > Hi, > my name is Christoph Nagel. I'm student on technical university Berlin and > participating on the course of Isabel Drost and Sebastian Schelter. > My work is to implement the pagerank-algorithm, where the pagerank-vector > fits in memory. > For the computation I used the naive algorithm shown in the book 'Mining of > Massive Datasets' from Rajaraman & Ullman > (http://www-scf.usc.edu/~csci572/2012Spring/UllmanMiningMassiveDataSets.pdf). > Matrix- and vector-multiplication are done with mahout methods. > Most work is the transformation the input graph, which has to consists of a > nodes- and edges file. > Format of nodes file: <node>\n > Format of edges file: <startNode>\t<endNode>\n > Therefore I created the following classes: > * LineIndexer: assigns each line an index > * EdgesToIndex: indexes the nodes of the edges > * EdgesIndexToTransitionMatrix: creates the transition matrix > * Pagerank: computes PR from transition matrix > * JoinNodesWithPagerank: creates the joined output > * PagerankExampleJob: does the complete job > Each class has a test (not PagerankExampleJob) and I took the example of the > book for evaluating. -- This message was sent by Atlassian JIRA (v6.1.5#6160)