On Friday 14 October 2011 15:03:16 Thomas Anderson wrote: > I read wiki (http://wiki.apache.org/nutch/NewScoring#LinkRank) stating > the process of LinkRank is iterative and scores tend to converge after > iteration. However, from the the source I discover it seems that the > job always reads from the same input path and produce to the same > output path. For instance, > > runCounter() reads intput from nodes and returns the number of nodes > runInitializer() reads from nodes and initializes inLinkScore > > then iteration (default is 10) > runInverted() reads from nodes, where inLinkScore is initialized, > outlinks, and loops; then produces output to > linkrank-<random>/inverted > runAnalysis() reads from nodes (inLinkScore is inited), and inverted > path (in previous step); then produces output to > linkrank-<random>/nodes
The score for X and Y after the first iteration are (1 - damping) + (damping * sum(inlinkScore)). Suppose X also links to Y, then sum(inlinkScore) for Y will change as X has a new value after the first iteration. This is convergence as the delta's between iterations will flatten out after each iteration. > > This seems to me with the same process to calculate the scores, the > result of LinkRank will always be the same at each iteration. So I > can't understand very well how scores would converge. What place would > be the key point to spot at? Or any doc that may explain this more > detail? > > Thanks. -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350

