[ 
https://issues.apache.org/jira/browse/NUTCH-635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605139#action_12605139
 ] 

Andrzej Bialecki  commented on NUTCH-635:
-----------------------------------------

One more question: you said the algorithm converges, but do you have a 
reference set of values from this dataset, calculated using some other pagerank 
impl? It would be worthwhile to make sure that the values are indeed the 
PageRank, as described, and not yet another subtle variation such as our OPIC ;)

There are a few Java packages for computing PageRank, we could adapt one of 
those to serve as a baseline:

http://law.dsi.unimi.it/
http://webla.sourceforge.net/javadocs/pt/tumba/links/PageRank.html


> LinkAnalysis Tool for Nutch
> ---------------------------
>
>                 Key: NUTCH-635
>                 URL: https://issues.apache.org/jira/browse/NUTCH-635
>             Project: Nutch
>          Issue Type: New Feature
>    Affects Versions: 1.0.0
>         Environment: All
>            Reporter: Dennis Kubes
>            Assignee: Dennis Kubes
>             Fix For: 1.0.0
>
>         Attachments: NUTCH-635-1-20080612.patch, NUTCH-635-2-20080613.patch, 
> NUTCH-635-3-20080614.patch
>
>
> This is a basic pagerank type link analysis tool for nutch which simulates a 
> sparse matrix using inlinks and outlinks and converges after a given number 
> of iterations.  This tool is mean to replace the current scoring system in 
> nutch with a system that converges instead of exponentially increasing 
> scores.  Also includes a tool to create an outlinkdb.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to