Github user jingyimei commented on the issue: https://github.com/apache/madlib/pull/294 The current master has another issue: When you run ``` DROP TABLE IF EXISTS vertex, "EDGE"; CREATE TABLE vertex( id INTEGER ); CREATE TABLE "EDGE"( src INTEGER, dest INTEGER, user_id INTEGER ); INSERT INTO vertex VALUES (0), (1), (2); INSERT INTO "EDGE" VALUES (0, 1, 1), (0, 2, 1), (1, 2, 1), (2, 1, 1), (0, 1, 2); DROP TABLE IF EXISTS pagerank_ppr_grp_out; DROP TABLE IF EXISTS pagerank_ppr_grp_out_summary; SELECT pagerank( 'vertex', -- Vertex table 'id', -- Vertix id column '"EDGE"', -- "EDGE" table 'src=src, dest=dest', -- "EDGE" args 'pagerank_ppr_grp_out', -- Output table of PageRank NULL, -- Default damping factor (0.85) NULL, -- Default max iters (100) NULL, -- Default Threshold 'user_id'); ``` you will get the following result: ``` madlib=# select * from pagerank_ppr_grp_out order by user_id, id; user_id | id | pagerank ---------+----+------------------- 1 | 0 | 0.05 1 | 0 | 0.05 1 | 1 | 0.614906399170753 1 | 2 | 0.614906399170753 2 | 0 | 0.075 2 | 1 | 0.13875 (6 rows) ``` where for user_id=1 the pagerank scores don't sum up to 1 where they should have to. This PR actually fix this issue and gives the right number. However the dev check didn't have a case to catch this issue before. Suggest to add this corner case in dev check to test future changes.
---