Github user jingyimei commented on the issue:

    https://github.com/apache/madlib/pull/294
  
    The current master has another issue:
    When you run
    ```
    DROP TABLE IF EXISTS vertex, "EDGE";
    CREATE TABLE vertex(
    id INTEGER
    );
    CREATE TABLE "EDGE"(
    src INTEGER,
    dest INTEGER,
    user_id INTEGER
    );
    INSERT INTO vertex VALUES
    (0),
    (1),
    (2);
    INSERT INTO "EDGE" VALUES
    (0, 1, 1),
    (0, 2, 1),
    (1, 2, 1),
    (2, 1, 1),
    (0, 1, 2);
    
    
    DROP TABLE IF EXISTS pagerank_ppr_grp_out;
    DROP TABLE IF EXISTS pagerank_ppr_grp_out_summary;
    SELECT pagerank(
    'vertex', -- Vertex table
    'id', -- Vertix id column
    '"EDGE"', -- "EDGE" table
    'src=src, dest=dest', -- "EDGE" args
    'pagerank_ppr_grp_out', -- Output table of PageRank
    NULL, -- Default damping factor (0.85)
    NULL, -- Default max iters (100)
    NULL, -- Default Threshold 
    'user_id');
    ```
    
    you will get the following result:
    ```
    madlib=# select * from pagerank_ppr_grp_out order by user_id, id; user_id | 
id | pagerank
    ---------+----+-------------------
    1 | 0 | 0.05
    1 | 0 | 0.05
    1 | 1 | 0.614906399170753
    1 | 2 | 0.614906399170753
    2 | 0 | 0.075
    2 | 1 | 0.13875
    (6 rows)
    ```
    
    where for user_id=1 the pagerank scores don't sum up to 1 where they should 
have to. This PR actually fix this issue and gives the right number. However 
the dev check didn't have a case to catch this issue before. Suggest to add 
this corner case in dev check to test future changes.


---

Reply via email to