[
https://issues.apache.org/jira/browse/MADLIB-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429013#comment-16429013
]
Himanshu Pandey commented on MADLIB-1084:
-----------------------------------------
Frank McQuillan Jingyi Mei ,
I tested the pagerank with this GUC set to OFF
{code}
set optimizer_enable_tablescan = off;
{code}
When the above GUC is ON, ORCA is doing a Table scan and when It's OFF, it's
going on Seq. Scan.
And when it's turned OFF, the Install check time ( also the query) runtime has
reduced to half:
*GPDB 5.6.1 (CentOS Linux release 7.4.1708 (Core) )*
{code}
TEST CASE RESULT|Module: glm|gamma.sql_in|PASS|Time: 13015 milliseconds
TEST CASE RESULT|Module: glm|binomial.sql_in|PASS|Time: 4352 milliseconds
TEST CASE RESULT|Module: graph|wcc.sql_in|PASS|Time: 2425 milliseconds
TEST CASE RESULT|Module: graph|sssp.sql_in|PASS|Time: 4210 milliseconds
TEST CASE RESULT|Module: graph|pagerank.sql_in|PASS|Time: 76533 milliseconds
TEST CASE RESULT|Module: graph|measures.sql_in|PASS|Time: 2181 milliseconds
TEST CASE RESULT|Module: graph|hits.sql_in|PASS|Time: 2986 milliseconds
TEST CASE RESULT|Module: graph|bfs.sql_in|PASS|Time: 5325 milliseconds
TEST CASE RESULT|Module: graph|apsp.sql_in|PASS|Time: 1683 milliseconds
TEST CASE RESULT|Module: linear_systems|sparse_linear_sytems.sql_in|PASS|Time:
908 milliseconds
{code}
Query Results
With Grouping :
{code}
gpadmin=# SELECT madlib.pagerank( 'vertex', 'id', 'edge', 'src=src,
dest=dest', 'pagerank_out', NULL, NULL, NULL, 'user_id', '{1,3}');
pagerank
----------
(1 row)
Time: 49630.882 ms
{code}
Without Grouping :
{code}
gpadmin=# SELECT madlib.pagerank( 'vertex', 'id', 'edge', 'src=src,
dest=dest', 'pagerank_out', NULL, NULL, NULL, NULL, '{1,3}');
pagerank
----------
(1 row)
Time: 7767.580 ms
{code}
I am not sure if we can set the GUC's internally in MADlib. So just an FYI.
> Graph - Personalized PageRank
> -----------------------------
>
> Key: MADLIB-1084
> URL: https://issues.apache.org/jira/browse/MADLIB-1084
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Graph
> Reporter: Frank McQuillan
> Assignee: Himanshu Pandey
> Priority: Major
> Fix For: v1.14
>
> Attachments: GraphTest.py
>
>
> Personalized PageRank which is a variant of regular PageRank.
> Please refer to
> [http://madlib.apache.org/docs/latest/group__grp__pagerank.html] as a
> starting point.
> Reference:
> Neighborhood Formation and Anomaly Detection in Bipartite Graphs
> [http://www.cs.cmu.edu/~deepay/mywww/papers/icdm05.pdf]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)