[
https://issues.apache.org/jira/browse/MADLIB-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16166695#comment-16166695
]
Jingyi Mei commented on MADLIB-1124:
------------------------------------
Question: If it is a multi-directed graph, there might be some duplicated
edges. how are we going to deal with those duplicated edges for HITS?
The current implementation doesn’t check if an edge is distinct or not, which
means, if there are multiple edges from one vertex to another, those edges will
be counted multiple times. An example can be like this: in paper A, there are
multiple links refer to paper B, and using our current calculation, when
calculating B’s authority score, A’s hub score will be added multiple times. In
this case, A will play a 'more important' role than other vertices which only
have one edge pointing to B. Does this make sense? Should we treat every vertex
equally so that we only calculate distinct edges between vertices?
> Graph - HITS algorithm
> ----------------------
>
> Key: MADLIB-1124
> URL: https://issues.apache.org/jira/browse/MADLIB-1124
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Graph
> Reporter: Frank McQuillan
> Assignee: Jingyi Mei
> Fix For: v2.0
>
> Attachments: pagerank_hits.png
>
>
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)