[ https://issues.apache.org/jira/browse/NUTCH-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732172#action_12732172 ]
Andrzej Bialecki commented on NUTCH-650: ----------------------------------------- "nutchbase" is ok for now, although it sounds cryptic. +1 on importing and closing this issue. I don't believe OPIC scoring can work well, even if we implement it as intended - the dynamic nature of the webgraph is IMHO not properly addressed even in the original paper (authors propose a smoothing schema based on a history of past values). In my opinion we should strive to create a more elegant scoring API than the current one (which owes much to the way Nutch passed bits of data between different data stores), and use PageRank as the default. Re: use of Katta for distributed indexing - let's discuss this on the list. > Hbase Integration > ----------------- > > Key: NUTCH-650 > URL: https://issues.apache.org/jira/browse/NUTCH-650 > Project: Nutch > Issue Type: New Feature > Affects Versions: 1.0.0 > Reporter: Doğacan Güney > Assignee: Doğacan Güney > Fix For: 1.1 > > Attachments: hbase-integration_v1.patch, hbase_v2.patch, > malformedurl.patch, meta.patch, meta2.patch, nofollow-hbase.patch, > nutch-habase.patch, searching.diff, slash.patch > > > This issue will track nutch/hbase integration -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.