Re: Social Network Link Prediction in Mahout

2013-06-09 Thread Peter Holland
Thanks Sean,
I'm just looking at section 3.3 Coping without preference values of
Mahout in Action book, that explains how to work with
GenericBooleanPrefDataModel.




On 8 June 2013 14:20, Sean Owen sro...@gmail.com wrote:

 Use an implementation that doesn't expect a rating. These are
 so-called 'boolean' implementations, like GenericBooleanPrefDataModel.
 For example you can build and item-based recommender with the boolean
 version of item based recommender and a log-likelihood similarity.

 Or, yes you can calculate some meaningful edge weight to add more info
 to your model. Maybe the number of times the two users interacted? the
 resulting number can be used as a 'rating' although I don't know if
 you will get great results since it doesn't act a lot like a rating.
 Instead, use the log of this number.

 Or, use an algorithm that is comfortable with count-like input, like
 ALS with the implicit data option turned on.

 Sean

 On Sat, Jun 8, 2013 at 2:15 PM, Peter Holland d1...@mydit.ie wrote:
  Hi All,
  I am trying to use Mahout for Link Prediction in a Social Network.
 
  The data I have is an edges list with 9.4 million rows. The edge list is
 a
  csv vile where each node is an integer value and a row represents a edge
  between two nodes. For example;
 
  3432, 5098
  3423, 6710
  4490, 5843
  4490, 2039
  .
 
  This is a directed graph so row 1 means that node 3432 follows node 5098.
 
  I would like to build a recommender to calculate the top 10 nodes a user
  might like to connect to next. The problem I have is that the recommender
  classes needs input in the form (user, item, value).  So, how can I first
  calculate a value to represent the 'weight' of an edge? For example
  EdgeRank?
 
  Any help would be greatly appreciated.
  Thank you,
  Peter



Social Network Link Prediction in Mahout

2013-06-08 Thread Peter Holland
Hi All,
I am trying to use Mahout for Link Prediction in a Social Network.

The data I have is an edges list with 9.4 million rows. The edge list is a
csv vile where each node is an integer value and a row represents a edge
between two nodes. For example;

3432, 5098
3423, 6710
4490, 5843
4490, 2039
.

This is a directed graph so row 1 means that node 3432 follows node 5098.

I would like to build a recommender to calculate the top 10 nodes a user
might like to connect to next. The problem I have is that the recommender
classes needs input in the form (user, item, value).  So, how can I first
calculate a value to represent the 'weight' of an edge? For example
EdgeRank?

Any help would be greatly appreciated.
Thank you,
Peter


Re: Social Network Link Prediction in Mahout

2013-06-08 Thread Sean Owen
Use an implementation that doesn't expect a rating. These are
so-called 'boolean' implementations, like GenericBooleanPrefDataModel.
For example you can build and item-based recommender with the boolean
version of item based recommender and a log-likelihood similarity.

Or, yes you can calculate some meaningful edge weight to add more info
to your model. Maybe the number of times the two users interacted? the
resulting number can be used as a 'rating' although I don't know if
you will get great results since it doesn't act a lot like a rating.
Instead, use the log of this number.

Or, use an algorithm that is comfortable with count-like input, like
ALS with the implicit data option turned on.

Sean

On Sat, Jun 8, 2013 at 2:15 PM, Peter Holland d1...@mydit.ie wrote:
 Hi All,
 I am trying to use Mahout for Link Prediction in a Social Network.

 The data I have is an edges list with 9.4 million rows. The edge list is a
 csv vile where each node is an integer value and a row represents a edge
 between two nodes. For example;

 3432, 5098
 3423, 6710
 4490, 5843
 4490, 2039
 .

 This is a directed graph so row 1 means that node 3432 follows node 5098.

 I would like to build a recommender to calculate the top 10 nodes a user
 might like to connect to next. The problem I have is that the recommender
 classes needs input in the form (user, item, value).  So, how can I first
 calculate a value to represent the 'weight' of an edge? For example
 EdgeRank?

 Any help would be greatly appreciated.
 Thank you,
 Peter