Re: Social Network Link Prediction in Mahout

2013-06-09 Thread Peter Holland
Thanks Sean,
I'm just looking at section "3.3 Coping without preference values" of
Mahout in Action book, that explains how to work with
GenericBooleanPrefDataModel.




On 8 June 2013 14:20, Sean Owen  wrote:

> Use an implementation that doesn't expect a rating. These are
> so-called 'boolean' implementations, like GenericBooleanPrefDataModel.
> For example you can build and item-based recommender with the boolean
> version of item based recommender and a log-likelihood similarity.
>
> Or, yes you can calculate some meaningful edge weight to add more info
> to your model. Maybe the number of times the two users interacted? the
> resulting number can be used as a 'rating' although I don't know if
> you will get great results since it doesn't act a lot like a rating.
> Instead, use the log of this number.
>
> Or, use an algorithm that is comfortable with count-like input, like
> ALS with the "implicit data" option turned on.
>
> Sean
>
> On Sat, Jun 8, 2013 at 2:15 PM, Peter Holland  wrote:
> > Hi All,
> > I am trying to use Mahout for Link Prediction in a Social Network.
> >
> > The data I have is an edges list with 9.4 million rows. The edge list is
> a
> > csv vile where each node is an integer value and a row represents a edge
> > between two nodes. For example;
> >
> > 3432, 5098
> > 3423, 6710
> > 4490, 5843
> > 4490, 2039
> > .
> >
> > This is a directed graph so row 1 means that node 3432 follows node 5098.
> >
> > I would like to build a recommender to calculate the top 10 nodes a user
> > might like to connect to next. The problem I have is that the recommender
> > classes needs input in the form (user, item, value).  So, how can I first
> > calculate a value to represent the 'weight' of an edge? For example
> > EdgeRank?
> >
> > Any help would be greatly appreciated.
> > Thank you,
> > Peter
>


Re: Social Network Link Prediction in Mahout

2013-06-08 Thread Sean Owen
Use an implementation that doesn't expect a rating. These are
so-called 'boolean' implementations, like GenericBooleanPrefDataModel.
For example you can build and item-based recommender with the boolean
version of item based recommender and a log-likelihood similarity.

Or, yes you can calculate some meaningful edge weight to add more info
to your model. Maybe the number of times the two users interacted? the
resulting number can be used as a 'rating' although I don't know if
you will get great results since it doesn't act a lot like a rating.
Instead, use the log of this number.

Or, use an algorithm that is comfortable with count-like input, like
ALS with the "implicit data" option turned on.

Sean

On Sat, Jun 8, 2013 at 2:15 PM, Peter Holland  wrote:
> Hi All,
> I am trying to use Mahout for Link Prediction in a Social Network.
>
> The data I have is an edges list with 9.4 million rows. The edge list is a
> csv vile where each node is an integer value and a row represents a edge
> between two nodes. For example;
>
> 3432, 5098
> 3423, 6710
> 4490, 5843
> 4490, 2039
> .
>
> This is a directed graph so row 1 means that node 3432 follows node 5098.
>
> I would like to build a recommender to calculate the top 10 nodes a user
> might like to connect to next. The problem I have is that the recommender
> classes needs input in the form (user, item, value).  So, how can I first
> calculate a value to represent the 'weight' of an edge? For example
> EdgeRank?
>
> Any help would be greatly appreciated.
> Thank you,
> Peter


Social Network Link Prediction in Mahout

2013-06-08 Thread Peter Holland
Hi All,
I am trying to use Mahout for Link Prediction in a Social Network.

The data I have is an edges list with 9.4 million rows. The edge list is a
csv vile where each node is an integer value and a row represents a edge
between two nodes. For example;

3432, 5098
3423, 6710
4490, 5843
4490, 2039
.

This is a directed graph so row 1 means that node 3432 follows node 5098.

I would like to build a recommender to calculate the top 10 nodes a user
might like to connect to next. The problem I have is that the recommender
classes needs input in the form (user, item, value).  So, how can I first
calculate a value to represent the 'weight' of an edge? For example
EdgeRank?

Any help would be greatly appreciated.
Thank you,
Peter