Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-10 Thread Pat Ferrel
There are many algorithms in Mahout but not all are equal. Some combinations 
never perform well even though they are described in Mahout in Action. The 
combination below is probably not the best.

You seem to assume your user similarity metric is better than Mahout’s? Do you 
have more users or items?

If I were you I'd try user or item based recs in Mahout using LLR similarity. 
It’s always performed best when I’ve compared. I say this because I know of no 
way to do what you ask without writing some code and partly because I bet it 
will outperform.

Also be aware that the only good way to compare completely different 
recommenders is A/B user testing.

On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini  
wrote:

Hi all,
I am new to mahout but I work with recommender systems, I have just tried
to implement a simple user-based recommender:

DataModel dm = new FileDataModel(new File("data/ratings.dat"));

UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);

UserNeighborhood neighborhood = new
ThresholdUserNeighborhood(0.1,similarity, dm);

UserBasedRecommender r = new GenericUserBasedRecommender(dm, neighborhood,
similarity);

I would like to compare the results of this recommender with another I
implemented using another technology. The only differences between the two
algorithms is the way I choose neighbors; since I am not very fluent in
Java, instead of implementing the second algorithm in mahout, I would like
to manually specify the neighbors for each user, is this possible? Which is
the easiest way to provide an alternative user-user similarity matrix
(computed using my algorithm)?

Just to recap: I want to use GenericUserBasedRecommender but providing an
alternative users similarity matrix, without reimplementing my similarity
algorithm in Java. Basically if I could import the similarities from a text
file it would be great, but other methods are fine as well.

Thanks a lot in advance.

Eugenio Tacchini



Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-11 Thread Eugenio Tacchini
Hello Pat and thanks for your reply,
I know that when users >> items normally item-based works better and I
don't assume my similarity metric works better but I have, for research
purposes, to compare:

- RMSE produced by a pearson correlation user-based algorithm VS
- RMSE produced by a user-based algorithm where similarities are computed
in a completely different and not standard way (algorithm implemented in C)

so I am looking for a way to assign manually the user similarities; the
test will be performed just on a couple of datasets so it's fine if I have
to hard-code the assignment.

Eugenio


2015-02-10 23:58 GMT+01:00 Pat Ferrel :

> There are many algorithms in Mahout but not all are equal. Some
> combinations never perform well even though they are described in Mahout in
> Action. The combination below is probably not the best.
>
> You seem to assume your user similarity metric is better than Mahout’s? Do
> you have more users or items?
>
> If I were you I'd try user or item based recs in Mahout using LLR
> similarity. It’s always performed best when I’ve compared. I say this
> because I know of no way to do what you ask without writing some code and
> partly because I bet it will outperform.
>
> Also be aware that the only good way to compare completely different
> recommenders is A/B user testing.
>
> On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini 
> wrote:
>
> Hi all,
> I am new to mahout but I work with recommender systems, I have just tried
> to implement a simple user-based recommender:
>
> DataModel dm = new FileDataModel(new File("data/ratings.dat"));
>
> UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
>
> UserNeighborhood neighborhood = new
> ThresholdUserNeighborhood(0.1,similarity, dm);
>
> UserBasedRecommender r = new GenericUserBasedRecommender(dm, neighborhood,
> similarity);
>
> I would like to compare the results of this recommender with another I
> implemented using another technology. The only differences between the two
> algorithms is the way I choose neighbors; since I am not very fluent in
> Java, instead of implementing the second algorithm in mahout, I would like
> to manually specify the neighbors for each user, is this possible? Which is
> the easiest way to provide an alternative user-user similarity matrix
> (computed using my algorithm)?
>
> Just to recap: I want to use GenericUserBasedRecommender but providing an
> alternative users similarity matrix, without reimplementing my similarity
> algorithm in Java. Basically if I could import the similarities from a text
> file it would be great, but other methods are fine as well.
>
> Thanks a lot in advance.
>
> Eugenio Tacchini
>
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-11 Thread Juanjo Ramos
You can create your custom class with your similarity implementation. All
you need is that class to implement the UserSimilarity interface and use it
here
UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);

instead of the PearsonCorrelationSimilarity.

UserSimilarity similarity = new CustomUserSimilarity(dm); //
CustomUserSimilarity
implements UserSimilarity

If the implementation of that CustomUserSimilarity is in C, you may want to
look into JNI (Java Native Interface) to call C code from Java.

Best,
Juanjo.

On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Hello Pat and thanks for your reply,
> I know that when users >> items normally item-based works better and I
> don't assume my similarity metric works better but I have, for research
> purposes, to compare:
>
> - RMSE produced by a pearson correlation user-based algorithm VS
> - RMSE produced by a user-based algorithm where similarities are computed
> in a completely different and not standard way (algorithm implemented in C)
>
> so I am looking for a way to assign manually the user similarities; the
> test will be performed just on a couple of datasets so it's fine if I have
> to hard-code the assignment.
>
> Eugenio
>
>
> 2015-02-10 23:58 GMT+01:00 Pat Ferrel :
>
> > There are many algorithms in Mahout but not all are equal. Some
> > combinations never perform well even though they are described in Mahout
> in
> > Action. The combination below is probably not the best.
> >
> > You seem to assume your user similarity metric is better than Mahout’s?
> Do
> > you have more users or items?
> >
> > If I were you I'd try user or item based recs in Mahout using LLR
> > similarity. It’s always performed best when I’ve compared. I say this
> > because I know of no way to do what you ask without writing some code and
> > partly because I bet it will outperform.
> >
> > Also be aware that the only good way to compare completely different
> > recommenders is A/B user testing.
> >
> > On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com>
> > wrote:
> >
> > Hi all,
> > I am new to mahout but I work with recommender systems, I have just tried
> > to implement a simple user-based recommender:
> >
> > DataModel dm = new FileDataModel(new File("data/ratings.dat"));
> >
> > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> >
> > UserNeighborhood neighborhood = new
> > ThresholdUserNeighborhood(0.1,similarity, dm);
> >
> > UserBasedRecommender r = new GenericUserBasedRecommender(dm,
> neighborhood,
> > similarity);
> >
> > I would like to compare the results of this recommender with another I
> > implemented using another technology. The only differences between the
> two
> > algorithms is the way I choose neighbors; since I am not very fluent in
> > Java, instead of implementing the second algorithm in mahout, I would
> like
> > to manually specify the neighbors for each user, is this possible? Which
> is
> > the easiest way to provide an alternative user-user similarity matrix
> > (computed using my algorithm)?
> >
> > Just to recap: I want to use GenericUserBasedRecommender but providing an
> > alternative users similarity matrix, without reimplementing my similarity
> > algorithm in Java. Basically if I could import the similarities from a
> text
> > file it would be great, but other methods are fine as well.
> >
> > Thanks a lot in advance.
> >
> > Eugenio Tacchini
> >
> >
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-11 Thread Eugenio Tacchini
Yes, I know I can implement a custom user similarity but what I want to do
is passing to mahout fixed, pre-computed user similarities I have already
stored in a text file in the easiest way possible, since I am not a Java
programmer.

If there is no way to do it, I will implement CustomUserSimilarity just by
reading the text file, storing the file in memory and returning the
corresponding similarity. I should do that making sure the read of the text
file is done just once, though.

Eugenio



2015-02-11 11:28 GMT+01:00 Juanjo Ramos :

> You can create your custom class with your similarity implementation. All
> you need is that class to implement the UserSimilarity interface and use it
> here
> UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
>
> instead of the PearsonCorrelationSimilarity.
>
> UserSimilarity similarity = new CustomUserSimilarity(dm); //
> CustomUserSimilarity
> implements UserSimilarity
>
> If the implementation of that CustomUserSimilarity is in C, you may want to
> look into JNI (Java Native Interface) to call C code from Java.
>
> Best,
> Juanjo.
>
> On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com> wrote:
>
> > Hello Pat and thanks for your reply,
> > I know that when users >> items normally item-based works better and I
> > don't assume my similarity metric works better but I have, for research
> > purposes, to compare:
> >
> > - RMSE produced by a pearson correlation user-based algorithm VS
> > - RMSE produced by a user-based algorithm where similarities are computed
> > in a completely different and not standard way (algorithm implemented in
> C)
> >
> > so I am looking for a way to assign manually the user similarities; the
> > test will be performed just on a couple of datasets so it's fine if I
> have
> > to hard-code the assignment.
> >
> > Eugenio
> >
> >
> > 2015-02-10 23:58 GMT+01:00 Pat Ferrel :
> >
> > > There are many algorithms in Mahout but not all are equal. Some
> > > combinations never perform well even though they are described in
> Mahout
> > in
> > > Action. The combination below is probably not the best.
> > >
> > > You seem to assume your user similarity metric is better than Mahout’s?
> > Do
> > > you have more users or items?
> > >
> > > If I were you I'd try user or item based recs in Mahout using LLR
> > > similarity. It’s always performed best when I’ve compared. I say this
> > > because I know of no way to do what you ask without writing some code
> and
> > > partly because I bet it will outperform.
> > >
> > > Also be aware that the only good way to compare completely different
> > > recommenders is A/B user testing.
> > >
> > > On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini <
> > eugenio.tacch...@gmail.com>
> > > wrote:
> > >
> > > Hi all,
> > > I am new to mahout but I work with recommender systems, I have just
> tried
> > > to implement a simple user-based recommender:
> > >
> > > DataModel dm = new FileDataModel(new File("data/ratings.dat"));
> > >
> > > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> > >
> > > UserNeighborhood neighborhood = new
> > > ThresholdUserNeighborhood(0.1,similarity, dm);
> > >
> > > UserBasedRecommender r = new GenericUserBasedRecommender(dm,
> > neighborhood,
> > > similarity);
> > >
> > > I would like to compare the results of this recommender with another I
> > > implemented using another technology. The only differences between the
> > two
> > > algorithms is the way I choose neighbors; since I am not very fluent in
> > > Java, instead of implementing the second algorithm in mahout, I would
> > like
> > > to manually specify the neighbors for each user, is this possible?
> Which
> > is
> > > the easiest way to provide an alternative user-user similarity matrix
> > > (computed using my algorithm)?
> > >
> > > Just to recap: I want to use GenericUserBasedRecommender but providing
> an
> > > alternative users similarity matrix, without reimplementing my
> similarity
> > > algorithm in Java. Basically if I could import the similarities from a
> > text
> > > file it would be great, but other methods are fine as well.
> > >
> > > Thanks a lot in advance.
> > >
> > > Eugenio Tacchini
> > >
> > >
> >
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-11 Thread Juanjo Ramos
Yes. You approach sounds about right.

As far as I know, you just cannot not pass a file to Mahout with user
similarities and it will create a UserSimilarity object as it can do with
the DataModel.

When I have done something like that in the past, you need to build your
own thing of parsing the file and loading it into memory.

On Wed, Feb 11, 2015 at 10:42 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Yes, I know I can implement a custom user similarity but what I want to do
> is passing to mahout fixed, pre-computed user similarities I have already
> stored in a text file in the easiest way possible, since I am not a Java
> programmer.
>
> If there is no way to do it, I will implement CustomUserSimilarity just by
> reading the text file, storing the file in memory and returning the
> corresponding similarity. I should do that making sure the read of the text
> file is done just once, though.
>
> Eugenio
>
>
>
> 2015-02-11 11:28 GMT+01:00 Juanjo Ramos :
>
> > You can create your custom class with your similarity implementation. All
> > you need is that class to implement the UserSimilarity interface and use
> it
> > here
> > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> >
> > instead of the PearsonCorrelationSimilarity.
> >
> > UserSimilarity similarity = new CustomUserSimilarity(dm); //
> > CustomUserSimilarity
> > implements UserSimilarity
> >
> > If the implementation of that CustomUserSimilarity is in C, you may want
> to
> > look into JNI (Java Native Interface) to call C code from Java.
> >
> > Best,
> > Juanjo.
> >
> > On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini <
> > eugenio.tacch...@gmail.com> wrote:
> >
> > > Hello Pat and thanks for your reply,
> > > I know that when users >> items normally item-based works better and I
> > > don't assume my similarity metric works better but I have, for research
> > > purposes, to compare:
> > >
> > > - RMSE produced by a pearson correlation user-based algorithm VS
> > > - RMSE produced by a user-based algorithm where similarities are
> computed
> > > in a completely different and not standard way (algorithm implemented
> in
> > C)
> > >
> > > so I am looking for a way to assign manually the user similarities; the
> > > test will be performed just on a couple of datasets so it's fine if I
> > have
> > > to hard-code the assignment.
> > >
> > > Eugenio
> > >
> > >
> > > 2015-02-10 23:58 GMT+01:00 Pat Ferrel :
> > >
> > > > There are many algorithms in Mahout but not all are equal. Some
> > > > combinations never perform well even though they are described in
> > Mahout
> > > in
> > > > Action. The combination below is probably not the best.
> > > >
> > > > You seem to assume your user similarity metric is better than
> Mahout’s?
> > > Do
> > > > you have more users or items?
> > > >
> > > > If I were you I'd try user or item based recs in Mahout using LLR
> > > > similarity. It’s always performed best when I’ve compared. I say this
> > > > because I know of no way to do what you ask without writing some code
> > and
> > > > partly because I bet it will outperform.
> > > >
> > > > Also be aware that the only good way to compare completely different
> > > > recommenders is A/B user testing.
> > > >
> > > > On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini <
> > > eugenio.tacch...@gmail.com>
> > > > wrote:
> > > >
> > > > Hi all,
> > > > I am new to mahout but I work with recommender systems, I have just
> > tried
> > > > to implement a simple user-based recommender:
> > > >
> > > > DataModel dm = new FileDataModel(new File("data/ratings.dat"));
> > > >
> > > > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> > > >
> > > > UserNeighborhood neighborhood = new
> > > > ThresholdUserNeighborhood(0.1,similarity, dm);
> > > >
> > > > UserBasedRecommender r = new GenericUserBasedRecommender(dm,
> > > neighborhood,
> > > > similarity);
> > > >
> > > > I would like to compare the results of this recommender with another
> I
> > > > implemented using another technology. The only differences between
> the
> > > two
> > > > algorithms is the way I choose neighbors; since I am not very fluent
> in
> > > > Java, instead of implementing the second algorithm in mahout, I would
> > > like
> > > > to manually specify the neighbors for each user, is this possible?
> > Which
> > > is
> > > > the easiest way to provide an alternative user-user similarity matrix
> > > > (computed using my algorithm)?
> > > >
> > > > Just to recap: I want to use GenericUserBasedRecommender but
> providing
> > an
> > > > alternative users similarity matrix, without reimplementing my
> > similarity
> > > > algorithm in Java. Basically if I could import the similarities from
> a
> > > text
> > > > file it would be great, but other methods are fine as well.
> > > >
> > > > Thanks a lot in advance.
> > > >
> > > > Eugenio Tacchini
> > > >
> > > >
> > >
> >
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-13 Thread Eugenio Tacchini
Ok, thanks for your support.

Eugenio

2015-02-11 11:54 GMT+01:00 Juanjo Ramos :

> Yes. You approach sounds about right.
>
> As far as I know, you just cannot not pass a file to Mahout with user
> similarities and it will create a UserSimilarity object as it can do with
> the DataModel.
>
> When I have done something like that in the past, you need to build your
> own thing of parsing the file and loading it into memory.
>
> On Wed, Feb 11, 2015 at 10:42 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com> wrote:
>
> > Yes, I know I can implement a custom user similarity but what I want to
> do
> > is passing to mahout fixed, pre-computed user similarities I have already
> > stored in a text file in the easiest way possible, since I am not a Java
> > programmer.
> >
> > If there is no way to do it, I will implement CustomUserSimilarity just
> by
> > reading the text file, storing the file in memory and returning the
> > corresponding similarity. I should do that making sure the read of the
> text
> > file is done just once, though.
> >
> > Eugenio
> >
> >
> >
> > 2015-02-11 11:28 GMT+01:00 Juanjo Ramos :
> >
> > > You can create your custom class with your similarity implementation.
> All
> > > you need is that class to implement the UserSimilarity interface and
> use
> > it
> > > here
> > > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> > >
> > > instead of the PearsonCorrelationSimilarity.
> > >
> > > UserSimilarity similarity = new CustomUserSimilarity(dm); //
> > > CustomUserSimilarity
> > > implements UserSimilarity
> > >
> > > If the implementation of that CustomUserSimilarity is in C, you may
> want
> > to
> > > look into JNI (Java Native Interface) to call C code from Java.
> > >
> > > Best,
> > > Juanjo.
> > >
> > > On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini <
> > > eugenio.tacch...@gmail.com> wrote:
> > >
> > > > Hello Pat and thanks for your reply,
> > > > I know that when users >> items normally item-based works better and
> I
> > > > don't assume my similarity metric works better but I have, for
> research
> > > > purposes, to compare:
> > > >
> > > > - RMSE produced by a pearson correlation user-based algorithm VS
> > > > - RMSE produced by a user-based algorithm where similarities are
> > computed
> > > > in a completely different and not standard way (algorithm implemented
> > in
> > > C)
> > > >
> > > > so I am looking for a way to assign manually the user similarities;
> the
> > > > test will be performed just on a couple of datasets so it's fine if I
> > > have
> > > > to hard-code the assignment.
> > > >
> > > > Eugenio
> > > >
> > > >
> > > > 2015-02-10 23:58 GMT+01:00 Pat Ferrel :
> > > >
> > > > > There are many algorithms in Mahout but not all are equal. Some
> > > > > combinations never perform well even though they are described in
> > > Mahout
> > > > in
> > > > > Action. The combination below is probably not the best.
> > > > >
> > > > > You seem to assume your user similarity metric is better than
> > Mahout’s?
> > > > Do
> > > > > you have more users or items?
> > > > >
> > > > > If I were you I'd try user or item based recs in Mahout using LLR
> > > > > similarity. It’s always performed best when I’ve compared. I say
> this
> > > > > because I know of no way to do what you ask without writing some
> code
> > > and
> > > > > partly because I bet it will outperform.
> > > > >
> > > > > Also be aware that the only good way to compare completely
> different
> > > > > recommenders is A/B user testing.
> > > > >
> > > > > On Feb 10, 2015, at 3:39 AM, Eugenio Tacchini <
> > > > eugenio.tacch...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > Hi all,
> > > > > I am new to mahout but I work with recommender systems, I have just
> > > tried
> > > > > to implement a simple user-based recommender:
> > > > >
> > > > > DataModel dm = new FileDataModel(new File("data/ratings.dat"));
> > > > >
> > > > > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
> > > > >
> > > > > UserNeighborhood neighborhood = new
> > > > > ThresholdUserNeighborhood(0.1,similarity, dm);
> > > > >
> > > > > UserBasedRecommender r = new GenericUserBasedRecommender(dm,
> > > > neighborhood,
> > > > > similarity);
> > > > >
> > > > > I would like to compare the results of this recommender with
> another
> > I
> > > > > implemented using another technology. The only differences between
> > the
> > > > two
> > > > > algorithms is the way I choose neighbors; since I am not very
> fluent
> > in
> > > > > Java, instead of implementing the second algorithm in mahout, I
> would
> > > > like
> > > > > to manually specify the neighbors for each user, is this possible?
> > > Which
> > > > is
> > > > > the easiest way to provide an alternative user-user similarity
> matrix
> > > > > (computed using my algorithm)?
> > > > >
> > > > > Just to recap: I want to use GenericUserBasedRecommender but
> > providing
> > > an
> > > > > alternative users similarity matrix, without reimplementing m

Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-13 Thread Eugenio Tacchini
I am trying to add the fixed user similarities in the easiest possible way.

This is my starting code (a normal user-based algorithm based on Pearson
Correlation):

UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);

UserNeighborhood neighborhood = new NearestNUserNeighborhood(15, 0.1,
similarity, dm);
GenericUserBasedRecommender = new GenericUserBasedRecommender(dm,
neighborhood, similarity);

I would say my (pseudo) code will be:

// UserSimilarity similarity = new PearsonCorrelationSimilarity(dm); // I
don't need this anymore

// UserNeighborhood neighborhood = new NearestNUserNeighborhood(15, 0.1,
similarity, dm); // I don't need this anymore

1) Read the similarities from a file ...

2) Build the neighborhood and similarity objects according to my matrix.
3) GenericUserBasedRecommender = new GenericUserBasedRecommender(dm,
neighborhood, similarity);

Part 2) is the most difficult, I thought the neighborhood object
represented, for each user, his neighbors but from my Eclipse inspection I
see there is much information and the neighbors seem not to be listed here,
but retrieved using getUserNeighborhood + userSimilarity ? I am getting
lost here, also because I am almost new to Java.

Is there anyone who can give me some hints about this task?

Thanks a lot in advance.

Eugenio


2015-02-13 18:29 GMT+01:00 Eugenio Tacchini :

> Ok, thanks for your support.
>
> Eugenio
>
> 2015-02-11 11:54 GMT+01:00 Juanjo Ramos :
>
>> Yes. You approach sounds about right.
>>
>> As far as I know, you just cannot not pass a file to Mahout with user
>> similarities and it will create a UserSimilarity object as it can do with
>> the DataModel.
>>
>> When I have done something like that in the past, you need to build your
>> own thing of parsing the file and loading it into memory.
>>
>> On Wed, Feb 11, 2015 at 10:42 AM, Eugenio Tacchini <
>> eugenio.tacch...@gmail.com> wrote:
>>
>> > Yes, I know I can implement a custom user similarity but what I want to
>> do
>> > is passing to mahout fixed, pre-computed user similarities I have
>> already
>> > stored in a text file in the easiest way possible, since I am not a Java
>> > programmer.
>> >
>> > If there is no way to do it, I will implement CustomUserSimilarity just
>> by
>> > reading the text file, storing the file in memory and returning the
>> > corresponding similarity. I should do that making sure the read of the
>> text
>> > file is done just once, though.
>> >
>> > Eugenio
>> >
>> >
>> >
>> > 2015-02-11 11:28 GMT+01:00 Juanjo Ramos :
>> >
>> > > You can create your custom class with your similarity implementation.
>> All
>> > > you need is that class to implement the UserSimilarity interface and
>> use
>> > it
>> > > here
>> > > UserSimilarity similarity = new PearsonCorrelationSimilarity(dm);
>> > >
>> > > instead of the PearsonCorrelationSimilarity.
>> > >
>> > > UserSimilarity similarity = new CustomUserSimilarity(dm); //
>> > > CustomUserSimilarity
>> > > implements UserSimilarity
>> > >
>> > > If the implementation of that CustomUserSimilarity is in C, you may
>> want
>> > to
>> > > look into JNI (Java Native Interface) to call C code from Java.
>> > >
>> > > Best,
>> > > Juanjo.
>> > >
>> > > On Wed, Feb 11, 2015 at 9:48 AM, Eugenio Tacchini <
>> > > eugenio.tacch...@gmail.com> wrote:
>> > >
>> > > > Hello Pat and thanks for your reply,
>> > > > I know that when users >> items normally item-based works better
>> and I
>> > > > don't assume my similarity metric works better but I have, for
>> research
>> > > > purposes, to compare:
>> > > >
>> > > > - RMSE produced by a pearson correlation user-based algorithm VS
>> > > > - RMSE produced by a user-based algorithm where similarities are
>> > computed
>> > > > in a completely different and not standard way (algorithm
>> implemented
>> > in
>> > > C)
>> > > >
>> > > > so I am looking for a way to assign manually the user similarities;
>> the
>> > > > test will be performed just on a couple of datasets so it's fine if
>> I
>> > > have
>> > > > to hard-code the assignment.
>> > > >
>> > > > Eugenio
>> > > >
>> > > >
>> > > > 2015-02-10 23:58 GMT+01:00 Pat Ferrel :
>> > > >
>> > > > > There are many algorithms in Mahout but not all are equal. Some
>> > > > > combinations never perform well even though they are described in
>> > > Mahout
>> > > > in
>> > > > > Action. The combination below is probably not the best.
>> > > > >
>> > > > > You seem to assume your user similarity metric is better than
>> > Mahout’s?
>> > > > Do
>> > > > > you have more users or items?
>> > > > >
>> > > > > If I were you I'd try user or item based recs in Mahout using LLR
>> > > > > similarity. It’s always performed best when I’ve compared. I say
>> this
>> > > > > because I know of no way to do what you ask without writing some
>> code
>> > > and
>> > > > > partly because I bet it will outperform.
>> > > > >
>> > > > > Also be aware that the only good way to compare completely
>> different
>> > > > > recommenders is A/B user te

Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-13 Thread Ted Dunning
On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Is there anyone who can give me some hints about this task?
>

Another way to look at this is to try to wedge this into the item
similarity code.

There are hooks available in the map-reduce version of item similarity to
put an arbitrary user distance in.  This only works well if there are
sparsity constraints that limit the number of distances that need to be
computed, but if it works, it can be really excellent.  This would allow
you to put your distances in and still use an indicator-based recommender.


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-13 Thread Pat Ferrel
If the user -> similar users relationship is really fixed for some test this 
isn’t even a Mahout problem… All you need to do is create a linear combination 
of all the similar user's preferences and rank accordingly. This produces 
ranked recs for some “current user”. If you have a record of user preferences 
and similar users it’s not even a Mahout thing. A DB will do this just fine for 
a test.

The current code in spark-rowsimilarity will give similar users based on 
interaction input data using LLR. Adding a custom distance metric to 
SimilarityAnalysis.rowSimilarity should be pretty easy.

So you have several ways to go using new code or old Taste code. To make it 
work generally you’ll have to write some code since your metric is really new.


On Feb 13, 2015, at 11:14 AM, Ted Dunning  wrote:

On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Is there anyone who can give me some hints about this task?
> 

Another way to look at this is to try to wedge this into the item
similarity code.

There are hooks available in the map-reduce version of item similarity to
put an arbitrary user distance in.  This only works well if there are
sparsity constraints that limit the number of distances that need to be
computed, but if it works, it can be really excellent.  This would allow
you to put your distances in and still use an indicator-based recommender.



Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-14 Thread Eugenio Tacchini
Hi Ted; I don't have constraint, I have to compute all the distances, but
the distances are already computed, I already have a text file which tells
me the pairwise distances among all the users and I need to fill the mahout
user-based algo with these distances.

Hi Pat, I don't understand why it is not a Mahout problem, my goal is to
evaluate (RMSE) the output of a user based algorithm comparing different
user similarity measures, Mahout already has everything I need except the
fact I cannot give in input a custom similarity matrix.

Eugenio

2015-02-13 21:51 GMT+01:00 Pat Ferrel :

> If the user -> similar users relationship is really fixed for some test
> this isn’t even a Mahout problem… All you need to do is create a linear
> combination of all the similar user's preferences and rank accordingly.
> This produces ranked recs for some “current user”. If you have a record of
> user preferences and similar users it’s not even a Mahout thing. A DB will
> do this just fine for a test.
>
> The current code in spark-rowsimilarity will give similar users based on
> interaction input data using LLR. Adding a custom distance metric to
> SimilarityAnalysis.rowSimilarity should be pretty easy.
>
> So you have several ways to go using new code or old Taste code. To make
> it work generally you’ll have to write some code since your metric is
> really new.
>
>
> On Feb 13, 2015, at 11:14 AM, Ted Dunning  wrote:
>
> On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com> wrote:
>
> > Is there anyone who can give me some hints about this task?
> >
>
> Another way to look at this is to try to wedge this into the item
> similarity code.
>
> There are hooks available in the map-reduce version of item similarity to
> put an arbitrary user distance in.  This only works well if there are
> sparsity constraints that limit the number of distances that need to be
> computed, but if it works, it can be really excellent.  This would allow
> you to put your distances in and still use an indicator-based recommender.
>
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-14 Thread Pat Ferrel
I just meant that you can make recommendations with the data you have, without 
using Mahout. But I see now that you are trying to use it to calculate RMSE. 
And that requires Taste. I believe using it has already been described below.

It should be noted that, except for a few special cases, RMSE is not longer 
considered a very good test of recommenders. Unless you are really trying to 
predict ratings it is not useful. If you want to optimize _ranking_, in other 
words you want to show the best n recommendations, you want a precision metric 
like MAP (mean average precision). 

MAP is not built into Mahout of any flavor. It should also be noted that doing 
offline comparison of different algorithms is fraught with problems and so 
should only be undertaken with a good degree of skepticism. 

On Feb 14, 2015, at 6:05 AM, Eugenio Tacchini  
wrote:

Hi Ted; I don't have constraint, I have to compute all the distances, but
the distances are already computed, I already have a text file which tells
me the pairwise distances among all the users and I need to fill the mahout
user-based algo with these distances.

Hi Pat, I don't understand why it is not a Mahout problem, my goal is to
evaluate (RMSE) the output of a user based algorithm comparing different
user similarity measures, Mahout already has everything I need except the
fact I cannot give in input a custom similarity matrix.

Eugenio

2015-02-13 21:51 GMT+01:00 Pat Ferrel :

> If the user -> similar users relationship is really fixed for some test
> this isn’t even a Mahout problem… All you need to do is create a linear
> combination of all the similar user's preferences and rank accordingly.
> This produces ranked recs for some “current user”. If you have a record of
> user preferences and similar users it’s not even a Mahout thing. A DB will
> do this just fine for a test.
> 
> The current code in spark-rowsimilarity will give similar users based on
> interaction input data using LLR. Adding a custom distance metric to
> SimilarityAnalysis.rowSimilarity should be pretty easy.
> 
> So you have several ways to go using new code or old Taste code. To make
> it work generally you’ll have to write some code since your metric is
> really new.
> 
> 
> On Feb 13, 2015, at 11:14 AM, Ted Dunning  wrote:
> 
> On Fri, Feb 13, 2015 at 11:11 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com> wrote:
> 
>> Is there anyone who can give me some hints about this task?
>> 
> 
> Another way to look at this is to try to wedge this into the item
> similarity code.
> 
> There are hooks available in the map-reduce version of item similarity to
> put an arbitrary user distance in.  This only works well if there are
> sparsity constraints that limit the number of distances that need to be
> computed, but if it works, it can be really excellent.  This would allow
> you to put your distances in and still use an indicator-based recommender.
> 
> 



Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-15 Thread Ted Dunning
On Sat, Feb 14, 2015 at 6:05 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Hi Pat, I don't understand why it is not a Mahout problem, my goal is to
> evaluate (RMSE) the output of a user based algorithm comparing different
> user similarity measures, Mahout already has everything I need except the
> fact I cannot give in input a custom similarity matrix.
>

You should be able to implement a special user similarity function that is
simply a lookup.  If you have all the distances, then you aren't worried
about efficiency, by definition.


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-16 Thread Eugenio Tacchini
Yes, I know that RMSE and offline evaluation have many drawbacks, but I
just need to compute it.

Yes, I need to implement a lookup function, I was wondering which is the
easiest way, since I am not a Java programmer and I've started using Mahout
since a few days ago.

Eugenio


2015-02-15 1:10 GMT+01:00 Ted Dunning :

> On Sat, Feb 14, 2015 at 6:05 AM, Eugenio Tacchini <
> eugenio.tacch...@gmail.com> wrote:
>
> > Hi Pat, I don't understand why it is not a Mahout problem, my goal is to
> > evaluate (RMSE) the output of a user based algorithm comparing different
> > user similarity measures, Mahout already has everything I need except the
> > fact I cannot give in input a custom similarity matrix.
> >
>
> You should be able to implement a special user similarity function that is
> simply a lookup.  If you have all the distances, then you aren't worried
> about efficiency, by definition.
>


Re: How can I manually specify user similarities in the user-based algorithm?

2015-02-16 Thread Ted Dunning
On Mon, Feb 16, 2015 at 1:25 AM, Eugenio Tacchini <
eugenio.tacch...@gmail.com> wrote:

> Yes, I need to implement a lookup function, I was wondering which is the
> easiest way, since I am not a Java programmer and I've started using Mahout
> since a few days ago.
>

Without Java programming, there won't be any easy way.