Re: Why some userId has no recommendations?

2014-02-13 Thread Koobas
User 3 gave a recommendation to item 107. User 5 did not rate 107. On Thu, Feb 13, 2014 at 1:57 AM, Suresh M suresh4mas...@gmail.com wrote: user 5 has given rating for all 5 books, So there will be no recommendations for him. On 12 February 2014 08:55, jiangwen jiang jiangwen...@gmail.com

Re: Why some userId has no recommendations?

2014-02-13 Thread Koobas
I guess you would get a 107 as a recommendation for 5 if you switched to user-based? On Thu, Feb 13, 2014 at 8:21 AM, Koobas koo...@gmail.com wrote: User 3 gave a recommendation to item 107. User 5 did not rate 107. On Thu, Feb 13, 2014 at 1:57 AM, Suresh M suresh4mas...@gmail.com wrote

Re: Why some userId has no recommendations?

2014-02-12 Thread Koobas
5 should get 107 as a recommendation, whether user-based or item-based. No clue why you're not getting it. On Wed, Feb 12, 2014 at 11:50 PM, jiangwen jiang jiangwen...@gmail.comwrote: Hi, all: I try to user mahout api to make recommendations, but I find some userId has no recommendations,

Re: generic latent variable recommender question

2014-01-25 Thread Koobas
to approximate the rating values. That's exactly what I was thinking. Thanks for your reply. On Sat, Jan 25, 2014 at 5:08 AM, Koobas koo...@gmail.com wrote: A generic latent variable recommender question. I passed the user-item matrix through a low rank approximation, with either something like

generic latent variable recommender question

2014-01-24 Thread Koobas
A generic latent variable recommender question. I passed the user-item matrix through a low rank approximation, with either something like ALS or SVD, and now I have the feature vectors for all users and all items. Case 1: I want to recommend items to a user. I compute a dot product of the user’s

ALS and SVD feature vectors

2013-09-04 Thread Koobas
In ALS the coincidence matrix is approximated by XY', where X is user-feature, Y is item-feature. Now, here is the question: are/should the feature vectors be normalized before computing recommendations? Now, what happens in the case of SVD? The vectors are normal by definition. Are singular

Re: ALS and SVD feature vectors

2013-09-04 Thread Koobas
dlie...@gmail.com wrote: On Wed, Sep 4, 2013 at 10:07 AM, Koobas koo...@gmail.com wrote: In ALS the coincidence matrix is approximated by XY', where X is user-feature, Y is item-feature. Now, here is the question: are/should the feature vectors be normalized before computing

Re: ALS and SVD feature vectors

2013-09-04 Thread Koobas
! Straight to the point. That's the answer I was looking for. Also, thanks to Ted. He pretty much said the same thing. On Wed, Sep 4, 2013 at 6:07 PM, Koobas koo...@gmail.com wrote: In ALS the coincidence matrix is approximated by XY', where X is user-feature, Y is item-feature. Now, here

Re: Paper on Mahout's ALS implementation accepted at RecSys'13

2013-07-23 Thread Koobas
Same request here. Can you share the paper? On Tue, Jul 23, 2013 at 6:47 AM, 刘鎏 liuliu@gmail.com wrote: Congratulations~ By the way, could the paper be shared? THX~ Best, LiuLiu On Mon, Jul 22, 2013 at 2:22 AM, Sebastian Schelter s...@apache.org wrote: I'm happy to anounce

Re: Using Mahout for low-volume data

2013-07-15 Thread Koobas
Is a factorizing recommender a better idea for low volume data in general? On Mon, Jul 15, 2013 at 11:35 AM, Ted Dunning ted.dunn...@gmail.com wrote: With such small data, this sounds (without thinking too much) like you are doing reasonably well with LLR similarity. Have you tried a

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
I am guessing (comments welcome) that it is going to be difficult to guarantee reproducibility under parallel execution conditions. MapReduce has reduction in its name. Reduction operations are the main cause of irreproducibility in parallel codes, because changing the order of summations changes

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin kazm...@hotmail.com wrote: I agree with you, I should have mentioned earlier that it would be good to separate noise from data and deal with only what is separable.

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
this will change as soon as CaaS machine learning goes mainstream. On Mon, Jun 24, 2013 at 2:29 PM, Koobas koo...@gmail.com wrote: On Mon, Jun 24, 2013 at 5:07 PM, Dmitriy Lyubimov dlie...@gmail.com wrote: On Mon, Jun 24, 2013 at 1:35 PM, Michael Kazekin kazm...@hotmail.com wrote

Re: Consistent repeatable results for distributed ALS-WR recommender

2013-06-24 Thread Koobas
Well, you know, the issue is there, whether we like it or not. Maybe replication is enough, maybe not. If there is a workshop on that issue, it's on the radar. http://beamtenherrschaft.blogspot.com/2013/06/acm-recsys-2013-workshop-on.html On Mon, Jun 24, 2013 at 6:36 PM, Sean Owen

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Koobas
Since I am primarily an HPC person, probably a naive question from the ML perspective. What if, when computing recommendations, we don't exclude what the user already has, and then see if the items he has end up being recommended to him (compute some appropriate metric / ratio)? Wouldn't that be

Re: evaluating recommender with boolean prefs

2013-06-07 Thread Koobas
of explains why precision/recall can be really low in these tests. I would not be surprised if you get 0 in some cases, on maybe small input. Is it a bad predictor? maybe, but it's not clear. On Fri, Jun 7, 2013 at 8:06 PM, Koobas koo...@gmail.com wrote: Since I am primarily an HPC person

Re: Blending initial recommendations for cross recommendation

2013-05-31 Thread Koobas
I am also very interested in the answer to this question. Just to reiterate, if you use different recommenders, e.g., kNN user-based, kNN item-based, ALS, each one produces recommendations on a different scale. So how do you combine them? On Fri, May 31, 2013 at 3:07 PM, Dominik Hübner

Re: Blending initial recommendations for cross recommendation

2013-05-31 Thread Koobas
as a weak, though still useful, inspirational guide. On Fri, May 31, 2013 at 3:18 PM, Koobas koo...@gmail.com wrote: I am also very interested in the answer to this question. Just to reiterate, if you use different recommenders, e.g., kNN user-based, kNN item-based, ALS, each one produces

Re: Clustering product views and sales

2013-05-06 Thread Koobas
Since Dominik mentioned item-based and ALS, let me throw in a question here. I believe that one of the Netflix price solutions combined KNN and ALS. 1) What is the best way to combine the results of both? 2) Is there really merit to this approach? 3) Are there other combinations that make sense?

Re: Clustering product views and sales

2013-05-06 Thread Koobas
I think I see the picture now. Thanks! On Mon, May 6, 2013 at 5:25 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Mon, May 6, 2013 at 12:50 PM, Koobas koo...@gmail.com wrote: Since Dominik mentioned item-based and ALS, let me throw in a question here. I believe that one of the Netflix

Re: cross recommender

2013-04-15 Thread Koobas
to purchases. All these are implicit preferences but that's not the important part for this technique. On Apr 10, 2013, at 4:15 PM, Koobas koo...@gmail.com wrote: Retail data may be hard to impossible, but one can improvise. It seems to be fairly common to use Wikipedia articles (Myrrix, GraphLab

Re: cross recommender

2013-04-10 Thread Koobas
Retail data may be hard to impossible, but one can improvise. It seems to be fairly common to use Wikipedia articles (Myrrix, GraphLab). Another idea is to use StackOverflow tags (Myrrix examples). Although they are only good for emulating implicit feedback. On Wed, Apr 10, 2013 at 6:48 PM, Ted

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-08 Thread Koobas
Okay, it sheds some light on the problem. Thanks for sharing. On Mon, Apr 8, 2013 at 4:33 AM, Sean Owen sro...@gmail.com wrote: PS I think the issue is really more like this, after some more testing. When lambda (overfitting parameter) is high, the X and Y in the factorization A = X*Y' are

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-06 Thread Koobas
On Fri, Apr 5, 2013 at 8:07 AM, Sean Owen sro...@gmail.com wrote: OK yes you're on to something here. I should clarify. Koobas you are right that the ALS algorithm itself is fine here as far as my knowledge takes me. The thing it inverts to solve for a row of X is something like (Y' * Cu * Y

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-06 Thread Koobas
, but it's not really a matter of condition number or machine precision. Condition numbers are 1 in these cases but not that large. On Sun, Apr 7, 2013 at 12:19 AM, Koobas koo...@gmail.com wrote: I don't see why the inverse of Y'*Y does not exist. What Y do you end up with?

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-05 Thread Koobas
Let me try to wrap my head around it On Fri, Apr 5, 2013 at 8:07 AM, Sean Owen sro...@gmail.com wrote: OK yes you're on to something here. I should clarify. Koobas you are right that the ALS algorithm itself is fine here as far as my knowledge takes me. The thing it inverts to solve

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
On Thu, Apr 4, 2013 at 9:13 AM, Ted Dunning ted.dunn...@gmail.com wrote: Typically, to deal with this kind of problem, you need to follow one of two courses. First, you can use a so-called rank-revealing QR which uses a pivoting strategy to push all of the small elements of R as far down the

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
On Thu, Apr 4, 2013 at 9:36 AM, Sean Owen sro...@gmail.com wrote: Yeah I've got the pivoting part down -- I think. The problem is that I can't seem to identify the problem by simple thresholding. For example, a diagonal like 10 9 8 0.0001 0.001 obviously has a problem. But so might 100 90

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
. Is there established procedure for evaluating the ill-conditioned-ness of matrices -- like a principled choice of threshold above which you say it's ill-conditioned, based on k, etc.? On Thu, Apr 4, 2013 at 3:19 PM, Koobas koo...@gmail.com wrote: So, the problem is that the kxk matrix is ill

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
of matrices -- like a principled choice of threshold above which you say it's ill-conditioned, based on k, etc.? On Thu, Apr 4, 2013 at 3:19 PM, Koobas koo...@gmail.com wrote: So, the problem is that the kxk matrix is ill-conditioned, or is there more to it?

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
I took Movie Lens 100K data without ratings and ran non-weighted ALS in Matlab. I set number of features k=2000, which is larger than the input matrix (1000 x 1700). I used QR to do the inversion. It runs without problems. Can you share your data? On Thu, Apr 4, 2013 at 1:10 PM, Koobas koo

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
dummy data like below, without maybe k=10. If it completes with error that's a problem! Okay, let me try it 0,0,1 0,1,4 0,2,3 1,2,3 2,1,4 2,3,3 2,4,2 3,0,5 3,2,2 3,4,3 4,3,5 5,0,2 5,1,4 On Thu, Apr 4, 2013 at 7:05 PM, Koobas koo...@gmail.com wrote: I took Movie Lens 100K

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
:04 PM, Koobas koo...@gmail.com wrote: On Thu, Apr 4, 2013 at 2:23 PM, Sean Owen sro...@gmail.com wrote: Does it complete without problems? It may complete without error but the result may be garbage. The matrix that's inverted is not going to be singular due to round-off. Even if it's

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
k was 10 On Thu, Apr 4, 2013 at 3:37 PM, Koobas koo...@gmail.com wrote: No major problems: A = 1 4 3 0 0 0 0 3 0 0 0 4 0 3 2 5 0 2 0 3 0 0 0 5 0 2 4 0 0 0

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
Sorry, the image was off. This is more like it: [image: Inline image 1] On Thu, Apr 4, 2013 at 3:38 PM, Koobas koo...@gmail.com wrote: k was 10 On Thu, Apr 4, 2013 at 3:37 PM, Koobas koo...@gmail.com wrote: No major problems: A = 1 4 3 0 0 0 0 3

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
0 0 0 01.43070.1803 0 0 0 0 0 0 0 0 01.1404 On Thu, Apr 4, 2013 at 3:46 PM, Koobas koo...@gmail.com wrote: Sorry, the image was off. This is more like it: [image: Inline image 1] On Thu, Apr 4

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
BTW, my initialization of X and Y is simply random: X = rand(m,k); Y = rand(k,n); On Thu, Apr 4, 2013 at 3:51 PM, Koobas koo...@gmail.com wrote: It's done in one iteration. This is the R from QR factorization: 5.06635.81224.97044.39876.34004.59705.0334 4.2581

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
Makes perfect sense. Thanks for the explanation. On Thu, Apr 4, 2013 at 6:11 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Thu, Apr 4, 2013 at 4:16 PM, Koobas koo...@gmail.com wrote: The Mahout QR that I whipped up a couple of months ago is not rank revealing, but it is pretty easy

Re: Detecting rank-deficiency, or worse, via QR decomposition

2013-04-04 Thread Koobas
at 8:54 PM, Koobas koo...@gmail.com wrote: BTW, my initialization of X and Y is simply random: X = rand(m,k); Y = rand(k,n); On Thu, Apr 4, 2013 at 3:51 PM, Koobas koo...@gmail.com wrote: It's done in one iteration. This is the R from QR factorization: 5.06635.8122

Re: Regarding ItemBased Recommendation Results

2013-03-28 Thread Koobas
Are the suggestions completely different, or somewhat different? What about the neighborhoods? On Thu, Mar 28, 2013 at 10:09 AM, ch raju ch.raju...@gmail.com wrote: Hi all, I am working on mahout-0.7 recommendations, ran following command from the command line ./bin/mahout

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
matrix? Are there any indicators that it results in better recommendations? Koobas

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
On Mon, Mar 25, 2013 at 9:52 AM, Sean Owen sro...@gmail.com wrote: On Mon, Mar 25, 2013 at 1:41 PM, Koobas koo...@gmail.com wrote: But the assumption works nicely for click-like data. Better still when you can weakly prefer to reconstruct the 0 for missing observations and much more

Re: Mathematical background of ALS recommenders

2013-03-25 Thread Koobas
regularization entirely. I misspoke. I meant lambda=1. On Mon, Mar 25, 2013 at 2:14 PM, Koobas koo...@gmail.com wrote: On Mon, Mar 25, 2013 at 9:52 AM, Sean Owen sro...@gmail.com wrote: On Mon, Mar 25, 2013 at 1:41 PM, Koobas koo...@gmail.com wrote: But the assumption works nicely for click-like

Re: Mathematical background of ALS recommenders

2013-03-24 Thread Koobas
will chip in. Koobas On Sun, Mar 24, 2013 at 10:19 PM, Dominik Huebner cont...@dhuebner.comwrote: It's quite hard for me to get the mathematical concepts of the ALS recommenders. It would be great if someone could help me to figure out the details. This is my current status: 1. The item-feature (M

Re: reproducibility

2013-03-17 Thread Koobas
. Not sure about KNN though. On Sun, Mar 17, 2013 at 3:03 AM, Koobas koo...@gmail.com wrote: Can anybody shed any light on the issue of reproducibility in Mahout, with and without Hadoop, specifically in the context of kNN and ALS recommenders?

Re: reproducibility

2013-03-17 Thread Koobas
. On Sun, Mar 17, 2013 at 1:43 PM, Koobas koo...@gmail.com wrote: I am asking the basic reproducibility question. If I run twice on the same dataset, with the same hardware setup, will I always get the same resuts? Or is there any chance that on two different runs, the same user will get

Re: Problem in Deploying Mahout Recommender As a Web Service

2013-03-13 Thread Koobas
On Wed, Mar 13, 2013 at 5:01 AM, Manuel Blechschmidt manuel.blechschm...@gmx.de wrote: Hi Reinhard, here you go: https://github.com/ManuelB/facebook-recommender-demo The example above provides a SOAP interface and a REST interface using Java EE 6. It is not scalable for a lot of reasons

GenericUserBasedRecommender vs GenericItemBasedRecommender

2013-02-21 Thread Koobas
In the GenericUserBasedRecommender the concept of a neighborhood seems to be fundamental. I.e., it is a classic implementation of the kNN algorithm. But it is not the case with the GenericItemBasedRecommender. I understand that the two approaches are not meant to be completely symmetric, but

Re: GenericUserBasedRecommender vs GenericItemBasedRecommender

2013-02-21 Thread Koobas
do I find more information about this? And thanks for the instantaneous reply :) On Thu, Feb 21, 2013 at 2:37 PM, Koobas koo...@gmail.com wrote: In the GenericUserBasedRecommender the concept of a neighborhood seems to be fundamental. I.e., it is a classic implementation of the kNN

Re: Boolean preferences and evaluation

2013-01-25 Thread Koobas
to N users but the quality of recommendations overall. In this particular data set, which is rich and un-noisy, the ratings are probably valuable information and I imagine you will do better with any approach that doesn't drop them. On Fri, Jan 25, 2013 at 2:19 AM, Koobas koo...@gmail.com

Re: Boolean preferences and evaluation

2013-01-24 Thread Koobas
A naive question: Boolean recommender means that we are ignoring ratings, but aren't recommendations still weighted by user-user similarities or item-item similarities? Which would also mean that increasing the neighborhood will not deteriorate the results, because bad contributions from farther

Re: Boolean preferences and evaluation

2013-01-24 Thread Koobas
On Thu, Jan 24, 2013 at 7:41 PM, Ted Dunning ted.dunn...@gmail.com wrote: That doesn't mean that is a bad recommendation. People don't rate things for simple reasons. Generally, they rate things that are close to what they like and they rate things negatively that are very close to what

Re: Any utility to solve the matrix inversion in Map/Reduce Way

2013-01-22 Thread Koobas
, Jan 21, 2013 at 1:12 AM, Colin Wang colin.bin.wang.mah...@gmail.com wrote: Hi Koobas, I am trying on dense matrix in Hadoop, thousand times thousand square size. How do HPC guys to solve this problem? Any references? Thank you, Colin On Mon, Jan 21, 2013 at 11:49 AM, Koobas koo

Re: Any utility to solve the matrix inversion in Map/Reduce Way

2013-01-20 Thread Koobas
Colin, I am more of an HPC guys. I am a Mahout noob myself. Are we talking about a dense matrix? What size? On Sun, Jan 20, 2013 at 9:34 PM, Colin Wang colin.bin.wang.mah...@gmail.com wrote: Hi Koobas, I want the first one. Do you have any suggestions? Thank you, Colin On Fri, Jan 18

Re: Any utility to solve the matrix inversion in Map/Reduce Way

2013-01-17 Thread Koobas
Martix inversion, as in explicitly computing the inverse, e.g. computing variance / covariance, or matrix inversion, as in solving a linear system of equations? On Thu, Jan 17, 2013 at 7:49 PM, Colin Wang colin.bin.wang.mah...@gmail.com wrote: Hi All, I want to solve the matrix inversion,

Re: alternating least squares

2013-01-09 Thread Koobas
, Koobas koo...@gmail.com wrote: Okay, I got a little bit further in my understanding. The matrix of ratings R is replaced with the binary matrix P. Then R is used again in regularization. I get it. This takes care of the situations when you have user-item interactions, but you don't have

Re: alternating least squares

2013-01-09 Thread Koobas
notation? Because it seems to me that I have to go one row of X, (one column of Y) at a time. Is that really so, or am I missing something? On Wed, Jan 9, 2013 at 10:13 AM, Koobas koo...@gmail.com wrote: On Wed, Jan 9, 2013 at 12:40 AM, Sean Owen sro...@gmail.com wrote: I think the model you're

Re: alternating least squares

2013-01-08 Thread Koobas
storage and hit with Householder? The underlying question being the computational complexity, i.e. number of floating point operations involved. On Tue, Jan 8, 2013 at 4:03 PM, Sebastian Schelter s...@apache.org wrote: Hi Koobas, We have two classes that implement the solutions described

Re: alternating least squares

2013-01-08 Thread Koobas
On Tue, Jan 8, 2013 at 6:41 PM, Sean Owen sro...@gmail.com wrote: There's definitely a QR decomposition in there for me since solving A = X Y' for X is X = A Y (Y' * Y)^-1 and you need some means to compute the inverse of that (small) matrix. Sean, I think I got it. 1) A Y is a handful of

Re: alternating least squares

2013-01-08 Thread Koobas
On Tue, Jan 8, 2013 at 7:18 PM, Ted Dunning ted.dunn...@gmail.com wrote: But is it actually QR of Y? Ted, This is my understanding: In the process of solving the least squares problem, you end up inverting a small square matrix (Y' * Y)-1. How it is done is irrelevant. Since the matrix is

Re: alternating least squares

2013-01-08 Thread Koobas
On Tue, Jan 8, 2013 at 7:17 PM, Koobas koo...@gmail.com wrote: On Tue, Jan 8, 2013 at 6:41 PM, Sean Owen sro...@gmail.com wrote: There's definitely a QR decomposition in there for me since solving A = X Y' for X is X = A Y (Y' * Y)^-1 and you need some means to compute the inverse

Re: Remove unused recommenders?

2012-12-06 Thread Koobas
As a n00b, I am still revolving in the kNN space. Could you please point me to some details on ALS. Thanks! On Thu, Dec 6, 2012 at 10:14 AM, Sean Owen sro...@gmail.com wrote: The tree-based ones are very old and not fast, and were more of an experiment. I recall a few questions about them but

Re: Remove unused recommenders?

2012-12-06 Thread Koobas
it is not going away. I need a starting point to get up to speed. Thanks for clarifying. On Thu, Dec 6, 2012 at 3:18 PM, Koobas koo...@gmail.com wrote: As a n00b, I am still revolving in the kNN space. Could you please point me to some details on ALS. Thanks! On Thu, Dec 6, 2012 at 10

Re: Mahout Amazon EMR usage cost

2012-12-05 Thread Koobas
I am very happy to see that I started a lively thread. I am a newcomer to the field, so this is all very useful. Now yet another naive question. Ted is probably going to go ballistic ;) Assuming that simple overlap methods suck, is there still a metric that works better than others (i.e. Tanimoto

Re: Mahout Amazon EMR usage cost

2012-12-05 Thread Koobas
On Wed, Dec 5, 2012 at 7:03 PM, Ted Dunning ted.dunn...@gmail.com wrote: On Wed, Dec 5, 2012 at 5:29 PM, Koobas koo...@gmail.com wrote: ... Now yet another naive question. Ted is probably going to go ballistic ;) I hope not. Assuming that simple overlap methods suck