date:20110422

Re: Recommend output: User vs. Item, Tanimoto vs. LogLikelihood

2011-04-22 Thread Lance Norskog

The "abstract information structure" encoded in the item-item graph is completely different from the user-user graph. Also, there are different User-based and Item-based approaches. Comparing recommendations is hard. It is not really possible to make an absolute or even fuzzy ranking of "what shoul

Recommend output: User vs. Item, Tanimoto vs. LogLikelihood

2011-04-22 Thread Otis Gospodnetic

Hi, Given the same input data, should the same list of recommended items be returned regardless of whether one uses Item-based or User-based recommendations? I always thought the answer was yes (same "matrix" just flipped differently is how I imagined it), but I recently saw output of some M

Re: kmeans on space-delimited input data,

2011-04-22 Thread Vincent Xue

Hello vs, I am also a beginner mahout user and I think that the problem may be with your initial step to convert the txt matrix to a sequence file. I had a similar task to convert a tab delimited matrix into a sequence file of for SVD computations. What I did, was to write some custom Java code

kmeans on space-delimited input data,

2011-04-22 Thread vs

Mahout Users, I have seen posts attempting to an answer the problem i have in hand. But, i would like to seek some comments from who have been successful in resolving this issue. (1) Input data: A space-delimited symmetric matrix of 500x500 double values. The entire matrix is in one-single fil

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

2011-04-22 Thread Ted Dunning

Yes. But how do we specify the input? And how do we specify the encodings? This is what has always held me back in the past. Should we just allow classes to be specified on the command line? On Fri, Apr 22, 2011 at 8:47 AM, Dmitriy Lyubimov wrote: > Maybe there's a space for Mr based input c

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

2011-04-22 Thread Ted Dunning

On Fri, Apr 22, 2011 at 6:39 AM, Stanley Xu wrote: > One more question, I am also trying to test the MixedGradient, it looks > like the RankingGradient will take much more time than the DefaultGradient. > This is probably due to memory use. You need to review which way you group users. > > If

Re: Anyway to speedup the category feature parsing and encoding in the SGD algorithm?

2011-04-22 Thread Ted Dunning

Look at VectorWritable On Fri, Apr 22, 2011 at 6:57 AM, Stanley Xu wrote: > Hi Ted, > > Which class do you mean for the sparse vector as Writable? > > I checked the code that neither the RandomAccessSparseVector nor > SequentialAccessSparseVector implemented the Writable interface. > > Thanks. >

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

2011-04-22 Thread Dmitriy Lyubimov

Maybe there's a space for Mr based input conversion job indeed as a command line routine? I was kind of thinking about the same. Maybe even along with standartisation of the values. Some formal definition of inputs being fed to it. apologies for brevity. Sent from my android. -Dmitriy On Apr 21,

Re: Anyway to speedup the category feature parsing and encoding in the SGD algorithm?

2011-04-22 Thread Stanley Xu

Hi Ted, Which class do you mean for the sparse vector as Writable? I checked the code that neither the RandomAccessSparseVector nor SequentialAccessSparseVector implemented the Writable interface. Thanks. On Fri, Apr 22, 2011 at 12:49 PM, Ted Dunning wrote: > The binary format is already defin

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

2011-04-22 Thread Stanley Xu

Got it. Thanks so much, Ted. One more question, I am also trying to test the MixedGradient, it looks like the RankingGradient will take much more time than the DefaultGradient. If I set the alpha to 0.5, it will take 50 times of time comparing to the DefaultGradient, I thought it should be like t

Re: Recommend output: User vs. Item, Tanimoto vs. LogLikelihood

Recommend output: User vs. Item, Tanimoto vs. LogLikelihood

Re: kmeans on space-delimited input data,

kmeans on space-delimited input data,

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

Re: Anyway to speedup the category feature parsing and encoding in the SGD algorithm?

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

Re: Anyway to speedup the category feature parsing and encoding in the SGD algorithm?

Re: Does the Feature Hashing and Collision in the SGD will harm the performance of the algorithm?

10 matches

Site Navigation

Mail list logo

Footer information