RE: Getting rating for all the files

Martin, Nick Mon, 30 Sep 2013 16:14:53 -0700

Hi all, 

I have the same question as Deepak does below...where can I find the User based 
recommender via Mahout command line?

I don't see it listed in the valid program names:

Valid program names are:
  arff.vector: : Generate Vectors from an ARFF file or directory
  baumwelch: : Baum-Welch algorithm for unsupervised HMM training
  canopy: : Canopy clustering
  cat: : Print a file or resource as the logistic regression models would see it
  cleansvd: : Cleanup and verification of SVD output
  clusterdump: : Dump cluster output to text
  clusterpp: : Groups Clustering Output In Clusters
  cmdump: : Dump confusion matrix in HTML or text formats
  cvb: : LDA via Collapsed Variation Bayes (0th deriv. approx)
  cvb0_local: : LDA via Collapsed Variation Bayes, in memory locally.
  dirichlet: : Dirichlet Clustering
  eigencuts: : Eigencuts spectral clustering
  evaluateFactorization: : compute RMSE and MAE of a rating matrix 
factorization against probes
  fkmeans: : Fuzzy K-means clustering
  fpg: : Frequent Pattern Growth
  hmmpredict: : Generate random sequence of observations by given HMM
  itemsimilarity: : Compute the item-item-similarities for item-based 
collaborative filtering
  kmeans: : K-means clustering
  lucene.vector: : Generate Vectors from a Lucene index
  matrixdump: : Dump matrix in CSV format
  matrixmult: : Take the product of two matrices
  meanshift: : Mean Shift clustering
  minhash: : Run Minhash clustering
  parallelALS: : ALS-WR factorization of a rating matrix
  recommendfactorized: : Compute recommendations using the factorization of a 
rating matrix
  recommenditembased: : Compute recommendations using item-based collaborative 
filtering
  regexconverter: : Convert text files on a per line basis based on regular 
expressions
  rowid: : Map SequenceFile<Text,VectorWritable> to 
{SequenceFile<IntWritable,VectorWritable>, SequenceFile<IntWritable,Text>}
  rowsimilarity: : Compute the pairwise similarities of the rows of a matrix
  runAdaptiveLogistic: : Score new production data using a probably trained and 
validated AdaptivelogisticRegression model
  runlogistic: : Run a logistic regression model against CSV data
  seq2encoded: : Encoded Sparse Vector generation from Text sequence files
  seq2sparse: : Sparse Vector generation from Text sequence files
  seqdirectory: : Generate sequence files (of Text) from a directory
  seqdumper: : Generic Sequence File dumper
  seqmailarchives: : Creates SequenceFile from a directory containing gzipped 
mail archives
  seqwiki: : Wikipedia xml dump to sequence file
  spectralkmeans: : Spectral k-means clustering
  split: : Split Input data into test and train sets
  splitDataset: : split a rating dataset into training and probe parts
  ssvd: : Stochastic SVD
  svd: : Lanczos Singular Value Decomposition
  testnb: : Test the Vector-based Bayes classifier
  trainAdaptiveLogistic: : Train an AdaptivelogisticRegression model
  trainlogistic: : Train a logistic regression using stochastic gradient descent
  trainnb: : Train the Vector-based Bayes classifier
  transpose: : Take the transpose of a matrix
  validateAdaptiveLogistic: : Validate an AdaptivelogisticRegression model 
against hold-out data set
  vecdist: : Compute the distances between a set of Vectors (or Cluster or 
Canopy, they must fit in memory) and a list of Vectors
  vectordump: : Dump vectors from a sequence file to text
  viterbi: : Viterbi decoding of hidden states from given output states sequence

-----Original Message-----
From: Deepak Subhramanian [mailto:deepak.subhraman...@gmail.com] 
Sent: Sunday, September 29, 2013 4:06 PM
To: user@mahout.apache.org
Subject: Re: Getting rating for all the files

I tried writing a UserRecommendation program in java. But it give me less 
results than the ItemBasedRecommendation. Anyone else have any thoughts on my 
previous question ?

On Sun, Sep 29, 2013 at 7:24 PM, Deepak Subhramanian < 
deepak.subhraman...@gmail.com> wrote:

> Thanks Nick. I am planning to give a try with userbasedrecommendation 
> since there are low no of users. I dont see recommenduserbased option 
> in the commandline utility for Mahout. Does that mean I have to write 
> a Java Program to use the UserBasedRecommender ?
>
>
> On Sun, Sep 29, 2013 at 7:22 PM, Martin, Nick <nimar...@pssd.com> wrote:
>
>> I'l need to defer to one of the other math whizzes on the potential 
>> reasons for recommendations for certain users not appearing. My 
>> suspicion is that you would either not have sufficient co-occurrence 
>> of specific users/items to support a recommendation or you may need 
>> to experiment with a different similarity measure.
>>
>> Anyone else want to weigh in?
>>
>>
>>
>> Sent from my iPhone
>>
>> On Sep 29, 2013, at 1:14 PM, "Deepak Subhramanian" < 
>> deepak.subhraman...@gmail.com> wrote:
>>
>> > Sorry . My mistake . I am getting the lower ratings for some of the
>> users
>> > and items. But my issue is not solved . I am not getting ratings 
>> > for
>> some
>> > of the users and some of the ratings.
>> >
>> > My userFile has 8000 users and my itemsFile has 4000 Items  . But I 
>> > get recommendations for only 5000 users and  1500 items. And the 
>> > maximum no
>> of
>> > recommendations given is 258. What can be the reasons that there  
>> > is no items recommendations for 3000 users and 2500 items. Is it 
>> > because
>> there is
>> > no similarities exist between those users and items  ?
>> >
>> >
>> > On Sun, Sep 29, 2013 at 4:46 PM, Deepak Subhramanian < 
>> > deepak.subhraman...@gmail.com> wrote:
>> >
>> >> Thanks Nick. As I mentioned earleir I am getting  ratings only for 
>> >> the
>> top
>> >> recommended products instead of ratings for 4000 products I am 
>> >> giving numRecommendations parameter to 4000 and maxPrefsPerUser  to 4000.
>> Should
>> >> it give 4000 items in the list for each user ? For some reasons 
>> >> the output for items which are having lower ratings is not 
>> >> displayed.  I
>> see
>> >> the default limit is 10.
>> >>
>> >> I am not sure if I am not getting ratings for 4000 items because I 
>> >> am passing the wrong options for the  mahout version or is it an 
>> >> issue
>> with
>> >> mahout ver 0.7. I am using 0.7 -mahout-examples-0.7-cdh4.3.1.jar .
>> >>
>> >> I see the parameter name changed in the latest version I checked 
>> >> from
>> git
>> >> - 0.9-SNAPSHOT
>> >>
>> >> maxPrefsPerUserConsidered =
>> jobConf.getInt(MAX_PREFS_PER_USER_CONSIDERED,
>> >> DEFAULT_MAX_PREFS_PER_USER_CONSIDERED);
>> >>
>> >> Will using a latest version help ?
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Sep 29, 2013 at 12:29 PM, Martin, Nick <nimar...@pssd.com>
>> wrote:
>> >>
>> >>> There should be a score after each recommended item (i.e. 
>> >>> 123456:2.6)
>> in
>> >>> your output. Lower scores would be the ones you're interested in.
>> >>>
>> >>> Sent from my iPhone
>> >>>
>> >>> On Sep 28, 2013, at 8:25 AM, "Deepak Subhramanian" < 
>> >>> deepak.subhraman...@gmail.com> wrote:
>> >>>
>> >>>> Hi
>> >>>>
>> >>>> I am trying to predict the ratings for some items for some users
>> using
>> >>> item
>> >>>> based collaborative filtering. I tried using the mahout
>> >>> recommenditembased
>> >>>> , but it shows only the top 10 items or I can increase it by 
>> >>>> passing
>> the
>> >>>> --numRecommendations parameter. But it doesnt shows items which 
>> >>>> has
>> >>> lower
>> >>>> predicted rating . What is the best approach to get ratings for 
>> >>>> items
>> >>> which
>> >>>> has low predicted rating ?
>> >>>>
>> >>>>
>> >>>> I tried this command.
>> >>>>
>> >>>> mahout recommenditembased --input mahoutrecoinput --usersFile 
>> >>>> recouserlist  --itemsFile  recoitemlist --output 
>> >>>> /mahoutrecooutputpearsonnew -s SIMILARITY_PEARSON_CORRELATION 
>> >>>> --numRecommendations 4000  --maxPrefsPerUser 4000
>> >>>>
>> >>>> Also I tried using the estimatePreference method on the recommender.
>> >>>>
>> >>>> Please help .
>> >>
>> >>
>> >>
>> >> --
>> >> Deepak Subhramanian
>> >
>> >
>> >
>> > --
>> > Deepak Subhramanian
>>
>
>
>
> --
> Deepak Subhramanian
>

--
Deepak Subhramanian

RE: Getting rating for all the files

Reply via email to