Re: Tuning parameters for ALS-WR

2013-09-11 Thread Sean Owen
On Wed, Sep 11, 2013 at 12:22 AM, Parimi Rohit rohit.par...@gmail.comwrote:

 1. Do we have to follow this setting, to compare algorithms? Can't we
 report the parameter combination for which we get highest mean average
 precision for the test data, when trained on the train set, with out any
 validation set.


As Ted alludes to this would overfit. Think of it as two learning
processes. You learn model hyper-parameters like lambda, and you learn
model parameters like your matrix decomposition. So there must be two
levels of hold-out.


 2. Do we have to tune the similarityclass parameter in item-based CF? If
 so, do we compare the mean average precision values based on validation
 data, and then report the same for the test set?


Yes you are conceptually looking over the entire hyper-parameter space. If
the similarity metric is one of those, you are trying different metrics.
Grid search, just brute-force trying combinations, works for 1-2
hyper-parameters. Otherwise I'd try randomly choosing parameters, really,
or else it will take way too long to explore. You try to pick
hyper-parameters 'nearer' to those that have yielded better scores.


Re: Tuning parameters for ALS-WR

2013-09-11 Thread Ted Dunning
On Wed, Sep 11, 2013 at 12:07 AM, Sean Owen sro...@gmail.com wrote:

  2. Do we have to tune the similarityclass parameter in item-based CF?
 If
  so, do we compare the mean average precision values based on validation
  data, and then report the same for the test set?
 
 
 Yes you are conceptually looking over the entire hyper-parameter space. If
 the similarity metric is one of those, you are trying different metrics.
 Grid search, just brute-force trying combinations, works for 1-2
 hyper-parameters. Otherwise I'd try randomly choosing parameters, really,
 or else it will take way too long to explore. You try to pick
 hyper-parameters 'nearer' to those that have yielded better scores.


Or use a real exploration algorithm.  For my favorite (hear that horn
blowing?) see this article on recorded step
meta-mutation.http://arxiv.org/abs/0803.3838
The idea is a randomized search, but with something akin to momentum.  This
lets you search nasty landscapes with pretty pretty good robustness and
smooth ones with fast convergence.  The code and theory are simple and
there is an implementation in Mahout.


Re: Tuning parameters for ALS-WR

2013-09-10 Thread Ted Dunning
You definitely need to separate into three sets.

Another way to put it is that with cross validation, any learning algorithm
needs to have test data withheld from it.  The remaining data is training
data to be used by the learning algorithm.

Some training algorithms such as the one that you describe divide their
training data into portions so that they can learn hyper-parameters
separately from parameters.  Whether the learning algorithm does this or
uses some other technique to come to a final value for the model has no
bearing on whether the original test data is withheld and because the test
data has to be unconditionally withheld, any sub-division of the training
data cannot include any of the test data.

In your case, you hold back 25% test data.  Then you divide the remaining
75% into 25% validation and 50% training.  The validation set has to be
separate from the 50% in order to avoid over-fitting, but the test data has
to be separate from the training+validation for the same reason.





On Tue, Sep 10, 2013 at 4:22 PM, Parimi Rohit rohit.par...@gmail.comwrote:

 Hi All,

 I was wondering if there is any experimental design to tune the parameters
 of ALS algorithm in mahout, so that we can compare its recommendations with
 recommendations from another algorithm.

 My datasets have implicit data and would like to use the following design
 for tuning the ALS parameters (alphs, lambda, numfeatures).

 1. Split the data such that for each user, 50% of the clicks go to train,
 25% go to validation, 25% goes to test.

 2. Create the user and item features by applying the ALS algorithm on
 training data, and test on the validation set. (We can pick the parameters
 which minimizes the RMSE score, in-case of implicit data, Pui - XY’)
 3. Once we find the parameters which give the best RMSE value on
 validation, use the user and item matrices generated for those parameters
 to predict the top k items and test it with the items in the test set
 (compute mean average precision).

 Although the above setting looks good, I have few questions

 1. Do we have to follow this setting, to compare algorithms? Can't we
 report the parameter combination for which we get highest mean average
 precision for the test data, when trained on the train set, with out any
 validation set.
 2. Do we have to tune the similarityclass parameter in item-based CF? If
 so, do we compare the mean average precision values based on validation
 data, and then report the same for the test set?

 My ultimate objective is to compare different algorithms but I am confused
 as to how to compare the best results (based on parameter tuning) between
 algorithms. Are there any publications that explain this in detail? Any
 help/comments about the design of experiments is much appreciated.

 Thanks,
 Rohit