2010/1/18 Ted Dunning <ted.dunn...@gmail.com>: > These bounds were too tight in any case. I had to loosen other bounds > during development and should have loosened these as well. > > Your change is a good one.
Great! so here is the sequel: I have written a real training convergence test and identified and fixed two bugs in the classification and training code: http://github.com/ogrisel/mahout/commit/4f9bdbf2ba0642be47d5e4b0e9da116a12bebc9e I plan to go on adding more tests (e.g. test that the L1 prior leads to sparse parameters while the others don't for instance). Also I think your default values for lambda (10 or 1.0) for such low dimensional test are much to strong and leads to null parameters with a L1 prior. I also plan to write an example based on the wikipedia country (adapted from the bayesian classifier example). I plan to use hadoop to build the training set (vectorized) with different feature numbers in //. And use hadoop to train various versions of the regressor in // with a grid search on the hyperparameters (lambda, the prior and the learning rate). Tell me if you plan to work on this this week so that we can synchronize our effort. I would be great if you branched my git branch for this if so. Manual patch handling is tedious :) -- Olivier http://twitter.com/ogrisel - http://code.oliviergrisel.name