> On 2011-12-13 13:24:28, Ted Dunning wrote: > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, > > lines 36-41 > > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36> > > > > Direct and exact quotes from the paper should be either avoided or > > acknowledged. Better here to rephrase the language.
Rephrased the language at revision 5. > On 2011-12-13 13:24:28, Ted Dunning wrote: > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java, > > lines 60-63 > > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60> > > > > Again, just quoting the paper is not a good idea. This isn't adding > > any information in any case since the exact same language was used in the > > class level java doc. > > > > It would be nice here to note that the average is an *unweighted* > > average. Rephrased the language at revision 5. > On 2011-12-13 13:24:28, Ted Dunning wrote: > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, > > lines 87-88 > > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87> > > > > This looks like a bad key to use here. This key should be the average of log-likelihood of the best OnlineLogisticRegression in AdaptiveLogisticRegression. > On 2011-12-13 13:24:28, Ted Dunning wrote: > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java, > > line 40 > > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40> > > > > I don't think that this is correct. Is this really what the output is? > > Why are you dividing by a weight vector? How do you compute this score? > > > > Or do you mean to not divide here? > > > > If so, why do you use a score as the key? The way to explain it may be bad, but it means the Map output key is score and Map output value is new weight vector. I rewrote the comment at revision 5. > On 2011-12-13 13:24:28, Ted Dunning wrote: > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java, > > lines 34-35 > > <https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34> > > > > I don't think that this is correct. In the google paper, the average > > was unweighted. In any case how do you compute this score for weighting? > > > > Also, if the key is the score, how does the reducer work since each > > reduce function will only see one score? Are you assuming that there is > > exactly one reducer? The original paper(http://aclweb.org/anthology-new/N/N10/N10-1069.pdf) says it is a weighted average, but my simple experiment showed that the unweighted average was better than the weighted average. So I rewrote the code as the unweighted average at revision 5. The number of reducers should be set to one. I added the comment accordingly at revision 5. The number of reducers is set at runIteration function at Driver class. - issei ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/3072/#review3875 ----------------------------------------------------------- On 2011-12-14 08:59:29, issei yoshida wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/3072/ > ----------------------------------------------------------- > > (Updated 2011-12-14 08:59:29) > > > Review request for mahout. > > > Summary > ------- > > MAHOUT-918 Parallelized SGD in MapReduce > > > This addresses bug MAHOUT-918. > https://issues.apache.org/jira/browse/MAHOUT-918 > > > Diffs > ----- > > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java > 1214116 > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java > PRE-CREATION > > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java > PRE-CREATION > > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java > PRE-CREATION > > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java > PRE-CREATION > > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java > PRE-CREATION > > trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/3072/diff > > > Testing > ------- > > > Thanks, > > issei > >