> On 2011-12-13 13:24:28, Ted Dunning wrote:
> > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java,
> >  lines 36-41
> > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36>
> >
> >     Direct and exact quotes from the paper should be either avoided or 
> > acknowledged.  Better here to rephrase the language.

Rephrased the language at revision 5.


> On 2011-12-13 13:24:28, Ted Dunning wrote:
> > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java,
> >  lines 60-63
> > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60>
> >
> >     Again, just quoting the paper is not a good idea.  This isn't adding 
> > any information in any case since the exact same language was used in the 
> > class level java doc.
> >     
> >     It would be nice here to note that the average is an *unweighted* 
> > average.

Rephrased the language at revision 5.


> On 2011-12-13 13:24:28, Ted Dunning wrote:
> > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java,
> >  lines 87-88
> > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87>
> >
> >     This looks like a bad key to use here.

This key should be the average of log-likelihood of the best 
OnlineLogisticRegression in AdaptiveLogisticRegression.


> On 2011-12-13 13:24:28, Ted Dunning wrote:
> > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java,
> >  line 40
> > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40>
> >
> >     I don't think that this is correct.  Is this really what the output is? 
> >  Why are you dividing by a weight vector?  How do you compute this score?
> >     
> >     Or do you mean to not divide here?
> >     
> >     If so, why do you use a score as the key?

The way to explain it may be bad, but it means the Map output key is score and 
Map output value is new weight vector.
I rewrote the comment at revision 5.


> On 2011-12-13 13:24:28, Ted Dunning wrote:
> > trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java,
> >  lines 34-35
> > <https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34>
> >
> >     I don't think that this is correct.  In the google paper, the average 
> > was unweighted.  In any case how do you compute this score for weighting?
> >     
> >     Also, if the key is the score, how does the reducer work since each 
> > reduce function will only see one score?  Are you assuming that there is 
> > exactly one reducer?

The original paper(http://aclweb.org/anthology-new/N/N10/N10-1069.pdf) says it 
is a weighted average,
but my simple experiment showed that the unweighted average was better than the 
weighted average.
So I rewrote the code as the unweighted average at revision 5.

The number of reducers should be set to one. I added the comment accordingly at 
revision 5.
The number of reducers is set at runIteration function at Driver class.


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3875
-----------------------------------------------------------


On 2011-12-14 08:59:29, issei yoshida wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/3072/
> -----------------------------------------------------------
> 
> (Updated 2011-12-14 08:59:29)
> 
> 
> Review request for mahout.
> 
> 
> Summary
> -------
> 
> MAHOUT-918 Parallelized SGD in MapReduce
> 
> 
> This addresses bug MAHOUT-918.
>     https://issues.apache.org/jira/browse/MAHOUT-918
> 
> 
> Diffs
> -----
> 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java
>  1214116 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java
>  PRE-CREATION 
>   
> trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java
>  PRE-CREATION 
>   
> trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java
>  PRE-CREATION 
>   
> trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java
>  PRE-CREATION 
>   
> trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java
>  PRE-CREATION 
>   
> trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java
>  PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/3072/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> issei
> 
>

Reply via email to