[ 
https://issues.apache.org/jira/browse/MAHOUT-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13169223#comment-13169223
 ] 

jirapos...@reviews.apache.org commented on MAHOUT-918:
------------------------------------------------------



bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > 
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java,
 lines 36-41
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line36>
bq.  >
bq.  >     Direct and exact quotes from the paper should be either avoided or 
acknowledged.  Better here to rephrase the language.

Rephrased the language at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > 
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java,
 lines 60-63
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64283#file64283line60>
bq.  >
bq.  >     Again, just quoting the paper is not a good idea.  This isn't adding 
any information in any case since the exact same language was used in the class 
level java doc.
bq.  >     
bq.  >     It would be nice here to note that the average is an *unweighted* 
average.

Rephrased the language at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > 
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java,
 lines 87-88
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line87>
bq.  >
bq.  >     This looks like a bad key to use here.

This key should be the average of log-likelihood of the best 
OnlineLogisticRegression in AdaptiveLogisticRegression.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > 
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java,
 line 40
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64284#file64284line40>
bq.  >
bq.  >     I don't think that this is correct.  Is this really what the output 
is?  Why are you dividing by a weight vector?  How do you compute this score?
bq.  >     
bq.  >     Or do you mean to not divide here?
bq.  >     
bq.  >     If so, why do you use a score as the key?

The way to explain it may be bad, but it means the Map output key is score and 
Map output value is new weight vector.
I rewrote the comment at revision 5.


bq.  On 2011-12-13 13:24:28, Ted Dunning wrote:
bq.  > 
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java,
 lines 34-35
bq.  > <https://reviews.apache.org/r/3072/diff/4/?file=64285#file64285line34>
bq.  >
bq.  >     I don't think that this is correct.  In the google paper, the 
average was unweighted.  In any case how do you compute this score for 
weighting?
bq.  >     
bq.  >     Also, if the key is the score, how does the reducer work since each 
reduce function will only see one score?  Are you assuming that there is 
exactly one reducer?

The original paper(http://aclweb.org/anthology-new/N/N10/N10-1069.pdf) says it 
is a weighted average,
but my simple experiment showed that the unweighted average was better than the 
weighted average.
So I rewrote the code as the unweighted average at revision 5.

The number of reducers should be set to one. I added the comment accordingly at 
revision 5.
The number of reducers is set at runIteration function at Driver class.


- issei


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3072/#review3875
-----------------------------------------------------------


On 2011-12-14 08:59:29, issei yoshida wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3072/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-12-14 08:59:29)
bq.  
bq.  
bq.  Review request for mahout.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  MAHOUT-918 Parallelized SGD in MapReduce
bq.  
bq.  
bq.  This addresses bug MAHOUT-918.
bq.      https://issues.apache.org/jira/browse/MAHOUT-918
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/PassiveAggressive.java
 1214116 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionDriver.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapper.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionReducer.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionDriver.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapper.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionReducer.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveDriver.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapper.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveReducer.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDDriver.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapper.java
 PRE-CREATION 
bq.    
trunk/core/src/main/java/org/apache/mahout/classifier/sgd/mapreduce/SGDReducer.java
 PRE-CREATION 
bq.    
trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/AdaptiveLogisticRegressionMapReduceTest.java
 PRE-CREATION 
bq.    
trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/LogisticRegressionMapReduceTest.java
 PRE-CREATION 
bq.    
trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/PassiveAggressiveMapReduceTest.java
 PRE-CREATION 
bq.    
trunk/core/src/test/java/org/apache/mahout/classifier/sgd/mapreduce/SGDMapReduceTest.java
 PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/3072/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  issei
bq.  
bq.


                
> Implement SGD based classifiers using MapReduce
> -----------------------------------------------
>
>                 Key: MAHOUT-918
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-918
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>    Affects Versions: 0.6
>            Reporter: issei yoshida
>         Attachments: MAHOUT-918.patch, design.pdf
>
>
> Implement SGD based classifiers (Logistic Regression, Adaptive Logistic 
> regression and Passive-Aggressive) using MapReduce.
> They are implemented using Iterative Parameter Mixtures algorithm which is 
> referred to in the following papers.
> http://research.google.com/pubs/pub36948.html
> http://aclweb.org/anthology-new/N/N10/N10-1069.pdf
> http://books.nips.cc/papers/files/nips22/NIPS2009_0345.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to