[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

GeorgeDittmar Sat, 30 May 2015 23:24:40 -0700

Github user GeorgeDittmar commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6112#discussion_r31387313
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala 
---
    @@ -717,6 +719,49 @@ class SparseVector(
           new SparseVector(size, ii, vv)
         }
       }
    +
    +  override def argmax: Int = {
    +    if (size == 0) {
    +      -1
    +    } else {
    +
    +      //grab first active index and value by default
    +      var maxIdx = indices(0)
    +      var maxValue = values(0)
    +
    +      foreachActive { (i, v) =>
    +        if (v > maxValue) {
    +          maxIdx = i
    +          maxValue = v
    +        }
    +      }
    +
    +      // look for inactive values incase all active node values are 
negative
    +      if(size != values.size && maxValue <= 0){
    --- End diff --
    
    Found another corner case with if we have a 0 value defined in the active 
set of values at the very end of the vector. I'm wondering if it might make 
more sense to more strongly enforce in the SparseVector implementation that 0's 
cant be in the set of active values? maybe thats too strict of a rule, but it 
would cut down on these corner cases. 
    
    Seems odd to allow addition of active nodes with value 0 if they should 
really be inactive. As well if we call SparseVector.toSparseVector it looks 
like it filters out the zeros to begin with so might make sense to do this more 
formally at object creation time. @mengxr thoughts?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Spark-7422][MLLIB] Add argmax to Vector, Spar...

Reply via email to