On Sep 30, 2009, at 4:03 PM, Jake Mannix wrote:

Regarding having equals() effectively delegate to
getName().equals(other.getName()) && equivalent(other) means that we need to
be extra special careful about implementations of hashCode() :

If we are not going to break the contract between equals() and hashCode(), and we're having equals() *only* take into account the mathematical contents and the name, then I'd say what we need to do is implement hashCode () in a
top level class like AbstractVector.

That is what is happening.


(Is something funny going on with JIRA?  Seems broken...)

Yes, there is something wrong.  Infra is aware of it.



 -jake

On Wed, Sep 30, 2009 at 10:01 AM, Sean Owen (JIRA) <j...@apache.org> wrote:


  [
https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760956#action_12760956]

Sean Owen commented on MAHOUT-165:
----------------------------------

Are my conclusions sound then:

We agree that equals() should be 'pretty strict'. The conventional Java wisdom is that equals(), in fact, ought not return true for instances of differing classes, unless you really know what you're doing. I guess we do.
:)

If the idea behind equals() is "do class-specific stuff, otherwise, check names, and use equivalent() then", then we don't need strictEquivalence() --
where's it used?

(If I represented the logic correctly above -- is that as simple as we can
make it? seems a touch complex)

I am not sure anything is 'broken' in practice here but I sense it could be
simpler.


Using better primitives hash for sparse vector for performance gains
--------------------------------------------------------------------

               Key: MAHOUT-165
               URL: https://issues.apache.org/jira/browse/MAHOUT-165
           Project: Mahout
        Issue Type: Improvement
        Components: Matrix
  Affects Versions: 0.2
          Reporter: Shashikant Kore
          Assignee: Grant Ingersoll
           Fix For: 0.2

Attachments: colt.jar, mahout-165-trove.patch, MAHOUT-165.patch,
mahout-165.patch


In SparseVector, we need primitives hash map for index and values. The
present implementation of this hash map is not as efficient as some of the
other implementations in non-Apache projects.
In an experiment, I found that, for get/set operations, the primitive
hash of  Colt performance an order of magnitude better than
OrderedIntDoubleMapping. For iteration it is 2x slower, though.
Using Colt in Sparsevector improved performance of canopy generation. For
an experimental dataset, the current implementation takes 50 minutes. Using Colt, reduces this duration to 19-20 minutes. That's 60% reduction in the
delay.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to