On Sep 30, 2009, at 4:03 PM, Jake Mannix wrote:
Regarding having equals() effectively delegate to
getName().equals(other.getName()) && equivalent(other) means that we
need to
be extra special careful about implementations of hashCode() :
If we are not going to break the contract between equals() and
hashCode(),
and we're having equals() *only* take into account the mathematical
contents
and the name, then I'd say what we need to do is implement hashCode
() in a
top level class like AbstractVector.
That is what is happening.
(Is something funny going on with JIRA? Seems broken...)
Yes, there is something wrong. Infra is aware of it.
-jake
On Wed, Sep 30, 2009 at 10:01 AM, Sean Owen (JIRA) <j...@apache.org>
wrote:
[
https://issues.apache.org/jira/browse/MAHOUT-165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760956#action_12760956]
Sean Owen commented on MAHOUT-165:
----------------------------------
Are my conclusions sound then:
We agree that equals() should be 'pretty strict'. The conventional
Java
wisdom is that equals(), in fact, ought not return true for
instances of
differing classes, unless you really know what you're doing. I
guess we do.
:)
If the idea behind equals() is "do class-specific stuff, otherwise,
check
names, and use equivalent() then", then we don't need
strictEquivalence() --
where's it used?
(If I represented the logic correctly above -- is that as simple as
we can
make it? seems a touch complex)
I am not sure anything is 'broken' in practice here but I sense it
could be
simpler.
Using better primitives hash for sparse vector for performance gains
--------------------------------------------------------------------
Key: MAHOUT-165
URL: https://issues.apache.org/jira/browse/MAHOUT-165
Project: Mahout
Issue Type: Improvement
Components: Matrix
Affects Versions: 0.2
Reporter: Shashikant Kore
Assignee: Grant Ingersoll
Fix For: 0.2
Attachments: colt.jar, mahout-165-trove.patch,
MAHOUT-165.patch,
mahout-165.patch
In SparseVector, we need primitives hash map for index and values.
The
present implementation of this hash map is not as efficient as some
of the
other implementations in non-Apache projects.
In an experiment, I found that, for get/set operations, the
primitive
hash of Colt performance an order of magnitude better than
OrderedIntDoubleMapping. For iteration it is 2x slower, though.
Using Colt in Sparsevector improved performance of canopy
generation. For
an experimental dataset, the current implementation takes 50
minutes. Using
Colt, reduces this duration to 19-20 minutes. That's 60% reduction
in the
delay.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search