Re: [jira] Commented: (MAHOUT-379) SequentialAccessSparseVector.equals does not agree with AbstractVector.equivalent

Jeff Eastman Sun, 18 Apr 2010 09:53:54 -0700

I can think of situations where I need to use a clusterId as thekey-part and a Vector as the value-part. If the Vector is going to havea consistent identity as it moves through jobs then that would need tobe inside the Vector.


On 4/18/10 8:41 AM, Jake Mannix wrote:

Which one is "this"?  Wrapping Vector impls into a
NamedVector/LabeledVector,
or seeing if we even need the label *inside* of the Vector itself, and
instead
just having those live in the "key" part of the key-value pair in hadoop,
like
DistributedRowMatrix has it?


   -jake

On Sun, Apr 18, 2010 at 3:44 AM, Sean Owen<sro...@gmail.com>  wrote:

Yeah why don't I have a crack at this. The change as it stands is
already too big for what it is (though I believe they're good
changes.) Then we look at more changes, and sounds like there are
several ideas for streamlining vectors, which is a great thing to
think about at this early stage.

On Sun, Apr 18, 2010 at 12:54 AM, Ted Dunning<ted.dunn...@gmail.com>
wrote:

How about this alternative:

NamedVector: {Vector: wrapped, String: name}
Vector: AbstractVector
AbstractVector: DenseVector | SequentialSparseVector | HashSparseVector

This avoids the multiplicative explosion of vector types.

Re: [jira] Commented: (MAHOUT-379) SequentialAccessSparseVector.equals does not agree with AbstractVector.equivalent

Reply via email to