On Mon, Apr 19, 2010 at 9:43 PM, Sean Owen <sro...@gmail.com> wrote: > More on Vector, as I'm browsing through it: > > AbstractVector.minus(Vector) says: > > public Vector minus(Vector x) { > if (size() != x.size()) { > throw new CardinalityException(); > } > if (x instanceof RandomAccessSparseVector || x instanceof DenseVector) { > // TODO: if both are RandomAccess check the numNonDefault to > determine which to iterate > Vector result = x.clone(); > Iterator<Element> iter = iterateNonZero(); > while (iter.hasNext()) { > Element e = iter.next(); > result.setQuick(e.index(), e.get() - result.getQuick(e.index())); > } > return result; > } else { // TODO: check the numNonDefault elements to further optimize > Vector result = clone(); > Iterator<Element> iter = x.iterateNonZero(); > while (iter.hasNext()) { > Element e = iter.next(); > result.setQuick(e.index(), getQuick(e.index()) - e.get()); > } > return result; > } > } > > > The stanza after the instanceof checks can just become the body of an > overriding method in these two subclasses right? > > Since we're computing "this - that", makes sense to only look at > "that" where it is nonzero. But the first version iterates over > indices where "this" is nonzero, so it's wrong. (Yeah just checked > with a test.) > > Was the intent to compute "that - this" in this case, so to be able to > iterate over nonzero elements of "this", and then invert it at the > end? This works: > > @Override > public Vector minus(Vector that) { > if (this.size() != that.size()) { > throw new CardinalityException(); > } > Vector result = that.clone(); > Iterator<Element> iter = this.iterateNonZero(); > while (iter.hasNext()) { > Element thisElement = iter.next(); > int index = thisElement.index(); > result.setQuick(index, that.getQuick(index) - > thisElement.get()); // this is "-(that - this)" > } > return result.times(-1.0); > } > > It's nice but involves another Vector allocation at the end. > > But then I'm also confused since this is the version intended for > DenseVector and RandomAccessSparseVector. I'd imagine it's best used > with SequentialAccessSparseVector, where "iterateNonZero()" has the > most benefit? > > > But that's only an issue to optimize if you iterate over "this", like > in my update above. Could we just have one implementation that > iterates over "that"? It's more straightforward. The issue isn't > really implementation but number of non-zero elements in "this" versus > "that", as the TODO comments point out.
You can test the minus performance using the Benchmark tool in utils. I believe this optimization was done because, inserting or editing a sequential access vector is very expensive compared to others.