Re: Vector and Matrices - The Next Gen

Dan Filimon Sat, 20 Apr 2013 09:46:57 -0700

On Thu, Apr 18, 2013 at 11:41 PM, Robin Anil <[email protected]> wrote:


> Next obvious speedups ideas I can think of are:
>
> 1) Batch insert into OpenIntDoubleHashMap(OIDHM) and
> OrderedIntDoubleMapping(OIDM). This way mutable operations like plus() or
> minus() can iterate on the Intersection elements and add the difference in
> one go. Can anyone think of a smart way to rehash based on new input
> elements ?
>
> 2) Speed up aggregate and assign methods(Dan is doing that with)
>

Regarding this, I'm testing the code to see if anything breaks and then
want to see what the performance is like.
I'm experiment with making every operation a variant of aggregate() or
assign().
This is useful because there's just one code path to look and we can focus
on high-level optimizations that apply to a larger class of functions.

Here is a preliminary version:
https://reviews.apache.org/r/10669/diff/#index_header

Regarding the parallelization, the results would be valid as long as the
aggregating function is both commutative and associative (which we can now
check) but it adding the parallelization here might be too much work.


> 3) Generalize caching framework of derived properties like
> getLengthSquared() and extend it into other things, like commons norms (L1,
> L2), numNonZeros(),
>
> 4) Parallelize operations: Use a consistent sharding function to trivially
> parallelize certain iterative operations across multiple threads.
>
> 6) Replace current DenseVector and/or encapsulate JBlas inside it.
>
> 7) Improve exception handling.
>
> All these can be independent projects. I know I wont get time to get to
> this, I am more than happy to review
>

Re: Vector and Matrices - The Next Gen

Reply via email to