Hello Nathaniel,
> I'd also be interested in hearing more about the memory requirements of this 
> approach. How does the temporary memory required typically scale with the 
> size of the input arrays? Is there an upper bound on the temporary memory 
> required?
> 
Currently the algorithm will not create an array larger than the largest input 
or output array (maximum_array_size). This gives a maximum upper bound of 
(number_of_terms/2 + 1) * maximum_array_size. In practice, this rarely goes 
beyond the maximum_array_size as building large outer products is not usually 
helpful. The views are also dereferenced after they are used, I believe this 
should delete the arrays correctly. However, this is one thing I am not sure is 
being handled in the best way and can use further testing. Figuring out 
cumulative memory should also be possible for the brute force path algorithm, 
but I am not sure if this is possible for the faster greedy path algorithm 
without large changes.

Overall this sounds great. If anyone has a suggestion of where this should go I 
can start working on a PR and we can work out the remaining issues there?

-Daniel Smith

> On Jan 3, 2015, at 10:57 AM, Nathaniel Smith <n...@pobox.com> wrote:
> 
> On 3 Jan 2015 02:46, "Daniel Smith" <dgasm...@icloud.com 
> <mailto:dgasm...@icloud.com>> wrote:
> >
> > Hello everyone,
> >
> > I have been working on a chunk of code that basically sets out to provide a 
> > single function that can take an arbitrary einsum expression and computes 
> > it in the most optimal way. While np.einsum can compute arbitrary 
> > expressions, there are two drawbacks to using pure einsum: einsum does not 
> > consider building intermediate arrays for possible reductions in overall 
> > rank and is not currently capable of using a vendor BLAS. I have been 
> > working on a project that aims to solve both issues simultaneously:
> >
> >https://github.com/dgasmith/opt_einsum 
> ><https://github.com/dgasmith/opt_einsum>
> >
> > This program first builds the optimal way to contract the tensors together, 
> > or using my own nomenclature a “path.” This path is then iterated over and 
> > uses tensordot when possible and einsum for everything else. In test cases 
> > the worst case scenario adds a 20 microsecond overhead penalty and, in the 
> > best case scenario, it can reduce the overall rank of the tensor. The 
> > primary (if somewhat exaggerated) example is a 5-term N^8 index 
> > transformation that can be reduced to N^5; even when N is very small (N=10) 
> > there is a 2,000 fold speed increase over pure einsum or, if using 
> > tensordot, a 2,400 fold speed increase. This is somewhat similar to the new 
> > np.linalg.multi_dot function.
> 
> This sounds super awesome. Who *wouldn't* want a free 2,000x speed increase? 
> And I especially like your test suite.
> 
> I'd also be interested in hearing more about the memory requirements of this 
> approach. How does the temporary memory required typically scale with the 
> size of the input arrays? Is there an upper bound on the temporary memory 
> required?
> 
> > As such, I am very interested in implementing this into numpy. While I 
> > think opt_einsum is in a pretty good place, there is still quite a bit to 
> > do (see outstanding issues in the README). Even if this is not something 
> > that would fit into numpy I would still be very interested in your comments.
> 
> We would definitely be interested in integrating this functionality into 
> numpy. After all, half the point of having an interface like einsum is that 
> it provides a clean boundary where we can swap in complicated, sophisticated 
> machinery without users having to care. No one wants to curate their own pile 
> of custom optimized libraries. :-)
> 
> -n
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to