Hi, I was playing around with einsum (which is really cool, by the way!), and I noticed something strange (or unexpected at least) performance-wise. Here is an example:
Let x and y be two NxM arrays, and W be a NxN array. I want to compute f = x^{ik} W_i^j y_{jk} (I hope the notation is clear enough) What surprises me is that it's much faster to compute the two products separately - as in ((xW)y) or (x(Wy)), rather than directly passing the three-operands expression to einsum. I did a quick benchmark, with the following script #======================================================== import numpy as np def f1(x,W,y): return np.einsum('ik,ij,jk', x,W,y) def f2(x,W,y): return np.einsum('jk,jk', np.einsum('ik,ij->jk', x, W), y) n = 30 m = 150 x=np.random.random(size=(n, m)) y=np.random.random(size=(n, m)) W=np.random.random(size=(n, n)) setup = """\ from __main__ import f1,f2,x,y,W """ if __name__ == '__main__': import timeit min_f1_time = min(timeit.repeat(stmt='f1(x,W,y)', setup=setup, repeat=10, number=10000)) min_f2_time = min(timeit.repeat(stmt='f2(x,W,y)', setup=setup, repeat=10, number=10000)) print('f1: {time}'.format(time=min_f1_time)) print('f2: {time}'.format(time=min_f2_time)) #============================================================ and I get (on a particular trial on my intel 64 bit machine running linux and numpy 1.6.1) the following: f1: 2.86719584465 f2: 0.891730070114 Of course, I know nothing of einsum's internals and there's probably a good reason for this. But still, it's quite counterintuitive for me; I just thought I'd mention it because a quick search didn't bring up anything on this topic, and AFAIK einsum's docs/examples don't cover the case with more than two operands. Anyway: I hope I've been clear enough, and that I didn't bring up some already-debated topic. Thanks in advance for any clarification, Eugenio _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion