Andrea, One note: transposing is almost free — it just rearranges the strides — I.e. changed how the array is interpreted. It doesn’t actually move the data around.
-CHB Sent from my iPhone On Oct 7, 2017, at 2:58 AM, Andrea Gavana <andrea.gav...@gmail.com> wrote: Apologies, correct timeit code this time (I had gotten the wrong shape for the output matrix in the loop case): if __name__ == '__main__': repeat = 1000 items = [Item('item_%d'%(i+1)) for i in xrange(500)] output = numpy.asarray([item.do_something() for item in items]).T statements = [''' output = numpy.asarray([item.do_something() for item in items]).T ''', ''' output = numpy.empty((8, 500)) for i, item in enumerate(items): output[:, i] = item.do_something() '''] methods = ['List Comprehension', 'Empty plus Loop '] setup = 'from __main__ import numpy, items' for stmnt, method in zip(statements, methods): elapsed = timeit.repeat(stmnt, setup=setup, number=1, repeat=repeat) minv, maxv, meanv = min(elapsed), max(elapsed), numpy.mean(elapsed) elapsed.sort() best_of_3 = numpy.mean(elapsed[0:3]) result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms , BEST OF 3: %0.2f ms'%tuple(result.tolist()) Results are the same as before... On 7 October 2017 at 11:52, Andrea Gavana <andrea.gav...@gmail.com> wrote: > Hi All, > > I have this little snippet of code: > > import timeit > import numpy > > class Item(object): > > def __init__(self, name): > > self.name = name > self.values = numpy.random.rand(8, 1) > > def do_something(self): > > sv = self.values.sum(axis=0) > array = numpy.empty((8, )) > f = numpy.dot(0.5*numpy.ones((8, )), self.values)[0] > array.fill(f) > return array > > > In my real application, the method do_something does a bit more than that, > but I believe the snippet is enough to start playing with it. What I have > is a list of (on average) 500-1,000 classes Item, and I am trying to > retrieve the output of do_something for each of them in a single, big 2D > numpy array. > > My current approach is to use list comprehension like this: > > output = numpy.asarray([item.do_something() for item in items]).T > > (Note: I need the transposed of that 2D array, always). > > But then I though: why not preallocating the output array and make a > simple loop: > > output = numpy.empty((500, 8)) > for i, item in enumerate(items): > output[i, :] = item.do_something() > > > I was expecting this version to be marginally faster - as the previous one > has to call asarray and then transpose the matrix, but I was in for a > surprise: > > if __name__ == '__main__': > > repeat = 1000 > items = [Item('item_%d'%(i+1)) for i in xrange(500)] > > statements = [''' > output = numpy.asarray([item.do_something() for item in > items]).T > ''', > ''' > output = numpy.empty((500, 8)) > for i, item in enumerate(items): > output[i, :] = item.do_something() > '''] > > methods = ['List Comprehension', 'Empty plus Loop '] > > setup = 'from __main__ import numpy, items' > > for stmnt, method in zip(statements, methods): > > elapsed = timeit.repeat(stmnt, setup=setup, number=1, > repeat=repeat) > minv, maxv, meanv = min(elapsed), max(elapsed), numpy.mean(elapsed) > elapsed.sort() > best_of_3 = numpy.mean(elapsed[0:3]) > result = numpy.asarray((minv, maxv, meanv, best_of_3))*repeat > > print method, ': MIN: %0.2f ms , MAX: %0.2f ms , MEAN: %0.2f ms , > BEST OF 3: %0.2f ms'%tuple(result.tolist()) > > > I get this: > > List Comprehension : MIN: 7.32 ms , MAX: 9.13 ms , MEAN: 7.85 ms , BEST OF > 3: 7.33 ms > Empty plus Loop : MIN: 7.99 ms , MAX: 9.57 ms , MEAN: 8.31 ms , BEST OF > 3: 8.01 ms > > > Now, I know that list comprehensions are renowned for being insanely fast, > but I though that doing asarray plus transpose would by far defeat their > advantage, especially since the list comprehension is used to call a > method, not to do some simple arithmetic inside it... > > I guess I am missing something obvious here... oh, and if anyone has > suggestions about how to improve my crappy code (performance wise), please > feel free to add your thoughts. > > Thank you. > > Andrea. > > > > > > > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@python.org https://mail.python.org/mailman/listinfo/numpy-discussion