On Mar 26, 7:31 am, Steve Howell <showel...@yahoo.com> wrote: > On Mar 24, 4:19 pm, Paul Rubin <no.em...@nospam.invalid> wrote: > > > kj <no.em...@please.post> writes: > > > Is there a sequence-oriented equivalent to the sum built-in? E.g.: > > > seq_sum(((1, 2), (5, 6))) --> (1, 2) + (5, 6) --> (1, 2, 5, 6) > > > use itertools.chain for this. A few people have mentioned that sum will > > also work, but I think for that purpose it could have O(n**2) > > complexity. > > I agree on the practical matter that itertools.chain and other > solutions are usually the way to go for most tasks that involve > iterating through several lists. > > From a purely academic standpoint, I'm not convinced that sum() is > inefficient in terms of big-O complexity, though. > > show...@showell-laptop:~$ python > Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) > [GCC 4.3.3] on linux2 > >>> class StupidList: > ... def __init__(self, lst): > ... print 'creating', lst > ... self.lst = lst > ... def __add__(self, other): > ... self.lst += '|' > ... self.lst.extend(other.lst) > ... return self > ... > >>> result = sum([StupidList([1, 2]), StupidList([3,4]), > StupidList([5,6])], StupidList([0])) > creating [1, 2] > creating [3, 4] > creating [5, 6] > creating [0] > >>> result.lst > [0, '|', 1, 2, '|', 3, 4, '|', 5, 6] > > If I'm interpreting the above program correctly, then sum() is doing > the most efficient thing under the hood--it appears to do the > equivalent of += without creating unnecessary objects for intermediate > sums. > > I think the special-case error message might be a case where > practicality simply beats out purity. It would be nice if sum() were > completely duck-typed-let-you-shoot-yourself-in-foot-if-you-know-what- > you-are-doing, but maybe this was such a pitfall at one time, that > extra safeguards were put into sum(). I wonder how severely sum(), > without the restriction, would underperform join() on modern versions > of Python, though. > > >>> sum('1', '2') > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > TypeError: sum() can't sum strings [use ''.join(seq) instead] > > Note that you can easily fake out sum() to get duck typing. > > >>> class EmptyStringStarter: > ... def __add__(self, other): return other > ... > >>> empty = EmptyStringStarter() > >>> sum(['hello ', 'world'], empty) > 'hello world'
Looking at the code answers my own questions: http://svn.python.org/view/python/trunk/Python/bltinmodule.c?view=markup Look for builtin_sum(). First you see the guard against strings: /* reject string values for 'start' parameter */ if (PyObject_TypeCheck(result, &PyBaseString_Type)) { PyErr_SetString(PyExc_TypeError, "sum() can't sum strings [use ''.join(seq) instead]"); Py_DECREF(iter); return NULL; } Py_INCREF(result); Also, Paul's suspicion that sum() works in O(N squared) for lists is confirmed by this comment: /* It's tempting to use PyNumber_InPlaceAdd instead of PyNumber_Add here, to avoid quadratic running time when doing 'sum(list_of_lists, [])'. However, this would produce a change in behaviour: a snippet like empty = [] sum([[x] for x in range(10)], empty) would change the value of empty. */ It's interesting, though, that you can construct classes pretty easily, as I did above, that avoid the quadratic behavior, as long as you do not mind mutating the start value, which I think is usually fine, since the original start value usually is not useful afterward anyway. -- http://mail.python.org/mailman/listinfo/python-list