Stephen Hansen wrote:
There really is just a right way verses a wrong way to join strings
together; using + is always the wrong way. Sometimes that level of 'wrong'
is so tiny that no one cares, like if you are using it to join together two
small strings. But when joining together a sequence of strings, the wrong
amplifies to become /clearly/ wrong.
Then I'm fine with sum() being smart enough to recognize this
horrid case and do the "right" thing by returning ''.join()
instead. If sum() were limited to int/floats like some
array/numpy functions explicitly claim, that would be an "oh, we
only handle these specific things and nothing else". But sum()
is defined over "things that have an __add__ method", and strings
have an __add__ method, making this breakage purely for
breakage's sake.
>>> class W:
... def __init__(self, s):
... self.s = s
... def __add__(self, other):
... return W(self.s + other.s)
... def __repr__(self): return "<W(%r)>" % self.s
... def __str__(self): return self.s
...
>>> lst = [W('hello'), W('world'), W('foo')]
>>> print sum(lst, W(''))
helloworldfoo
It's not an error (that it *can't* be done)...it's just plain
ornery :)
count = 0
for i in range(1000000):
if i % 1000: count += 1
instead of specifying the step-size? Or even forcing me to precompute this
constant value for `count` because looping is inefficient in this case?
That comparison is apples to... rocket launchers.
The case with sum has nothing at all to do with the the above example or it
maybe one day trying to "force" you into doing one thing or the other in the
name of Efficiency-- or start going down some data-hiding road.
For sum() to error out because strings are a special-case of
inefficiency, the above loop should error out too because it's
much more efficient to just say
count = 999000
To look at the "for" loop version and tell me that's dumb is
exactly why I feel the sum() case is dumb. If I have performance
problems because I'm sum()ing strings when I should be
''.join()ing them, it's my responsibility to read the docs on
sum() and see that's a foolish thing for me to be doing. But
don't tell me I *can't* do dumb things.
Yes, sum() is doing some "hand holding" here, but only in one specific case:
because its -always-wrong- to use it in that case.
What's always wrong is giving me an *error* when the semantics
are perfectly valid. I don't care if the implementation is
def sum(iterable, default=0):
if is_instance(default, base_string):
return ''.join(iterable)
else:
result = default
for item in iterable:
result += item
return result
to do the "right" thing of performing __add__ on all the elements
of the iterable unless it's a string. If you want to
special-case strings to perform a ''.join() the go right ahead.
The "consenting adults" argument sort of applies, sure. But these general
principles aren't absolutes. None of them are. In this case, someone decided
that it was way too easy for someone to NOT know that this is wrong and make
a mistake.
Among consenting adults, it's not "wrong". You'll just discover
there are better ways when your sum() becomes a hot-spot for CPU
cycles.
Just a burr in my boots.
-tkc
--
http://mail.python.org/mailman/listinfo/python-list