Stephen Hansen wrote:
There really is just a right way verses a wrong way to join strings
together; using + is always the wrong way. Sometimes that level of 'wrong'
is so tiny that no one cares, like if you are using it to join together two
small strings. But when joining together a sequence of strings, the wrong
amplifies to become /clearly/ wrong.

Then I'm fine with sum() being smart enough to recognize this horrid case and do the "right" thing by returning ''.join() instead. If sum() were limited to int/floats like some array/numpy functions explicitly claim, that would be an "oh, we only handle these specific things and nothing else". But sum() is defined over "things that have an __add__ method", and strings have an __add__ method, making this breakage purely for breakage's sake.

  >>> class W:
  ...     def __init__(self, s):
  ...             self.s = s
  ...     def __add__(self, other):
  ...             return W(self.s + other.s)
  ...     def __repr__(self): return "<W(%r)>" % self.s
  ...     def __str__(self): return self.s
  ...
  >>> lst = [W('hello'), W('world'), W('foo')]
  >>> print sum(lst, W(''))
  helloworldfoo

It's not an error (that it *can't* be done)...it's just plain ornery :)

 count = 0
 for i in range(1000000):
   if i % 1000: count += 1

instead of specifying the step-size?  Or even forcing me to precompute this
constant value for `count` because looping is inefficient in this case?

That comparison is apples to... rocket launchers.

The case with sum has nothing at all to do with the the above example or it
maybe one day trying to "force" you into doing one thing or the other in the
name of Efficiency-- or start going down some data-hiding road.

For sum() to error out because strings are a special-case of inefficiency, the above loop should error out too because it's much more efficient to just say

  count = 999000

To look at the "for" loop version and tell me that's dumb is exactly why I feel the sum() case is dumb. If I have performance problems because I'm sum()ing strings when I should be ''.join()ing them, it's my responsibility to read the docs on sum() and see that's a foolish thing for me to be doing. But don't tell me I *can't* do dumb things.

Yes, sum() is doing some "hand holding" here, but only in one specific case:
because its -always-wrong- to use it in that case.

What's always wrong is giving me an *error* when the semantics are perfectly valid. I don't care if the implementation is

  def sum(iterable, default=0):
    if is_instance(default, base_string):
      return ''.join(iterable)
    else:
      result = default
      for item in iterable:
        result += item
      return result

to do the "right" thing of performing __add__ on all the elements of the iterable unless it's a string. If you want to special-case strings to perform a ''.join() the go right ahead.

The "consenting adults" argument sort of applies, sure. But these general
principles aren't absolutes. None of them are. In this case, someone decided
that it was way too easy for someone to NOT know that this is wrong and make
a mistake.

Among consenting adults, it's not "wrong". You'll just discover there are better ways when your sum() becomes a hot-spot for CPU cycles.

Just a burr in my boots.

-tkc



--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to