[issue27458] Allow subtypes of unicode/str to hit the optimized unicode_concatenate block

Steven D'Aprano Tue, 05 Jul 2016 18:35:33 -0700

Steven D'Aprano added the comment:

I haven't studied your code in detail (I won't be qualified to judge it) but I 
notice this comment:


  /* Hit the faster unicode_concatenate method if and only if
     all the following conditions are true:
     1. The left operand is a unicode type
     2. The right operand is a unicode type
     3. The left operand's __add__ method isn't overriden */


I don't think that's sufficient. You also need to check the right operand's 
__radd__ method if it is a subclass of the left:

class A(str):
    pass

class B(str):
    def __radd__(self, other):
        return B(other.uppper() + str(self))

a = A("hello ")
b = B("world")
assert a + b == "HELLO world"



Also you have some tests, but I don't think they're sufficient. You have a 
comment about "Longer strings to prevent interning" but I'm not sure that a 
mere 21 characters is guaranteed now and forever to do that. I'd be much 
happier to see a long string which is not an identifier:

s = S("# ! ~"*50)  # hopefully will not be interned

plus an assertion to check that it is not interned. That way, if the rules for 
interning ever change, the test will fail loudly and clearly rather than 
silently do the wrong thing.

(Better still would be an actual language API for un-interning strings, if one 
exists.)

Also, I see that you have tests to check that the optimized path is taken for 
subclasses, but do you have tests to check that it is NOT taken when it 
shouldn't be?

----------
nosy: +steven.daprano
versions: +Python 3.6

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue27458>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue27458] Allow subtypes of unicode/str to hit the optimized unicode_concatenate block

Reply via email to