Sorry, I think I accidentally left out a clause here - I meant that the
rationale for /always returning a 'str'/ (as opposed to returning a
subclass) is missing, it just says in the PEP:

> The only difference between the real implementation and the above is
> that, as with other string methods like replace, the methods will
> raise a TypeError if any of self, pre or suf is not an instace of str,
> and will cast subclasses of str to builtin str objects.

I think the rationale for these differences is not made entirely clear,
specifically the "and will cast subclasses of str to builtin str
objects" part.

I think it would be best to define the truncation in terms of
__getitem__ - possibly with the caveat that implementations are allowed
(but not required) to return `self` unchanged if no match is found.

Best,
Paul

P.S. Dennis - just noticed in this reply that there is a typo in the PEP
- s/instace/instance

On 3/22/20 12:15 PM, Victor Stinner wrote:
> tl; dr A method implemented in C is more efficient than hand-written
> pure-Python code, and it's less error-prone
>
> I don't think if it has already been said previously, but I hate
> having to compute manually the string length when writing:
>
> if line.startswith("prefix"): line = line[6:]
>
> Usually what I do is to open a Python REPL and I type: len("prefix")
> and copy-paste the result :-)
>
> Passing directly the length is a risk of mistake. What if I write
> line[7:] and it works most of the time because of a space, but
> sometimes the space is omitted randomly and the application fails?
>
> --
>
> The lazy approach is:
>
> if line.startswith("prefix"): line = line[len("prefix"):]
>
> Such code makes my "micro-optimizer hearth" bleeding since I know that
> Python is stupid and calls len() at runtime, the compiler is unable to
> optimize it (sadly for good reasons, len name can be overriden)  :-(
>
> => line.cutprefix("prefix") is more efficient! ;-) It's also also shorter.
>
> Victor
>
> Le dim. 22 mars 2020 à 17:02, Paul Ganssle <p...@ganssle.io> a écrit :
>> I don't see any rationale in the PEP or in the python-ideas thread
>> (admittedly I didn't read the whole thing, I just Ctrl + F-ed "subclass"
>> there). Is this just for consistency with other methods like .casefold?
>>
>> I can understand why you'd want it to be consistent, but I think it's
>> misguided in this case. It adds unnecessary complexity for subclass
>> implementers to need to re-implement these two additional methods, and I
>> can see no obvious reason why this behavior would be necessary, since
>> these methods can be implemented in terms of string slicing.
>>
>> Even if you wanted to use `str`-specific optimizations in C that aren't
>> available if you are constrained to use the subclass's __getitem__, it's
>> inexpensive to add a "PyUnicode_CheckExact(self)" check to hit a "fast
>> path" that doesn't use slice.
>>
>> I think defining this in terms of string slicing makes the most sense
>> (and, notably, slice itself returns `str` unless explicitly overridden,
>> the default is for it to return `str` anyway...).
>>
>> Either way, it would be nice to see the rationale included in the PEP
>> somewhere.
>>
>> Best,
>> Paul
>>
>> On 3/22/20 7:16 AM, Eric V. Smith wrote:
>>> On 3/22/2020 1:42 AM, Nick Coghlan wrote:
>>>> On Sun, 22 Mar 2020 at 15:13, Cameron Simpson <c...@cskk.id.au> wrote:
>>>>> On 21Mar2020 12:45, Eric V. Smith <e...@trueblade.com> wrote:
>>>>>> On 3/21/2020 12:39 PM, Victor Stinner wrote:
>>>>>>> Well, if CPython is modified to implement tagged pointers and
>>>>>>> supports
>>>>>>> storing a short strings (a few latin1 characters) as a pointer, it
>>>>>>> may
>>>>>>> become harder to keep the same behavior for "x is y" where x and y
>>>>>>> are
>>>>>>> strings.
>>>>> Are you suggesting that it could become impossible to write this
>>>>> function:
>>>>>
>>>>>      def myself(o):
>>>>>          return o
>>>>>
>>>>> and not be able to rely on "o is myself(o)"? That seems... a pretty
>>>>> nasty breaking change for the language.
>>>> Other way around - because strings are immutable, their identity isn't
>>>> supposed to matter, so it's possible that functions that currently
>>>> return the exact same object in some cases may in the future start
>>>> returning a different object with the same value.
>>>>
>>>> Right now, in CPython, with no tagged pointers, we return the full
>>>> existing pointer wherever we can, as that saves us a data copy. With
>>>> tagged pointers, the pointer storage effectively *is* the instance, so
>>>> you can't really replicate that existing "copy the reference not the
>>>> storage" behaviour any more.
>>>>
>>>> That said, it's also possible that identity for tagged pointers would
>>>> be value based (similar to the effect of the small integer cache and
>>>> string interning), in which case the entire question would become
>>>> moot.
>>>>
>>>> Either way, the PEP shouldn't be specifying that a new object *must*
>>>> be returned, and it also shouldn't be specifying that the same object
>>>> *can't* be returned.
>>> Agreed. I think the PEP should say that a str will be returned (in the
>>> event of a subclass, assuming that's what we decide), but if the
>>> argument is exactly a str, that it may or may not return the original
>>> object.
>>>
>>> Eric
>>>
>>> _______________________________________________
>>> Python-Dev mailing list -- python-dev@python.org
>>> To unsubscribe send an email to python-dev-le...@python.org
>>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>>> Message archived at
>>> https://mail.python.org/archives/list/python-dev@python.org/message/JHM7T6JZU56PWYRJDG45HMRBXE3CBXMX/
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>> _______________________________________________
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at 
>> https://mail.python.org/archives/list/python-dev@python.org/message/RTQWEE4KZYIIXL3HK3C6IJ2ATQ6CM7PG/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>
>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6563MZW5CPYLR6ESMIVYOS32BZ2PAFDJ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to