[Python-ideas] Re: Custom string prefixes

Andrew Barnert via Python-ideas Thu, 29 Aug 2019 14:12:02 -0700

On Aug 29, 2019, at 07:52, Steven D'Aprano <st...@pearwood.info> wrote:
> 
>> On Thu, Aug 29, 2019 at 05:30:39AM -0700, Andrew Barnert wrote:
>>> On Aug 29, 2019, at 04:58, Steven D'Aprano <st...@pearwood.info> wrote:
>>> 
>>> - quote marks are also used for function calls, but only a limited 
>>> subset of function calls (those which take a single string literal 
>>> argument).
>> 
>> This is a disingenuous argument.
>> 
>> When you read spam.eggs, of course you know that that means to call 
>> the __getattr__('eggs') method on spam. But do you actually read it as 
>> a special method calling syntax that’s restricted to taking a single 
>> string that must be an identifier as an argument
> 
> You make a good point about abstractions, but you are missing the 
> critical point that spam.eggs *doesn't look like a string*. Things that 
> look similar should be similar; things which are different should not 
> look similar.


Which is exactly why you’d read 1.23dec or 1.23f as a number, because it looks 
like a number and also acts like a number, rather than as a function call that 
takes the string '1.23', even if you know that’s how it’s implemented.

And most of the string affixes people have suggested are for string-ish things. 
I’m not sure what a “version string” is, but I might design that as an actual 
subclass of str that adds extractor methods and overrides comparison. A 
compiled regex isn’t literally a string, but neither is a bytes; it’s still 
clearly _similar_ to a string, in important ways. And so is a path, or a URL 
(although I don’t know what you’d use the url prefix for in Python, given that 
we don’t have a string-ish type like ObjC’s NSURL to return and I don’t think 
we need one, but presumably whoever wrote the url affix would be someone who 
disagreed and packaged the prefix with such a class).

And versions of the proposal that allow delimiters other than quotes so you can 
write things like regex/a.*b/, well, I’d need to see a specific proposal to be 
sure, but that seems even less objectionable in this regard. That looks like 
nothing else in Python, but it looks like a regex in awk or sed or perl, so I’d 
probably read it as a regex object.

> I acknowledge your point (and the OP's) that many things in Python are 
> ultimately implemented as function calls. But none of those things look 
> like strings:
> 
> - The argument to the import statement looks like an identifier 
>  (since it is an identifier, not an arbitrary string);
> 
> - The argument to __getattr__ etc looks like an identifier
>  (since it is an identifier, not an arbitrary string);
> 
> - The argument to __getitem__ is an arbitrary expression, not just
>  a string.

The arguments to the dec and f affix handlers look like numeric literals, not 
arbitrary strings.

The arguments to path and version are… probably string literal representations 
(with the quotes and all), not arbitrary strings. Although that does depends on 
the details of the specific proposal, if _any_ of your killer uses needs 
uncooked strings, then either you rcome up with something over complicated like 
C++ where you can register three different kinds of affixes, or you just always 
pass uncooked strings (because it’s trivial to cook on demand but impossible to 
de-cook).

And the arguments to regex may be some _other_ kind of restricted special 
string that… I don’t think anyone has tried to define yet, but you can vaguely 
imagine what it would have to be like, and it certainly won’t be any arbitrary 
string.

> Let me suggest some design principles that should hold for languages 
> with more-or-less "conventional" syntax. Languages like APL or Forth 
> excluded.
> 
> - anything using ' or " quotation marks as delimiters (with or without 
>  affixes) ought to return a string, and nothing but a string;

So b"abc" should not be allowed?

Let’s say I created a native-UTF16-string type to deal with some horrible 
Windows or Java stuff. Why would this principle of yours suggest that I 
shouldn’t be allowed to use u16"" just like b””?

This is a design guideline for affixes, custom or otherwise. Which could be 
useful as a filter on the list of proposed uses, to see if any good ones remain 
(and if no string affix uses remain, then of course the proposal is either 
useless or should be restricted to just numbers or whatever), but it can’t be 
an argument against all affixes, or against custom affixes, or anything else 
generic like that.

> - as a strong preference, anything using quotation marks as delimiters
>  ought to be processed at compile-time (f-strings are a conspicuous 
>  exception to that principle);

I don’t see why you should even want to _know_ whether it’s true, much less 
have a strong preference.

Here are things you probably really do care about: (a) they act like strings, 
(b) they act like constants, (c) if there are potential issues parsing them, 
you see those issues as soon as possible, (d) working with them is more than 
fast enough. Compile time is neither necessary (Haskell) nor sufficient (Tcl) 
for any of that. So why insist on compile-time instead of insisting on a-d?

> No I'm not. I'm going to think of it as a *string*, because it looks 
> like a string.

Well, yes. It’s a path string, or a regex string, or a version string, or 
whatever, which is loosely a kind of string but not literally one. Like bytes.

Or it’s a sql cursor, in which case it was probably a misuse of the feature.

> Particularly given the OP's preference for single-letter prefixes.

OK, I will agree with you there that the overuse of single-letter prefixes in 
the motivating examples is a worrying sign. In principle there’s nothing wrong 
with single letters (and I think I can make a good case for the f suffix as a 
good use in 3D-math code). And a program that used a whole ton of version 
strings and version string constants might find it useful to use v instead of 
ver. But I’m having a hard time imagining such a program existing. (Even 
something like pip or the PyPI backend might have lots of version strings, but 
why would it have lots of version string constants?)

So, maybe that’s a sign that the OP’s eventual detailed set of use cases is not 
going to make me happy. Of course the burden is on the proposer, and if every 
proposed string affix use case ends up looking bad, then I’d either oppose the 
proposal or suggest that it be restricted to numeric affixes or something.

But that’s not a reason to reject the proposal before seeing it, or to argue 
that whatever it is can’t conceivably be good because of [some posited 
universal principle that doesn’t even hold in Python today].

> 1.23f doesn't look like a string, it looks like a number. I have no 
> objection to that in principle, although of course there is a question 
> whether float32 is important enough to justify either builtin syntax or 
> custom, user-defined syntax.

As I’ve said before, I believe that anything that doesn’t have a builtin type 
does not deserve builtin syntax. And I don’t understand why that isn’t a 
near-ubiquitous viewpoint. But it’s not just you; at least three people (all of 
whom dislike the whole concept of custom affixes) seem at least in principle 
open to the idea of adding builtin affixes for types that don’t exist. Which 
makes me think it’s almost certainly not that you’re all crazy, but that I’m 
missing something important. Can you explain it to me?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/GZF2UHWTJNNREOMUEB3HB5BISNHYXFZH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

Reply via email to