[Python-ideas] Re: Custom string prefixes

Andrew Barnert via Python-ideas Wed, 28 Aug 2019 17:20:57 -0700

On Aug 28, 2019, at 01:05, Paul Moore <p.f.mo...@gmail.com> wrote:
> 
> On Wed, 28 Aug 2019 at 05:04, Andrew Barnert via Python-ideas
> <python-ideas@python.org> wrote:
>> What matters here is not whether things like the OP’s czt'abc' or my 1.23f 
>> or 1.23d are literals to the compiler, but whether they’re readable ways to 
>> enter constant values to the human reader.
>> 
>> If so, they’re useful. Period.
>> 
>> Now, it’s possible that even though they’re useful, the feature is still not 
>> worth adding because of Chris’s issue that it can be abused, or because 
>> there’s an unavoidable performance cost that makes it a bad idea to rely on 
>> them, or because they’re not useful in _enough_ code to be worth the effort, 
>> or whatever. Those are questions worth discussing. But arguing about whether 
>> they meet (one of the three definitions of) “literal” is not relevant.
> 
> Extended (I'm avoiding the term "custom" for now) literals like 0.2f,
> 3.14D, re/^hello.*/ or qw{a b c} have a fairly solid track record in
> other languages, and I think in general have proved both useful and
> straightforward in those languages. And even in Python, constructs
> like f-strings and complex numbers are examples of such things.
> However, I know of almost no examples of other languages that have
> added *user-definable* literal types (with the notable exception of
> C++, and I don't believe I've seen use of that feature in user code -
> which is not to say that it's not used). That to me says that there
> are complexities in extending the question to user-defined literals
> that we need to be careful of.


Agreed 100%. That’s why I think we need a more concrete proposal, that includes 
at least some thought on implementation, before we can go any farther, as I 
said in my first reply.

The OP wanted to get some feeling of whether at least some people might find 
some version of this useful before going further. I think we’ve got that now 
(the fact that not 100% of the responders agree doesn’t change that), so we 
need to get more detailed now.

My own proposal was just to answer the charge that any design will inherently 
be impossible or magical or complicated by giving a design that is none of 
those. It shouldn’t be taken as any more than that. If there are good use cases 
for prefixes, prefixes plus suffixes, etc., then my proposal can’t get you 
there, so let’s wait for the OP’s

> Some specific
> questions which would need to be dealt with:
> 
> 1. What is valid in the "literal" part of the construct (this is the
> p"C:\" question)?

I think this pretty much has to be either (a) exactly what’s valid in the 
equivalent literals today, or (b) something equally simple to describe, and 
parse, even if it’s different (like really-raw strings, or perlesque regex with 
delimiters other than quotes, or whatever).

Either way, I think you want to use the same rule for all affixed literals, not 
allow a choice of different ones like C++ does.

> 2. How do definitions of literal syntax get brought into scope in time
> for the parser to act on them (this is about "import xyz_literal"
> making xyz"a string" valid but leaving abc"a string" as a syntax
> error)?

I don’t know that this is actually necessary. If `abc"a string"` raises an 
error at execution time rather than compile time, yes, that’s different from 
how most syntax errors work today, but is it really unacceptable? (Notice that 
in the most typical case, the error still gets raised from importing the module 
or from the top level of the script—but that’s just the most typical case, not 
all cases—you could get those errors from, say, calling a method, which you 
don’t normally expect.)

There’s clearly a trade off here, because the only other alternative (at least 
that I’ve thought of or seen from anyone else; I’d love to be wrong) is that 
what you’ve imported and/or registers affects how later imports work (and 
doesn’t that mean some kind of registry hash needs to get encoded in .pyc files 
or something too?). While that is normal for people who use import hooks, most 
people don’t use import hooks most of the time, and I suspect that weirdness 
would be more off-putting than the late errors.

Another big one: How do custom prefixes interact with builtin string prefixes? 
For suffixes, there’s no problem suffixing, say, a b-string, but for prefixes, 
there is. If this is going to be allowed, there are multiple ways it could be 
designed, but someone has to pick one and specify it.

(Actually, for suffixes, there _is_ a similar issue: is `1.2jd` a `d` suffix on 
the literal `1.2j`, or a `jd` suffix on `1.2`? I think the former, because it’s 
a trivially simple rule that doesn’t need to touch any of the rest of the 
grammar. Plus, not only is it likely to never matter, but on the rare cases 
where it does matter, I think it’s the rule you’d want. For example, if I 
created my own ComplexDecimal class and wanted to use a suffix for it, why 
would I want to define both `d` and `jd` instead of just defining `d` and 
having it work with imaginary literals?)

> These questions also fundamentally affect other tools like IDEs,
> linters, code formatters, etc.

Good point.

I was thinking that any rule that’s simple enough for Python and humans to 
parse will probably be reasonably simple for other tools, and any rule that 
isn’t simple enough for Python and humans is probably a non-starter anyway.

But the “lookup affixes at compile time” idea is an example of something that 
would be easy for Python and for humans but difficult for single-file-at-a-time 
tools, so this can be important.

> In addition, there is the question of how user-defined literals would
> get turned into constants within the code. In common with list
> expressions, tuples, etc, user-defined literals would need to be
> handled as translating into runtime instructions for constructing the
> value (i.e., a function call). But people typically don't expect
> values that take the form of a literal like this to be "just" syntax
> sugar for a function call. So there's an education issue here. Code
> will get errors at runtime that the users might have expected to
> happen at compile time, or in the linter.

I really don’t think this one is a serious issue. Many people never need to 
learn that -2, 1+2j, (1,2), etc. are not literals, or which of those get 
optimized by CPython and packed into co_consts anyway, or which things that 
don’t even look like literals get similarly optimized. So how often will they 
need to know whether 1.23f is a literal, not a literal but optimized into a 
const, or neither?

> Also, it's worth noting that the benefits of *user-defined* literals
> are *not* the same as the benefits of things like 0.2f, or 3.14d, or
> even re/^hello.*/. Those things may well be useful. But the benefit
> you gain from *user-defined* literals is that of letting the end user
> make the design decisions, rather than the language designer. And
> that's a subtly different thing.

That’s a good point, but I think you’re missing something big here.

Think about it this way; assuming f and frac and dec and re and sql and so on 
are useful, out options are:

1) people don’t get a useful feature
2) we add user-defined affixes
3) we add all of these as builtin affixes

While #3 theoretically isn’t impossible, it’s wildly implausible, and probably 
a bad idea to boot, so the realistic choice is between 1 and 2.

Now, you’re right that choice 2 inherently means that we’re putting a new 
design decision on the end user (or library designer). That is definitely a 
factor to be weighed on the decision. But I don’t think it’s an immediate 
disqualifying factor. And in fact, if the feature is properly designed to be 
restrictive enough (but not too restrictive) I don’t think it will even end up 
being that big of a deal. There are all kinds of things that we leave up to the 
user, from the trivial (e.g., in Haskell, a capital letter means a type rather 
than a value; in Python it’s entirely up to each project whether it means 
anything at all) to the drastic but rarely used (import hooks probably being 
the most extreme). This one isn’t going to be trivial, but I think it will fall 
much closer to the less-disruptive side than many people are assuming. It’s 
only going to touch a small part of the grammar, and the language in general. 
(And if that turns out not to be true of the actual proposal, then I probably 
won’t support the actual proposal.)

> So, to summarise, the real problem with user defined literal proposals
> is that the benefit they give hasn't yet proven sufficient to push
> anyone to properly address all of the design-time details. We keep
> having high-level "would this be useful" debates, but never really
> focus on the key question, of what, in precise detail, is the "this"
> that we're talking about - so people are continually making arguments
> based on how they conceive such a feature might work. A really good
> example here is the p"C:\" question. Is the proposal that the "string
> part" of the literal is just a normal string? If so, then how do you
> address this genuine issue that not all paths are valid? What about
> backslash-escapes (p"C:\temp")? Is the string a raw string or not? If
> the proposal is that the path-literal code can define how the string
> is parsed, then *how does that work*?
> 
> The OP even made this point explicitly:
> 
>> I'm not discussing possible implementation of this feature just yet, we can 
>> get to
>> that point later when there is a general understanding that this is worth 
>> considering.
> 
> I don't think we *can* agree on much without the implementation
> details (well, other than "yes, it's worth discussing, but only if
> someone proposes a properly specified design"

Again, agreed.

_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/UK3MP6ZHUPKZSYUGUDQKS2E56KROF7WZ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Custom string prefixes

Reply via email to