Re: [Python-Dev] PEP-498: Literal String Formatting

Steve Dower Mon, 17 Aug 2015 15:10:21 -0700

On 17Aug2015 0813, Barry Warsaw wrote:

On Aug 18, 2015, at 12:58 AM, Chris Angelico wrote:

The linters could tell you that you have no 'end' or 'start' just as
easily when it's in that form as when it's written out in full.
Certainly the mismatched brackets could easily be caught by any sort
of syntax highlighter. The rules for f-strings are much simpler than,
say, the PHP rules and the differences between ${...} and {$...},
which I've seen editors get wrong.


I'm really asking whether it's technically feasible and realistically possible
for them to do so.  I'd love to hear from the maintainers of pyflakes, pylint,
Emacs, vim, and other editors, linters, and other static analyzers on a rough
technical assessment of whether they can support this and how much work it
would be.

With the current format string proposals (allowing arbitraryexpressions) I think I'd implement it in our parser with aFORMAT_STRING_TOKEN, a FORMAT_STRING_JOIN_OPERATOR and aFORMAT_STRING_FORMAT_OPERATOR.

A FORMAT_STRING_TOKEN would be started by f('|"|'''|""") and ended bymatching quotes or before an open brace (that is not escaped).

A FORMAT_STRING_JOIN_OPERATOR would be inserted as the '{', which we'deither colour as part of the string or the regular brace colour. Thisalso enables a parsing context where a colon becomes theFORMAT_STRING_FORMAT_OPERATOR and the right-hand side of this binaryoperator would be FORMAT_STRING_TOKEN. The final close brace becomesanother FORMAT_STRING_JOIN_OPERATOR and the rest of the string isFORMAT_STRING_TOKEN.


So it'd translate something like this:

f"This {text} is my {string:>{length+3}}"

FORMAT_STRING_TOKEN[f"This ]
FORMAT_STRING_JOIN_OPERATOR[{]
IDENTIFIER[text]
FORMAT_STRING_JOIN_OPERATOR[}]
FORMAT_STRING_TOKEN[ is my ]
FORMAT_STRING_JOIN_OPERATOR[{]
IDENTIFIER[string]
FORMAT_STRING_FORMAT_OPERATOR[:]
FORMAT_STRING_TOKEN[>]
FORMAT_STRING_JOIN_OPERATOR[{]
IDENTIFIER[length]
OPERATOR[+]
NUMBER[3]
FORMAT_STRING_JOIN_OPERATOR[}]
FORMAT_STRING_TOKEN[]
FORMAT_STRING_JOIN_OPERATOR[}]
FORMAT_STRING_TOKEN["]

I *believe* (without having tried it) that this would let us produce avalid tokenisation (in our model) without too much difficulty, andhighlight/analyse correctly, including validating matching braces.Getting the precedence correct on the operators might be more difficult,but we may also just produce an AST that looks like a function call,since that will give us "good enough" handling once we're past tokenisation.

A simpler tokenisation that would probably be sufficient for manyeditors would be to treat the first and last segments ([f"This {] and[}"]) as groupings and each section of text as separators, giving this:


OPEN_GROUPING[f"This {]
EXPRESSION[text]
COMMA[} is my {]
EXPRESSION[string]
COMMA[:>{]
EXPRESSION[length+3]
COMMA[}}]
CLOSE_GROUPING["]

Initial parsing may be a little harder, but it should mean less troublewhen expressions spread across multiple lines, since that is alreadyhandled for other types of groupings. And if any code analysis isoccurring, it should be happening for dict/list/etc. contents already,and so format strings will get it too.

So I'm confident we can support it, and I expect either of these twoapproaches will work for most tools without too much trouble. (There'salso a middle ground where you create new tokens for format stringcomponents, but combine them like the second example.)


Cheers,
Steve

Cheers,
-Barry



_______________________________________________
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP-498: Literal String Formatting

Reply via email to