On 21 September 2015 at 05:22, Eric V. Smith <e...@trueblade.com> wrote: > >> On Sep 20, 2015, at 11:15 AM, Serhiy Storchaka <storch...@gmail.com> wrote: >> >>> On 20.09.15 16:51, Eric V. Smith wrote: >>>> On 9/20/2015 8:37 AM, Nick Coghlan wrote: >>>>> On 19 September 2015 at 21:03, Eric V. Smith <e...@trueblade.com> wrote: >>>>> Instead of calling __format__, I've changed the code generator to call >>>>> format(expr1, spec1). As an optimization, I might add special opcodes to >>>>> deal with this and string concatenation, but that's for another day (if >>>>> ever). >>>> >>>> Does this mean overriding format at the module level or in builtins >>>> will affect the way f-strings are evaluated at runtime? (I don't have >>>> a strong preference one way or the other, but I think the PEP should >>>> be explicit as to the expected behaviour rather than leaving it as >>>> implementation defined). >>> >>> Yes, in the current implementation, if you mess with format(), str(), >>> repr(), or ascii() you can break f-strings. The latter 3 are used to >>> implement !s, !r, and !a. >>> >>> I have a plan to change this, by adding one or more opcodes to implement >>> the formatting and string joining. I'll defer a decision on updating the >>> PEP until I can establish the feasibility (and desirability) of that >>> approach. >> >> I propose to add internal builting formatter type. Instances should be >> marshallable and callable. The code generated for f-strings should just load >> formatter constant and call it with arguments. The formatter builds >> resulting string by concatenating literal strings and results of formatting >> arguments with specified specifications. >> >> Later we could change compiler (just peephole optimizer?) to replace >> literal_string.format(*args) and literal_string % args with calling >> precompiled formatter. >> >> Later we could rewrite str.format, str.__mod__ and re.sub to create >> temporary formatter object and call it. >> >> Later we could expose public API for creating formatter object. It can be >> used by third-party template engines. >> > > I think this is InterpolationTemplate from PEP 501.
It's certainly a similar idea, although PEP 501 just proposed storing strings and tuples on the code object, with the interpolation template itself still being a mutable object constructed at runtime. Serhiy's suggestion goes a step further to suggest making the template itself immutable, and passing in all the potentially mutable data as method arguments. I think there's a simpler approach available though, which is to go the way we went in introducing first the __import__ builtin and later the __build_class__ builtin to encapsulate some of the complexity of their respective statements without requiring a raft of new opcodes. The last draft of PEP 501 before I deferred it proposed the following for interpolation templates, since it was able to rely on having f-strings available as a primitive and wanted to offer more flexibility than string formatting needs: _raw_template = "Substitute {names} and {expressions()} at runtime" _parsed_template = ( ("Substitute ", "names"), (" and ", "expressions()"), (" at runtime", None), ) _field_values = (names, expressions()) _format_specifiers = (f"", f"") template = types.InterpolationTemplate(_raw_template, _parsed_template, _field_values, _format_specifiers) A __format__ builtin (or a dedicated opcode) could use a simpler data model that consisted of the following constant and variable elements: Compile time constant: tuple of (<leading_text>, <tuple_of_leading_specifier_elements>) pairs Runtime variable: tuple of (<substitution_field_value>, <tuple_of_specifier_substitution_field_values>) pairs If the format string didn't end with a substitution field, then the runtime variable tuple would be 1 element shorter than the constant tuple. With that approach, then __format__ (or an opcode that popped these two tuples directly off the stack) could be defined as something like: def __format__(constant_parts, variable_parts): num_fields = len(variable_parts) segments = [] for idx, (leading_text, specifier_constants) in constant_parts: segments.append(leading_text) if idx < num_fields: field_value, specifier_variables = variable_parts[idx] if specifier_variables: specifier = __format__(specifier_constants, specifier_variables) else: assert len(specifier_constants) == 1 specifier = specifier_constants[0] if specifier.startswith("!"): # Handle "!a", "!r", "!s" by modifying field_value *and* specifier if specifier: segments.append(format(field_value, specifier) return "".join(segments) Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com