Nick Coghlan wrote: [snip] > The rules for name fields would then become: > > 1. Numeric fields start with a digit and are terminated by any > non-numeric character. > > 2. An identifier name field is terminated by any one of: > '}' (terminates the replacement field) > '!' (terminates identifier field, starts conversion specifier) > ':' (terminates identifier field, starts format specifier) > '.' (terminates identifier field, starts new identifier field for > subattribute) > '[' (terminates identifier field, starts index field) > > 3. An index field is terminated by ']' (subsequent character will > determine next field)
+1 > That second set of rules is *far* more in line with the behaviour of > the rest of the language than the status quo, so unless the difficulty > of making the str.format mini-language parser work that way is truly > prohibitive, it certainly seems worthwhile to tidy up the semantics. > > The index field behaviour should definitely be fixed, as it poses no > backwards compatibility concerns. The brace matching behaviour should > probably be left alone, as changing it would potentially break > currently valid format strings (e.g. "{a{0}}".format(**{'a{0}':1}) > produces '1' now, but would raise an exception if the brace matching > rules were changed). -1 for leaving the brace matching behavior alone, as it's very unintuitive for *the user*. For the implementor it may make sense to count matching braces, but definitely not for the user. I don't believe that "{a{0}}" is a real use case that someone might already use, as it's a hard violation of what the documentation currently says. I'd rather disallow braces in the replacement field before the format specifier altogether. Or closing braces at the minimum. Furthermore, the double-escaping sounds reasonable in the format specifier, but not elsewhere. My motivation is that the user should be able to have a quick glance on the format string and see where the replacement fields are. This is probably what the PEP intends to say when disallowing braces inside the replacement field. In my opinion, it's easy to write the parser in a way that braces are parsed in any imaginable manner. Or maybe not easy, but not harder than any other way of handling braces. > So +1 on making the str.format parser accept anything other than ']' > inside an index field and turn the whole thing into an ordinary > string, -1 on making any other changes to the brace-matching > behaviour. > > That would leave us with the following set of rules for name fields: > > 1. Numeric fields start with a digit and are terminated by any > non-numeric character. > > 2. An identifier name field is terminated by any one of: > '}' (terminates the replacement field, unless preceded by a > matching '{' character, in which case it is ignored and included in > the string) > '!' (terminates identifier field, starts conversion specifier) > ':' (terminates identifier field, starts format specifier) > '.' (terminates identifier field, starts new identifier field for > subattribute) > '[' (terminates identifier field, starts index field) > > 3. An index field is terminated by ']' (subsequent character will > determine next field) > > Note that brace-escaping currently doesn't work inside name fields, so > that should also be fixed: > > >>> "{0[{{]}".format({'{':1}) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ValueError: unmatched '{' in format > >>> "{a{{}".format(**{'a{':1}) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > ValueError: unmatched '{' in format -1. Why do we need braces inside replacement fields at all (except for inner replacements in the format specier)? I strongly believe that the PEP's use case is the simple one: '{foo}'.format(foo=10) In my opinoin, these '{!#%}'.format(**{'!#%': 10}) cases are not real. The current documentation requires field_name to be a valid identifier, an this is a sane requirement. The only problem is that parsing identifiers correctly is very hard, so it can be made simpler by allowing some non-identifiers. But we still don't have to accept braces. --- As a somewhat another issue, I'm confused about this: >>> '{a[1][2]}'.format(a={1:{2:3}}) '3' and even more about this: >>> '{a[1].foo[2]}'.format(a={1:namedtuple('x', 'foo')({2:3})}) '3' Why does this work? It's against the current documentation. The documented syntax only allows zero or one attribute names and zero or one element index, in this order. Is it intentional that we allow arbitrary chains of getattr and __getitem__? If we do, this should be documented, too. Petri _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com