+1 on adding something like parse to the language. -0 on the assignment feature... it just doesn't seem to be that beneficial to me. But once parse exists in the language a rational and limited conversation about the fstring assignment feature becomes much more possible.
On Sat, Sep 19, 2020, 3:47 PM Christopher Barker <python...@gmail.com> wrote: > On Sat, Sep 19, 2020 at 12:10 PM Wes Turner <wes.tur...@gmail.com> wrote: > >> Regex uses the ? symbol to indicate that something is a "non-greedy" >> match (to default to "shortest match") >> > > exactly -- Regex was designed to be a parsing language, format specifiers > were not. > > I'm quite surprised by how little the parse package has had to adapt the > format language to a parsing language, but it has indeed adapted it. I'm > honestly not sure how confusing that would be to have a built in parsing > language that looks like the format one, but behaves differently. I suspect > it's particularly an issue if we did assigning to fstrings, and less so if > it were a string method or stand alone function. > > Trying parse with my earlier example in this thread: > > In [1]: x, y, z = 23, 45, 67 > > > In [2]: a_string = f"{x}{y}{z}" > > > In [3]: a_string > Out[3]: '234567' > > In [4]: from parse import parse > In [5]: parse("{x}{y}{z}", a_string) > Out[5]: <Result () {'x': '2', 'y': '3', 'z': '4567'}> > > In [6]: parse("{x:d}{y:d}{z:d}", a_string) > > Out[6]: <Result () {'x': 2345, 'y': 6, 'z': 7}> > > So that's interesting -- different level of "greadiness" for strings than > integers > > In [7]: parse("{x:2d}{y:2d}{z:2d}", a_string) > > Out[7]: <Result () {'x': 23, 'y': 45, 'z': 67}> > > And now we get back what we started with -- not bad. > > I'm liking this -- I think it would be good to have parse, or something > like in, in the stdlib, maybe as a string method. > > Then maybe consider some auto-assigning behavior -- though I'm pretty > sceptical of that, and Wes' point about debugging is a good one. It would > create a real debugging / testing nightmare to have stuff auto-assigned > into locals. > > -CHB > > > > > > >> import re >> str_ = "a:b:c" >> assert re.match(r'(.*):(.*)', str_).groups() == ("a:b", "c") >> assert re.match(r'(.*?):(.*)', str_).groups() == ("a", "b:c") >> >> Typically, debugging parsing issues involves testing the output of a >> function (not changes to locals()). >> >> Parse defaults to (case-insensitive) non-greedy/shortest-match: >> >> > parse() will always match the shortest text necessary (from left to >> right) to fulfil the parse pattern, so for example: >> >> > >>> pattern = '{dir1}/{dir2}' >> > >>> data = 'root/parent/subdir' >> > >>> sorted(parse(pattern, data).named.items()) >> > [('dir1', 'root'), ('dir2', 'parent/subdir')] >> >> > So, even though {'dir1': 'root/parent', 'dir2': 'subdir'} would also >> fit the pattern, the actual match represents the shortest successful match >> for dir1. >> >> https://github.com/r1chardj0n3s/parse#potential-gotchas >> >> https://github.com/r1chardj0n3s/parse#format-specification : >> >> > Note: attempting to match too many datetime fields in a single parse() >> will currently result in a resource allocation issue. A TooManyFields >> exception will be raised in this instance. The current limit is about 15. >> It is hoped that this limit will be removed one day. >> >> >> On Sat, Sep 19, 2020, 1:00 PM Rob Cliffe via Python-ideas < >> python-ideas@python.org> wrote: >> >>> Parsing can be ambiguous: >>> f"{x}:{y}" = "a:b:c" >>> Does this set >>> x = "a" >>> y = "b:c" >>> or >>> x = "a:b" >>> y = "c" >>> Rob Cliffe >>> >>> On 17/09/2020 05:52, Dennis Sweeney wrote: >>> > TL;DR: I propose the following behavior: >>> > >>> > >>> s = "She turned me into a newt." >>> > >>> f"She turned me into a {animal}." = s >>> > >>> animal >>> > 'newt' >>> > >>> > >>> f"A {animal}?" = s >>> > Traceback (most recent call last): >>> > File "<pyshell#2>", line 1, in <module> >>> > f"A {animal}?" = s >>> > ValueError: f-string assignment target does not match 'She turned >>> me into a newt.' >>> > >>> > >>> f"{hh:d}:{mm:d}:{ss:d}" = "11:59:59" >>> > >>> hh, mm, ss >>> > (11, 59, 59) >>> > >>> > === Rationale === >>> > >>> > Part of the reason I like f-strings so much is that they reduce the >>> > cognitive overhead of reading code: they allow you to see *what* is >>> > being inserted into a string in a way that also effortlessly shows >>> > *where* in the string the value is being inserted. There is no need to >>> > "paint-by-numbers" and remember which variable is {0} and which is {1} >>> > in an unnecessary extra layer of indirection. F-strings allow string >>> > formatting that is not only intelligible, but *locally* intelligible. >>> > >>> > What I propose is the inverse feature, where you can assign a string >>> > to an f-string, and the interpreter will maintain an invariant kept >>> > in many other cases: >>> > >>> > >>> a[n] = 17 >>> > >>> a[n] == 17 >>> > True >>> > >>> > >>> obj.x = "foo" >>> > >>> obj.x == "foo" >>> > True >>> > >>> > # Proposed: >>> > >>> f"It is {hh}:{mm} {am_or_pm}" = "It is 11:45 PM" >>> > >>> f"It is {hh}:{mm} {am_or_pm}" == "It is 11:45 PM" >>> > True >>> > >>> hh >>> > '11' >>> > >>> > This could be thought of as analogous to the c language's scanf >>> > function, something I've always felt was just slightly lacking in >>> > Python. I think such a feature would more clearly allow readers of >>> > Python code to answer the question "What kinds of strings are allowed >>> > here?". It would add certainty to programs that accept strings, >>> > confirming early that the data you have is the data you want. >>> > The code reads like a specification that beginners can understand in >>> > a blink. >>> > >>> > >>> > === Existing way of achieving this === >>> > >>> > As of now, you could achieve the behavior with regular expressions: >>> > >>> > >>> import re >>> > >>> pattern = re.compile(r'It is (.+):(.+) (.+)') >>> > >>> match = pattern.fullmatch("It is 11:45 PM") >>> > >>> hh, mm, am_or_pm = match.groups() >>> > >>> hh >>> > '11' >>> > >>> > But this suffers from the same paint-by-numbers, extra-indirection >>> > issue that old-style string formatting runs into, an issue that >>> > f-strings improve upon. >>> > >>> > You could also do a strange mishmash of built-in str operations, like >>> > >>> > >>> s = "It is 11:45 PM" >>> > >>> empty, rest = s.split("It is ") >>> > >>> assert empty == "" >>> > >>> hh, rest = rest.split(":") >>> > >>> mm, am_or_pm = s.split(" ") >>> > >>> hh >>> > '11' >>> > >>> > But this is 5 different lines to express one simple idea. >>> > How many different times have you written a micro-parser like this? >>> > >>> > >>> > === Specification (open to bikeshedding) === >>> > >>> > In general, the goal would be to pursue the assignment-becomes-equal >>> > invariant above. By default, assignment targets within f-strings would >>> > be matched as strings. However, adding in a format specifier would >>> > allow the matches to be evaluated as different data types, e.g. >>> > f'{foo:d}' = "1" would make foo become the integer 1. If a more complex >>> > format specifier was added that did not match anything that the >>> > f-string could produce as an expression, then we'd still raise a >>> > ValueError: >>> > >>> > >>> f"{x:.02f}" = "0.12345" >>> > Traceback (most recent call last): >>> > File "<pyshell#2>", line 1, in <module> >>> > f"{x:.02f}" = "0.12345" >>> > ValueError: f-string assignment target does not match '0.12345' >>> > >>> > If we're feeling adventurous, one could turn the !r repr flag in a >>> > match into an eval() of the matched string. >>> > >>> > The f-string would match with the same eager semantics as regular >>> > expressions, backtracking when a match is not made on the first >>> > attempt. >>> > >>> > Let me know what you think! >>> > _______________________________________________ >>> > Python-ideas mailing list -- python-ideas@python.org >>> > To unsubscribe send an email to python-ideas-le...@python.org >>> > https://mail.python.org/mailman3/lists/python-ideas.python.org/ >>> > Message archived at >>> https://mail.python.org/archives/list/python-ideas@python.org/message/JEGSKODAK5MCO2HHUF4555JZPZ6SKNEC/ >>> > Code of Conduct: http://python.org/psf/codeofconduct/ >>> _______________________________________________ >>> Python-ideas mailing list -- python-ideas@python.org >>> To unsubscribe send an email to python-ideas-le...@python.org >>> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >>> Message archived at >>> https://mail.python.org/archives/list/python-ideas@python.org/message/CVPRH5MEEUV2HPP4QOSZQDGQ6CWAXCY7/ >>> Code of Conduct: http://python.org/psf/codeofconduct/ >>> >> _______________________________________________ >> Python-ideas mailing list -- python-ideas@python.org >> To unsubscribe send an email to python-ideas-le...@python.org >> https://mail.python.org/mailman3/lists/python-ideas.python.org/ >> Message archived at >> https://mail.python.org/archives/list/python-ideas@python.org/message/HFNRY3HB4CJXPKOX6ZXBPZ7V2TZ3O4FY/ >> Code of Conduct: http://python.org/psf/codeofconduct/ >> > > > -- > Christopher Barker, PhD > > Python Language Consulting > - Teaching > - Scientific Software Development > - Desktop GUI and Web Development > - wxPython, numpy, scipy, Cython > _______________________________________________ > Python-ideas mailing list -- python-ideas@python.org > To unsubscribe send an email to python-ideas-le...@python.org > https://mail.python.org/mailman3/lists/python-ideas.python.org/ > Message archived at > https://mail.python.org/archives/list/python-ideas@python.org/message/BUZPGEC4EESBBVBAIV5G4RJ7SUED4XCX/ > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-le...@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/P7XDJ436LUSPTNQNKR6QE2RIRWUXDYLT/ Code of Conduct: http://python.org/psf/codeofconduct/