On 2015-03-09 15:29, Antoon Pardon wrote: > Op 09-03-15 om 13:50 schreef Tim Chase: > >> (?:(?!_|\d)\w)\w+ > > If you don't have to treat it as an atom, you can simplify that to > > just > > > > (?!_|\d)\w+ > > > > which just means that the first character can't be an underscore > > or digit. > > > > Though for a Py3 identifier, the underscore is acceptable as a > > first character ("__init__"), so you can simplify it even further > > to just > > > > (?!\d)\w+ > > No that doesn't work. To begin with my attempt above shoud have > been: > > (?:(?!_|\d)\w)\w*
Did you actually test my suggestion? The "(?!\d)\w+" means "one or more Word characters, but the first one can't be a digit" because the "(?!...)" is zero-width. This should match single-character strings including a single underscore. > because an identifier can just be one letter. So when change the '+' > into a "*' in your suggestion I get this: > > >>> r = re.compile(r"(?!\d)\w*") > >>> r.match('√') > <_sre.SRE_Match object; span=(0, 0), match=''> > > But the √ is not a letter. Notice that you match an empty string there because the (?!\d) is zero width, and thus you match 0-or-more-word-characters by matching nothing. Try either anchoring it with a "$" at the end to see that it doesn't really match. -tkc -- https://mail.python.org/mailman/listinfo/python-list