On 4/1/2018 10:20 PM, Tim Peters wrote:
[MRAB <pyt...@mrabarnett.plus.com>[
A thread on python-ideas is talking about the prefixes of string literals,
and the regex used in IDLE.

Line 25 of Lib\idlelib\colorizer.py is:

     stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?"

which looks slightly wrong to me.

This must be a holdover from years ago, before I was involved. I have wondered about it but left it as is. Thanks for confirming that it is not right.

The \b will apply only to the first choice.

Shouldn't it be more like:

     stringprefix = r"(?:\b(?i:r|u|f|fr|rf|b|br|rb))?"

?

See below.

I believe the change would capture its real intent.  It doesn't seem
to matter a whole lot, though - IDLE isn't a syntax checker, and
applies heuristics to color on the fly based on best guesses.  As is,
if you type this fragment into an IDLE shell:

kr"sdf"

only the last 5 characters get "string colored", presumably because of
the leading \br in the original regexp.  But if you type in

ku"sdf"

the last 6 characters get "string colored", because - as you pointed
out - the \b part of the original regexp has no effect on anything
other than the r following \b.

I tested with uf versus ur, which are both plausibly legal but are not.

But in neither case is the fragment legit Python.  If you do type in
legit Python, it makes no difference (legit string literals always
start at a word boundary, regardless of whether the regexp checks for
that).

I want uniform behavior. I decided to drop the \b because I prefer coloring the maximal legal string rather than the minimum. I think the contrast between two chars legal by themselves, but differently colored when put together, makes the bug more obvious.

https://bugs.python.org/issue33204

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to