[Python-Dev] Re: Inline links in Misc/NEWS entries

2019-08-12 Thread Mariatta
> I would like to understand why some developers dislike including it, even
when the reST syntax is provided.

This has something to do with the use of blurb/blurb-it. Both tools
specifically say "single paragraph with simple ReST markup".

Further reading blurb's source code, it says the format of the news blurb
should be as follows:

  * The BODY section should be a single paragraph of English text
in ReST format.  It should not use the following ReST markup
features:
  * section headers
  * comments
  * directives, citations, or footnotes
  * Any features that require significant line breaks,
like lists, definition lists, quoted paragraphs, line blocks,
literal code blocks, and tables.


Perhaps Larry has more context on why the news entry should be "simple"?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3B4HRDQH5LFT7B4SLKMGOQNN56TP5A4X/


[Python-Dev] Re: Raw string literals and trailing backslash

2019-08-12 Thread Glenn Linderman

On 8/12/2019 10:21 PM, Serhiy Storchaka wrote:

12.08.19 22:41, Glenn Linderman пише:

On 8/12/2019 12:08 AM, Serhiy Storchaka wrote:
Currently a raw literal cannot end in a single backslash (e.g. in 
r"C:\User\"). Although there are reasons for this. It is an old 
gotcha, and there are many closed issues about it. This question is 
even included in FAQ.


Hmm. I didn't find it documentation, and searching several ways for 
it in a FAQ, I wasn't able to find it either.


https://docs.python.org/3/faq/design.html#why-can-t-raw-strings-r-strings-end-with-a-backslash 



Thanks. After my Google searches failed, I looked at the Python FAQ TOC, 
and the sections that seemed most promising seemed to be "General" and 
"Programming" and "Python on Windows".  I never thought to look under 
"Design and History".  "Programming" actually had a section on strings, 
and it wasn't there... which reduced my enthusiasm for reading the whole 
thing, and since it is in 8 sections, it was cumbersome to do a global 
search in the browser.


It looks like the FAQ is part of the standard documentation, but it 
seems like it would be more useful if there were cross-links between the 
documentation and the related FAQs.




Thanks for your investigation, Serhiy. Point 3 seems like the easiest 
way to convert most regular expressions containing  \" or \'  from  
r"..." form to v"""...""", without disturbing the internal gibberish 
in the regular expression, and without needing significant analysis.


No new prefix is needed, since a single trailing backslash is never a 
problem in regular expression (as it is an illegal RE syntax).


I'd be interested in your comments on my future import idea 
 
either here or privately. After 30 years of Python, it seems that there 
are quite a few warts in the string syntax, and a fresh start might be 
appropriate, as well as simpler to document, learn, and teach, and 
future import would allow a gradual, opt-in migration. It may be a long 
time, if ever, before the current syntax warts could be removed and the 
future import eliminated, but from the sounds of things, it might also 
be a long time, if ever, before there can be agreement on adding new 
escapes or giving errors for bad ones in the present syntax: making any 
changes without introducing a new prefix is a breaking, incompatible change.





Regarding point 4, if it is a string literal used as a regexp, 
internal triple quotes can be recoded as   "{3}  and  '{3} .


Good point! This is yet one option.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VR34LGEWNJVNKIFNXW7R3CCHFH6USYTT/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SHAEY4H6CKPBEYQ7WU5T2LIBTATRUXKX/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Serhiy Storchaka

12.08.19 22:51, Glenn Linderman пише:

On 8/12/2019 12:11 AM, Serhiy Storchaka wrote:
For example, in many cases `\"` can be replaced with 
`"'"'r"`, but it does not look pretty readable.


No, that is not readable.  But neither does it seem to be valid syntax, 
or else I'm not sure what you are saying. Ah, maybe you were saying that 
a seqence like the '\"' that is already embedded in a raw string can be 
converted to the sequence `"'"'r"` also embedded in the raw string. That 
makes the syntax work, but if that is what you were saying, your 
translation dropped the \ from before the ", since the raw string 
preserves both the \ and the ".


Yes, this is what I meant. Thank you for correction. I dropped the `\` 
because in context of regular expression `\"` and `"` is the same, and a 
backslash is only used to prevent `"` to end a string literal. This is 
why `\"` is so rarely used in other strings: because only in regular 
expressions `\` before `"` does not matter.


Regarding the readability, I think any use of implicitly concatenated 
strings should have at least two spaces or a newline between them to 
make the implicit concatenation clearer.


Agree. I have wrote it without spaces for dramatic effect.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IFRYF5GUDNUTF7EJPYZO2QY3VHCM7FPK/


[Python-Dev] Re: Raw string literals and trailing backslash

2019-08-12 Thread Serhiy Storchaka

12.08.19 22:41, Glenn Linderman пише:

On 8/12/2019 12:08 AM, Serhiy Storchaka wrote:
Currently a raw literal cannot end in a single backslash (e.g. in 
r"C:\User\"). Although there are reasons for this. It is an old 
gotcha, and there are many closed issues about it. This question is 
even included in FAQ.


Hmm. I didn't find it documentation, and searching several ways for it 
in a FAQ, I wasn't able to find it either.


https://docs.python.org/3/faq/design.html#why-can-t-raw-strings-r-strings-end-with-a-backslash

Thanks for your investigation, Serhiy.  Point 3 seems like the easiest 
way to convert most regular expressions containing  \" or \'  from  
r"..." form to v"""...""", without disturbing the internal gibberish in 
the regular expression, and without needing significant analysis.


No new prefix is needed, since a single trailing backslash is never a 
problem in regular expression (as it is an illegal RE syntax).


Regarding point 4, if it is a string literal used as a regexp, internal 
triple quotes can be recoded as   "{3}  and  '{3} .


Good point! This is yet one option.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VR34LGEWNJVNKIFNXW7R3CCHFH6USYTT/


[Python-Dev] Inline links in Misc/NEWS entries

2019-08-12 Thread Kyle Stanley
Recently, on Discuss, I created a new topic: 
https://discuss.python.org/t/should-news-entries-contain-documentation-links/2127

However, many may not have the time to read the full post or don’t regularly 
check the core workflow category on Discuss, so I'll provide a shortened 
version here.

During my experience thus far reviewing PRs on GitHub, I've noticed a 
significant degree of inconsistency when it comes to the usage of inline links 
with reST and Sphinx in Misc/NEWS entries. Since many of my contributions have 
involved documentation changes, I've familiarized myself most of the syntax.

Many of the features in markup languages provide visual modifications rather 
than functional ones. However, with the inline markup supported by reST and 
processing from Sphinx, there's a functional improvement as well. For example, 
the usage of :func:\`\` (escaped for mailman) provides a link to 
the relevant docs for the function. 

In the context of news entries, this allows readers to click on a function, 
method, class, etc for more information. This can be useful if it's something 
they're not familiar with, or when the changes affected the docs. For those who 
are familiar with the structure of docs.python.org, the cross link to the docs 
may not seem at all necessary. 

However, for readers that are either newer or not familiar with the structure, 
they might be led astray into 2.7 docs or an entirely wrong page. This happens 
especially frequently when using external search engines.

I'm not at all suggesting that every PR author should be required to use it or 
know all of the reST constructs. However, I would like for everyone to be aware 
of the potential usefulness of including inline links in news entries, and 
mention it in the devguide.

Also, I would like to understand why some developers dislike including it, even 
when the reST syntax is provided. The majority of authors so far would add my 
suggestion to their PR, but there have been some that don't want anything 
besides plaintext in their news entry.

Personally, I think it provides further inclusiveness to readers of all levels 
of experience and QoL at a very minimal cost.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HTAFGTQIZJQUCU6QCVF3KFD3VFGFOBWV/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Neil Schemenauer
On 2019-08-10, Serhiy Storchaka wrote:
> Actually we need to distinguish the the author and the user of the code and
> show warnings only to the author. Using .pyc files was just an heuristic:
> the author compiles the Python code, and the user uses compiled .pyc files.
> Would be nice to have more reliable way to determine the owning of the code.
> It is related not only to SyntaxWarnings, but to runtime
> DeprecationWarnings. Maybe silence warnings only for readonly files and make
> files installed by PIP readonly?

Identifying the author vs the user seems like a good idea.  Relying
on the OS filesystem seems like a solution that would cause some
challenges.  Can we embed that information in the .pyc file instead?
That way, Python knows that it is module/package that has been
installed with pip or similar and the end user is likely not the
author.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OVUKO7BJHG3JBKKGOWYWK4HTJ4SICCSK/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Glenn Linderman

On 8/12/2019 12:11 AM, Serhiy Storchaka wrote:

11.08.19 23:07, Glenn Linderman пише:

On 8/11/2019 1:26 AM, Serhiy Storchaka wrote:

10.08.19 22:10, Glenn Linderman пише:
I wonder how many raw strings actually use the \"  escape 
productively? Maybe that should be deprecated too! ?  I can't think 
of a good and necessary use for it, can anyone?


This is an interesting question. I have performed some experiments. 
15 files in the stdlib (not counting the tokenizer) use \' or \" in 
raw strings. And one test (test_venv) is failed because of using 
them in third-party code. All cases are in regular expressions. It 
is possible to rewrite them, but it is less trivial task than fixing 
invalid escape sequences. So changing this will require much much 
more long deprecation period.


Couldn't they be rewritten using the above idiom? Why would that be 
less trivial?
Or by using triple quotes, so the \" could be written as " ? That 
seems trivial.


Yes, they could. You can use different quote character, triple quotes, 
string literal concatenation. There are many options, and you should 
choose what is applicable in any particular case and what is optimal. 
You need to analyze the whole string literal and code transformation 
usually is more complex than just duplicating a backslash or adding 
the `r` prefix. For example, in many cases `\"` can be replaced with 
`"'"'r"`, but it does not look pretty readable.


No, that is not readable.  But neither does it seem to be valid syntax, 
or else I'm not sure what you are saying. Ah, maybe you were saying that 
a seqence like the '\"' that is already embedded in a raw string can be 
converted to the sequence `"'"'r"` also embedded in the raw string. That 
makes the syntax work, but if that is what you were saying, your 
translation dropped the \ from before the ", since the raw string 
preserves both the \ and the ".


Regarding the readability, I think any use of implicitly concatenated 
strings should have at least two spaces or a newline between them to 
make the implicit concatenation clearer.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VGFQFBKNPNKBJLOPDHQWGCJ6WPK7IDKT/


[Python-Dev] Re: Raw string literals and trailing backslash

2019-08-12 Thread Glenn Linderman

On 8/12/2019 12:08 AM, Serhiy Storchaka wrote:
Currently a raw literal cannot end in a single backslash (e.g. in 
r"C:\User\"). Although there are reasons for this. It is an old 
gotcha, and there are many closed issues about it. This question is 
even included in FAQ.


Hmm. I didn't find it documentation, and searching several ways for it 
in a FAQ, I wasn't able to find it either.




The most common workarounds are:

    r"C:\User" "\\"

and

    r"C:\User\ "[:-1]

I tried to experiment. It was easy to make the parser allowing a 
trailing backslash character. It was more difficult to change the 
Python implementation in the tokenizer module. But this change breaks 
existing code in more sites than I expected. 14 Python files in the 
stdlib (not counting tokenizer.py) will need to be fixed. In all cases 
it is a regular expression.


Few examples:

1.
    r"([\"\\])"

If only one type of quotes is used in a string, we can just use 
different kind of quotes for creating a string literal and remove 
escaping.


    r'(["\\])'

2.
    r'(\'[^\']*\'|"[^"]*"|...'

If different types o quotes are used in different parts of a string, 
we can use implicit concatenation of string literals created with 
different quotes (in any case a regular expression is long and should 
be split on several lines on semantic boundaries).


    r"('[^']*'|"
    r'"[^"]*"|'
    r'...'

3.
    r"([^.'\"\\#]\b|^)"

You can also use triple quotes if the string contain both type of 
quotes together.


    r"""([^.'"\\#]\b|^)"""

4. In rare cases a multiline raw string literals can contain both 
`'''` and `"""`. In this case you can use implicit concatenation of 
string literals created with different triple quotes.


See https://github.com/python/cpython/pull/15217 .

I do not think we are ready for such breaking change. It will break 
more code than forbidding unrecognized escape sequences, and the 
required fixes are less trivial.


Thanks for your investigation, Serhiy.  Point 3 seems like the easiest 
way to convert most regular expressions containing  \" or \'  from  
r"..." form to v"""...""", without disturbing the internal gibberish in 
the regular expression, and without needing significant analysis.


Regarding point 4, if it is a string literal used as a regexp, internal 
triple quotes can be recoded as   "{3}  and  '{3} .  But whether or not 
it is used as a regexp, I fail to find a syntax that permits the 
creation of a multiline raw string contining both "'''" and '"""', 
without using implicit concatenation. Since implicit concatenation must 
already be in use for that case, converting from  raw string to verbatim 
string is straightforward.




___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BVTHINCVSXGYG5VCIRPP7MAIF2ACWWUZ/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Terry Reedy

On 8/12/2019 6:34 AM, Eric V. Smith wrote:

On 8/12/2019 2:52 AM, Greg Ewing wrote:

Eric V. Smith wrote:
I'm not in any way serious about this. I just want people to realize 
how many wacky combinations there would be.


It doesn't matter how many combinations there are, as long as
multiple prefixes combine in the way you would expect, which
they do as far as I can see.


In general I agree, although there's some cognitive overhead to which 
combinations are valid or not. There's no "fu" strings, for example.


But for reading code that doesn't matter, so your point stands.


Please no more combinations. The presence of both legal and illegal 
combinations is already a mild nightmare for processing and testing. 
idlelib.colorizer has the following re to detest legal combinations


stringprefix = r"(?i:r|u|f|fr|rf|b|br|rb)?"

and the following test strings to make sure it works

"# All valid prefixes for unicode and byte strings should be colored.\n"
"'x', '''x''', \"x\", \"\"\"x\"\"\"\n"
"r'x', u'x', R'x', U'x', f'x', F'x'\n"
"fr'x', Fr'x', fR'x', FR'x', rf'x', rF'x', Rf'x', RF'x'\n"
"b'x',B'x', br'x',Br'x',bR'x',BR'x', rb'x', rB'x',Rb'x',RB'x'\n"
"# Invalid combinations of legal characters should be half colored.\n"
"ur'x', ru'x', uf'x', fu'x', UR'x', ufr'x', rfu'x', xf'x', fx'x'\n"

Or, if another prefix is added, please add an expanded 
guaranteed-correct regex to the stdlib somewhere.


--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EVDDJEA25YKPTKX6RZY55Q66NJWTOH3A/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Terry Reedy

On 8/7/2019 6:57 PM, raymond.hettin...@gmail.com wrote:

For me, these warnings are continuing to arise almost daily.  See two recent 
examples below.


Both examples are fragile, as explained below.  They make me more in 
favor of no longer guessing what \ means in the default mode.


The transition is a different matter.  I wonder if future imports could 
be (or have been) used.



In both cases, the code previously had always worked without complaint.


Because they are the are in the subset of examples of the type that work 
without adding an r prefix.  Others in the class require an r prefix.


Ascii art:


''' How old-style formatting works with positional placeholders
print('The answer is %d today, but was %d yesterday' % (new, old))
  \o
   \o
'''


In general, ascii art needs an r prefix.  Even if one example gets away 
without, an edited version or a new example may not.  In the example 
above, the o looks weird.  Suppose '\' were used instead.  Suppose one 
pointed to parentheses instead and ended up with this teaching example.


'''Sample code with parentheses:
print('The answer is %d today, but was %d yesterday' % (new, old))
\---\
  \--\
These parentheses are properly nested.
'''
Whoops. This is what I mean by fragile.

A new example:

alpha_slide = '''
-
\abcd
*\bcd
**\cd
***\d
\
-
'''
print(alpha_slide)
# This looks nice in source, but the result is
-
bcd
*cd
**\cd
***\d
-
where the appearance of \a and \b depends on the output device.

Ascii art never needs cooking.  I would teach "Always prefix ascii art 
with r" in preference to "Don't bother prefixing ascii art with r unless 
you really have to because you use one of a memorized the list of 
escapes, and promise yourself to recheck and add it if needed everytime 
you edit and are able to keep that promise".


vCard data item:


# Cut and pasted from:
# https://en.wikipedia.org/wiki/VCard#vCard_2.1
vcard = '''
BEGIN:VCARD
VERSION:2.1
N:Gump;Forrest;;Mr.
FN:Forrest Gump
ORG:Bubba Gump Shrimp Co.
TITLE:Shrimp Man
PHOTO;GIF:http://www.example.com/dir_photos/my_photo.gif
TEL;WORK;VOICE:(111) 555-1212
TEL;HOME;VOICE:(404) 555-1212
ADR;WORK;PREF:;;100 Waters Edge;Baytown;LA;30314;United States of America
LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:100 Waters Edge=0D=
  =0ABaytown\, LA 30314=0D=0AUnited States of America
ADR;HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
LABEL;HOME;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:42 Plantation St.=0D=0A=
  Baytown, LA 30314=0D=0AUnited States of America
EMAIL:forrestg...@example.com
REV:20080424T195243Z
END:VCARD
'''


Thank you for including the link so I could learn more.  In general, 
vCard representations should be raw.  The above uses the vCard 2.1 spec. 
 The more commonly used 3.0 and 4.0 specs replace "=0D=0A=" in the 2.1 
spec with a raw "\n".  If the above were updated, it might appear to 
'work', but would, I believe, fail if fed to a vCard processor.  This is 
what I mean by 'fragile'.


I would rather teach beginners the easily remembered "Always prefix 
vCard representations with 'r'" rather than "Only prefix vCard 
representations with 'r' if you use the more common newer specs and use 
'\n', as you often would."  (I don't know if raw '\t' is ever used; if 
so, add that.)


The above is based on the idea that while bytes and strings are 
'sequences of characters (codes)', they are usually used to represent 
instances of often undeclared types of data.  If the strings of a data 
type never need cooking, and may contain backslashes that could be 
cooked but must not be, the easiest rule is to always prefix with 'r'.
(Those with experience can refine it if they wish.)  If instances 
contain some backslashes that must be cooked, omit 'r' and double any 
backslashes that must be left alone.


--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TOV634XSDSM57ZZYGDOMBFNUT6VVI3P7/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Steve Holden
On Mon, Aug 12, 2019 at 6:26 PM Terry Reedy  wrote:

> On 8/8/2019 5:31 AM, Dima Tisnek wrote:
> [...]
>
> To me, this one of the major problems with the half-baked default.
> People who want string literals left as is sometimes get away with
> omitting explicit mention of that fact, but sometimes don't.
>
> Note: when we added '\u' and '\U' escapes, we broke working code that
> had Windows paths like "C:\Users\Terry".  But we did it anyway.
>

It might be helpful it there were some sort of declaration that the
ultimate goal, despite the backwards incompatibility it would entail, is
removing this wart from the language.

While practicality does indeed often beat purity, I fell this particular
case may be the exception that proves the rule. Onwards to 4.0!
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6AMS2N4O53RZ4BKTAB3GNPADZBGA4T7B/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Terry Reedy

On 8/8/2019 5:31 AM, Dima Tisnek wrote:

These two ought to be converted to raw strings, shouldn't they?


For the first example, yes or no. It depends ;-)  See below.

The problem is that string literals in python code are, by default, 
half-baked.  The interpretation of '\' by the python parser, and the 
resulting string object, depends on the next char.  I can see how this 
is sometimes a convenience, but I consider it a design bug.  There is no 
way for a user to say "I intend for this string to be fully baked, so if 
it cannot be, I goofed."  And the convenience gets used when it must not be.



On Thu, 8 Aug 2019 at 08:04,  wrote:


For me, these warnings are continuing to arise almost daily.  See two recent 
examples below.  In both cases, the code previously had always worked without 
complaint.

- Example from yesterday's class 

''' How old-style formatting works with positional placeholders

print('The answer is %d today, but was %d yesterday' % (new, old))
  \o
   \o
'''

SyntaxWarning: invalid escape sequence \-


For true ascii-only character art, where one will never want '\' baked, 
an 'r' prefix is appropriate.  It is in fact mandatory when '\' may be 
followed by a legal escape code.



If one is making unicode art, with '\u' and '\U' escapes used, one must 
not use the 'r' prefix, but should instead use '\\' for unbaked 
backslashes.  The unicode escapes have already thrown off column alignments.



- Example from today's class 

# Cut and pasted from:
# https://en.wikipedia.org/wiki/VCard#vCard_2.1
vcard = '''
BEGIN:VCARD
VERSION:2.1
N:Gump;Forrest;;Mr.
FN:Forrest Gump
ORG:Bubba Gump Shrimp Co.
TITLE:Shrimp Man
PHOTO;GIF:http://www.example.com/dir_photos/my_photo.gif
TEL;WORK;VOICE:(111) 555-1212
TEL;HOME;VOICE:(404) 555-1212
ADR;WORK;PREF:;;100 Waters Edge;Baytown;LA;30314;United States of America
LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:100 Waters Edge=0D=
  =0ABaytown\, LA 30314=0D=0AUnited States of America
ADR;HOME:;;42 Plantation St.;Baytown;LA;30314;United States of America
LABEL;HOME;ENCODING=QUOTED-PRINTABLE;CHARSET=UTF-8:42 Plantation St.=0D=0A=
  Baytown, LA 30314=0D=0AUnited States of America
EMAIL:forrestg...@example.com
REV:20080424T195243Z
END:VCARD
'''

SyntaxWarning: invalid escape sequence \,


Based on my reading of the Wikipedia vCard page linked above,
the vCard protocol mandates use of '\' chars that must be passed through 
unbaked to a vCard processor.  (I don't know why '\,', but it does not 
matter.)  So vCard strings using '\' should generally have 'r' prefixes, 
just as for regex and latex strings.  For version 2.1, it appears that 
one can currently, in 3.7-, get away with omitting 'r'.  In versions 3.0 
and 4.0, embedded 'newline' is represented by '\n' instead of '=0D=0A'. 
It must not be baked by python, but passed on as is.  So omitting 'r' 
becomes a bug for those versions.


To me, this one of the major problems with the half-baked default. 
People who want string literals left as is sometimes get away with 
omitting explicit mention of that fact, but sometimes don't.


Note: when we added '\u' and '\U' escapes, we broke working code that 
had Windows paths like "C:\Users\Terry".  But we did it anyway.


--
Terry Jan Reedy
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NZZ32WFHUMQAKG6O3KDYV5J5NQMWGKSO/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Steve Dower

On 10Aug2019 1544, Glenn Linderman wrote:

On 8/10/2019 3:36 PM, Greg Ewing wrote:

It might be better to introduce a new string prefix, e.g.
'v' for 'verbatim':

   v"C:\Users\Fred\"

Which is why I suggested  rr"C:\directory\", but allowed as how there 
might be better spellings I like your  v for verbatim !


The only new prefix I would support is 'p' to construct a pathlib.Path 
object directly from the string literal. But that doesn't change any of 
the existing discussion (apart from please take all the new prefix 
suggestions to python-ideas).


People have been solving the trailing backslash problem for a long time, 
and it's not a big enough burden to need a new fix.


Unintentional escapes in paths are a much bigger burden for new users 
and deserve a fix, but our current warning about the upcoming change is 
not targeted at the right people. Because we intend to fix the warning, 
delaying it by a release is not just "kicking the can down the road". 
But we need some agreement on what that looks like.


The bug is already at https://bugs.python.org/issue32912

Cheers,
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YVL3J7A4AM43NSUPUHMIMVZ7NT3WC2AZ/


[Python-Dev] Re: typing: how to use names in result-tuples?

2019-08-12 Thread Ivan Levkivskyi
On Mon, 12 Aug 2019 at 11:24, Christian Tismer  wrote:

> On 12.08.19 10:52, Ivan Levkivskyi wrote:
> > On Thu, 8 Aug 2019 at 17:17, Christian Tismer  > > wrote:
> >
> > Yes, that's what I mean.
> > Probably retval or whatever people prefer would be adequate,
> > with a special rule if that name is taken.
> >
> > I think btw. that using StructSequence(somename=sometype, ..., ) that
> > does a dict lookup is quite appealing. It returns a class like
> > stat_result, but the function call shows its arguments (see answer to
> > Ronald).
> >
> > Ciao -- Chris
> >
> >
> > Just a little comment: there is a (vague) plan to add a feature (key
> > types) to the type system that will allow user-defined constructs
> > similar to NamedTuple (for example StructSequence you mention).
> > However, taking into account current schedule it is unlikely it will be
> > added before mid-2020, so your best bet is indeed using ad-hoc named
> tuples.
>
> Interesting! Can I read something more about this? I'm curious ;-)
>

Some preliminary ideas were discussed at one of recent typing summits, see
https://paper.dropbox.com/doc/Type-system-improvements--AfxnxPOd_hhYvtamiI9nOdZ9Ag-HHOkniMG9WcCgS0LzXZAe

--
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4TSPDB7U7T5EJ6IWGD3UDI3F2WM3ZZDO/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Eric V. Smith

On 8/12/2019 2:52 AM, Greg Ewing wrote:

Eric V. Smith wrote:
I'm not in any way serious about this. I just want people to realize 
how many wacky combinations there would be.


It doesn't matter how many combinations there are, as long as
multiple prefixes combine in the way you would expect, which
they do as far as I can see.


In general I agree, although there's some cognitive overhead to which 
combinations are valid or not. There's no "fu" strings, for example.


But for reading code that doesn't matter, so your point stands.

Eric
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VD4BZYX2UV2GBU22PSZKDQAANQ43EZ54/


[Python-Dev] Re: typing: how to use names in result-tuples?

2019-08-12 Thread Christian Tismer
On 12.08.19 10:52, Ivan Levkivskyi wrote:
> On Thu, 8 Aug 2019 at 17:17, Christian Tismer  > wrote:
> 
> Yes, that's what I mean.
> Probably retval or whatever people prefer would be adequate,
> with a special rule if that name is taken.
> 
> I think btw. that using StructSequence(somename=sometype, ..., ) that
> does a dict lookup is quite appealing. It returns a class like
> stat_result, but the function call shows its arguments (see answer to
> Ronald).
> 
> Ciao -- Chris
> 
> 
> Just a little comment: there is a (vague) plan to add a feature (key
> types) to the type system that will allow user-defined constructs
> similar to NamedTuple (for example StructSequence you mention).
> However, taking into account current schedule it is unlikely it will be
> added before mid-2020, so your best bet is indeed using ad-hoc named tuples.

Interesting! Can I read something more about this? I'm curious ;-)

But I guess it will anyway take a little time until I can make that
transition. Maybe it's better to see what's coming up with typing,
mypy, typing_inspect and friends.
Well, I will probably start a prototype dev branch for that.

Cheers -- Chris
-- 
Christian Tismer :^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://github.com/PySide
14482 Potsdam: GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IZM2CKBMBUYH3QPOBTDKRBLNQDYACI75/


[Python-Dev] Re: typing: how to use names in result-tuples?

2019-08-12 Thread Ivan Levkivskyi
On Thu, 8 Aug 2019 at 17:17, Christian Tismer  wrote:

> Yes, that's what I mean.
> Probably retval or whatever people prefer would be adequate,
> with a special rule if that name is taken.
>
> I think btw. that using StructSequence(somename=sometype, ..., ) that
> does a dict lookup is quite appealing. It returns a class like
> stat_result, but the function call shows its arguments (see answer to
> Ronald).
>
> Ciao -- Chris
>

Just a little comment: there is a (vague) plan to add a feature (key types)
to the type system that will allow user-defined constructs similar to
NamedTuple (for example StructSequence you mention).
However, taking into account current schedule it is unlikely it will be
added before mid-2020, so your best bet is indeed using ad-hoc named tuples.

--
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O2ZFIAUHR2Q4TUXQUOOOW4KE4T7BPEWM/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Serhiy Storchaka

11.08.19 23:07, Glenn Linderman пише:

On 8/11/2019 1:26 AM, Serhiy Storchaka wrote:

10.08.19 22:10, Glenn Linderman пише:
I wonder how many raw strings actually use the \"  escape 
productively? Maybe that should be deprecated too! ?  I can't think 
of a good and necessary use for it, can anyone?


This is an interesting question. I have performed some experiments. 15 
files in the stdlib (not counting the tokenizer) use \' or \" in raw 
strings. And one test (test_venv) is failed because of using them in 
third-party code. All cases are in regular expressions. It is possible 
to rewrite them, but it is less trivial task than fixing invalid 
escape sequences. So changing this will require much much more long 
deprecation period.


Couldn't they be rewritten using the above idiom? Why would that be less 
trivial?
Or by using triple quotes, so the \" could be written as " ? That seems 
trivial.


Yes, they could. You can use different quote character, triple quotes, 
string literal concatenation. There are many options, and you should 
choose what is applicable in any particular case and what is optimal. 
You need to analyze the whole string literal and code transformation 
usually is more complex than just duplicating a backslash or adding the 
`r` prefix. For example, in many cases `\"` can be replaced with 
`"'"'r"`, but it does not look pretty readable.


See https://github.com/python/cpython/pull/15217.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KSOBUCTZITXAI3KG77DVST7U4DBPPKGR/


[Python-Dev] Raw string literals and trailing backslash

2019-08-12 Thread Serhiy Storchaka
Currently a raw literal cannot end in a single backslash (e.g. in 
r"C:\User\"). Although there are reasons for this. It is an old gotcha, 
and there are many closed issues about it. This question is even 
included in FAQ.


The most common workarounds are:

r"C:\User" "\\"

and

r"C:\User\ "[:-1]

I tried to experiment. It was easy to make the parser allowing a 
trailing backslash character. It was more difficult to change the Python 
implementation in the tokenizer module. But this change breaks existing 
code in more sites than I expected. 14 Python files in the stdlib (not 
counting tokenizer.py) will need to be fixed. In all cases it is a 
regular expression.


Few examples:

1.
r"([\"\\])"

If only one type of quotes is used in a string, we can just use 
different kind of quotes for creating a string literal and remove escaping.


r'(["\\])'

2.
r'(\'[^\']*\'|"[^"]*"|...'

If different types o quotes are used in different parts of a string, we 
can use implicit concatenation of string literals created with different 
quotes (in any case a regular expression is long and should be split on 
several lines on semantic boundaries).


r"('[^']*'|"
r'"[^"]*"|'
r'...'

3.
r"([^.'\"\\#]\b|^)"

You can also use triple quotes if the string contain both type of quotes 
together.


r"""([^.'"\\#]\b|^)"""

4. In rare cases a multiline raw string literals can contain both `'''` 
and `"""`. In this case you can use implicit concatenation of string 
literals created with different triple quotes.


See https://github.com/python/cpython/pull/15217 .

I do not think we are ready for such breaking change. It will break more 
code than forbidding unrecognized escape sequences, and the required 
fixes are less trivial.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KFCYDKDTGSNWHEODLAWWA4YBYT6PTT6P/


[Python-Dev] Re: What to do about invalid escape sequences

2019-08-12 Thread Greg Ewing

Eric V. Smith wrote:
I'm not in any way serious about this. I just want people to realize how 
many wacky combinations there would be.


It doesn't matter how many combinations there are, as long as
multiple prefixes combine in the way you would expect, which
they do as far as I can see.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CMMOYWF7DOX4K5CS2IONDXE4DEJGAUT4/