Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-28 Thread Cameron Simpson
On 28Aug2020 08:56, Chris Green wrote: >Stefan Ram wrote: >> Chris Angelico writes: >> >But this is a really good job for a list comprehension: >> >sss = [str(word) for word in bbb] >> >> Are you all sure that "str" is really what you all want? >> >Not absolutely, you no doubt have been

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-28 Thread Chris Green
Cameron Simpson wrote: > On 27Aug2020 23:54, Marco Sulla wrote: > >Are you sure you want `str()`? > > > str(b'aaa') > >"b'aaa'" > > > >Probably you want: > > > >map(lambda x: x.decode(), bbb) > > _And_ you need to know the encoding of the text in the bytes. The above > _assumes_ UTF-8

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-28 Thread Chris Green
Chris Angelico wrote: > On Fri, Aug 28, 2020 at 6:36 AM Chris Green wrote: > > > > This sounds quite an easy thing to do but I can't find how to do it > > elegantly. > > > > I have a list of bytes class objects (i.e. a list containing sequences > > of bytes, which are basically text) and I want

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-28 Thread Chris Green
Stefan Ram wrote: > Chris Angelico writes: > >But this is a really good job for a list comprehension: > >sss = [str(word) for word in bbb] > > Are you all sure that "str" is really what you all want? > Not absolutely, you no doubt have been following other threads related to this one. :-)

Aw: Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-28 Thread Karsten Hilbert
> >Are you sure you want `str()`? > > > str(b'aaa') > >"b'aaa'" > > > >Probably you want: > > > >map(lambda x: x.decode(), bbb) > > _And_ you need to know the encoding of the text in the bytes. The above > _assumes_ UTF-8 because that is the default for bytes.decode, and if > that is _not_

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-27 Thread Cameron Simpson
On 27Aug2020 23:54, Marco Sulla wrote: >Are you sure you want `str()`? > str(b'aaa') >"b'aaa'" > >Probably you want: > >map(lambda x: x.decode(), bbb) _And_ you need to know the encoding of the text in the bytes. The above _assumes_ UTF-8 because that is the default for bytes.decode, and

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-27 Thread Marco Sulla
Are you sure you want `str()`? >>> str(b'aaa') "b'aaa'" Probably you want: map(lambda x: x.decode(), bbb) -- https://mail.python.org/mailman/listinfo/python-list

Re: Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-27 Thread Chris Angelico
On Fri, Aug 28, 2020 at 6:36 AM Chris Green wrote: > > This sounds quite an easy thing to do but I can't find how to do it > elegantly. > > I have a list of bytes class objects (i.e. a list containing sequences > of bytes, which are basically text) and I want to convert it to a list > of string

Python 3 how to convert a list of bytes objects to a list of strings?

2020-08-27 Thread Chris Green
This sounds quite an easy thing to do but I can't find how to do it elegantly. I have a list of bytes class objects (i.e. a list containing sequences of bytes, which are basically text) and I want to convert it to a list of string objects. One of the difficulties of finding out how to do this is

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
Vegard Stikbakke added the comment: Great idea, thanks! It's open source, so I'll see if I can fix it. On Thu, 20 Aug 2020 at 17:28, Eric V. Smith wrote: > > > Eric V. Smith added the comment: > > > > Completely agree with paul j3. The calling tool is breaking the "argv" > conventions. If

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Eric V. Smith
Eric V. Smith added the comment: Completely agree with paul j3. The calling tool is breaking the "argv" conventions. If the OP can control the calling tool, it should be fixed there. -- ___ Python tracker

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread paul j3
paul j3 added the comment: I'd say the problem is with the deployment tool. Inputs like that should be split regardless of who's doing the commandline parsing. With normal shell input, quotes are used to prevent splitting, or to otherwise prevent substitutions and special character

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Eric V. Smith
Change by Eric V. Smith : -- resolution: -> not a bug stage: -> resolved status: open -> closed type: enhancement -> behavior ___ Python tracker ___

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
Vegard Stikbakke added the comment: I see! Thanks, had not heard about shlex. I also had not realized `parse_args` takes arguments. Doh. That makes sense. Thanks a lot! -- ___ Python tracker

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Eric V. Smith
-in-python You shouldn't need to mutate sys.argv. You can break the input up into multiple strings with shlex.split() (or whatever you decide to use) and pass those to ArgumentParser.parse_args(). -- nosy: +eric.smith ___ Python tracker <ht

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Karthikeyan Singaravelan
Change by Karthikeyan Singaravelan : -- nosy: +paul.j3, rhettinger ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
ython/blob/2ce39631f679e14132a54dc90ce764259d26e166/Lib/argparse.py#L2227 Here it says that if there's a space in the string, it was meant to be a positional, and so the function returns `None`, causing it to not find the argument. In conclusion, it seems to me that argparse is not, in fact, meant to handle quot

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
Vegard Stikbakke added the comment: It seems that I mixed up something in the post here. If the quoted string is `"--a=1 --b=2` as I said in the post, then the program will only complain about `b` missing. In this case, it sets `a` to be `1 --b=2`. Whereas if the quoted string is `"--a 1

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
Vegard Stikbakke added the comment: For what it's worth, I'd love to work on this if it's something that could be nice to have. -- ___ Python tracker ___

[issue41600] Expected behavior of argparse given quoted strings

2020-08-20 Thread Vegard Stikbakke
ary (Lib) messages: 375702 nosy: vegarsti priority: normal severity: normal status: open title: Expected behavior of argparse given quoted strings type: enhancement versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue41600> __

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-20 Thread Raphael Grewe
Raphael Grewe added the comment: Soo... I think I use the functions indeed incorrectly then. Thank you for your time. Regards Raphael -- resolution: -> not a bug stage: -> resolved status: open -> closed ___ Python tracker

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-19 Thread Irit Katriel
Irit Katriel added the comment: The get and put functions of Queue have the optional 'block' and 'timeout' args that you need, and SimpleQueue doesn't. -- ___ Python tracker

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-19 Thread Raphael Grewe
Raphael Grewe added the comment: I used SimpleQueue on purpose because I only need the functions get and put. Of course I could also use Queue but that would be just a workaround for me. -- ___ Python tracker

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-14 Thread Irit Katriel
Irit Katriel added the comment: You probably need to use Queue instead of SimpleQueue: https://docs.python.org/3.8/library/multiprocessing.html#multiprocessing.Queue -- ___ Python tracker

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-14 Thread Irit Katriel
Irit Katriel added the comment: I've now reproduced it on osX, with string length of 65515. It takes a different code path than the windows version, and I was able to see more. This seems to be the sequence of events that leads to the hang: import multiprocessing.reduction import struct

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-14 Thread Irit Katriel
Irit Katriel added the comment: On windows 10 it's hanging for me from string length of 8175. Stepping through the code, the hang is in the call to _winapi.WaitForMultipleObjects, in PipeConnection._send_bytes, Lib/multiprocessing/connection.py:288 -- nosy: +iritkatriel

[issue41550] SimpleQueues put blocks if feeded with strings greater than 2**16-13 chars

2020-08-14 Thread Raphael Grewe
New submission from Raphael Grewe : Hi at all, Maybe I am using the put function incorrectly, but it is not possible to add larger strings there. The programm keeps locked and is doing nothing. I tested it on latest Debian Buster. Python 3.7.3 (default, Jul 25 2020, 13:03:44) Please check

[issue41411] Improve and consolidate f-strings docs

2020-08-05 Thread Ama Aje My Fren
cation Mini-Language", > "Format examples" sections from string.rst to stdtypes.rst where they belong; > * integrate f-strings in these sections, and add a new section explaining > f-string-specific quirks; > * leave the printf-style string formatting in stdtypes.rst, after the

[issue28002] ast.unparse can't roundtrip some f-strings

2020-08-02 Thread Shantanu
Shantanu added the comment: Just bumping this issue, as per dev guide, since https://github.com/python/cpython/pull/19612 has been ready for about two months. Would be grateful for review :-) -- ___ Python tracker

[issue41411] Improve and consolidate f-strings docs

2020-07-30 Thread Ezio Melotti
rmatting, and when str.format() was added, it got renamed and the note at the top was added to link to the str.format() documentation. I guess that the str.format documentation ended up in string.rst because it was related to string.formatter, and the author wanted to keep them together. >

[issue41411] Improve and consolidate f-strings docs

2020-07-29 Thread Ama Aje My Fren
have information about f-strings. The tutorial[0] and the reference[1] I have done PR 21681 that adds index to the tutorial although searching[2][3] does not seem to be better now that the reference has an index. > The lexical analysis is probably fine as is. I agree. > The introduction in

[issue41411] Improve and consolidate f-strings docs

2020-07-29 Thread Ama Aje My Fren
Change by Ama Aje My Fren : -- pull_requests: +20825 pull_request: https://github.com/python/cpython/pull/21681 ___ Python tracker ___

[issue41411] Improve and consolidate f-strings docs

2020-07-28 Thread Guido van Rossum
Guido van Rossum added the comment: Note that there already is something in the tutorial about f-strings (in inputoutput.rst, labeled tut-f-strings), and the intro has a link to their reference manual description in the "see also&quo

[issue41411] Improve and consolidate f-strings docs

2020-07-28 Thread Ezio Melotti
examples Template strings (string.Template) Helper functions (string.capwords) * stdtypes.rst[1] (about Python builtin types): Text Sequence Type — str (short intro about str) String Methods (all the str.* methods) printf-style String Formatting (old %-formatting

[issue41411] Improve and consolidate f-strings docs

2020-07-28 Thread Ama Aje My Fren
Ama Aje My Fren added the comment: Hi Ezio, Would you see this being resolved in part by a HOWTO document? -- nosy: +amaajemyfren ___ Python tracker ___

[issue41411] Improve and consolidate f-strings docs

2020-07-27 Thread Ezio Melotti
Change by Ezio Melotti : -- keywords: +patch pull_requests: +20788 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/21552 ___ Python tracker

[issue41411] Improve and consolidate f-strings docs

2020-07-27 Thread Guido van Rossum
Guido van Rossum added the comment: It's basically an accident that the only f-strings docs are in the language reference. Yes, they should be there, and the text there is pretty good *for the reference*, but there isn't much about them elsewhere outside of the tutorial, so everything links

[issue41411] Improve and consolidate f-strings docs

2020-07-27 Thread Eric V. Smith
Eric V. Smith added the comment: I think this is an excellent idea. The main f-string docs being in a section titled "Lexical Analysis" never seemed very user-friendly. -- ___ Python tracker

[issue41411] Improve and consolidate f-strings docs

2020-07-27 Thread Ezio Melotti
New submission from Ezio Melotti : [Creating a new issue from #41045] I was just just trying to link to someone the documentation for f-strings, but: 1) Searching "fstring" only returns two results about xdrlib[0]; 2) Searching "f-string" returns many unrelated resu

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-16 Thread Eric V. Smith
Eric V. Smith added the comment: I suggest you discuss this on python-ideas, since we'll need to reach consensus there, first. -- components: +Interpreter Core -IO, Library (Lib) ___ Python tracker

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-16 Thread Charles Machalow
Charles Machalow added the comment: In terms of multiple parameters, I propose adding a method to control the defaults used by !p. Though the defaults would work more than well enough for basic log and print usage. -- ___ Python tracker

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-16 Thread Eric V. Smith
Eric V. Smith added the comment: I agree with Raymond that it's unlikely that this will work, as a practical matter. In addition to the other problems mentioned, there's the issue of the many parameters to control pprint. And I agree with pprint, or a replacement, needing a redesign. I

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-16 Thread Charles Machalow
Charles Machalow added the comment: One of the key things for ppformat here is to format long spanning dicts/lists to multiple lines, that look easy to read in a log. I feel as though that feature/usefulness outweigh potential indentation weirdness. A lot of the usage would probably be

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-16 Thread Raymond Hettinger
Raymond Hettinger added the comment: If the python-ideas discussion is fruitful, go ahead and re-open this tracker item. Personally, I don't see how this would work. The pretty printing routines rely on knowing their current level of indentation. Also, much of the "prettiness" comes from

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-15 Thread Charles Machalow
Charles Machalow added the comment: Fair enough. Didn't really know that list existed. Sent this there. Awaiting moderator approval. Thanks. -- ___ Python tracker ___

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-15 Thread Karthikeyan Singaravelan
Karthikeyan Singaravelan added the comment: This needs discussion on python-ideas/ideas category on discourse similar to f-string debug notation. -- nosy: +eric.smith, xtreak ___ Python tracker

[issue41312] add !p to pprint.pformat() in str.format() an f-strings

2020-07-15 Thread Charles Machalow
ts: IO, Library (Lib) messages: 373738 nosy: Charles Machalow priority: normal severity: normal status: open title: add !p to pprint.pformat() in str.format() an f-strings type: enhancement versions: Python 3.10 ___ Python tracker <https://bug

[issue33754] f-strings should be part of the Grammar

2020-07-09 Thread Raymond Hettinger
Raymond Hettinger added the comment: I share Eric's concern about "unknowingly changing the behavior of f-strings." -- nosy: +rhettinger ___ Python tracker <https://bugs.python.o

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Wansoo Kim
Wansoo Kim added the comment: Well... to be honest, I'm a little confused. bpo-41244 and this issue are completely opposite. I'm not used to Python community yet because it hasn't been long since I joined it. You're saying that if a particular method is not dramatically good, we prefer to

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: In this particular case the number of concatenations is limited, the resulting string is usually short, and the code is not performance critical (it is the __repr__ implementation). So there is no significant advantage of one way over other, and no way is

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Andrew Svetlov
Andrew Svetlov added the comment: Remi is correct. Closing the issue. -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Andrew Svetlov
Change by Andrew Svetlov : -- resolution: -> wont fix stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Rémi Lapeyre
Rémi Lapeyre added the comment: Hi Wansoo, using += instead of str.join() is less performant. Concatenating n strings with + will create and allocate n new strings will str.join() will carefully look ahead and allocate the correct amount of memory and do all concatenation at one

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Wansoo Kim
Change by Wansoo Kim : -- keywords: +patch pull_requests: +20545 stage: -> patch review pull_request: https://github.com/python/cpython/pull/21397 ___ Python tracker ___

[issue41242] When concating strings, I think it is better to use += than join the list

2020-07-08 Thread Wansoo Kim
New submission from Wansoo Kim : Hello I think it's better to use += than list.join() when concating strings. This is more intuitive than other methods. Also, I personally think it is not good for one variable to change to another type during runtime. https://github.com/python/cpython/blob

[issue33754] f-strings should be part of the Grammar

2020-07-08 Thread Eric V. Smith
Change by Eric V. Smith : -- versions: +Python 3.10 -Python 3.8 ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue41224] Document is_annotate() in symtable and update doc strings

2020-07-07 Thread Joannah Nanjekye
Change by Joannah Nanjekye : -- stage: patch review -> resolved status: open -> closed ___ Python tracker ___ ___ Python-bugs-list

[issue41224] Document is_annotate() in symtable and update doc strings

2020-07-07 Thread Joannah Nanjekye
Joannah Nanjekye added the comment: New changeset a95ac779e6bca0d87819969e361627182b83292c by Joannah Nanjekye in branch 'master': bpo-41224: Document is_annotated() in symtable module and update doc strings (GH-21369) https://github.com/python/cpython/commit

[issue41224] Document is_annotate() in symtable and update doc strings

2020-07-06 Thread Joannah Nanjekye
Change by Joannah Nanjekye : -- keywords: +patch pull_requests: +20514 stage: -> patch review pull_request: https://github.com/python/cpython/pull/21369 ___ Python tracker

[issue41224] Document is_annotate() in symtable and update doc strings

2020-07-06 Thread Joannah Nanjekye
Change by Joannah Nanjekye : -- title: Document is_annotated() in the symtable module -> Document is_annotate() in symtable and update doc strings ___ Python tracker <https://bugs.python.org/issu

[issue41091] Remove recommendation in curses module documentation to initialize LC_ALL and encode strings

2020-06-25 Thread Manuel Jacob
Change by Manuel Jacob : -- keywords: +patch pull_requests: +20319 stage: -> patch review pull_request: https://github.com/python/cpython/pull/21159 ___ Python tracker ___

[issue41091] Remove recommendation in curses module documentation to initialize LC_ALL and encode strings

2020-06-23 Thread Manuel Jacob
u have to call > locale.setlocale() in the application and encode Unicode strings using one of > the system’s available encodings. This example uses the system’s default > encoding: > > import locale > locale.setlocale(locale.LC_ALL, '') > code = locale.getpreferreden

[issue40980] group names of bytes regexes are strings

2020-06-17 Thread Quentin Wenger
Quentin Wenger added the comment: bytes are _not_ Unicode code points, not even in the 256 range. End of the story. -- ___ Python tracker ___

[issue40980] group names of bytes regexes are strings

2020-06-17 Thread Quentin Wenger
Quentin Wenger added the comment: If I don't have to think about the str -> bytes direction, re should first stop going in the other direction. When I have bytes regexes I actually don't care about strings and would happily receive group names as bytes. But no, re decides that lati

[issue40980] group names of bytes regexes are strings

2020-06-17 Thread Quentin Wenger
Quentin Wenger added the comment: Because utf-8 is Python's default encoding, e.g. in source files, decode() and encode(). Literally everywhere. If you ask around "I have a bytestring, I need a string, what do I do?", using latin-1 will not be the first answer (and moreover, the correct

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: Why you always want to use "utf-8" encoded identifier as group name in `bytes` pattern. The direction is: a group name written in `bytes` pattern, and will convert to `str. Not this direction: `str` group name -(utf8)-> `bytes` pattern -> `str` group name

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
the (unicode) string with the same "graphical representation": ``` # consider the following bytestring pattern >>> p = b"(?P<\xc3\xba>)" # what character does the group name correspond to? # to discover it, we instead consider the string that "looks the same"

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
e code points is an implementation detail. If you want to keep it so, it ought (cf. the quote above) to be made clear in the docs that group names come out as latin-1-encoded strings, with all the restrictions that follow from that choice. But the more logical way would be to renounce this arbit

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: The problem can also be played in reverse, maybe it is more telling: ``` # consider the following bytestring pattern >>> p = b"(?P<\xc3\xba>)" # what character does the group name correspond to? # maybe we can try to infer it by decoding the bytestring? #

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: And there's no need for a cryptic encoding like cp1250 for this problem to arise. Here is a simple example with Python's default encoding utf-8: ``` >>> a = "ú" >>> b = list(re.match(b"(?P<" + a.encode() + b">)", b"").groupdict())[0] >>> a.isidentifier() True

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
] > >>> name > 'Ø' # '\xd8' > >>> name == orig_name > False > >>> name.encode("latin-1") > b'\xd8' > >>> name.encode("latin-1") == orig_ch > True > > "Ř" (\u0158) --cp1250

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: Please look at these: >>> orig_name = "Ř" >>> orig_ch = orig_name.encode("cp1250") # Because why not? >>> orig_ch b'\xd8' >>> name = list(re.match(b"(?P<" + orig_ch + b">)", b"").groupdict().keys())[0] >>> name 'Ø' # '\xd8' >>> name

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: > this limitation to the latin-1 subset is not compatible with the > documentation, which says that valid Python identifiers are valid group names. Not all latin-1 characters are valid identifier, for example: >>> '\x94'.encode('latin1') b'\x94' >>>

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: I prove my point that the decoding to string is arbitrary: ``` >>> import re >>> orig_name = "Ř" >>> orig_ch = orig_name.encode("cp1250") # Because why not? >>> name = list(re.match(b"(?P<" + orig_ch + b">)", b"").groupdict().keys())[0] >>> name == orig_name

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: > It seems you don't know some knowledge of encoding yet. I don't have to be ashamed of my knowledge of encoding. Yet you are right that I was missing a subtlety, which is that latin-1 is a strict subset of Unicode rather than a completely arbitrary

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: It seems you don't know some knowledge of encoding yet. Naturally, `bytes` cannot contain character which Unicode code point is greater than \u00ff. So you can only use "latin1" encoding, which map from character to byte (or reverse) directly. "utf-8", "utf-16"

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: The issue with the second variant is that utf-8 is an arbitrary (although default) choice. But: re is doing that same arbitrary choice already in decoding the group names into a string, which is my original complaint! --

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: Sorry, b"(?P<\xce\x94>)" -- ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: But Δ has no latin-1 representation. So Δ currently cannot be used as a group name in bytes regex, although it is a valid Python identifier. So that's a bug. I mean, if you insist of having group names as strings even for bytes regexes

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: In this case, you can only use 'latin1', which directly map one character (\u-\u00FF) to/from one byte. If use 'utf-8', it may map one character to multiple bytes, such as 'Δ' -> b'\xce\x94' '\x94' is an invalid identifier, it will raise an error: >>>

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: > So b'\xe9' is mapped to \u00e9, it is `é`. Yes but \xe9 is not strictly valid utf-8, or say not the canonical representation of "é". So there is no way to get \xe9 starting from é without leaving utf-8. So starting with é as group name, I cannot

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: `latin1` is the character set that Unicode code point from \u to \u00ff, and the characters are directly mapped from/to bytes. So b'\xe9' is mapped to \u00e9, it is `é`. Of course, characters with Unicode code point greater than 0xff are impossible to appear in

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: Of course an inconvenience in my program is not per se the reason to change the language. I just wanted to motivate that the current situation gives unexpected results. "\xe9" doesn't look like proper utf-8 to me: ``` >>> "é".encode("latin-1") b'\xe9' >>>

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Ma Lin
Ma Lin added the comment: > a non-ascii group name will raise an error in bytes, even if encoded Looks like this is a language limitation: >>> b'é' File "", line 1 SyntaxError: bytes can only contain ASCII literal characters. No problem if you use escaped character: >>>

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
Quentin Wenger added the comment: should *be a valid name -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue40980] group names of bytes regexes are strings

2020-06-16 Thread Quentin Wenger
, with semi-dynamical group names. So it seems natural to have everything in bytes to concatenate the regular expression, incl. the group names. But then group names that I receive back are strings, so I cannot look them up directly into the set of group names that I used to create the expression

[issue40980] group names of bytes regexes are strings

2020-06-15 Thread Ma Lin
Ma Lin added the comment: Group name is `str` is very reasonable. Essentially it is just a name, it has nothing to do with `bytes`. Other names in Python are also `str` type, such as codec names, hashlib names. -- nosy: +Ma Lin ___ Python tracker

[issue40980] group names of bytes regexes are strings

2020-06-15 Thread Quentin Wenger
Quentin Wenger added the comment: This also affects functions/methods expecting a group name as parameter (e.g. match.group), the group name has to be passed as string. -- ___ Python tracker

[issue40980] group names of bytes regexes are strings

2020-06-14 Thread Quentin Wenger
re are kind of two separate parts, cf. doc: > Both patterns and strings to be searched can be Unicode strings (str) as well > as 8-bit strings (bytes). However, Unicode strings and 8-bit strings cannot > be mixed: that is, you cannot match a Unicode string with a byte pattern or > vic

[issue40365] argparse: action "extend" with 1 parameter splits strings into characters

2020-06-09 Thread Jonathan Haigh
Jonathan Haigh added the comment: >> But I wonder, was this situation discussed in the original bug/issue? >Doesn't look like it: I was looking at the wrong PR link. This has more discussion: https://github.com/python/cpython/pull/13305. nargs is discussed but I'm not sure it was realized

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread miss-islington
miss-islington added the comment: New changeset 64409117361499058b1bf95e6efec31f7bb3c0d0 by Miss Islington (bot) in branch '3.9': bpo-40904: Fix segfault in the new parser with f-string containing yield statements with no value (GH-20701)

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread Pablo Galindo Salgado
Pablo Galindo Salgado added the comment: Thanks, Steve for the report! -- nosy: -miss-islington ___ Python tracker ___ ___

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread miss-islington
Change by miss-islington : -- nosy: +miss-islington nosy_count: 4.0 -> 5.0 pull_requests: +19917 pull_request: https://github.com/python/cpython/pull/20702 ___ Python tracker

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread Pablo Galindo Salgado
Change by Pablo Galindo Salgado : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker ___

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread Pablo Galindo Salgado
Pablo Galindo Salgado added the comment: New changeset 972ab0327675e695373fc6272d5ac24e187579ad by Pablo Galindo in branch 'master': bpo-40904: Fix segfault in the new parser with f-string containing yield statements with no value (GH-20701)

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread Pablo Galindo Salgado
Change by Pablo Galindo Salgado : -- keywords: +patch pull_requests: +19916 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20701 ___ Python tracker

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread STINNER Victor
Change by STINNER Victor : -- keywords: +3.9regression versions: +Python 3.9 ___ Python tracker ___ ___ Python-bugs-list mailing

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread STINNER Victor
Change by STINNER Victor : -- nosy: +gvanrossum, lys.nikolaou, pablogsal ___ Python tracker ___ ___ Python-bugs-list mailing list

[issue40904] Segfault from new PEG parser handling yield withing f-strings

2020-06-07 Thread Steve Stagg
nents: Interpreter Core messages: 370923 nosy: stestagg priority: normal severity: normal status: open title: Segfault from new PEG parser handling yield withing f-strings type: crash versions: Python 3.10 ___ Python tracker <https://

[issue40643] Improve doc-strings for datetime.strftime & strptime

2020-06-07 Thread Edison Abahurire
Edison Abahurire added the comment: Update: I opened a PR for this. -- ___ Python tracker ___ ___ Python-bugs-list mailing list

<    1   2   3   4   5   6   7   8   9   10   >