Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Tim Chase :

> On 2018-07-16 23:59, Marko Rauhamaa wrote:
>> Tim Chase :
>> > While the python world has moved its efforts into improving
>> > Python3, Python2 hasn't suddenly stopped working.  
>> 
>> The sword of Damocles is hanging on its head. Unless a consortium is
>> erected to support Python2, no vendor will be able to use it in the
>> medium term.
>
> Wait, but now you're talking about vendors. Much of the crux of this
> discussion has been about personal scripts that don't need to
> marshal Unicode strings in and out of various functions/objects.

In both personal and professional settings, you face the same issues.
But you don't want to build on something that will disappear from the
Linux distros.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread miss-islington


miss-islington  added the comment:


New changeset 8b5d191386350d28a0f20283dcb366cf50f82b97 by Miss Islington (bot) 
in branch '3.6':
bpo-33967: Fix wrong use of assertRaises (GH-8306)
https://github.com/python/cpython/commit/8b5d191386350d28a0f20283dcb366cf50f82b97


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue32545] Unable to install Python 3.7.0a4 on Windows 10 - Error 0x80070643: Failed to install MSI package.

2018-07-16 Thread Amin Radjabov


Amin Radjabov  added the comment:

yes I try to install to all user, but I have no any other python installations 
in my OS. I succeeded to install it to just me.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Steven D'Aprano :
> On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote:
>> UTF-8 bytes can only represent the first 128 code points of Unicode.
>
> This is DailyWTF material. Perhaps you want to rethink your wording
> and maybe even learn a bit more about Unicode and the UTF encodings
> before making such statements.
>
> The idea that UTF-8 bytes cannot represent the whole of Unicode is not
> even wrong. Of course a *single* byte cannot, but a single byte is not
> "UTF-8 bytes".

So I hope that by now you have understood my point and been able to
decide if you agree with it or not.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread miss-islington


miss-islington  added the comment:


New changeset 892df9d15aae08d4033faeb34698e3c550c85854 by Miss Islington (bot) 
in branch '3.7':
bpo-33967: Fix wrong use of assertRaises (GH-8306)
https://github.com/python/cpython/commit/892df9d15aae08d4033faeb34698e3c550c85854


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34134] multiprocessing memory huge usage

2018-07-16 Thread INADA Naoki


INADA Naoki  added the comment:

Do you imap or imap_unorderd?
They are intended for use with iterator, including generator.

--
nosy: +inada.naoki

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34135] The results of time.tzname print broken.

2018-07-16 Thread 김태환

New submission from 김태환 :

When I call time.tzname at Korean Windows. (Microsoft Windows 10 Pro(10.0.17134 
Build 17134))

It prints like below. This problem occurred Python 2 and 3 both.
>>> import time
>>> time.tzname
('´ëÇѹα¹ Ç¥ÁؽÃ', '´ëÇѹα¹ Àϱ¤ Àý¾à ½Ã°£')

I used chardet for getting correct tzname.
>>> import chardet
>>> tzname = [tzn.encode('latin-1').decode('cp949') for tzn in time.tzname]
>>> tzname
['대한민국 표준시', '대한민국 일광 절약 시간']

I think that cause of this problem is tzname encoded by 'latin-1' at Window s.

--
components: Windows
messages: 321790
nosy: paul.moore, steve.dower, tim.golden, zach.ware, 김태환
priority: normal
severity: normal
status: open
title: The results of time.tzname print broken.
type: behavior
versions: Python 2.7, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7844

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7843

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread INADA Naoki


INADA Naoki  added the comment:


New changeset 56d8f57b83a37b05a6f2fbc3e141bbc1ba6cb3a2 by INADA Naoki in branch 
'master':
bpo-33967: Fix wrong use of assertRaises (GH-8306)
https://github.com/python/cpython/commit/56d8f57b83a37b05a6f2fbc3e141bbc1ba6cb3a2


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34134] multiprocessing memory huge usage

2018-07-16 Thread Windson Yang


New submission from Windson Yang :

I'm using macOX and I got huge memory usage when using generator with 
multiprocess. (see file) 

I think this is because 
(https://github.com/python/cpython/blob/master/Lib/multiprocessing/pool.py#L383)

if not hasattr(iterable, '__len__'):
iterable = list(iterable)

if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1

When we convert an iterable to list(iterable), we lost the advantage of using 
the generator. I'm not sure how to fix it, maybe we can set a default value for 
an object don't have '__len__' attr, any ideas?

--
files: test.py
messages: 321788
nosy: Windson Yang, zach.ware
priority: normal
severity: normal
status: open
title: multiprocessing memory huge usage
type: resource usage
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8
Added file: https://bugs.python.org/file47698/test.py

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: can't install/run pip (Latest version of Python)

2018-07-16 Thread Gene Heskett
On Monday 16 July 2018 23:06:19 S Lea wrote:

> 'pip' not recognized as internal or external command, operable program
> or batch.
>
> And for some reason it's a 32 bit version
>
Huh? My ancient wet ram memory is probably out to lunch, but ISTR reading 
about someone else with the same problem at least a year ago on this 
list, unfortunately I have a 90 day expiry setup on this list so 
searching is just an exercise. My email corpus is nearly 20 gigabytes 
now.

pip has been around since it seems forever, but I would have thought by 
now it would have been rebuilt for 64 bit systems.  And that leads to a 
question, where did you get it?, and how long ago?  Maybe its an old 
version?  Head scratcher for sure.

> On Mon, Jul 16, 2018 at 8:03 PM, S Lea  wrote:
> > Thank you for reaching out.
> >
> > 1) Don't know what do you mean by the traceback.

What you see in the terminal screen you ran it in. It should a press the 
left mouse button and wipe the mouse from beginning to end so its all 
highlighted, then position the curser in an email and press the middle 
mouse button, which should copy the highlighted text into the email.

> > 2) In DOS, pip install pandas
> > 3) Yes, in DOS, Win 10
> > 4) 3.7
> > 5) Not getting much info
> >
> > On Sun, Jul 15, 2018 at 5:44 PM, boB Stepp  
wrote:
> >> On Sun, Jul 15, 2018 at 7:34 PM S Lea  wrote:
> >> >  I can't seem to install the pips, DOS gives me the syntex i
> >> > invalid,
> >>
> >> any
> >>
> >> > thoughts?
> >>
> >> You provide insufficient information for anyone to be able to help
> >> you.
> >>
> >> 1) Copy and paste the entire traceback into a plain text email, no
> >> screen shots please.
> >> 2) Copy and paste exactly what you typed that generated the syntax
> >> error. 3) Did you do all of this in cmd.exe in what version of
> >> Windows?  Or did you use something else?  Hopefully not in the
> >> interactive Python interpreter or IDLE!
> >> 4) What Python version are you using?
> >> 5) Did you try searching for the exact error message and see what
> >> answers might already be out there?
> >>
> >> If you provide these things, or, better, search and find the answer
> >> yourself, I'm sure someone will be happy to assist.
> >>
> >> Good luck!
> >> --
> >> boB
> >> --
> >> https://mail.python.org/mailman/listinfo/python-list



-- 
Cheers, Gene Heskett
--
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
Genes Web page 
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34132] Obscure netrc parser "bug"

2018-07-16 Thread Xiang Zhang


Change by Xiang Zhang :


--
nosy: +xiang.zhang

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: can't install/run pip (Latest version of Python)

2018-07-16 Thread S Lea
of python 3.7

On Mon, Jul 16, 2018 at 8:06 PM, S Lea  wrote:

> 'pip' not recognized as internal or external command, operable program or
> batch.
>
> And for some reason it's a 32 bit version
>
> On Mon, Jul 16, 2018 at 8:03 PM, S Lea  wrote:
>
>> Thank you for reaching out.
>>
>> 1) Don't know what do you mean by the traceback.
>> 2) In DOS, pip install pandas
>> 3) Yes, in DOS, Win 10
>> 4) 3.7
>> 5) Not getting much info
>>
>> On Sun, Jul 15, 2018 at 5:44 PM, boB Stepp 
>> wrote:
>>
>>> On Sun, Jul 15, 2018 at 7:34 PM S Lea  wrote:
>>> >
>>> >  I can't seem to install the pips, DOS gives me the syntex i invalid,
>>> any
>>> > thoughts?
>>>
>>> You provide insufficient information for anyone to be able to help you.
>>>
>>> 1) Copy and paste the entire traceback into a plain text email, no
>>> screen shots please.
>>> 2) Copy and paste exactly what you typed that generated the syntax error.
>>> 3) Did you do all of this in cmd.exe in what version of Windows?  Or
>>> did you use something else?  Hopefully not in the interactive Python
>>> interpreter or IDLE!
>>> 4) What Python version are you using?
>>> 5) Did you try searching for the exact error message and see what
>>> answers might already be out there?
>>>
>>> If you provide these things, or, better, search and find the answer
>>> yourself, I'm sure someone will be happy to assist.
>>>
>>> Good luck!
>>> --
>>> boB
>>> --
>>> https://mail.python.org/mailman/listinfo/python-list
>>>
>>
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: can't install/run pip (Latest version of Python)

2018-07-16 Thread S Lea
Also, I can't find the location of Python insallation, it refers
to C:\Users\Precision\PycharmProjects\my first project from
video\venv\Scripts

On Mon, Jul 16, 2018 at 8:06 PM, S Lea  wrote:

> of python 3.7
>
> On Mon, Jul 16, 2018 at 8:06 PM, S Lea  wrote:
>
>> 'pip' not recognized as internal or external command, operable program or
>> batch.
>>
>> And for some reason it's a 32 bit version
>>
>> On Mon, Jul 16, 2018 at 8:03 PM, S Lea  wrote:
>>
>>> Thank you for reaching out.
>>>
>>> 1) Don't know what do you mean by the traceback.
>>> 2) In DOS, pip install pandas
>>> 3) Yes, in DOS, Win 10
>>> 4) 3.7
>>> 5) Not getting much info
>>>
>>> On Sun, Jul 15, 2018 at 5:44 PM, boB Stepp 
>>> wrote:
>>>
 On Sun, Jul 15, 2018 at 7:34 PM S Lea  wrote:
 >
 >  I can't seem to install the pips, DOS gives me the syntex i invalid,
 any
 > thoughts?

 You provide insufficient information for anyone to be able to help you.

 1) Copy and paste the entire traceback into a plain text email, no
 screen shots please.
 2) Copy and paste exactly what you typed that generated the syntax
 error.
 3) Did you do all of this in cmd.exe in what version of Windows?  Or
 did you use something else?  Hopefully not in the interactive Python
 interpreter or IDLE!
 4) What Python version are you using?
 5) Did you try searching for the exact error message and see what
 answers might already be out there?

 If you provide these things, or, better, search and find the answer
 yourself, I'm sure someone will be happy to assist.

 Good luck!
 --
 boB
 --
 https://mail.python.org/mailman/listinfo/python-list

>>>
>>>
>>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: can't install/run pip (Latest version of Python)

2018-07-16 Thread S Lea
'pip' not recognized as internal or external command, operable program or
batch.

And for some reason it's a 32 bit version

On Mon, Jul 16, 2018 at 8:03 PM, S Lea  wrote:

> Thank you for reaching out.
>
> 1) Don't know what do you mean by the traceback.
> 2) In DOS, pip install pandas
> 3) Yes, in DOS, Win 10
> 4) 3.7
> 5) Not getting much info
>
> On Sun, Jul 15, 2018 at 5:44 PM, boB Stepp  wrote:
>
>> On Sun, Jul 15, 2018 at 7:34 PM S Lea  wrote:
>> >
>> >  I can't seem to install the pips, DOS gives me the syntex i invalid,
>> any
>> > thoughts?
>>
>> You provide insufficient information for anyone to be able to help you.
>>
>> 1) Copy and paste the entire traceback into a plain text email, no
>> screen shots please.
>> 2) Copy and paste exactly what you typed that generated the syntax error.
>> 3) Did you do all of this in cmd.exe in what version of Windows?  Or
>> did you use something else?  Hopefully not in the interactive Python
>> interpreter or IDLE!
>> 4) What Python version are you using?
>> 5) Did you try searching for the exact error message and see what
>> answers might already be out there?
>>
>> If you provide these things, or, better, search and find the answer
>> yourself, I'm sure someone will be happy to assist.
>>
>> Good luck!
>> --
>> boB
>> --
>> https://mail.python.org/mailman/listinfo/python-list
>>
>
>
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34132] Obscure netrc parser "bug"

2018-07-16 Thread bbayles


Change by bbayles :


--
nosy: +bbayles

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: can't install/run pip (Latest version of Python)

2018-07-16 Thread S Lea
Thank you for reaching out.

1) Don't know what do you mean by the traceback.
2) In DOS, pip install pandas
3) Yes, in DOS, Win 10
4) 3.7
5) Not getting much info

On Sun, Jul 15, 2018 at 5:44 PM, boB Stepp  wrote:

> On Sun, Jul 15, 2018 at 7:34 PM S Lea  wrote:
> >
> >  I can't seem to install the pips, DOS gives me the syntex i invalid, any
> > thoughts?
>
> You provide insufficient information for anyone to be able to help you.
>
> 1) Copy and paste the entire traceback into a plain text email, no
> screen shots please.
> 2) Copy and paste exactly what you typed that generated the syntax error.
> 3) Did you do all of this in cmd.exe in what version of Windows?  Or
> did you use something else?  Hopefully not in the interactive Python
> interpreter or IDLE!
> 4) What Python version are you using?
> 5) Did you try searching for the exact error message and see what
> answers might already be out there?
>
> If you provide these things, or, better, search and find the answer
> yourself, I'm sure someone will be happy to assist.
>
> Good luck!
> --
> boB
> --
> https://mail.python.org/mailman/listinfo/python-list
>
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:21, Steven D'Aprano wrote:
> > This doesn’t mean that UTF-32 is an awful system, just that it
> > isn’t the magical cure that some were hoping for.  
> 
> Nobody ever claimed it was, except for the people railing that
> since it isn't a magically system we ought to go back to the Good
> Old Days of code page hell, or even further back when everyone just
> used ASCII.

But even ed(1) on most systems is 8-bit clean so even there you're not
limited to ASCII.  I can't say I miss code-pages in the least.

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-17 01:08, Steven D'Aprano wrote:
> In English, I think most people would prefer to use a different
> term for whatever "sh" and "ch" represent than "character".

The term you may be reaching for is "consonant cluster"?

https://en.wikipedia.org/wiki/Consonant_cluster

-tkc



-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Python
On Mon, Jul 16, 2018 at 08:56:11PM +0100, Rhodri James wrote:
> The problem everyone is having with you, Marko, is that you are
> using the terminology incorrectly. [...] When you call UTF-32 a
> variable-width encoding, you are incorrect.

But please don't overlook that the "terminology" is in fact rather
specialized jargon, far less common than even most computer jargon.
Unless you're uncommonly familiar with the subject matter, you simply
don't have this vocabulary.  Under the circumstances it seems not
horribly unreasonable to expect such a person to consider the bytes
required to represent a glyph as an encoding's width, and you as
"experts" rightly should expect, let's call them lay people, to make
this mistake and adjust for it, or politely correct it, without the
condescension.

> You are of course welcome to use whatever terminology you personally
> like, like Humpty Dumpty.  However when you point to a duck and say
> "That's a gnu," people are likely to stop taking you seriously.

Shouldn't experts "be generous in what they accept, but conservative
in what they emit?"  If your goal here is to educate, and come to a
common understanding, rather than to simply prove how superior (the
generic) you are,  then perhaps both you and the community would be
better served if you strived to understand Marko's points, rather than
just point out how horribly wrong he is?  The tone here is often
extremely adversarial, which I think mostly serves to incite others to
respond adversarialy.  I certainly know I've fallen into that trap
more than once, myself.

I work primarily in Unix environments, and I daresay the way Unix
treats text as bytes--barring certain very specialized applications,
which require knowledge of what bytes correspond to what units of
linguistic representations, like reversing strings (which FWIW I've
never found a use for, other than academic ones)--works just fine.
You can--and I do (or have, at least)--write non-ASCII unicode strings
as bytes in your Python-2.7 code, or read them from a file, or
whatever other input your program desires, and send them to whatever
terminal or GUI program you want to, and they will appear as they
should to the user, provided the system is configured appropriately
(which these days mostly means configured to use UTF-8, and which
these days is generally the case).  

It's reasonable to assume users either know what encoding their
systems are using, or don't have a clue but won't change it, so it
will always be "right."  And if the system is configured correctly,
and you sensibly used UTF-8 encoded byte strings in your program, but
the system is configured in some other encoding, it's a fairly trivial
matter to use iconv to convert to the system's encoding (which I have
also done, but perhaps not in Python--I can't recall), assuming the
data can be converted (and if not you're kinda screwed anyway).  In
the overwhelming majority of cases, this gets you everything you need,
and the language internally understanding Unicode (especially if that
understanding requires more work from the programmer to deal with it)
mostly gets you very little.  Yes, of course there are specific
applications for which that intelligence is neccessary, and in those
cases it should be made use of.  The rest of the time--the
overwhelming majority of the time--it's just superfluous complexity.

So, sure, in uncommon cases knowing about Unicode may reduce (but not
eliminate) complications dealing with different languages, but in the
common cases it may only serve to  make more work for the programmer.
I don't know about you, but I prefer to do less, if less is required.
If these features exist because Windows needs them in order to
reliably get the common cases right, then maybe, just maybe, Unix
really did get it right after all.



pgpsYXkWk0Rss.pgp
Description: PGP signature
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34123] ambiguous documentation for dict.popitem

2018-07-16 Thread Raymond Hettinger


Change by Raymond Hettinger :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34123] ambiguous documentation for dict.popitem

2018-07-16 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset bfa8a358e2cec40484c4655138ca3c6b10f8462a by Raymond Hettinger 
(Miss Islington (bot)) in branch '3.7':
bpo-34123: Fix missed documentation update for dict.popitem(). (GH-8292) 
(GH#8307)
https://github.com/python/cpython/commit/bfa8a358e2cec40484c4655138ca3c6b10f8462a


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34118] Fix some class entries in 'Built-in Functions'

2018-07-16 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

SO user abarnert, who I presume is bpo abarnert (Andrew Barnert) claims that 
"Create a new dictionary. The dict object is the dictionary class." sounds a 
bit like dict returns the dictionary class.  It is different from "Return a new 
set object, ... . set is a built-in class."  I like the latter better and will 
use it as the pattern.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon
> On Jul 16, 2018, at 9:21 PM, Steven D'Aprano 
>  wrote:
> 
>> On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote:
>> 
>> You are defining a variable/fixed width codepoint set. Many others want
>> to deal with CHARACTER sets.
> 
> Good luck coming up with a universal, objective, language-neutral, 
> consistent definition for a character.
> 
Who says there needs to be one. A good engineer will use the definition that is 
most appropriate to the task at hand. Some things need very solid definitions, 
and some things don’t. 

This goes back to my original point, where I said some people consider UTF-32 
as a variable width encoding. For very many things, practically, the 
‘codepoint’ isn’t the important thing, so the fact that every UTF-32 code point 
takes the same number of bytes or code words isn’t that important. They are 
dealing with something that needs to be rendered and preserving larger units, 
like the grapheme is important.

> 
>> This doesn’t mean that UTF-32 is an awful system, just that it isn’t the
>> magical cure that some were hoping for.
> 
> Nobody ever claimed it was, except for the people railing that since it 
> isn't a magically system we ought to go back to the Good Old Days of code 
> page hell, or even further back when everyone just used ASCII.
> 
Sometimes ASCII is good enough, especially on a small machine with limited 
resources. Sometimes you do need to use a ‘Code Page’ because of limited 
resources and that unit will only be able to talk a single language because of 
that too). Sometimes you have the luxury of being able to use a somewhat 
complete Unicode implementation. Sometimes you are never going to be displaying 
anything, and you can mostly just treat everything as a bag of bytes. You use 
the tool that is right for the job.

> -- 
> Steven D'Aprano
> "Ever since I learned about confirmation bias, I've been seeing
> it everywhere." -- Jon Ronson
> 
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue31342] test.bisect module causes tests to fail

2018-07-16 Thread Neil Schemenauer


Neil Schemenauer  added the comment:

Yes, it looks like the same issue as bpo-29512.  Renaming test.bisect is the 
simplest solution.  I have trained myself to run "python -m test.regrtest " so this issue doesn't affect me any more.  However, I think it was a 
trap that will catch some people.  So, thanks for fixing.

I considered adding a 'bisect' command to the test/__main__, e.g. you could run 
'python -m test --bisect ..'.  That looks not entirely simple to implement 
though.

--
resolution:  -> duplicate
stage: needs patch -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 22:51:32 +0300, Marko Rauhamaa wrote:

> All UTF-8. No unicode strings.

That just means you are re-implementing the bits of Unicode you care 
about (which may be "nothing at all") as UTF-8. If your application is 
nothing but middleware squirting bytes from one layer to another layer, 
that might be all you need care about.

But then you're not processing text in your application, and why should 
your experience in not-processing-text be given any weight over the 
experiences of those who do process text?


And later, in another post:

> UTF-8 bytes can only represent the first 128 code points of Unicode.

This is DailyWTF material. Perhaps you want to rethink your wording and 
maybe even learn a bit more about Unicode and the UTF encodings before 
making such statements.

The idea that UTF-8 bytes cannot represent the whole of Unicode is not 
even wrong. Of course a *single* byte cannot, but a single byte is not 
"UTF-8 bytes".


-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 15:28:51 -0400, Terry Reedy wrote:

> On 7/16/2018 1:11 PM, Richard Damon wrote:
> 
>> Many consider that UTF-32 is a variable-width encoding because of the
>> combining characters. It can take multiple ‘codepoints’ to define what
>> should be a single ‘character’ for display.
> 
> I hope you realize that this is not the standard meaning of
> 'variable-width encoding', which is 'variable number of bytes for a
> codepoint'.

A minor correction Terry: it is the number of code units, not bytes.

UTF-8 uses 1-byte code units, and from 1 to 4 code units per code point;

UTF-16 uses 2-byte code units (a 16-bit word), and 1 or 2 words per code 
point;

UTF-32 uses 4-byte code units (a 32-bit word), and only ever a single 
code unit for every code point.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 19:02:36 -0400, Richard Damon wrote:

> You are defining a variable/fixed width codepoint set. Many others want
> to deal with CHARACTER sets.

Good luck coming up with a universal, objective, language-neutral, 
consistent definition for a character.


> This doesn’t mean that UTF-32 is an awful system, just that it isn’t the
> magical cure that some were hoping for.

Nobody ever claimed it was, except for the people railing that since it 
isn't a magically system we ought to go back to the Good Old Days of code 
page hell, or even further back when everyone just used ASCII.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 23:50:12 +0200, Roel Schroeven wrote:

> There are times (encoding/decoding network protocols and other data
> formats) when I have a byte string and I want/need to process it like
> Python 2 does, and that is the one area where I feel Python 3 make
> things a bit more difficult.

Ah yes, the unfortunate design error that iterating over byte-strings 
returns ints rather than single-byte strings.

That decision seemed to make sense at the time it was made, but turned 
out to be an annoyance. It's a wart on Python 3, but fortunately one 
which is fairly easily dealt with by a helper function.

That *is* a nice example of where byte strings in Python 3 aren't as nice 
as in Python 2.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 4000 was Re: [SUSPICIOUS MESSAGE] Re: Cult-like behaviour]

2018-07-16 Thread MRAB

On 2018-07-17 01:25, Steven D'Aprano wrote:

On Mon, 16 Jul 2018 15:09:16 -0400, Terry Reedy wrote:


On 7/16/2018 11:50 AM, Dennis Lee Bieber wrote:


For Python 4000 maybe


Please don't give people the idea that there is any current intention to
have a 'Python 4000' similar to 'Python 3000'.  Call it 'a mythical
Python 4000', if you must use such a term.


I prefer to say Python 5000, to make it even more clear that should such
a thing happen again, it will be a *REALLY* long time from now.

I think that Python 5000 would be more to do with an internal change, 
such as going GIL-less, than a change in, say, syntax.

--
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-16 23:59, Marko Rauhamaa wrote:
> Tim Chase :
> > While the python world has moved its efforts into improving
> > Python3, Python2 hasn't suddenly stopped working.  
> 
> The sword of Damocles is hanging on its head. Unless a consortium is
> erected to support Python2, no vendor will be able to use it in the
> medium term.

Wait, but now you're talking about vendors. Much of the crux of this
discussion has been about personal scripts that don't need to
marshal Unicode strings in and out of various functions/objects.

If you have a py2 script that works with py2 and breaks with py3, and
you don't want to update to py3 unicode-strings-by-default, then
stick with py2.  They even coexist nicely on the same machine.

It doesn't have a self-destruct clause.  As long as py2 continues to
build, it will continue to run which is a long lifetime.  To point,
I still have the "joy" of maintaining some py2.4 code that's in
production.  Would I rather upgrade it to 3.x?  You bet.  But the
powers in place are willing to forego python updates in order to not
rock the boat.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Tue, 17 Jul 2018 06:15:25 +1000, Chris Angelico wrote:

> On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano
>  wrote:
>> There is nothing special about diacritics such that we ought to treat
>> some combinations like "Ch" (two code points = one character) as "fixed
>> width" while others like "â" (two code points = one character) as
>> "variable width".
> 
> When you reverse a word, do you treat "ch" and "sh" as one character or
> two? 

In English, "ch" is always two letters of the alphabet. In Welsh and 
Czech, they can be one or two letters. (I think they will be two letters 
only in loan words, but I'm not certain about that.) Whether that makes 
them one or two characters depends on how you define "character".

Good luck with finding a universal, objective, unambiguous definition.


> I'm of the opinion that they're single characters, and thus this
> should be "dalokosh":
> 
> https://wiki.teamfortress.com/wiki/Dalokohs_Bar
> 
> (It's the Russian for "chocolate" - "шоколад" - transliterated to
> English/Latin - "šokolad" or "shokolad" - and then reversed.)

In English, I think most people would prefer to use a different term for 
whatever "sh" and "ch" represent than "character". But you make a good 
point that even in English, we sometimes want to treat two letter 
combinations as a single unit.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Unicode is not UTF-32 [was Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 22:40:13 +0300, Marko Rauhamaa wrote:

> Terry Reedy :
> 
>> On 7/15/2018 5:28 PM, Marko Rauhamaa wrote:
>>> if your new system used Python3's UTF-32 strings as a foundation,
>>
>> Since 3.3, Python's strings are not (always) UFT-32 strings.
> 
> You are right. Python's strings are a superset of UTF-32. More
> accurately, Python's strings are UTF-32 plus surrogate characters.

The first thing you are doing wrong is conflating the semantics of the 
data type with one possible implementation of that data type. UTF-32 is 
implementation, not semantics: it specifies how to represent Unicode code 
points as bytes in memory, not what Unicode code points are.

Python 3 strings are sequences of abstract characters ("code points") 
with no mandatory implementation. In CPython, some string objects are 
encoded in Latin-1. Some are encoded in UTF-16. Some are encoded in 
UTF-32. Some implementations (MicroPython) use UTF-8.

Your second error is a more minor point: it isn't clear (at least not to 
me) that "Unicode plus surrogates" is a superset of Unicode. Surrogates 
are part of Unicode. The only extension here is that Python strings are 
not necessarily well-formed surrogate-free Unicode strings, but they're 
still Unicode strings.


>> Nor are they always UCS-2 (or partly UTF-16) strings. Nor are the
>> always Latin-1 or Ascii strings. Python's Flexible String
>> Representation uses the narrowest possible internal code for any
>> particular string. This is all transparent to the user except for
>> memory size.
> 
> How CPython chooses to represent its strings internally is not what I'm
> talking about.

Then why do you repeatedly talk about the internal storage representation?

UTF-32 is not a character set, it is an encoding. It specifies how to 
implement a sequence of Unicode abstract characters.


>>> UTF-32, after all, is a variable-width encoding.
>>
>> Nope.  It a fixed-width (32 bits, 4 bytes) encoding.
>>
>> Perhaps you should ask more questions before pontificating.
> 
> You mean each code point is one code point wide. But that's rather an
> irrelevant thing to state.

No, he means that each code point is one code unit wide.


> The main point is that UTF-32 (aka Unicode)

UTF-32 is not a synonym for Unicode. Many legacy encodings don't 
distinguish between the character set and the mapping between bytes and 
characters, but Unicode is not one of those.


> uses one or more code points to represent what people would consider an
> individual character.

That's a reasonable observation to make. But that's not what fixed- and 
variable-width refers to.

So does ASCII, and in both cases, it is irrelevant since the term of art 
is to define fixed- and variable-width in terms of *code points* not 
human meaningful characters. "Character" is context- and language-
dependent and frequently ambiguous. "LL" or "CH" (for example) could be a 
single character or a double character, depending on context and language.

Even in ASCII English, something as large as "ough" might be considered 
to be a single unit of language, which some people might choose to call a 
character. (But not a single letter, naturally.) If you don't like that 
example, "qu" is probably a better one: aside from acronyms and loan 
words, no modern English word can fail to follow a Q with a U.


> Code points are about as interesting as individual bytes in UTF-8.

That's your opinion. I see no justification for it.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34133] ValueError should not be documented as being restricted to only "a built-in operation or function"

2018-07-16 Thread Nathaniel Manista


New submission from Nathaniel Manista :

The documentation for ValueError currently describes it as being "Raised when a 
built-in operation or function receives an argument that has the right type but 
an inappropriate value, and the situation is not described by a more precise 
exception such as IndexError.", but the Python community has (quite rightly!) 
adopted it as the exception to raise in any system when that system is passed a 
value for a parameter that is type-correct but of an invalid value.

(Because what, is every library going to present a "my_library.ValueError" 
exception instead? That would be ridiculous.)

ValueError's documentation should drop the "a built-in operation or function" 
wording.

Perhaps go with something like "When raised indicates that a function or method 
was passed a value of the correct type but an invalid value"?

--
assignee: docs@python
components: Documentation
messages: 321784
nosy: Nathaniel Manista, docs@python
priority: normal
severity: normal
status: open
title: ValueError should not be documented as being restricted to only "a 
built-in operation or function"
type: enhancement
versions: Python 2.7, Python 3.4, Python 3.5, Python 3.6, Python 3.7, Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Mark Lawrence

On 16/07/18 21:16, Rhodri James wrote:

On 16/07/18 20:58, Terry Reedy wrote:

On 7/16/2018 1:27 PM, Jim Lee wrote:

90% of the world *is* "beneath my notice" when it comes to 
programming for myself.   I really don't care if that's not PC enough 
for you.


Had you actually read my words with *intent* rather than *reaction*, 
you would notice that I suggested the *option* of turning off 
Unicode.  I didn't say get *rid* of Unicode.  I didn't say make it 
*harder* to use Unicode.  Once again - reaction rather than reading.


Obviously, the most vocal representatives of the Python community are 
too sensitive about their language to enable rational discussion.


My empirical observation is that the more abrasive posters get 
rewarded with more response, while my attempts to engage in rational 
discussion, without ad hominems, gets less.


I wouldn't disagree with you.  Fortunately Jim has pulled the "storming 
off in a huff rather than answer a question anyone actually asked" 
defence, so we can go back to debating about important things like how 
to spell assignment expressions.


Oh wait... :-)



Cheeky :)

--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


[issue34130] test_signal: test_warn_on_full_buffer() failed on AppVeyor

2018-07-16 Thread Nathaniel Smith


Nathaniel Smith  added the comment:

Huh, that's weird. My first thought was some kind of race condition, but... 
raise_signal uses raise(), which on Windows should be invoking the signal 
handler synchronously, so the warning should definitely be printed before 
raise_signal() returns.

Could the warning be trapped in some buffer? That would be weird too, usually 
stderr and warnings should not be buffered.

Of course putting a retry loop around the test is an option if we can't figure 
out how to fix it properly.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Python 4000 was Re: [SUSPICIOUS MESSAGE] Re: Cult-like behaviour]

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 15:09:16 -0400, Terry Reedy wrote:

> On 7/16/2018 11:50 AM, Dennis Lee Bieber wrote:
> 
>>  For Python 4000 maybe
> 
> Please don't give people the idea that there is any current intention to
> have a 'Python 4000' similar to 'Python 3000'.  Call it 'a mythical
> Python 4000', if you must use such a term.

I prefer to say Python 5000, to make it even more clear that should such 
a thing happen again, it will be a *REALLY* long time from now.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Users banned

2018-07-16 Thread Steven D'Aprano
On Mon, 16 Jul 2018 20:03:39 +0100, Steve Simmons wrote:

> +1  Seems to me Bart is being banned for "being a dick" and "talking
> rubbish" (my words/interpretation) with irritating persistence.

I know that when I first started here, I often talked rubbish. The 
difference is, I was willing to listen and consider when people gave 
alternate viewpoints. Eventually.

And I know that some people think that I'm sometimes still being a dick. 
They're wrong, I'm just charmingly forthright *wink*

Bart is often frustratingly resistant to reasonable argument, and has 
been obnoxious in his habit of bringing virtually every conversation into 
an opportunity to make a dig at Python.

But neither of these are prohibited by the CoC, neither of these should 
be banning offense, and even if they were, he should have had a formal 
warning first.

Preferably TWO formal warnings: the first privately, the second publicly, 
and only on the third offence a ban.

And I question the fairness of a six month ban, rather than (let's say) 
an initial one month ban.

As for banning Rick, when he isn't even posting at the moment, I don't 
even have words for that. There's no statute of limitation for murder, 
but surely "being obnoxious on the internet" ought to come with a fairly 
short period of forgiveness.



-- 
Steven D'Aprano
"Ever since I learned about confirmation bias, I've been seeing
it everywhere." -- Jon Ronson

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34123] ambiguous documentation for dict.popitem

2018-07-16 Thread Raymond Hettinger


Raymond Hettinger  added the comment:


New changeset 01b7d5898262dbe0e9edb321b3be9a34da196f6f by Raymond Hettinger in 
branch 'master':
bpo-34123: Fix missed documentation update for dict.popitem(). (GH-8292)
https://github.com/python/cpython/commit/01b7d5898262dbe0e9edb321b3be9a34da196f6f


--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34123] ambiguous documentation for dict.popitem

2018-07-16 Thread miss-islington


Change by miss-islington :


--
pull_requests: +7842

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34130] test_signal: test_warn_on_full_buffer() failed on AppVeyor

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

Traceback (most recent call last):
  File "", line 39, in 
AssertionError

According to the traceback, the captured stderr ('err' variable) is an empty 
string.

The test uses test.support.capture_stderr() which replaces sys.stderr.

The signal module calls PySys_WriteStderr(msg) which calls 
sys.stderr.write(msg). If the Python call fails, msg is supposed to be written 
into the C stderr stream. sys.stderr.flush() is not called, but it shouldn't be 
needed, since test.support.capture_stderr() replace sys.stderr with a 
io.StringIO object.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34130] test_signal: test_warn_on_full_buffer() failed on AppVeyor

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

The test failed at:

# Fill the send buffer
try:
while True:
write.send(b"x")
except BlockingIOError:
pass

# By default, we get a warning when a signal arrives
signal.set_wakeup_fd(write.fileno())

with captured_stderr() as err:
_testcapi.raise_signal(signum)

err = err.getvalue()
if ('Exception ignored when trying to {action} to the signal wakeup fd'
not in err):
raise AssertionError(err) # <~~ HERE

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 9:18 AM, Dan Sommers  wrote:
> Quick:  how long is the byte array that displays as '\xff'?  Too easy?
> What about '\0xff' and '0\xff'?

1, 4, 2 bytes respectively. Yep, easy... but then, I'm used to reading
backslash escapes. Nothing to do with text vs bytes.

DNS, of course, is different.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread INADA Naoki


Change by INADA Naoki :


--
pull_requests: +7840, 7841
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33967] functools.singledispatch: Misleading exception when calling without arguments

2018-07-16 Thread INADA Naoki


Change by INADA Naoki :


--
pull_requests: +7840
stage: needs patch -> patch review

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34126] Profiling certain invalid calls crashes Python

2018-07-16 Thread ppperry


Change by ppperry :


--
title: Profiling certain invalid calls crash Python -> Profiling certain 
invalid calls crashes Python

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26270] Support for read()/write()/select() on asyncio

2018-07-16 Thread Guido van Rossum


Change by Guido van Rossum :


--
nosy:  -gvanrossum

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Dan Sommers
On Tue, 17 Jul 2018 08:48:55 +1000, Chris Angelico wrote:

> That said, though, the fact that indexing a byte string yields an int
> instead of a one-byte string is basically unable to be changed now ...

Agreed.

> ... and IMO it'd be better to be consistent with text strings than
> with bytearray ...

Disagreed.  Given an arbitrary byte string, you can't know whether it's
semantically text or semantically an array of bytes.  (Sometimes a good
byte array is just a byte array?)

In the past, I've done plenty of work with "strings" (in the generic
sense) of octets to/from wire-level protocols.  It would have been much
easier had Python *not* tried to pretend they were text, and *not*
rendered some of the bytes as their ASCII equivalent and some of the
bytes as hex escapes (especially in the cases that some of the bytes
happened to be 0x58, 0x78, 0x5c, or in range(0x30, 0x3a)).

> ... I'm not sure how many of the core devs agree that b'spam'[1] ought
> to be b'p' rather than 112, but I'd say they all agree that it's too
> late to change it.

Curmudgeonly C programmer that I am, b'p' *is* 112.  ;-)

Quick:  how long is the byte array that displays as '\xff'?  Too easy?
What about '\0xff' and '0\xff'?

FWIW, Erlang, a language all but designed to read/write wire level
protocols, prints any array of integers less than 256 as a(n ASCII) text
string.  It never *mixes* integers and characters, but often picks the
wrong one.

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue26270] Support for read()/write()/select() on asyncio

2018-07-16 Thread smheidrich


Change by smheidrich :


--
nosy: +smheidrich

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34132] Obscure netrc parser "bug"

2018-07-16 Thread Skip Montanaro


Change by Skip Montanaro :


Added file: https://bugs.python.org/file47697/netrc-blank-comment

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34132] Obscure netrc parser "bug"

2018-07-16 Thread Skip Montanaro


New submission from Skip Montanaro :

Not sure I can really call this a bug, however there is a behavioral change 
between 2.7 and at least 3.6 and 3.7 (probably earlier versions of the 3.x 
series as well). There is no spec for .netrc files that I can find, certainly 
nothing which mentions comment or blank lines. Still, Python's netrc file 
parser seems happy with both.

However, in 3.x a blank line followed immediately by a comment line containing 
actual comment text causes the parser to raise a parse error. I've attached two 
netrc files, netrc-comment-blank, and netrc-blank-comment, identical save for 
the ordering of a blank line and a comment line. Here's what a 2.7.14 session 
looks like:

Python 2.7.14 |Anaconda, Inc.| (default, Mar 27 2018, 17:29:31) 
[GCC 7.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import netrc
>>> rc = netrc.netrc(file="/home/skip/tmp/netrc-comment-blank")
>>> rc = netrc.netrc(file="/home/skip/tmp/netrc-blank-comment")

Here's 3.7.0:

Python 3.7.0 (default, Jun 28 2018, 13:15:42) 
[GCC 7.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import netrc
>>> rc = netrc.netrc(file="/home/skip/tmp/netrc-comment-blank")
>>> rc = netrc.netrc(file="/home/skip/tmp/netrc-blank-comment")
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/skip/miniconda3/envs/python3/lib/python3.7/netrc.py", line 30, in 
__init__
self._parse(file, fp, default_netrc)
  File "/home/skip/miniconda3/envs/python3/lib/python3.7/netrc.py", line 63, in 
_parse
"bad toplevel token %r" % tt, file, lexer.lineno)
netrc.NetrcParseError: bad toplevel token 'Comment' 
(/home/skip/tmp/netrc-blank-comment, line 2)

--
components: Library (Lib)
files: netrc-comment-blank
messages: 321779
nosy: skip.montanaro
priority: normal
severity: normal
stage: needs patch
status: open
title: Obscure netrc parser "bug"
type: behavior
versions: Python 3.6, Python 3.7, Python 3.8
Added file: https://bugs.python.org/file47696/netrc-comment-blank

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread MRAB

On 2018-07-16 21:59, Marko Rauhamaa wrote:

Tim Chase :

While the python world has moved its efforts into improving Python3,
Python2 hasn't suddenly stopped working.


The sword of Damocles is hanging on its head. Unless a consortium is
erected to support Python2, no vendor will be able to use it in the
medium term.

Given the recent events, maybe that's exactly what's going to happen. A
business consortium will take it on themselves to support and enhance
Python2 ad infinitum. I wouldn't be surprised.

(Although it might make me regret my knee-jerk porting effort.)


In open source, it's up to those with the itch to scratch it.

Someone finally did, and it's called Tauthon.
--
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Richard Damon

> On Jul 16, 2018, at 3:28 PM, Terry Reedy  wrote:
> 
>> On 7/16/2018 1:11 PM, Richard Damon wrote:
>> 
>> Many consider that UTF-32 is a variable-width encoding because of the 
>> combining characters. It can take multiple ‘codepoints’ to define what 
>> should be a single ‘character’ for display.
> 
> I hope you realize that this is not the standard meaning of 'variable-width 
> encoding', which is 'variable number of bytes for a codepoint'.  UTF-16 and 
> UTF-8 are variable width.  If one expands the definition enough, Ascii is 
> 'variable width' because 'fi' is two bytes, or more realistically, because <= 
> and >= are two bytes instead of one (as they can be in Unicode!).
> 
> If one is using a broader definition than usual, it is clearer to say so.
> 
> -- 
> Terry Jan Reedy
> 

You are defining a variable/fixed width codepoint set. Many others want to deal 
with CHARACTER sets. The Unicode consortium agrees that a code point is not 
necessarily a character (which is one reason they came up with the term). When 
actually trying to do work with text strings, the fact that some codepoints are 
combining codes that need to ‘stick’ to their mate becomes important. One of 
the claimed advantages of fixed width character set encodings is that you 
aren’t supposed to need to worry about breaking strings in two, but that 
doesn’t work in Unicode, you need to make sure you aren’t breaking a combining 
sequence.

Even worse, Unicode really needs arbitrary look back to render substrings 
because it uses shift codes for things like left-to-right/right-to-left 
rendering control.

This doesn’t mean that UTF-32 is an awful system, just that it isn’t the 
magical cure that some were hoping for.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 8:41 AM, Roel Schroeven  wrote:
> In any case, even though Python 3's byte strings are not quite unlike Python
> 2's strings, they're not exactly like them either. And I feel there are
> cases where that makes things somewhat harder, even though I can't prove it.

You're absolutely right, and some of those differences were repaired
in different 3.x versions (for instance, the ability to use
percent-formatting with byte strings was reinstated in 3.5). Some of
the differences are fundamental, but anything else should be
considered fair game for an enhancement request. So next time you go
"ugh, Python 3's byte strings are such a pain because XYZ", post here
or on python-ideas about a possible fix.

That said, though, the fact that indexing a byte string yields an int
instead of a one-byte string is basically unable to be changed now,
and IMO it'd be better to be consistent with text strings than with
bytearray. I'm not sure how many of the core devs agree that
b'spam'[1] ought to be b'p' rather than 112, but I'd say they all
agree that it's too late to change it.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Roel Schroeven

Chris Angelico schreef op 16/07/2018 23:57:

On Tue, Jul 17, 2018 at 7:50 AM, Roel Schroeven  wrote:

Steven D'Aprano schreef op 16/07/2018 2:18:

On Sun, 15 Jul 2018 16:08:15 -0700, Jim Lee wrote:


Python3 is intrinsically tied to Unicode for string handling. Therefore,
the Python programmer is forced to deal with it (in all but trivial
cases), rather than given a choice.  So I don't understand how I can
illustrate my point with Python code since Python won't let me deal with
strings without also dealing with Unicode.


Nonsense.

b"Look ma, a Python 2 style ASCII string."


Except for one little difference, which has bitten be me a few times.
Consider this code:

from __future__ import print_function
s = b"Look ma, a Python 2 style ASCII string."
print('First element:', s[0])

Result in Python 2: First element: L
Result in Python 3: First element: 76

Likewise this code:

from __future__ import print_function
for e in b'hello':
  print(e, end=', ')
print()

Result in Python 2: h, e, l, l, o,
Result in Python 3: 104, 101, 108, 108, 111,

There are times (encoding/decoding network protocols and other data formats)
when I have a byte string and I want/need to process it like Python 2 does,
and that is the one area where I feel Python 3 make things a bit more
difficult.



For the most part, you probably want to decode it as ASCII, if you
want to process it as text. Remember, bytes are simply numbers -
octets, groups of eight bits. For it to mean the English word "hello",
that byte sequence has to be interpreted as ASCII, which is accurately
indicated as b'hello'.decode('ascii').


I think I've had cases where that approach didn't work very well, but 
unfortunately I can't readily think of any concrete examples.


In any case, even though Python 3's byte strings are not quite unlike 
Python 2's strings, they're not exactly like them either. And I feel 
there are cases where that makes things somewhat harder, even though I 
can't prove it.


--
"Honest criticism is hard to take, particularly from a relative, a
friend, an acquaintance, or a stranger."
-- Franklin P. Jones

Roel Schroeven

--
https://mail.python.org/mailman/listinfo/python-list


[issue34131] test_threading: BarrierTests.test_default_timeout() failed on AppVeyor

2018-07-16 Thread STINNER Victor


New submission from STINNER Victor :

test_threading.BarrierTests.test_default_timeout() failed on AppVeyor. Related 
issues:

* bpo-11871, commit d4d1d068dcf4e1aaf93772ccc0824207a21606e5: change timeout
* bpo-30316

https://ci.appveyor.com/project/python/cpython/build/3.8build19370

test_barrier (test.test_threading.BarrierTests) ... ok
test_barrier_10 (test.test_threading.BarrierTests) ... ok
test_default_timeout (test.test_threading.BarrierTests) ... ERROR
test_reset (test.test_threading.BarrierTests) ... Unhandled exception in thread 
started by .task at 0x03266728>
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\lock_tests.py", line 41, in task
f()
  File "C:\projects\cpython\lib\test\lock_tests.py", line 938, in f
i = barrier.wait()
  File "C:\projects\cpython\lib\threading.py", line 613, in wait
self._wait(timeout)
  File "C:\projects\cpython\lib\threading.py", line 653, in _wait
raise BrokenBarrierError
threading.BrokenBarrierError
Unhandled exception in thread started by .task 
at 0x03266728>
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\lock_tests.py", line 41, in task
f()
  File "C:\projects\cpython\lib\test\lock_tests.py", line 938, in f
i = barrier.wait()
  File "C:\projects\cpython\lib\threading.py", line 613, in wait
self._wait(timeout)
  File "C:\projects\cpython\lib\threading.py", line 653, in _wait
raise BrokenBarrierError
threading.BrokenBarrierError
Unhandled exception in thread started by .task 
at 0x03266728>
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\lock_tests.py", line 41, in task
f()
  File "C:\projects\cpython\lib\test\lock_tests.py", line 938, in f
i = barrier.wait()
  File "C:\projects\cpython\lib\threading.py", line 604, in wait
self._enter() # Block while the barrier drains.
  File "C:\projects\cpython\lib\threading.py", line 628, in _enter
raise BrokenBarrierError
threading.BrokenBarrierError
Unhandled exception in thread started by .task 
at 0x03266728>
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\lock_tests.py", line 41, in task
f()
  File "C:\projects\cpython\lib\test\lock_tests.py", line 938, in f
i = barrier.wait()
  File "C:\projects\cpython\lib\threading.py", line 604, in wait
self._enter() # Block while the barrier drains.
  File "C:\projects\cpython\lib\threading.py", line 628, in _enter
raise BrokenBarrierError
threading.BrokenBarrierError
ok
test_single_thread (test.test_threading.BarrierTests) ... ok
test_timeout (test.test_threading.BarrierTests) ... ok
test_wait_return (test.test_threading.BarrierTests) ... ok
test_acquire (test.test_threading.BoundedSemaphoreTests) ... ok

...

==
ERROR: test_default_timeout (test.test_threading.BarrierTests)
--
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\lock_tests.py", line 943, in 
test_default_timeout
self.run_threads(f)
  File "C:\projects\cpython\lib\test\lock_tests.py", line 772, in run_threads
f()
  File "C:\projects\cpython\lib\test\lock_tests.py", line 938, in f
i = barrier.wait()
  File "C:\projects\cpython\lib\threading.py", line 613, in wait
self._wait(timeout)
  File "C:\projects\cpython\lib\threading.py", line 651, in _wait
raise BrokenBarrierError
threading.BrokenBarrierError


test_threading succeeded when it has been re-run in verbose mode.

--
components: Tests
messages: 321778
nosy: vstinner
priority: normal
severity: normal
status: open
title: test_threading: BarrierTests.test_default_timeout() failed on AppVeyor
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue8799] Hang in lib/test/test_threading.py

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

This issue discussed different things:

* test_threading was unable: I didn't see such failure the last 12 months, and 
I backported enhancements from master to 2.7, so I consider this issue as fixed
* documentation of threading locks: fixed
* Except on Windows, Python 2.7 uses its own implementation of TLS: this issue 
has been fixed in Python 3.

All discussed issues have been fixed, so I close the issue.

--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: doubling the number of tests, but not taking twice as long

2018-07-16 Thread Larry Martell
On Mon, Jul 16, 2018 at 6:01 PM, Gilmeh Serda
 wrote:
> On Mon, 16 Jul 2018 14:17:57 -0400, Larry Martell wrote:
>
>> This code needs to process many tens of 1000's of files, and it runs
>> often, so it needs to run very fast. Needless to say, my change has made
>> it take 2x as long. Can anyone see a way to improve that?
>
> Don't use RegEx search?
>
> My version 361, and a simple benchmarking thing, tells me it's about 2.7
> times slower than "if ... in ..." on 1,000,000 loops.

Without the regex how would you suggest I search for '_M\d+_' efficiently?
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue18605] 2.7: test_threading hangs on Solaris 9

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

There is no more Solaris buildbot, there is no useful information to debug this 
issue, so I close the issue.

--
nosy: +vstinner
resolution:  -> out of date
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 7:02 AM, Ethan Furman  wrote:
> On 07/16/2018 01:15 PM, Chris Angelico wrote:
>>
>> On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano wrote:
>
>
>>> There is nothing special about diacritics such that we ought to treat
>>> some combinations like "Ch" (two code points = one character) as "fixed
>>> width" while others like "â" (two code points = one character) as
>>> "variable width".
>>
>>
>> When you reverse a word, do you treat "ch" and "sh" as one character
>> or two? I'm of the opinion that they're single characters, and thus
>> this should be "dalokosh":
>
>
> Depends on the language:  in Spanish, "ch" is it's own letter (at least it
> was when I grew up), so any word containing it should still contain it when
> reversed:  "chica" would be "acich".
>

Yeah. In Russian, "sh" is the single character "ш". I'm of the opinion
that, even after being transliterated into English phonetics, that
should be treated as a unit. ISO-9 uses "š" rather than "sh", which is
an improvement in character correspondence, but your average English
speaker is more likely to be able to pronounce "dalokosh" correctly
than to figure out "dalokoš". In the same way, I created a magic item
in a D campaign called "Yasham Burda", even though the more correct
spelling would be "Yaşam Burda" or even "Yasam Burda", for the benefit
of my monolingual players. But I'd still treat the "sh" as one
character.

Ain't transliteration fun?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34130] test_signal: test_warn_on_full_buffer() failed on AppVeyor

2018-07-16 Thread STINNER Victor


New submission from STINNER Victor :

test_signal.test_warn_on_full_buffer() failed on AppVeyor.

https://ci.appveyor.com/project/python/cpython/build/3.8build19372

==
FAIL: test_warn_on_full_buffer (test.test_signal.WakeupSocketSignalTests)
--
Traceback (most recent call last):
  File "C:\projects\cpython\lib\test\test_signal.py", line 538, in 
test_warn_on_full_buffer
assert_python_ok('-c', code)
  File "C:\projects\cpython\lib\test\support\script_helper.py", line 157, in 
assert_python_ok
return _assert_python(True, *args, **env_vars)
  File "C:\projects\cpython\lib\test\support\script_helper.py", line 143, in 
_assert_python
res.fail(cmd_line)
  File "C:\projects\cpython\lib\test\support\script_helper.py", line 84, in fail
err))
AssertionError: Process return code is 1
command line: ['C:\\projects\\cpython\\PCbuild\\win32\\python.exe', '-X', 
'faulthandler', '-I', '-c', 'if 1:\nimport errno\nimport 
signal\nimport socket\nimport sys\nimport time\n
import _testcapi\nfrom test.support import captured_stderr\n\n
signum = signal.SIGINT\n\n# This handler will be called, but we 
intentionally won\'t read from\n# the wakeup fd.\ndef 
handler(signum, frame):\npass\n\nsignal.signal(signum, 
handler)\n\nread, write = socket.socketpair()\n
read.setblocking(False)\nwrite.setblocking(False)\n\n# Fill the 
send buffer\ntry:\nwhile True:\n
write.send(b"x")\nexcept BlockingIOError:\npass\n\n
# By default, we get a warning when a signal arrives\n
signal.set_wakeup_fd(write.fileno())\n\nwith captured_stderr() as 
err:\n_testcapi.raise_signal(signum)\n
 \nerr = err.getvalue()\nif (\'Exception ignored when trying to 
send to the signal wakeup fd\'\nnot in err):\nraise 
AssertionError(err)\n\n# And also if warn_on_full_buffer=True\n
signal.set_wakeup_fd(write.fileno(), warn_on_full_buffer=True)\n\nwith 
captured_stderr() as err:\n_testcapi.raise_signal(signum)\n\n   
 err = err.getvalue()\nif (\'Exception ignored when trying to send to 
the signal wakeup fd\'\nnot in err):\nraise 
AssertionError(err)\n\n# But not if warn_on_full_buffer=False\n
signal.set_wakeup_fd(write.fileno(), warn_on_full_buffer=False)\n\nwith 
captured_stderr() as err:\n_testcapi.raise_signal(signum)\n\n   
 err = err.getvalue()\nif err != "":\nraise 
AssertionError("got unexpected output %r" % (err,))\n\n# And then check 
the default again, to make sure warn_on_full_buffer\n# settings don
 \'t leak across calls.\nsignal.set_wakeup_fd(write.fileno())\n\n   
 with captured_stderr() as err:\n_testcapi.raise_signal(signum)\n\n 
   err = err.getvalue()\nif (\'Exception ignored when trying to 
send to the signal wakeup fd\'\nnot in err):\nraise 
AssertionError(err)\n\n']
stdout:
---
---
stderr:
---
Traceback (most recent call last):
  File "", line 39, in 
AssertionError
---
--

The test passed when run again sequentially ("Re-running test 'test_signal' in 
verbose mode").

--
components: Tests
messages: 321775
nosy: njs, pitrou, vstinner
priority: normal
severity: normal
status: open
title: test_signal: test_warn_on_full_buffer() failed on AppVeyor
versions: Python 3.8

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34096] [2.7] test_audioop.test_max() failed: AssertionError: -2147483648 != 2147483648L

2018-07-16 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

yep.  i'm going to close this, it seems arch specific.  there isn't much we can 
realistically do to prevent people overriding things to their peril for 
configure or make. :)

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Reading EmailMessage from file

2018-07-16 Thread Roel Schroeven

Skip Montanaro schreef op 16/07/2018 3:31:

So, problem solved. The example I originally referred to clearly
requires the caller know the encoding of the input file. When you
don't know the encoding, you need bytes. The BytesParser gave me that.

Also, I must admit to having not completely read the examples page,
where it described use of the BytesParser class a bit further down the
page, but I stopped when the simple example failed for me.


I've been there too some time ago.

I Was going to complain about it here (or rather ask for a better way to 
do it); in order to do that, I did a bit more research than I had 
apparently done the first time around, and discovered the right way to 
create an email message from raw bytes.



--
"Honest criticism is hard to take, particularly from a relative, a
friend, an acquaintance, or a stranger."
-- Franklin P. Jones

Roel Schroeven

--
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Grant Edwards
On 2018-07-16, Roel Schroeven  wrote:

> There are times (encoding/decoding network protocols and other data 
> formats) when I have a byte string and I want/need to process it like 
> Python 2 does, and that is the one area where I feel Python 3 make 
> things a bit more difficult.

I use Python to work with various network and serial protocols a lot,
and using Python 3 certainly seems like more work.  The fact that
Python 2 and 3 both have a type called 'bytes' but the two types are
completely incompatible makes it especially hard to write portable
code that doesn't look like it's been intentionally obfuscated.

-- 
Grant Edwards   grant.b.edwardsYow! I represent a
  at   sardine!!
  gmail.com

-- 
https://mail.python.org/mailman/listinfo/python-list


[issue31342] test.bisect module causes tests to fail

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

> This is still broken, IMHO.  Either we should rename test.bisect (...)

I renamed it:

commit 823c295efa4efea93cadc640ed6122cd9d86cec4
Author: Victor Stinner 
Date:   Wed May 30 17:24:40 2018 +0200

bpo-29512: Rename Lib/test/bisect.py to bisect_cmd.py (#7229)

Can we close this issue a duplicate of bpo-29512?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Users banned

2018-07-16 Thread Terry Reedy

On 7/16/2018 3:27 PM, Grant Edwards wrote:

On 2018-07-16, Steve Simmons  wrote:


+1  Seems to me Bart is being banned for "being a dick" and "talking
rubbish" (my words/interpretation) with irritating persistence. Wonder
how many of the non-banned members have been guilty of the same thing in
one way or another.


I'm sure many of us have been guilty of one or both at some time or
another.  I think the level of "persistence" is the key.


What we have not and will not see on the list is the private interchange 
between Bart and the moderators before they took the next to most 
extreme step (of permaban).


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 7:50 AM, Roel Schroeven  wrote:
> Steven D'Aprano schreef op 16/07/2018 2:18:
>>
>> On Sun, 15 Jul 2018 16:08:15 -0700, Jim Lee wrote:
>>
>>> Python3 is intrinsically tied to Unicode for string handling. Therefore,
>>> the Python programmer is forced to deal with it (in all but trivial
>>> cases), rather than given a choice.  So I don't understand how I can
>>> illustrate my point with Python code since Python won't let me deal with
>>> strings without also dealing with Unicode.
>>
>>
>> Nonsense.
>>
>> b"Look ma, a Python 2 style ASCII string."
>
>
> Except for one little difference, which has bitten be me a few times.
> Consider this code:
>
> from __future__ import print_function
> s = b"Look ma, a Python 2 style ASCII string."
> print('First element:', s[0])
>
> Result in Python 2: First element: L
> Result in Python 3: First element: 76
>
> Likewise this code:
>
> from __future__ import print_function
> for e in b'hello':
>   print(e, end=', ')
> print()
>
> Result in Python 2: h, e, l, l, o,
> Result in Python 3: 104, 101, 108, 108, 111,
>
> There are times (encoding/decoding network protocols and other data formats)
> when I have a byte string and I want/need to process it like Python 2 does,
> and that is the one area where I feel Python 3 make things a bit more
> difficult.
>

For the most part, you probably want to decode it as ASCII, if you
want to process it as text. Remember, bytes are simply numbers -
octets, groups of eight bits. For it to mean the English word "hello",
that byte sequence has to be interpreted as ASCII, which is accurately
indicated as b'hello'.decode('ascii').

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue21622] ctypes.util incorrectly fails for libraries without DT_SONAME

2018-07-16 Thread Tianon


Tianon  added the comment:

This was reported on the Docker image for Python in 
https://github.com/docker-library/python/issues/111, with the note that it 
affects the Twisted inotify implementation, so it'd be really neat to see a 
proper patch in Python (instead of the very musl-/Alpine-assuming patch found 
in 
https://github.com/alpinelinux/aports/blob/202f4bea916b0cf974b38ced96ab8fca0b192e3f/main/python2/musl-find_library.patch).
 <3

--
nosy: +tianon

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Roel Schroeven

Steven D'Aprano schreef op 16/07/2018 2:18:

On Sun, 15 Jul 2018 16:08:15 -0700, Jim Lee wrote:


Python3 is intrinsically tied to Unicode for string handling. Therefore,
the Python programmer is forced to deal with it (in all but trivial
cases), rather than given a choice.  So I don't understand how I can
illustrate my point with Python code since Python won't let me deal with
strings without also dealing with Unicode.


Nonsense.

b"Look ma, a Python 2 style ASCII string."


Except for one little difference, which has bitten be me a few times. 
Consider this code:


from __future__ import print_function
s = b"Look ma, a Python 2 style ASCII string."
print('First element:', s[0])

Result in Python 2: First element: L
Result in Python 3: First element: 76

Likewise this code:

from __future__ import print_function
for e in b'hello':
  print(e, end=', ')
print()

Result in Python 2: h, e, l, l, o,
Result in Python 3: 104, 101, 108, 108, 111,

There are times (encoding/decoding network protocols and other data 
formats) when I have a byte string and I want/need to process it like 
Python 2 does, and that is the one area where I feel Python 3 make 
things a bit more difficult.



--
"Honest criticism is hard to take, particularly from a relative, a
friend, an acquaintance, or a stranger."
-- Franklin P. Jones

Roel Schroeven

--
https://mail.python.org/mailman/listinfo/python-list


[issue22708] httplib/http.client in method _tunnel used HTTP/1.0 CONNECT method

2018-07-16 Thread Michael Handler


Change by Michael Handler :


--
pull_requests: +7839

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34118] Fix some class entries in 'Built-in Functions'

2018-07-16 Thread Terry J. Reedy


Terry J. Reedy  added the comment:

I was forgetting that this is a Python, not CPython doc.  So I agree to not tag 
the iterator classes as such.  For all I know, PyPy might use (compiled?) 
generator functions.  And if we were to allow use of Cython, say, for CPython, 
we might  try that.

How about a note under the index table:

Functions that must be classes are tagged *class*.  The iterator functions 
enumerate, filter, map, reversed, and zip are classes in CPython, but they are 
not tagged because they could be implemented as generator functions.

Or we could add an *iterator* tag.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Ethan Furman :
> Depends on the language: in Spanish, "ch" is it's own letter (at least
> it was when I grew up), so any word containing it should still contain
> it when reversed: "chica" would be "acich".

The Royal Academy broke "ch" and "ll" up into separate letters a decade
or so back. It had become accepted practice in dictionaries way before
that.

In Finnish, "v" and "w" are still ortographic variants of the same
letter. In practice, Finns don't have a problem with computers insisting
they are separate letters.

While the Royal Academy of the Spanish Language has now accepted that
"ñ" is an accented "n", no Finn would think that "ä" is an accented "a"
any more than an English-speaker would think that "G" is an accented "C"
(which it originally was).


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:32 AM, Tim Chase
 wrote:
> On 2018-07-16 18:31, Steven D'Aprano wrote:
>> You say that all you want is a switch to turn off Unicode (and
>> replace it with what? Kanji strings? Cyrillic? Shift_JS? no of
>> course not, I'm being absurd -- replace it with ASCII, what else
>> could any right-thinking person want, right?).
>
> But we already have this.  If I want to turn off Unicode strings, I
> type "python2", and if I want to enable Unicode strings, I type
> "python3".
>
> While the python world has moved its efforts into improving Python3,
> Python2 hasn't suddenly stopped working.  It just stopped receiving
> improvements.  If the "old-man shakes-fist at progress" crowd
> doesn't like unicode stings in Py3, just keep on using Py2.  You
> (generic) won't get arrested.  There are no church^WPython police.

Except that Python 2 still supports Unicode, and Python 3 still
supports bytes. Py3 just makes a stronger distinction between text and
bytes.

>>> b"Hello, %s!" % b"world"
b'Hello, world!'

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:54 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>> Challenge: Reverse a string in UTF-8.
>
> Counter-challenge: Reverse a Unicode string:
>
>>>> s = "a\u0304e"
>>>> s
>'āe'
>>>> L = list(s)
>>>> L.reverse()
>>>> "".join(L)
>'ēa'
>
>> Challenge: Center text in UTF-8.
>
> Counter-challenge: Center a Unicode string:
>
>>>> t = s * 3
>>>> t
>'āeāeāe'
>>>> t.center(9)
>'āeāeāe'
>
>> Challenge: Given a (non-initial) character in a buffer of UTF-8 bytes,
>> find the immediately preceding character.
>
> The counter-challenge is left as an exercise for the reader.
>
>> All of these are fundamentally difficult by nature, but if you index
>> by code points, you eliminate one level of difficulty; indexing by
>> bytes retains all the existing difficulty and adds another layer.
>
> Oh, sorry. I thought you were suggesting Unicode strings would make the
> challenges somehow easy.

So now that you've actually read my entire post, you'll see that there
are fundamental difficulties, but that UTF-8 introduces more. Great.
Now go ahead and reply to my post, knowing my actual point.
Congratulations on posting something of no value.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:27 AM, Marko Rauhamaa  wrote:
> Rhodri James :
>
>> On 16/07/18 20:40, Marko Rauhamaa wrote:
>>> You mean each code point is one code point wide. But that's rather an
>>> irrelevant thing to state. The main point is that UTF-32 (aka Unicode)
>>> uses one or more code points to represent what people would consider an
>>> individual character.
>>
>> UTF-32 != Unicode, but that's a separate esoteric argument.
>>
>> The problem everyone
>
> "Everyone"!!!
>
>> is having with you, Marko, is that you are using the terminology
>> incorrectly. When you say that more than one codepoint can be used to
>> represent what people would consider an individual character, you are
>> correct (and would be more correct if you called "what people would
>> consider an individual character" a "glyph"). When you call UTF-32 a
>> variable-width encoding, you are incorrect.
>
> Unicode is one of the primary selling points of Python3

Here, have a look at the original plans for Python 3.0:

https://www.python.org/dev/peps/pep-3100/

The default string type becoming Unicode was just one bullet point
among many. Remember, Python 2 had Unicode strings for a long time;
the change is not "now we use Unicode" but "now the simple and obvious
string type is the text string rather than the byte sequence". Both
types had previously been available. Both types remained available.
This was not a "primary selling point". The main selling point was
cleanups and simplifications.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Ethan Furman

On 07/16/2018 01:15 PM, Chris Angelico wrote:

On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano wrote:



There is nothing special about diacritics such that we ought to treat
some combinations like "Ch" (two code points = one character) as "fixed
width" while others like "â" (two code points = one character) as
"variable width".


When you reverse a word, do you treat "ch" and "sh" as one character
or two? I'm of the opinion that they're single characters, and thus
this should be "dalokosh":


Depends on the language:  in Spanish, "ch" is it's own letter (at least it was when I grew up), so any word containing 
it should still contain it when reversed:  "chica" would be "acich".


--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:36 AM, Marko Rauhamaa  wrote:
> Chris Angelico :
>
>> On Tue, Jul 17, 2018 at 5:40 AM, Marko Rauhamaa  wrote:
>>> You mean each code point is one code point wide. But that's rather an
>>> irrelevant thing to state. The main point is that UTF-32 (aka
>>> Unicode) uses one or more code points to represent what people would
>>> consider an individual character.
>>
>> No, each code point is one code unit wide. It's not irrelevant.
>
> Finally, we have reached the simple crux of the debate, and that's where
> you and I disagree.
>
> Unicode code points sure express many more things than UTF-8 bytes.
> UTF-8 bytes can only represent the first 128 code points of Unicode.
> However, even Unicode has given up trying to represent even basic
> everyday symbols with single codepoints, which leads back to the
> question of how Python3's Unicode strings are superior to Python2's
> UTF-8 strings. They have the same up and downsides.
>

You snipped my explanation of how what you just said is flat out false.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Tim Chase :
> While the python world has moved its efforts into improving Python3,
> Python2 hasn't suddenly stopped working.

The sword of Damocles is hanging on its head. Unless a consortium is
erected to support Python2, no vendor will be able to use it in the
medium term.

Given the recent events, maybe that's exactly what's going to happen. A
business consortium will take it on themselves to support and enhance
Python2 ad infinitum. I wouldn't be surprised.

(Although it might make me regret my knee-jerk porting effort.)


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34129] CGITB does not mangle variables names

2018-07-16 Thread Pavel Jurkas


Change by Pavel Jurkas :


--
pull_requests: +7838

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34129] CGITB does not mangle variables names

2018-07-16 Thread Pavel Jurkas


Pavel Jurkas  added the comment:

https://github.com/python/cpython/pull/8304

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Marko Rauhamaa
Chris Angelico :
> Challenge: Reverse a string in UTF-8.

Counter-challenge: Reverse a Unicode string:

   >>> s = "a\u0304e"
   >>> s
   'āe'
   >>> L = list(s)
   >>> L.reverse()
   >>> "".join(L)
   'ēa'

> Challenge: Center text in UTF-8.

Counter-challenge: Center a Unicode string:

   >>> t = s * 3
   >>> t
   'āeāeāe'
   >>> t.center(9)
   'āeāeāe'

> Challenge: Given a (non-initial) character in a buffer of UTF-8 bytes,
> find the immediately preceding character.

The counter-challenge is left as an exercise for the reader.

> All of these are fundamentally difficult by nature, but if you index
> by code points, you eliminate one level of difficulty; indexing by
> bytes retains all the existing difficulty and adds another layer.

Oh, sorry. I thought you were suggesting Unicode strings would make the
challenges somehow easy.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Tim Chase
On 2018-07-16 18:31, Steven D'Aprano wrote:
> You say that all you want is a switch to turn off Unicode (and
> replace it with what? Kanji strings? Cyrillic? Shift_JS? no of
> course not, I'm being absurd -- replace it with ASCII, what else
> could any right-thinking person want, right?).

But we already have this.  If I want to turn off Unicode strings, I
type "python2", and if I want to enable Unicode strings, I type
"python3".

While the python world has moved its efforts into improving Python3,
Python2 hasn't suddenly stopped working.  It just stopped receiving
improvements.  If the "old-man shakes-fist at progress" crowd
doesn't like unicode stings in Py3, just keep on using Py2.  You
(generic) won't get arrested.  There are no church^WPython police.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


[issue28909] Adding LTTng-UST tracing support

2018-07-16 Thread Francis Deslauriers


Francis Deslauriers  added the comment:

Hi all,

It seems that, as of right now, the thing blocking this patchset from
going forward is the name of the intrumentation point. 

Two naming approached were suggested:
- Keeping PyDtrace*
- Changing to PyProbe*

I prefer the PyProbe option as it's a more generic name and is not misleading
of the underlying tracing engine but if people prefer that we keep the PyDtrace
version let's go with that.
So, what should we go with?

I can easily update and rebase this patchset.

As an example of how this feature could be used, a colleague of mine gave a
talk[1] at PyCon Canada 2017 about tracing Python applications using this
patchset. He built a tool to visualize Python Logging, Python function calls
and Linux syscalls all in the same view. This was done using the existing
Python logger tracing of LTTng-UST, the LTTng kernel tracer and the CPython
LTTng-UST instrumentation of this patchset. Here is an asciinema[2] recording
used in the talk, it shows the tool in action.

[1]: https://youtu.be/gKmtmPqr6H8
[2]: https://asciinema.org/a/v20Hxnoh3lpzzz3FPmF86fNDS

Cheers!
Francis

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Marko Rauhamaa
Chris Angelico :

> On Tue, Jul 17, 2018 at 5:40 AM, Marko Rauhamaa  wrote:
>> You mean each code point is one code point wide. But that's rather an
>> irrelevant thing to state. The main point is that UTF-32 (aka
>> Unicode) uses one or more code points to represent what people would
>> consider an individual character.
>
> No, each code point is one code unit wide. It's not irrelevant.

Finally, we have reached the simple crux of the debate, and that's where
you and I disagree.

Unicode code points sure express many more things than UTF-8 bytes.
UTF-8 bytes can only represent the first 128 code points of Unicode.
However, even Unicode has given up trying to represent even basic
everyday symbols with single codepoints, which leads back to the
question of how Python3's Unicode strings are superior to Python2's
UTF-8 strings. They have the same up and downsides.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 4:55 AM, Steven D'Aprano
 wrote:
> There is nothing special about diacritics such that we ought to treat
> some combinations like "Ch" (two code points = one character) as "fixed
> width" while others like "â" (two code points = one character) as
> "variable width".

When you reverse a word, do you treat "ch" and "sh" as one character
or two? I'm of the opinion that they're single characters, and thus
this should be "dalokosh":

https://wiki.teamfortress.com/wiki/Dalokohs_Bar

(It's the Russian for "chocolate" - "шоколад" - transliterated to
English/Latin - "šokolad" or "shokolad" - and then reversed.)

But that's an extremely difficult thing to explain to your average gamer...

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue32430] Simplify Modules/Setup{,.dist,.local}

2018-07-16 Thread STINNER Victor


STINNER Victor  added the comment:

Thank you Antoine for fixing this issue! This issue annoyed me forever, but
I never tried to fix it. It was very annoying when using git bisect for
example.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue34096] [2.7] test_audioop.test_max() failed: AssertionError: -2147483648 != 2147483648L

2018-07-16 Thread Erich Eckner


Erich Eckner  added the comment:

ah, that would explain, why we don't get it set automatically on archlinux32 - 
there's "export OPT=$CFLAGS" right infront of ./configure ... - so $OPT is set 
and thus, -fwrapv is not appended

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Terry Reedy

On 7/16/2018 2:01 PM, Chris Angelico wrote:


【 Stardew Valley Fanart 】*:・゚✧【 800 Subpoints = NEW EMOTE
】#devicat #anime #stardewvalley #fantasy


Just to be clear, 【 】・゚✧【 】,
\U0001f331, \u3010, \u3011, \uff65, \uff9f, \u2727 are the non-ascii 
chars in the above.


for c in """【 Stardew Valley Fanart 】*:・゚✧【 800 Subpoints = NEW 
EMOTE】#devicat #anime #stardewvalley #fantasy""": print(hex(ord(c)))


They look fine on Thunderbird, including the orange and green fruit.
They look ok in Notepad++ (the fruit is black and white and not 
recognizable without knowing what it is).



Ok, I'll bite. What font would be used to properly display the
above?



Oh! I just remembered. Try installing (through apt-get or equivalent)
the "unifont" package. It'll drag in a few fonts designed to provide
good coverage of all of Unicode, making them available as fallback
fonts. That way, when you use a font that doesn't have all the
characters, it'll use that for the bulk of the text, but instead of
the rectangles that you're seeing, you'll get the correct glyphs.


Windows 10 comes with good coverage of Unicode, fallback, and 
bidirectional (bidi) support.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Marko Rauhamaa
Rhodri James :

> On 16/07/18 20:40, Marko Rauhamaa wrote:
>> You mean each code point is one code point wide. But that's rather an
>> irrelevant thing to state. The main point is that UTF-32 (aka Unicode)
>> uses one or more code points to represent what people would consider an
>> individual character.
>
> UTF-32 != Unicode, but that's a separate esoteric argument.
>
> The problem everyone

"Everyone"!!!

> is having with you, Marko, is that you are using the terminology
> incorrectly. When you say that more than one codepoint can be used to
> represent what people would consider an individual character, you are
> correct (and would be more correct if you called "what people would
> consider an individual character" a "glyph"). When you call UTF-32 a
> variable-width encoding, you are incorrect.

Unicode is one of the primary selling points of Python3, and the
uninitiated are led to believe the false dichotomy between

 1. The ugly American who believes the whole world runs with ASCII and
is happy with Python2.

 2. The refined cosmopolitan who can appreciate the ease with which
Python3 brings them the whole world.

People (including "everyone" and the uninitiated) need to understand
that Unicode strings are no better at cosmopolitan code than UTF-8
inside byte strings. In their time, Windows and Java believed UCS-2 is
the solution to the woes of 8-bitness. They were sorely disappointed.
Python3 thought it could benefit from hindsight and went directly to
21-bit Unicode code points (plus surrogate characters, which really have
no business in Unicode strings). Alas, even that didn't cut it -- even
for the Americans, who are abandoning English in droves for hieroglyphs,
i.e., emojis.

> You are of course welcome to use whatever terminology you personally
> like, like Humpty Dumpty. However when you point to a duck and say
> "That's a gnu," people are likely to stop taking you seriously.

Who hath ears to hear, let him hear.


Marko
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 5:51 AM, Marko Rauhamaa  wrote:
> Steven D'Aprano :
>> Under that standard definition, UTF-8 and UTF-16 are variable-width,
>> and UTF-32 is fixed-width.
>>
>> But I'll accept that UTF-32 is variable-width if Marko accepts that
>> ASCII is too.
>
> If that makes you happy, fine. The point is, UTF-32 has no advantages
> over UTF-8. And I'm referring to the text abstraction as seen by the
> programmer. It has nothing to do with the layout of bytes inside
> CPython.
>
> I use UTF-8 in my C programs and sense no disadvantage. I have never
> felt a need for wchar_t. Similarly, I had a small Python2 program that
> quizzed me about Hebrew vocabulary with Finnish translations and
> Esperanto pronunciation instructions. All UTF-8. No unicode strings. (I
> *have* converted that to Python3 just to be on the bleeding edge, but it
> didn't give me any advantages over Python2.)

Challenge: Reverse a string in UTF-8.

Challenge: Center text in UTF-8.

Challenge: Given a (non-initial) character in a buffer of UTF-8 bytes,
find the immediately preceding character.

All of these are fundamentally difficult by nature, but if you index
by code points, you eliminate one level of difficulty; indexing by
bytes retains all the existing difficulty and adds another layer.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Glyphs and graphemes [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James

On 16/07/18 20:51, Marko Rauhamaa wrote:

I use UTF-8 in my C programs and sense no disadvantage. I have never
felt a need for wchar_t.


That's not a good comparison, though, because wchar_t in C really 
doesn't give you much (if any) advantage over rolling your own UTF-8 
support, even when that means making sure you don't split characters 
across buffers.


--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list


[issue34096] [2.7] test_audioop.test_max() failed: AssertionError: -2147483648 != 2147483648L

2018-07-16 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

https://github.com/python/cpython/blob/2.7/configure.ac#L1067

appears to add -fwrapv as desired if the gcc or clang version being used 
supports it.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 6:16 AM, Rhodri James  wrote:
> On 16/07/18 20:58, Terry Reedy wrote:
>>
>> On 7/16/2018 1:27 PM, Jim Lee wrote:
>>
>>> 90% of the world *is* "beneath my notice" when it comes to programming
>>> for myself.   I really don't care if that's not PC enough for you.
>>>
>>> Had you actually read my words with *intent* rather than *reaction*, you
>>> would notice that I suggested the *option* of turning off Unicode.  I didn't
>>> say get *rid* of Unicode.  I didn't say make it *harder* to use Unicode.
>>> Once again - reaction rather than reading.
>>>
>>> Obviously, the most vocal representatives of the Python community are too
>>> sensitive about their language to enable rational discussion.
>>
>>
>> My empirical observation is that the more abrasive posters get rewarded
>> with more response, while my attempts to engage in rational discussion,
>> without ad hominems, gets less.
>
>
> I wouldn't disagree with you.  Fortunately Jim has pulled the "storming off
> in a huff rather than answer a question anyone actually asked" defence, so
> we can go back to debating about important things like how to spell
> assignment expressions.
>
> Oh wait... :-)
>

+1 QOTD.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Cult-like behaviour [was Re: Kindness]

2018-07-16 Thread Chris Angelico
On Tue, Jul 17, 2018 at 5:40 AM, Marko Rauhamaa  wrote:
> Terry Reedy :
>
>> On 7/15/2018 5:28 PM, Marko Rauhamaa wrote:
>>> if your new system used Python3's UTF-32 strings as a foundation,
>>
>> Since 3.3, Python's strings are not (always) UFT-32 strings.
>
> You are right. Python's strings are a superset of UTF-32. More
> accurately, Python's strings are UTF-32 plus surrogate characters.
>
>> Nor are they always UCS-2 (or partly UTF-16) strings. Nor are the
>> always Latin-1 or Ascii strings. Python's Flexible String
>> Representation uses the narrowest possible internal code for any
>> particular string. This is all transparent to the user except for
>> memory size.
>
> How CPython chooses to represent its strings internally is not what I'm
> talking about.

Then don't talk about UTF-32, which is a representation format.

>>> UTF-32, after all, is a variable-width encoding.
>>
>> Nope.  It a fixed-width (32 bits, 4 bytes) encoding.
>>
>> Perhaps you should ask more questions before pontificating.
>
> You mean each code point is one code point wide. But that's rather an
> irrelevant thing to state. The main point is that UTF-32 (aka Unicode)
> uses one or more code points to represent what people would consider an
> individual character.

No, each code point is one code unit wide. It's not irrelevant.

> The letter "a" is encoded as a single code point, but  (Flag, United
> Kingdom) is two code points wide and  (Flag, England) is seven (!)
> code points wide, not to forget 淪‍♂️ (Man in Steamy Room) with four code
> points. https://unicode.org/emoji/charts/full-emoji-list.html>
>
> And of course, regular West-European letters can be represented by
> multiple code points.
>
> Code points are about as interesting as individual bytes in UTF-8.

Individual bytes in UTF-8 do not have individual meaning. Individual
code points do, with the partial exception of the flag characters
(which are pretty poorly supported anyway). Otherwise, every code
point is either a base character with general meaning, or a combining
character (or variant selector) that represents a specific change.
They can be composed in different ways. For example:

U+006F U+0301 "ó" LATIN SMALL LETTER O WITH ACUTE
U+006F U+030B "ő" LATIN SMALL LETTER O WITH DOUBLE ACUTE
U+0075 U+0301 "ú" LATIN SMALL LETTER U WITH ACUTE
U+0075 U+030B "ű" LATIN SMALL LETTER U WITH DOUBLE ACUTE

The UTF-8 representations of the combined forms of these characters are:
C3 B3
C5 91
C3 BA
C5 B1

What does byte value C5 mean? What does 91 mean? None of these has
meaning on its own. The only way you can interpret them is as a full
set. In contrast, the combining characters have meaning: a base
character, or a combining character.

So, no, individual code points are very interesting.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


[issue34096] [2.7] test_audioop.test_max() failed: AssertionError: -2147483648 != 2147483648L

2018-07-16 Thread Gregory P. Smith


Gregory P. Smith  added the comment:

IIRC that we decided that CPython and extension modules always require -fwrapv.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Rhodri James

On 16/07/18 20:58, Terry Reedy wrote:

On 7/16/2018 1:27 PM, Jim Lee wrote:

90% of the world *is* "beneath my notice" when it comes to programming 
for myself.   I really don't care if that's not PC enough for you.


Had you actually read my words with *intent* rather than *reaction*, 
you would notice that I suggested the *option* of turning off 
Unicode.  I didn't say get *rid* of Unicode.  I didn't say make it 
*harder* to use Unicode.  Once again - reaction rather than reading.


Obviously, the most vocal representatives of the Python community are 
too sensitive about their language to enable rational discussion.


My empirical observation is that the more abrasive posters get rewarded 
with more response, while my attempts to engage in rational discussion, 
without ad hominems, gets less.


I wouldn't disagree with you.  Fortunately Jim has pulled the "storming 
off in a huff rather than answer a question anyone actually asked" 
defence, so we can go back to debating about important things like how 
to spell assignment expressions.


Oh wait... :-)

--
Rhodri James *-* Kynesim Ltd
--
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Anders Wegge Keller
På Mon, 16 Jul 2018 11:33:46 -0700
Jim Lee  skrev:

> Go right ahead.  I find it surprising that Stephen isn't banned, 
> considering the fact that he ridicules anyone he doesn't agree with.  
> But I guess he's one of the 'good 'ol boys', and so exempt from the code 
> of conduct.

Well said!

-- 
//Wegge
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unicode [was Re: Cult-like behaviour]

2018-07-16 Thread Terry Reedy

On 7/16/2018 1:27 PM, Jim Lee wrote:

90% of the world *is* "beneath my notice" when it comes to programming 
for myself.   I really don't care if that's not PC enough for you.


Had you actually read my words with *intent* rather than *reaction*, you 
would notice that I suggested the *option* of turning off Unicode.  I 
didn't say get *rid* of Unicode.  I didn't say make it *harder* to use 
Unicode.  Once again - reaction rather than reading.


Obviously, the most vocal representatives of the Python community are 
too sensitive about their language to enable rational discussion.


My empirical observation is that the more abrasive posters get rewarded 
with more response, while my attempts to engage in rational discussion, 
without ad hominems, gets less.


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


[issue34128] Do not block threads when pickle/unpickle

2018-07-16 Thread ppperry


ppperry  added the comment:

um, something doesn't make sense about this. the python implementation of 
pickle never released the GIL (it can't, by definition -- it's written in 
python). The C implementation releasing the GIL wouldn't make sense, as the 
pickle api involves calls into python everywhere (for example, `__reduce__`)

--
nosy: +ppperry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   >