Re: Py 3.3, unicode / upper()

2012-12-29 Thread wxjmfauth
Le mercredi 19 décembre 2012 16:33:50 UTC+1, Christian Heimes a écrit : > > I think Python 3.3+ is using uppercase mapping (uc) instead of simple > > upper case (suc). I think you are thinking correctly. This a clever answer. Note: I do not care about the uc / suc choice. As long there is consi

Re: Py 3.3, unicode / upper()

2012-12-27 Thread wxjmfauth
Le jeudi 27 décembre 2012 20:00:37 UTC+1, Serhiy Storchaka a écrit : > On 19.12.12 17:40, Chris Angelico wrote: > > > Interestingly, IDLE on my Windows box can't handle the bolded > > > characters very well... > > > > > s="\U0001d407\U0001d41e\U0001d425\U0001d425\U0001d428, > \U0001d

Re: Py 3.3, unicode / upper()

2012-12-27 Thread Serhiy Storchaka
On 19.12.12 17:40, Chris Angelico wrote: Interestingly, IDLE on my Windows box can't handle the bolded characters very well... s="\U0001d407\U0001d41e\U0001d425\U0001d425\U0001d428, \U0001d430\U0001d428\U0001d42b\U0001d425\U0001d41d!" print(s) Traceback (most recent call last): File "", li

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Ian Kelly
On Thu, Dec 20, 2012 at 12:19 PM, wrote: > The first (and it should be quite obvious) consequence is that > you create bloated, unnecessary and useless code. I simplify > the flexible string representation (FSR) and will use an "ascii" / > "non-ascii" model/terminology. > > If you are an "ascii"

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Terry Reedy
On 12/20/2012 2:19 PM, [email protected] wrote: My feeling is that most of the people are defending this FSR simply because it exists, not because of its intrisic quality. The fact, contrary to your feeling, is that I was initially dubious that is could be made to work as well as it does. I

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Steven D'Aprano
On Thu, 20 Dec 2012 11:40:21 -0800, wxjmfauth wrote: > I do not care > about this optimization. I'm not an ascii user. As a non ascii user, > this optimization is just irrelevant. WRONG. Every Python user is an ASCII user. Every Python program has hundreds or thousands of ASCII strings. # ===

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Terry Reedy
On 12/20/2012 2:40 PM, [email protected] wrote: What should a Python user think, if he sees his strings are comsuming more memory just because he uses non ascii characters What should a Python user think, if he (or she) sees his (or her) strings sometimes or often consuming less memory than

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Terry Reedy
On 12/20/2012 2:57 PM, [email protected] wrote: I shew a case where the Py33 works 10 times slower than Py32, "replace". You the devs spend your time to correct that case. I discovered that it is the 'find' part of find and replace that is slower. The comparison is worse on Windows than on

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Terry Reedy
On 12/20/2012 2:19 PM, [email protected] wrote: If you are an "ascii" user, a FSR model has no sense. An "ascii" user will use, per definition, only "ascii characters". If you are a "non-ascii" user, the FSR model is also a non sense, because you are per definition a n"on-ascii" user of "non

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Chris Angelico
On Fri, Dec 21, 2012 at 7:20 AM, MRAB wrote: > On 2012-12-20 19:19, [email protected] wrote: >> The rule is to treat every character of a unique set of characters >> of a coding scheme in, how to say, an "equal way". The problematic >> can be seen the other way, every coding scheme has been buil

Re: Py 3.3, unicode / upper()

2012-12-20 Thread MRAB
On 2012-12-20 19:19, [email protected] wrote: Fact. In order to work comfortably and with efficiency with a "scheme for the coding of the characters", can be unicode or any coding scheme, one has to take into account two things: 1) work with a unique set of characters and 2) work with a contigu

Re: Py 3.3, unicode / upper()

2012-12-20 Thread wxjmfauth
Le jeudi 20 décembre 2012 06:32:42 UTC+1, Terry Reedy a écrit : > On 12/19/2012 10:12 PM, Westley Martínez wrote: > > > On Wed, Dec 19, 2012 at 09:54:20PM -0500, Terry Reedy wrote: > > >> On 12/19/2012 9:03 PM, Chris Angelico wrote: > > >>> On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly wrote: > >

Re: Py 3.3, unicode / upper()

2012-12-20 Thread wxjmfauth
Le mercredi 19 décembre 2012 22:23:15 UTC+1, Ian a écrit : > On Wed, Dec 19, 2012 at 1:55 PM, wrote: > > > Yes, it is correct (or can be considered as correct). > > > I do not wish to discuss the typographical problematic > > > of "Das Grosse Eszett". The web is full of pages on the > > > sub

Re: Py 3.3, unicode / upper()

2012-12-20 Thread wxjmfauth
Le mercredi 19 décembre 2012 22:31:42 UTC+1, Ian a écrit : > On Wed, Dec 19, 2012 at 2:18 PM, wrote: > > > latin-1 (iso-8859-1) ? are you sure ? > > > > Yes. > > > > sys.getsizeof('a') > > > 26 > > sys.getsizeof('ab') > > > 27 > > sys.getsizeof('aé') > > > 39 > > > >

Re: Py 3.3, unicode / upper()

2012-12-20 Thread wxjmfauth
Fact. In order to work comfortably and with efficiency with a "scheme for the coding of the characters", can be unicode or any coding scheme, one has to take into account two things: 1) work with a unique set of characters and 2) work with a contiguous block of code points. At this point, it shoul

Re: Py 3.3, unicode / upper()

2012-12-20 Thread Johannes Bauer
On 19.12.2012 16:40, Chris Angelico wrote: > You may not be familiar with jmf. He's one of our resident trolls, and > he has a bee in his bonnet about PEP 393 strings, on the basis that > they take up more space in memory than a narrow build of Python 3.2 > would, for a string with lots of BMP cha

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Steven D'Aprano
On Thu, 20 Dec 2012 00:32:42 -0500, Terry Reedy wrote: > In the unicode case, Jim discovered that find was several times slower > in 3.3 than 3.2 and claimed that that was a reason to not use 3.2. I ran > the complete stringbency.py and discovered that find (and consequently > find and replace) ar

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Terry Reedy
On 12/19/2012 10:12 PM, Westley Martínez wrote: On Wed, Dec 19, 2012 at 09:54:20PM -0500, Terry Reedy wrote: On 12/19/2012 9:03 PM, Chris Angelico wrote: On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly wrote: From what I've been able to discern, [jmf's] actual complaint about PEP 393 stems from m

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Chris Angelico
On Thu, Dec 20, 2012 at 2:12 PM, Westley Martínez wrote: > Really, why should we be so obsessed with speed anyways? Isn't > improving the language and fixing bugs far more important? Because speed is very important in certain areas. Python can be used in many ways: * Command-line calculator wit

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Westley Martínez
On Wed, Dec 19, 2012 at 09:54:20PM -0500, Terry Reedy wrote: > On 12/19/2012 9:03 PM, Chris Angelico wrote: > >On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly wrote: > >> From what I've been able to discern, [jmf's] actual complaint about PEP > >>393 stems from misguided moral concerns. With PEP-393,

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Terry Reedy
On 12/19/2012 9:03 PM, Chris Angelico wrote: On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly wrote: From what I've been able to discern, [jmf's] actual complaint about PEP 393 stems from misguided moral concerns. With PEP-393, strings that can be fully represented in Latin-1 can be stored in half

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Westley Martínez
On Wed, Dec 19, 2012 at 02:23:15PM -0700, Ian Kelly wrote: > On Wed, Dec 19, 2012 at 1:55 PM, wrote: > > If "wrong", this can be considered as programmatically correct > > or logically acceptable (Py3.2) > > > 'Straße'.upper().lower().capitalize() == 'Straße' > > True > > > > while this wil

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Chris Angelico
On Thu, Dec 20, 2012 at 5:27 AM, Ian Kelly wrote: > From what I've been able to discern, [jmf's] actual complaint about PEP > 393 stems from misguided moral concerns. With PEP-393, strings that > can be fully represented in Latin-1 can be stored in half the space > (ignoring fixed overhead) compa

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Chris Angelico
On Thu, Dec 20, 2012 at 8:23 AM, Ian Kelly wrote: > On Wed, Dec 19, 2012 at 1:55 PM, wrote: >> Yes, it is correct (or can be considered as correct). >> I do not wish to discuss the typographical problematic >> of "Das Grosse Eszett". The web is full of pages on the >> subject. However, I never s

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Terry Reedy
On 12/19/2012 10:40 AM, Chris Angelico wrote: Interestingly, IDLE on my Windows box can't handle the bolded characters very well... s="\U0001d407\U0001d41e\U0001d425\U0001d425\U0001d428, \U0001d430\U0001d428\U0001d42b\U0001d425\U0001d41d!" print(s) Traceback (most recent call last): File

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Ian Kelly
On Wed, Dec 19, 2012 at 2:18 PM, wrote: > latin-1 (iso-8859-1) ? are you sure ? Yes. sys.getsizeof('a') > 26 sys.getsizeof('ab') > 27 sys.getsizeof('aé') > 39 Compare to: >>> sys.getsizeof('a\u0100') 42 The reason for the difference you posted is that pure ASCII strings have a

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Ian Kelly
On Wed, Dec 19, 2012 at 1:55 PM, wrote: > Yes, it is correct (or can be considered as correct). > I do not wish to discuss the typographical problematic > of "Das Grosse Eszett". The web is full of pages on the > subject. However, I never succeeded to find an "official > position" from Unicode. T

Re: Py 3.3, unicode / upper()

2012-12-19 Thread wxjmfauth
Le mercredi 19 décembre 2012 19:27:38 UTC+1, Ian a écrit : > On Wed, Dec 19, 2012 at 8:40 AM, Chris Angelico wrote: > > > You may not be familiar with jmf. He's one of our resident trolls, and > > > he has a bee in his bonnet about PEP 393 strings, on the basis that > > > they take up more spac

Re: Py 3.3, unicode / upper()

2012-12-19 Thread wxjmfauth
Le mercredi 19 décembre 2012 15:52:23 UTC+1, Christian Heimes a écrit : > Am 19.12.2012 15:23, schrieb [email protected]: > > > But, this is not the problem. > > > I was suprised to discover this: > > > > > 'Straße'.upper() > > > 'STRASSE' > > > > > > I really, really do not know wh

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Benjamin Peterson
gmail.com> writes: > I really, really do not know what I should think about that. > (It is a complex subject.) And the real question is why? Because that's what the Unicode spec says to do. -- http://mail.python.org/mailman/listinfo/python-list

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Ian Kelly
On Wed, Dec 19, 2012 at 8:40 AM, Chris Angelico wrote: > You may not be familiar with jmf. He's one of our resident trolls, and > he has a bee in his bonnet about PEP 393 strings, on the basis that > they take up more space in memory than a narrow build of Python 3.2 > would, for a string with lot

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Chris Angelico
On Thu, Dec 20, 2012 at 2:18 AM, Johannes Bauer wrote: > On 19.12.2012 15:23, [email protected] wrote: >> I was using the German word "Straße" (Strasse) — German >> translation from "street" — to illustrate the catastrophic and >> completely wrong-by-design Unicode handling in Py3.3, this >> tim

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Christian Heimes
Am 19.12.2012 16:01, schrieb Stefan Krah: > The uppercase ß isn't really needed, since ß does not occur at the beginning > of a word. As far as I know, most Germans wouldn't even know that it has > existed at some point or how to write it. I think Python 3.3+ is using uppercase mapping (uc) instea

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Johannes Bauer
On 19.12.2012 16:18, Johannes Bauer wrote: > How do those arbitrary numbers prove anything at all? Why do you draw > the conclusion that it's broken by design? What do you expect? You're > very vague here. Just to show how ridiculously pointless your numers > are, your example gives 84 on Python3.

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Johannes Bauer
On 19.12.2012 15:23, [email protected] wrote: > I was using the German word "Straße" (Strasse) — German > translation from "street" — to illustrate the catastrophic and > completely wrong-by-design Unicode handling in Py3.3, this > time from a memory point of view (not speed): > sys.getsiz

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Chris Angelico
On Thu, Dec 20, 2012 at 1:23 AM, wrote: > But, this is not the problem. > I was suprised to discover this: > 'Straße'.upper() > 'STRASSE' > > I really, really do not know what I should think about that. > (It is a complex subject.) And the real question is why? Not all strings can be upperc

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Stefan Krah
[email protected] wrote: > But, this is not the problem. > I was suprised to discover this: > > >>> 'Straße'.upper() > 'STRASSE' > > I really, really do not know what I should think about that. > (It is a complex subject.) And the real question is why? http://de.wikipedia.org/wiki/Gro%C3%9Fes

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Christian Heimes
Am 19.12.2012 15:23, schrieb [email protected]: > But, this is not the problem. > I was suprised to discover this: > 'Straße'.upper() > 'STRASSE' > > I really, really do not know what I should think about that. > (It is a complex subject.) And the real question is why? It's correct. LATIN

Re: Py 3.3, unicode / upper()

2012-12-19 Thread Thomas Bach
On Wed, Dec 19, 2012 at 06:23:00AM -0800, [email protected] wrote: > I was suprised to discover this: > > >>> 'Straße'.upper() > 'STRASSE' > > I really, really do not know what I should think about that. > (It is a complex subject.) And the real question is why? Because there is no definition

Py 3.3, unicode / upper()

2012-12-19 Thread wxjmfauth
I was using the German word "Straße" (Strasse) — German translation from "street" — to illustrate the catastrophic and completely wrong-by-design Unicode handling in Py3.3, this time from a memory point of view (not speed): >>> sys.getsizeof('Straße') 43 >>> sys.getsizeof('STRAẞE') 50 instead of