Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of
On Tue, Aug 21, 2012 at 11:55 AM, Ezio Melotti wrote:
>> +Sequence Types --- :class:`list`, :class:`tuple`, :class:`range`
>> +
>> +
>
>
> These 3 links in the section title redirect to the functions.html page. I
> think it would be better if they linked to the appropriate subsection
> instead, and in the case of the subsections (e.g. "Text Sequence Type ---
> str") they shouldn't be links. The same comment can be applied to other
> titles as well.
I made a start on moving the info out of functions.html and adding
appropriate noindex entries. str, bytes and bytearray haven't been
consolidated at all yet.
>> ++--++--+
>> +| ``s * n, n * s`` | *n* shallow copies of *s* | (2)(7) |
>
>
> I would use '``s * n`` or ``n * s``' here.
Done.
>> +| ``s.index(x, [i[, j]])`` | index of the first occurence | \(8) |
>
>
> This should be ``s.index(x[, i[, j]])``
Done.
>> +
>> + * if concatenating :class:`tuple` objects, extend a :class:`list`
>> instead.
>> +
>> + * for other types, investigate the relevant class documentation
>> +
>
>
> The trailing punctuation of the elements in this list is inconsistent.
Just removed all trailing punctuation from these bullet points for now.
>
> You missed clear() from this list.
The problem was actually index() and count() were missing from the
index for the "common sequence operations" table. Added them there,
and moved that index above the table.
copy() was missing from the index list for the mutable sequence
methods, so I added that.
> Also in the "Result" column the descriptions in prose are OK, but I find
> some of the "same as ..." ones not very readable (or even fairly obscure).
> (I think I saw something similar in the doc of list.append() too.)
These are all rather old - much of this patch was just moving things
around rather than fixing the prose, although there was plenty of the
latter, too :)
I tried to improve them a bit.
> Is it worth mentioning a function call as an example of syntactic ambiguity?
> Someone might wonder if foo(a, b, c) is actually passing a 3-elements tuple
> or 3 distinct values.
Done.
> This claim is maybe a bit too strong. I think the main reason to use
> namedtuples is being able to access the elements via t.name, rather than
> t[pos], and while this can be useful for basically every heterogeneous
> tuple, I think that plain tuples are still preferred.
Reworded.
> On a separate note, should tuple unpacking be mentioned here? (a link to a
> separate section of the doc is enough.)
Not really - despite the name, tuple unpacking isn't especially
closely related to tuples these days.
> I would mention explicitly "in :keyword:`for` loops" -- ranges don't loop on
> their own (I think people familiar with Ruby and/or JQuery might get
> confused here).
Done.
> I thought that these two paragraphs were talking about positive and negative
> start/stop/step until I reached the middle of the second paragraph (the word
> "indices" wasn't enough to realize that these paragraphs are about
> indexing/slicing, probably because they are rarely used and I wasn't
> expecting to find them at this point of the doc). Maybe it's better to move
> the paragraphs at the bottom of the section.
For the moment, I've just dumped the old range builtin docs into this
section. They need a pass to remove the duplication and ensure
everything makes sense in context.
>> +String literals that are part of a single expression and have only
>> whitespace
>> +between them will be implicitly converted to a single string literal.
>> +
>
>
> Is it a string /literal/ they are converted to?
Yup:
>>> ast.dump(compile('"hello world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
"Expression(body=Str(s='hello world'))"
>>> ast.dump(compile('"hello" " world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
"Expression(body=Str(s='hello world'))"
> Anyway a simple ('foo' 'bar') == 'foobar' example might make this sentence
> more understandable.
Added.
>> +There is also no mutable string type, but :meth:`str.join` or
>> +:class:`io.StringIO` can be used to efficiently construct strings from
>> +multiple fragments.
>> +
>
> str.format() deserves to be mentioned here too.
For the kinds of strings where quadratic growth is a problem,
str.format is unlikely to be appropriate.
> I noticed that here there's this fairly long section about the "old" string
> formatting and nothing about the "new" formatting. Maybe this should be
> moved together with the new formatting doc, so that all the detailed
> formatting docs are in the same place. (This would also help making this
> less noticeable)
Probably. There are a lot of structural problems in the current docs,
because the layout hasn't previously changed to suit the language
design changes.
>> +While bytes literals and representations are based on ASCII text, bytes
>> +objects actually beha
[Python-Dev] Python 2.7: only Visual Studio 2008?
Greetings, it is my understanding that the patches floating around the net to support Visual Studio 2010 to compile the Python core and for distutils will never be accepted and therefore that the 2.7 line is stuck to VS 2008 for the remaining of its life. Could you please confirm that? Best wishes, Luc ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of
On Tue, 21 Aug 2012 17:47:28 +1000, Nick Coghlan wrote:
> On Tue, Aug 21, 2012 at 11:55 AM, Ezio Melotti wrote:
> >> +String literals that are part of a single expression and have only
> >> whitespace
> >> +between them will be implicitly converted to a single string literal.
> >> +
> >
> >
> > Is it a string /literal/ they are converted to?
> Yup:
>
> >>> ast.dump(compile('"hello world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
> "Expression(body=Str(s='hello world'))"
> >>> ast.dump(compile('"hello" " world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
> "Expression(body=Str(s='hello world'))"
>
> > Anyway a simple ('foo' 'bar') == 'foobar' example might make this sentence
> > more understandable.
>
> Added.
I think it is an important and subtle point that this happens at "compile
time" rather than "run time". Subtle in that it is not at all obvious
(as this question demonstrates), and important in that it does have
performance implications (even if those are trivial in most cases).
So I think it would be worth saying "implicitly converted to a single
string literal when the source is parsed", or something like that.
--David
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of
On Tue, Aug 21, 2012 at 10:01 PM, R. David Murray wrote: > I think it is an important and subtle point that this happens at "compile > time" rather than "run time". Subtle in that it is not at all obvious > (as this question demonstrates), and important in that it does have > performance implications (even if those are trivial in most cases). > So I think it would be worth saying "implicitly converted to a single > string literal when the source is parsed", or something like that. That kind of fine detail is what the language reference is for - the distinction really doesn't matter most of the time. Cheers, Nick. -- Nick Coghlan | [email protected] | Brisbane, Australia ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
2012/8/18 Terry Reedy :
> The issue came up in python-list about string operations being slower in
> 3.3. (The categorical claim is false as some things are actually faster.)
Yes, some operations are slower, but others are faster :-) There was
an important effort to limit the overhead of the PEP 393 (when the
branch was merged, most operations were slower). I tried to fix all
performance regressions. If you find cases where Python 3.3 is slower,
I can investigate and try to optimize it (in Python 3.4) or at least
explain why it is slower :-)
As said by Antoine, use the stringbench tool if you would like to get
a first overview of string performances.
> Some things I understand, this one I do not.
>
> Win7-64, 3.3.0b2 versus 3.2.3
> print(timeit("c in a", "c = '…'; a = 'a'*1000+c")) # ord(c) = 8230
> # .6 in 3.2, 1.2 in 3.3
On Linux with narrow build (UTF-16), I get:
$ python3.2 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 4.25 usec per loop
$ python3.3 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 3.21 usec per loop
Linux-2.6.30.10-105.2.23.fc11.i586-i686-with-fedora-11-Leonidas
Python 3.2.2+ (3.2:1453d2fe05bf, Aug 21 2012, 14:21:05)
Python 3.3.0b2+ (default:b36ce0a3a844, Aug 21 2012, 14:05:23)
I'm not sure that I read your benchmark correctly: you write c='...'
and then ord(c)=8230. Algorithms to find a substring are different if
the substring is a single character or if the substring is longer. For
1 character, Antoine Pitrou modified the code to use memchr() and
memrchr(), even if the string is not UCS1 (if this benchmark, the
string uses a UCS2 storage): it may find false positives.
> Why is searching for a two-byte char in a two-bytes per char string so much
> faster in 3.2?
Can you reproduce your benchmark on other Windows platforms? Do you
run the benchmark more than once? I always run a benchmark 3 times.
I don't like the timeit module for micro benchmarks, it is really
unstable (default settings are not written for micro benchmarks).
Example of 4 runs on the same platform:
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.79 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.61 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 3.16 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.76 usec per loop
I wrote my own benchmark tool, based on timeit, to have more stable
results on micro benchmarks:
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
Example of 4 runs:
3.18 us: c=chr(8230); a='a'*1000+c; c in a
3.18 us: c=chr(8230); a='a'*1000+c; c in a
3.21 us: c=chr(8230); a='a'*1000+c; c in a
3.18 us: c=chr(8230); a='a'*1000+c; c in a
My benchmark.py script calibrates automatically the number of loops to
take at least 100 ms, and then repeat the test during at least 1.0
second.
Using time instead of a fixed number of loops is more reliable because
the test is less dependent on the system activity.
> print(timeit("a.encode()", "a = 'a'*1000"))
> # 1.5 in 3.2, .26 in 3.3
>
> print(timeit("a.encode(encoding='utf-8')", "a = 'a'*1000"))
> # 1.7 in 3.2, .51 in 3.3
This test doesn't compare performances of the UTF-8 encoder: "encode"
an ASCII string to UTF-8 in Python 3.3 is a no-op, it just duplicates
the memory (ASCII is compatible with UTF-8)...
So your benchmark just measures the performances of
PyArg_ParseTupleAndKeywords()... Try also str.encode('utf-8').
If you want to benchmark the UTF-8 encoder, use at least a non-ASCII
character like "\x80".
At least, your benchmark shows that Python 3.3 is *much* faster than
Python 3.2 to "encode" pure ASCII strings to UTF-8 :-)
Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis wrote: > Greetings, > > it is my understanding that the patches floating around the net to support > Visual Studio 2010 to compile the Python core and for distutils will never be > accepted and therefore that the 2.7 line is stuck to VS 2008 for the > remaining of its life. Could you please confirm that? This is correct. A compiler upgrade is a feature, so the change to VS2010 could only be applied to the version actively receiving new features, which at the time was 3.3. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
2012/8/21 Brian Curtin : > On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis wrote: >> Greetings, >> >> it is my understanding that the patches floating around the net to support >> Visual Studio 2010 to compile the Python core and for distutils will never >> be accepted and therefore that the 2.7 line is stuck to VS 2008 for the >> remaining of its life. Could you please confirm that? > > This is correct. A compiler upgrade is a feature, so the change to > VS2010 could only be applied to the version actively receiving new > features, which at the time was 3.3. But this does not prevent anyone from creating and maintaining such a patch, outside of the official python.org repository. -- Amaury Forgeot d'Arc ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
Zitat von Luc Bourhis : it is my understanding that the patches floating around the net to support Visual Studio 2010 to compile the Python core and for distutils will never be accepted and therefore that the 2.7 line is stuck to VS 2008 for the remaining of its life. Could you please confirm that? That is correct, yes. OTOH, Python is free software, so people are free to maintain such patches, and even make binary releases out of them. These just won't be available from python.org. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
Zitat von Brian Curtin : On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis wrote: Greetings, it is my understanding that the patches floating around the net to support Visual Studio 2010 to compile the Python core and for distutils will never be accepted and therefore that the 2.7 line is stuck to VS 2008 for the remaining of its life. Could you please confirm that? This is correct. A compiler upgrade is a feature In the specific case, this isn't actually the limiting factor. Instead, it's binary compatibility: binaries compiled with VS 2010 are incompatible (in some cases) with those compiled with VS 2008. So if the python.org binaries were released as compiler outputs from VS 2010, exising extensions modules might crash Python. Therefore, we cannot switch. Maintaining a VS 2010 build process along with the VS 2008 process would be a new feature, indeed. Fortunately, Mercurial makes it easy enough to maintain such patches in a ways that allows simple tracking of changes applied to 2.7 itself, for anybody with enough interest to do so. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
> My benchmark.py script calibrates automatically the number of loops to > take at least 100 ms, and then repeat the test during at least 1.0 > second. > > Using time instead of a fixed number of loops is more reliable because > the test is less dependent on the system activity. I've also been bitten in the past by something that is probably quite obvious but I didn't think to, that is dynamic cpu frequency. Many modern CPUs can dynamically change the frequency depending on the load and temperature and the switch can take more than one second. When doing benchmarks now I've a small script (based on cpufreq-set) that just blocks all the cores into fast mode. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
print(timeit("c in a", "c = '…'; a = 'a'*1000+c")) # ord(c) = 8230
I'm not sure that I read your benchmark correctly: you write c='...'
Apparenly you didn't - or your MUA was not able to display it
correctly. He didn't say
'...' # U+002E U+002E U+002E, 3x FULL STOP
but
'…' # U+2026, HORIZONTAL ELLIPSIS
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
Am 21.08.2012 17:01, schrieb [email protected]: > In the specific case, this isn't actually the limiting factor. > Instead, it's binary compatibility: binaries compiled with VS 2010 > are incompatible (in some cases) with those compiled with VS 2008. > So if the python.org binaries were released as compiler outputs > from VS 2010, exising extensions modules might crash Python. Therefore, > we cannot switch. Compatibility issues may lead to other strange bugs, too. IIRC each msvcrt has its own thread local storage and therefore its own errno handling. An extension compiled with VS 2010 won't be able to use the PyErr_SetFromErrno*() function correctly. That's much harder to debug than a FILE pointer mismatch because it usually doesn't cause a segfault. Christian ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On Tue, 21 Aug 2012 17:20:14 +0200 Andrea Griffini wrote: > > My benchmark.py script calibrates automatically the number of loops to > > take at least 100 ms, and then repeat the test during at least 1.0 > > second. > > > > Using time instead of a fixed number of loops is more reliable because > > the test is less dependent on the system activity. > > I've also been bitten in the past by something that is probably quite > obvious but I didn't think to, that is dynamic cpu frequency. Many > modern CPUs can dynamically change the frequency depending on the load > and temperature and the switch can take more than one second. > > When doing benchmarks now I've a small script (based on cpufreq-set) > that just blocks all the cores into fast mode. For the record, under Linux, the following command: $ sudo cpufreq-set -rg performance should do the trick. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On 21/08/12 23:04, Victor Stinner wrote: I don't like the timeit module for micro benchmarks, it is really unstable (default settings are not written for micro benchmarks). [...] I wrote my own benchmark tool, based on timeit, to have more stable results on micro benchmarks: https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py I am surprised, because the whole purpose of timeit is to time micro code snippets. If it is as unstable as you suggest, and if you have an alternative which is more stable and accurate, I would love to see it in the standard library. -- Steven ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On 21 août 2012, at 19:25, Steven D'Aprano wrote: > On 21/08/12 23:04, Victor Stinner wrote: > >> I don't like the timeit module for micro benchmarks, it is really >> unstable (default settings are not written for micro benchmarks). > [...] >> I wrote my own benchmark tool, based on timeit, to have more stable >> results on micro benchmarks: >> https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py > > I am surprised, because the whole purpose of timeit is to time micro > code snippets. And when invoked from the command-line, it is already time-based: unless -n is specified, python guesstimates the number of iterations to be a power of 10 resulting in at least 0.2s per test (the repeat defaults to 3 though) As a side-note, every time I use timeit programmatically, it annoys me that this behavior is not available and has to be implemented manually. > If it is as unstable as you suggest, and if you have an alternative > which is more stable and accurate, I would love to see it in the > standard library. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
Xavier Morel, 21.08.2012 19:56: > On 21 août 2012, at 19:25, Steven D'Aprano wrote: >> On 21/08/12 23:04, Victor Stinner wrote: >>> I don't like the timeit module for micro benchmarks, it is really >>> unstable (default settings are not written for micro benchmarks). >> [...] >>> I wrote my own benchmark tool, based on timeit, to have more stable >>> results on micro benchmarks: >>> https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py >> >> I am surprised, because the whole purpose of timeit is to time micro >> code snippets. > > And when invoked from the command-line, it is already time-based: unless > -n is specified, python guesstimates the number of iterations to be a > power of 10 resulting in at least 0.2s per test (the repeat defaults to > 3 though) > > As a side-note, every time I use timeit programmatically, it annoys me > that this behavior is not available and has to be implemented manually. +100, sounds like someone should contribute a patch for this. Stefan ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On Tue, Aug 21, 2012 at 1:56 PM, Xavier Morel wrote: > As a side-note, every time I use timeit programmatically, it annoys me that > this behavior is not available and has to be implemented manually. You are not alone: http://bugs.python.org/issue6422 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On Wed, 22 Aug 2012 03:25:21 +1000 Steven D'Aprano wrote: > On 21/08/12 23:04, Victor Stinner wrote: > > > I don't like the timeit module for micro benchmarks, it is really > > unstable (default settings are not written for micro benchmarks). > [...] > > I wrote my own benchmark tool, based on timeit, to have more stable > > results on micro benchmarks: > > https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py > > I am surprised, because the whole purpose of timeit is to time micro > code snippets. > > If it is as unstable as you suggest, and if you have an alternative > which is more stable and accurate, I would love to see it in the > standard library. In my experience timeit is stable enough to know whether a change is significant or not. No need for three-digit precision when the question is whether there is at least a 10% performance difference between two approaches. Regards Antoine. -- Software development and contracting: http://pro.pitrou.net ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
Thanks for the quick response. >> [...] A compiler upgrade is a feature, so the change to >> VS2010 could only be applied to the version actively receiving new >> features, which at the time was 3.3. > > But this does not prevent anyone from creating and maintaining such a > patch, outside of the official python.org repository. I was contemplating that option indeed. Sébastien Sablé seemed to have the same aim. Would you know any other such efforts? I would rather prefer to contribute back to the community. Best wishes, Luc Bourhis ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On 19.08.12 00:17, Terry Reedy wrote:
This is one of the 3.3 improvements. But since the results are equal:
('a'*1000).encode() == ('a'*1000).encode(encoding='utf-8')
and 3.3 should know that for an all-ascii string, I do not see why
adding the parameter should double the the time. Another issue or known
and un-fixable?
This is a cost of argument packing/unpacking.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
On 8/21/2012 9:04 AM, Victor Stinner wrote:
2012/8/18 Terry Reedy :
The issue came up in python-list about string operations being slower in
3.3. (The categorical claim is false as some things are actually faster.)
Yes, some operations are slower, but others are faster :-)
Yes, that is what I wrote, showed, and posted to python-list :-)
I was and am posting here in response to a certain French writer who
dislikes the fact that 3.3 unicode favors text written with the first
256 code points, which do not include all the characters needed for
French, and do not include the euro symbol invented years after that set
was established. His opinion aside, his search for 'evidence' did turn
up a version of the example below.
an important effort to limit the overhead of the PEP 393 (when the
branch was merged, most operations were slower). I tried to fix all
performance regressions.
Yes, I read and appreciated the speed-up patches by you and others.
> If you find cases where Python 3.3 is slower,
I can investigate and try to optimize it (in Python 3.4) or at least
explain why it is slower :-)
Replacement appears to be as much as 6.5 times slower on some Win 7
machines. (I factored out the setup part, which increased the ratio
since it takes the same time on both machines.)
ttr = timeit.repeat
# 3.2.3
>>> ttr("euroreplace('€', 'œ')", "euroreplace = ('€'*100).replace")
[0.385043233078477, 0.35294282203631155, 0.3468394370770511]
# 3.3.0b2
>>> ttr("euroreplace('€', 'œ')", "euroreplace = ('€'*100).replace")
[2.2624885911213823, 2.245330314124203, 2.2531118686461014]
How do this compare on *nix?
As said by Antoine, use the stringbench tool if you would like to get
a first overview of string performances.
I found it, ran it on 3.2 and 3.3, and posted to python-list that 3.3
unicode looks quite good. It is overall comparable to both byte
operations and 3.2 unicode operations. Replace operations were
relatively the slowest, though I do not remember any as bad as the
example above.
Some things I understand, this one I do not.
Win7-64, 3.3.0b2 versus 3.2.3
print(timeit("c in a", "c = '…'; a = 'a'*1000+c")) # ord(c) = 8230
# .6 in 3.2, 1.2 in 3.3
On Linux with narrow build (UTF-16), I get:
$ python3.2 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 4.25 usec per loop
$ python3.3 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 3.21 usec per loop
The slowdown seems to be specific to (some?) windows systems. Perhaps we
as hitting a difference in the VC2008 and VC2010 compilers or runtimes.
Someone on python-list wondered whether the 3.3.0 betas have the same
compile optimization settings as 3.2.3 final. Martin?
Can you reproduce your benchmark on other Windows platforms? Do you
run the benchmark more than once? I always run a benchmark 3 times.
Always, and now I see the repeat does this for me.
I don't like the timeit module for micro benchmarks, it is really
unstable (default settings are not written for micro benchmarks).
I am reporting rounded lowest times. As other said, make timeit better
if you can.
print(timeit("a.encode()", "a = 'a'*1000"))
# 1.5 in 3.2, .26 in 3.3
print(timeit("a.encode(encoding='utf-8')", "a = 'a'*1000"))
# 1.7 in 3.2, .51 in 3.3
This test doesn't compare performances of the UTF-8 encoder: "encode"
an ASCII string to UTF-8 in Python 3.3 is a no-op, it just duplicates
the memory (ASCII is compatible with UTF-8)...
That is what I thought, and why I was puzzled, ...
So your benchmark just measures the performances of
PyArg_ParseTupleAndKeywords()...,
having forgotten about arg processing. I should have factored out the
.encode lookup (as I did with .replace). The following suggests that you
are correct. The difference, about .3, is independent of the length of
string being copied.
>>> ttr("aenc()", "aenc = ('a'*1).encode")
[0.588499543029684, 0.5760222493490801, 0.5757037691037112]
>>> ttr("aenc(encoding='utf-8')", "aenc = ('a'*1).encode")
[0.8973955632254729, 0.887000380270365, 0.884113153942053]
>>> ttr("aenc()", "aenc = ('a'*5).encode")
[3.6618914099180984, 3.650091040467487, 3.6542183723140624]
>>> ttr("aenc(encoding='utf-8')", "aenc = ('a'*5).encode")
[3.964849740958016, 3.9363826484832316, 3.937290440151628]
--
Terry Jan Reedy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python 2.7: only Visual Studio 2008?
I was contemplating that option indeed. Sébastien Sablé seemed to have the same aim. Would you know any other such efforts? I believe Kristjan Jonsson has a port as well. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 3.3 str timings
Zitat von Terry Reedy : I was and am posting here in response to a certain French writer who dislikes the fact that 3.3 unicode favors text written with the first 256 code points, which do not include all the characters needed for French, and do not include the euro symbol invented years after that set was established. His opinion aside, his search for 'evidence' did turn up a version of the example below. I personally don't see a need to "defend" this or any other deliberate change. There is a need to defend changes before they are made, to convince co-contributors and other Python users, this is what the PEP process is good for. One point of the PEP process is that once the PEP is accepted, discussion ought to stop - or anybody continuing in discussion doesn't deserve an answer by anybody not interested. Anybody who doesn't like the change is free not to use Python 3.3, or stay at 2.7, use PyPy, or switch to Ruby altogether. Neither bothers me to the slightest. If people find proper bugs, they are encouraged to report them; if they contribute patches along, the better. If they merely want to complain - let them complain. If they want to see an agreed-upon patch reverted, they can try to lobby a BDFL pronouncement. I certainly think the performance of str in 3.3 is fine, and thought so even before Serhiy or Victor submitted their patches. I actually dislike some of the code complication that these improvements brought, but I can accept that a certain loss of maintainability that gives better performance makes a lot of people happy. But I will continue to object further complications that support irrelevant special cases. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Jython roadmap
On 21/08/2012 06:34, [email protected] wrote: Zitat von "Juancarlo Añez (Apalala)" : It seems that Jython is under the Python Foundation, but I can't find a roadmap, a plan, or instructions about how to contribute to it reaching 2.7 and 3.3. Are there any pages that describe the process? Hi Juanca, These questions are best asked on the jython-dev mailing list, see Hi Juancarlo: I'm cross-posting this for you on jython-dev as Martin is right. Let's continue there. Jython does need new helpers and I agree it isn't very easy to get started. And we could do with a published roadmap. I began by fixing a few bugs (about a year ago now), as that seemed to be the suggestion on-line and patches can be offered unilaterally. (After a bit of nagging) some of these got reviewed and I'd won my spurs. I found the main difficulty to be understanding the source, or rather the architecture: there is too little documentation and some of what you can find is out of date (svn?). A lot of basic stuff is still a complete mystery to me. As I've discovered things I've put them on the Jython Wiki ( http://wiki.python.org/jython/JythonDeveloperGuide ) in the hope of speeding others' entry, including up-to-date description of how to get the code to build in Eclipse. One place to look, that may not occur to you immediately, is Frank Wierzbicki's blog ( http://fwierzbicki.blogspot.co.uk/ ). Frank is the project manager for Jython, an author of the Jython book, and has worked like a Trojan (the good kind, not the horse) over the last 6 months. Although Frank has shared inklings of a roadmap, it must be difficult to put dates to things that depend on a small pool of volunteers working in their spare time -- especially perfectionist volunteers who write more Javadoc than actual code, then delete it all because they've had a better idea :-). Direction of travel is easier: 2.5.3 is out, we're trying to get to 2.7b, but with an eye on 3.3. I haven't seen anything systematic on what's still to do, who's doing it, and where the gaps are, which is probably what you're looking for. ... Frank? Jeff Allen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
