Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of

2012-08-21 Thread Nick Coghlan
On Tue, Aug 21, 2012 at 11:55 AM, Ezio Melotti  wrote:
>> +Sequence Types --- :class:`list`, :class:`tuple`, :class:`range`
>> +
>> +
>
>
> These 3 links in the section title redirect to the functions.html page.  I
> think it would be better if they linked to the appropriate subsection
> instead, and in the case of the subsections (e.g. "Text Sequence Type ---
> str") they shouldn't be links.  The same comment can be applied to other
> titles as well.

I made a start on moving the info out of functions.html and adding
appropriate noindex entries. str, bytes and bytearray haven't been
consolidated at all yet.

>> ++--++--+
>> +| ``s * n, n * s`` | *n* shallow copies of *s*  | (2)(7)   |
>
>
> I would use '``s * n`` or ``n * s``' here.

Done.

>> +| ``s.index(x, [i[, j]])`` | index of the first occurence   | \(8) |
>
>
> This should be ``s.index(x[, i[, j]])``

Done.

>> +
>> +   * if concatenating :class:`tuple` objects, extend a :class:`list`
>> instead.
>> +
>> +   * for other types, investigate the relevant class documentation
>> +
>
>
> The trailing punctuation of the elements in this list is inconsistent.

Just removed all trailing punctuation from these bullet points for now.

>
> You missed clear() from this list.

The problem was actually index() and count() were missing from the
index for the "common sequence operations" table. Added them there,
and moved that index above the table.

copy() was missing from the index list for the mutable sequence
methods, so I added that.

> Also in the "Result" column the descriptions in prose are OK, but I find
> some of the "same as ..." ones not very readable (or even fairly obscure).
> (I think I saw something similar in the doc of list.append() too.)

These are all rather old - much of this patch was just moving things
around rather than fixing the prose, although there was plenty of the
latter, too :)

I tried to improve them a bit.

> Is it worth mentioning a function call as an example of syntactic ambiguity?
> Someone might wonder if foo(a, b, c) is actually passing a 3-elements tuple
> or 3 distinct values.

Done.

> This claim is maybe a bit too strong.  I think the main reason to use
> namedtuples is being able to access the elements via t.name, rather than
> t[pos], and while this can be useful for basically every heterogeneous
> tuple, I think that plain tuples are still preferred.

Reworded.

> On a separate note, should tuple unpacking be mentioned here? (a link to a
> separate section of the doc is enough.)

Not really - despite the name, tuple unpacking isn't especially
closely related to tuples these days.

> I would mention explicitly "in :keyword:`for` loops" -- ranges don't loop on
> their own (I think people familiar with Ruby and/or JQuery might get
> confused here).

Done.

> I thought that these two paragraphs were talking about positive and negative
> start/stop/step until I reached the middle of the second paragraph (the word
> "indices" wasn't enough to realize that these paragraphs are about
> indexing/slicing, probably because they are rarely used and I wasn't
> expecting to find them at this point of the doc).  Maybe it's better to move
> the paragraphs at the bottom of the section.

For the moment, I've just dumped the old range builtin docs into this
section. They need a pass to remove the duplication and ensure
everything makes sense in context.

>> +String literals that are part of a single expression and have only
>> whitespace
>> +between them will be implicitly converted to a single string literal.
>> +
>
>
> Is it a string /literal/ they are converted to?
Yup:

>>> ast.dump(compile('"hello world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
"Expression(body=Str(s='hello world'))"
>>> ast.dump(compile('"hello" " world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
"Expression(body=Str(s='hello world'))"

> Anyway a simple ('foo' 'bar') == 'foobar' example might make this sentence
> more understandable.

Added.

>> +There is also no mutable string type, but :meth:`str.join` or
>> +:class:`io.StringIO` can be used to efficiently construct strings from
>> +multiple fragments.
>> +
>
> str.format() deserves to be mentioned here too.

For the kinds of strings where quadratic growth is a problem,
str.format is unlikely to be appropriate.

> I noticed that here there's this fairly long section about the "old" string
> formatting and nothing about the "new" formatting.  Maybe this should be
> moved together with the new formatting doc, so that all the detailed
> formatting docs are in the same place. (This would also help making this
> less noticeable)

Probably. There are a lot of structural problems in the current docs,
because the layout hasn't previously changed to suit the language
design changes.

>> +While bytes literals and representations are based on ASCII text, bytes
>> +objects actually beha

[Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread Luc Bourhis
Greetings,

it is my understanding that the patches floating around the net to support 
Visual Studio 2010 to compile the Python core and for distutils will never be 
accepted and therefore that the 2.7 line is stuck to VS 2008 for the remaining 
of its life. Could you please confirm that?

Best wishes,

Luc

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of

2012-08-21 Thread R. David Murray
On Tue, 21 Aug 2012 17:47:28 +1000, Nick Coghlan  wrote:
> On Tue, Aug 21, 2012 at 11:55 AM, Ezio Melotti  wrote:
> >> +String literals that are part of a single expression and have only
> >> whitespace
> >> +between them will be implicitly converted to a single string literal.
> >> +
> >
> >
> > Is it a string /literal/ they are converted to?
> Yup:
> 
> >>> ast.dump(compile('"hello world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
> "Expression(body=Str(s='hello world'))"
> >>> ast.dump(compile('"hello" " world"', '', 'eval', flags=ast.PyCF_ONLY_AST))
> "Expression(body=Str(s='hello world'))"
> 
> > Anyway a simple ('foo' 'bar') == 'foobar' example might make this sentence
> > more understandable.
> 
> Added.

I think it is an important and subtle point that this happens at "compile
time" rather than "run time".  Subtle in that it is not at all obvious
(as this question demonstrates), and important in that it does have
performance implications (even if those are trivial in most cases).
So I think it would be worth saying "implicitly converted to a single
string literal when the source is parsed", or something like that.

--David
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Close #4966: revamp the sequence docs in order to better explain the state of

2012-08-21 Thread Nick Coghlan
On Tue, Aug 21, 2012 at 10:01 PM, R. David Murray  wrote:
> I think it is an important and subtle point that this happens at "compile
> time" rather than "run time".  Subtle in that it is not at all obvious
> (as this question demonstrates), and important in that it does have
> performance implications (even if those are trivial in most cases).
> So I think it would be worth saying "implicitly converted to a single
> string literal when the source is parsed", or something like that.

That kind of fine detail is what the language reference is for - the
distinction really doesn't matter most of the time.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Victor Stinner
2012/8/18 Terry Reedy :
> The issue came up in python-list about string operations being slower in
> 3.3. (The categorical claim is false as some things are actually faster.)

Yes, some operations are slower, but others are faster :-) There was
an important effort to limit the overhead of the PEP 393 (when the
branch was merged, most operations were slower). I tried to fix all
performance regressions. If you find cases where Python 3.3 is slower,
I can investigate and try to optimize it (in Python 3.4) or at least
explain why it is slower :-)

As said by Antoine, use the stringbench tool if you would like to get
a first overview of string performances.

> Some things I understand, this one I do not.
>
> Win7-64, 3.3.0b2 versus 3.2.3
> print(timeit("c in a", "c  = '…'; a = 'a'*1000+c")) # ord(c) = 8230
> # .6 in 3.2, 1.2 in 3.3

On Linux with narrow build (UTF-16), I get:

$ python3.2 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 4.25 usec per loop
$ python3.3 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 3.21 usec per loop

Linux-2.6.30.10-105.2.23.fc11.i586-i686-with-fedora-11-Leonidas
Python 3.2.2+ (3.2:1453d2fe05bf, Aug 21 2012, 14:21:05)
Python 3.3.0b2+ (default:b36ce0a3a844, Aug 21 2012, 14:05:23)

I'm not sure that I read your benchmark correctly: you write c='...'
and then ord(c)=8230. Algorithms to find a substring are different if
the substring is a single character or if the substring is longer. For
1 character, Antoine Pitrou modified the code to use memchr() and
memrchr(), even if the string is not UCS1 (if this benchmark, the
string uses a UCS2 storage): it may find false positives.

> Why is searching for a two-byte char in a two-bytes per char string so much
> faster in 3.2?

Can you reproduce your benchmark on other Windows platforms? Do you
run the benchmark more than once? I always run a benchmark 3 times.

I don't like the timeit module for micro benchmarks, it is really
unstable (default settings are not written for micro benchmarks).
Example of 4 runs on the same platform:

$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.79 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.61 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 3.16 usec per loop
$ ./python -m timeit -s "a='a'*1000" "a.encode()"
10 loops, best of 3: 2.76 usec per loop

I wrote my own benchmark tool, based on timeit, to have more stable
results on micro benchmarks:
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py

Example of 4 runs:

3.18 us: c=chr(8230); a='a'*1000+c; c in a
3.18 us: c=chr(8230); a='a'*1000+c; c in a
3.21 us: c=chr(8230); a='a'*1000+c; c in a
3.18 us: c=chr(8230); a='a'*1000+c; c in a

My benchmark.py script calibrates automatically the number of loops to
take at least 100 ms, and then repeat the test during at least 1.0
second.

Using time instead of a fixed number of loops is more reliable because
the test is less dependent on the system activity.

> print(timeit("a.encode()", "a = 'a'*1000"))
> # 1.5 in 3.2, .26 in 3.3
>
> print(timeit("a.encode(encoding='utf-8')", "a = 'a'*1000"))
> # 1.7 in 3.2, .51 in 3.3

This test doesn't compare performances of the UTF-8 encoder: "encode"
an ASCII string to UTF-8 in Python 3.3 is a no-op, it just duplicates
the memory (ASCII is compatible with UTF-8)...

So your benchmark just measures the performances of
PyArg_ParseTupleAndKeywords()... Try also str.encode('utf-8').

If you want to benchmark the UTF-8 encoder, use at least a non-ASCII
character like "\x80".

At least, your benchmark shows that Python 3.3 is *much* faster than
Python 3.2 to "encode" pure ASCII strings to UTF-8 :-)

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread Brian Curtin
On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis  wrote:
> Greetings,
>
> it is my understanding that the patches floating around the net to support 
> Visual Studio 2010 to compile the Python core and for distutils will never be 
> accepted and therefore that the 2.7 line is stuck to VS 2008 for the 
> remaining of its life. Could you please confirm that?

This is correct. A compiler upgrade is a feature, so the change to
VS2010 could only be applied to the version actively receiving new
features, which at the time was 3.3.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread Amaury Forgeot d'Arc
2012/8/21 Brian Curtin :
> On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis  wrote:
>> Greetings,
>>
>> it is my understanding that the patches floating around the net to support 
>> Visual Studio 2010 to compile the Python core and for distutils will never 
>> be accepted and therefore that the 2.7 line is stuck to VS 2008 for the 
>> remaining of its life. Could you please confirm that?
>
> This is correct. A compiler upgrade is a feature, so the change to
> VS2010 could only be applied to the version actively receiving new
> features, which at the time was 3.3.

But this does not prevent anyone from creating and maintaining such a
patch, outside of the official python.org repository.

-- 
Amaury Forgeot d'Arc
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread martin


Zitat von Luc Bourhis :

it is my understanding that the patches floating around the net to  
support Visual Studio 2010 to compile the Python core and for  
distutils will never be accepted and therefore that the 2.7 line is  
stuck to VS 2008 for the remaining of its life. Could you please  
confirm that?


That is correct, yes.

OTOH, Python is free software, so people are free to maintain such
patches, and even make binary releases out of them. These just won't
be available from python.org.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread martin


Zitat von Brian Curtin :


On Tue, Aug 21, 2012 at 5:24 AM, Luc Bourhis  wrote:

Greetings,

it is my understanding that the patches floating around the net to  
support Visual Studio 2010 to compile the Python core and for  
distutils will never be accepted and therefore that the 2.7 line is  
stuck to VS 2008 for the remaining of its life. Could you please  
confirm that?


This is correct. A compiler upgrade is a feature


In the specific case, this isn't actually the limiting factor.
Instead, it's binary compatibility: binaries compiled with VS 2010
are incompatible (in some cases) with those compiled with VS 2008.
So if the python.org binaries were released as compiler outputs
from VS 2010, exising extensions modules might crash Python. Therefore,
we cannot switch.

Maintaining a VS 2010 build process along with the VS 2008 process
would be a new feature, indeed. Fortunately, Mercurial makes it easy
enough to maintain such patches in a ways that allows simple tracking of
changes applied to 2.7 itself, for anybody with enough interest to do
so.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Andrea Griffini
> My benchmark.py script calibrates automatically the number of loops to
> take at least 100 ms, and then repeat the test during at least 1.0
> second.
>
> Using time instead of a fixed number of loops is more reliable because
> the test is less dependent on the system activity.

I've also been bitten in the past by something that is probably quite
obvious but I didn't think to, that is dynamic cpu frequency. Many
modern CPUs can dynamically change the frequency depending on the load
and temperature and the switch can take more than one second.

When doing benchmarks now I've a small script (based on cpufreq-set)
that just blocks all the cores into fast mode.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread martin

print(timeit("c in a", "c  = '…'; a = 'a'*1000+c")) # ord(c) = 8230



I'm not sure that I read your benchmark correctly: you write c='...'


Apparenly you didn't - or your MUA was not able to display it
correctly. He didn't say

'...' # U+002E U+002E U+002E, 3x FULL STOP

but

'…' # U+2026, HORIZONTAL ELLIPSIS

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread Christian Heimes
Am 21.08.2012 17:01, schrieb [email protected]:
> In the specific case, this isn't actually the limiting factor.
> Instead, it's binary compatibility: binaries compiled with VS 2010
> are incompatible (in some cases) with those compiled with VS 2008.
> So if the python.org binaries were released as compiler outputs
> from VS 2010, exising extensions modules might crash Python. Therefore,
> we cannot switch.

Compatibility issues may lead to other strange bugs, too. IIRC each
msvcrt has its own thread local storage and therefore its own errno
handling. An extension compiled with VS 2010 won't be able to use the
PyErr_SetFromErrno*() function correctly. That's much harder to debug
than a FILE pointer mismatch because it usually doesn't cause a segfault.

Christian

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Antoine Pitrou
On Tue, 21 Aug 2012 17:20:14 +0200
Andrea Griffini  wrote:
> > My benchmark.py script calibrates automatically the number of loops to
> > take at least 100 ms, and then repeat the test during at least 1.0
> > second.
> >
> > Using time instead of a fixed number of loops is more reliable because
> > the test is less dependent on the system activity.
> 
> I've also been bitten in the past by something that is probably quite
> obvious but I didn't think to, that is dynamic cpu frequency. Many
> modern CPUs can dynamically change the frequency depending on the load
> and temperature and the switch can take more than one second.
> 
> When doing benchmarks now I've a small script (based on cpufreq-set)
> that just blocks all the cores into fast mode.

For the record, under Linux, the following command:

$ sudo cpufreq-set -rg performance

should do the trick.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Steven D'Aprano

On 21/08/12 23:04, Victor Stinner wrote:


I don't like the timeit module for micro benchmarks, it is really
unstable (default settings are not written for micro benchmarks).

[...]

I wrote my own benchmark tool, based on timeit, to have more stable
results on micro benchmarks:
https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py


I am surprised, because the whole purpose of timeit is to time micro
code snippets.

If it is as unstable as you suggest, and if you have an alternative
which is more stable and accurate, I would love to see it in the
standard library.



--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Xavier Morel
On 21 août 2012, at 19:25, Steven D'Aprano  wrote:
> On 21/08/12 23:04, Victor Stinner wrote:
> 
>> I don't like the timeit module for micro benchmarks, it is really
>> unstable (default settings are not written for micro benchmarks).
> [...]
>> I wrote my own benchmark tool, based on timeit, to have more stable
>> results on micro benchmarks:
>> https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
> 
> I am surprised, because the whole purpose of timeit is to time micro
> code snippets.

And when invoked from the command-line, it is already time-based: unless -n is 
specified, python guesstimates the number of iterations to be a power of 10 
resulting in at least 0.2s per test (the repeat defaults to 3 though)

As a side-note, every time I use timeit programmatically, it annoys me that 
this behavior is not available and has to be implemented manually. 

> If it is as unstable as you suggest, and if you have an alternative
> which is more stable and accurate, I would love to see it in the
> standard library.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Stefan Behnel
Xavier Morel, 21.08.2012 19:56:
> On 21 août 2012, at 19:25, Steven D'Aprano wrote:
>> On 21/08/12 23:04, Victor Stinner wrote:
>>> I don't like the timeit module for micro benchmarks, it is really 
>>> unstable (default settings are not written for micro benchmarks).
>> [...]
>>> I wrote my own benchmark tool, based on timeit, to have more stable 
>>> results on micro benchmarks: 
>>> https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
>> 
>> I am surprised, because the whole purpose of timeit is to time micro 
>> code snippets.
> 
> And when invoked from the command-line, it is already time-based: unless
> -n is specified, python guesstimates the number of iterations to be a
> power of 10 resulting in at least 0.2s per test (the repeat defaults to
> 3 though)
> 
> As a side-note, every time I use timeit programmatically, it annoys me
> that this behavior is not available and has to be implemented manually.

+100, sounds like someone should contribute a patch for this.

Stefan


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Alexander Belopolsky
On Tue, Aug 21, 2012 at 1:56 PM, Xavier Morel  wrote:
> As a side-note, every time I use timeit programmatically, it annoys me that 
> this behavior is not available and has to be implemented manually.

You are not alone:

http://bugs.python.org/issue6422
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Antoine Pitrou
On Wed, 22 Aug 2012 03:25:21 +1000
Steven D'Aprano  wrote:
> On 21/08/12 23:04, Victor Stinner wrote:
> 
> > I don't like the timeit module for micro benchmarks, it is really
> > unstable (default settings are not written for micro benchmarks).
> [...]
> > I wrote my own benchmark tool, based on timeit, to have more stable
> > results on micro benchmarks:
> > https://bitbucket.org/haypo/misc/src/tip/python/benchmark.py
> 
> I am surprised, because the whole purpose of timeit is to time micro
> code snippets.
> 
> If it is as unstable as you suggest, and if you have an alternative
> which is more stable and accurate, I would love to see it in the
> standard library.

In my experience timeit is stable enough to know whether a change is
significant or not.  No need for three-digit precision when the
question is whether there is at least a 10% performance difference
between two approaches.

Regards

Antoine.


-- 
Software development and contracting: http://pro.pitrou.net


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread Luc Bourhis
Thanks for the quick response.

>> [...] A compiler upgrade is a feature, so the change to
>> VS2010 could only be applied to the version actively receiving new
>> features, which at the time was 3.3.
> 
> But this does not prevent anyone from creating and maintaining such a
> patch, outside of the official python.org repository.

I was contemplating that option indeed. Sébastien Sablé seemed to have the same 
aim. Would you know any other such efforts? I would rather prefer to contribute 
back to the community.

Best wishes,

Luc Bourhis

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Serhiy Storchaka

On 19.08.12 00:17, Terry Reedy wrote:

This is one of the 3.3 improvements. But since the results are equal:
('a'*1000).encode() == ('a'*1000).encode(encoding='utf-8')
and 3.3 should know that for an all-ascii string, I do not see why
adding the parameter should double the the time. Another issue or known
and un-fixable?


This is a cost of argument packing/unpacking.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread Terry Reedy

On 8/21/2012 9:04 AM, Victor Stinner wrote:

2012/8/18 Terry Reedy :

The issue came up in python-list about string operations being slower in
3.3. (The categorical claim is false as some things are actually faster.)


Yes, some operations are slower, but others are faster :-)


Yes, that is what I wrote, showed, and posted to python-list :-)

I was and am posting here in response to a certain French writer who 
dislikes the fact that 3.3 unicode favors text written with the first 
256 code points, which do not include all the characters needed for 
French, and do not include the euro symbol invented years after that set 
was established. His opinion aside, his search for 'evidence' did turn 
up a version of the example below.



an important effort to limit the overhead of the PEP 393 (when the
branch was merged, most operations were slower). I tried to fix all
performance regressions.


Yes, I read and appreciated the speed-up patches by you and others.

> If you find cases where Python 3.3 is slower,

I can investigate and try to optimize it (in Python 3.4) or at least
explain why it is slower :-)


Replacement appears to be as much as 6.5 times slower on some Win 7 
machines. (I factored out the setup part, which increased the ratio 
since it takes the same time on both machines.)


ttr = timeit.repeat
# 3.2.3
>>> ttr("euroreplace('€', 'œ')", "euroreplace = ('€'*100).replace")
[0.385043233078477, 0.35294282203631155, 0.3468394370770511]

# 3.3.0b2
>>> ttr("euroreplace('€', 'œ')", "euroreplace = ('€'*100).replace")
[2.2624885911213823, 2.245330314124203, 2.2531118686461014]

How do this compare on *nix?


As said by Antoine, use the stringbench tool if you would like to get
a first overview of string performances.


I found it, ran it on 3.2 and 3.3, and posted to python-list that 3.3 
unicode looks quite good. It is overall comparable to both byte 
operations and 3.2 unicode operations. Replace operations were 
relatively the slowest, though I do not remember any as bad as the 
example above.



Some things I understand, this one I do not.

Win7-64, 3.3.0b2 versus 3.2.3
print(timeit("c in a", "c  = '…'; a = 'a'*1000+c")) # ord(c) = 8230
# .6 in 3.2, 1.2 in 3.3


On Linux with narrow build (UTF-16), I get:

$ python3.2 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 4.25 usec per loop
$ python3.3 -m timeit -s "c=chr(8230); a='a'*1000+c" "c in a"
10 loops, best of 3: 3.21 usec per loop


The slowdown seems to be specific to (some?) windows systems. Perhaps we 
as hitting a difference in the VC2008 and VC2010 compilers or runtimes. 
Someone on python-list wondered whether the 3.3.0 betas have the same 
compile optimization settings as 3.2.3 final. Martin?



Can you reproduce your benchmark on other Windows platforms? Do you
run the benchmark more than once? I always run a benchmark 3 times.


Always, and now I see the repeat does this for me.


I don't like the timeit module for micro benchmarks, it is really
unstable (default settings are not written for micro benchmarks).


I am reporting rounded lowest times. As other said, make timeit better 
if you can.



print(timeit("a.encode()", "a = 'a'*1000"))
# 1.5 in 3.2, .26 in 3.3

print(timeit("a.encode(encoding='utf-8')", "a = 'a'*1000"))
# 1.7 in 3.2, .51 in 3.3


This test doesn't compare performances of the UTF-8 encoder: "encode"
an ASCII string to UTF-8 in Python 3.3 is a no-op, it just duplicates
the memory (ASCII is compatible with UTF-8)...


That is what I thought, and why I was puzzled, ...


So your benchmark just measures the performances of
PyArg_ParseTupleAndKeywords()...,


having forgotten about arg processing. I should have factored out the 
.encode lookup (as I did with .replace). The following suggests that you 
are correct. The difference, about .3, is independent of the length of 
string being copied.


>>> ttr("aenc()", "aenc = ('a'*1).encode")
[0.588499543029684, 0.5760222493490801, 0.5757037691037112]
>>> ttr("aenc(encoding='utf-8')", "aenc = ('a'*1).encode")
[0.8973955632254729, 0.887000380270365, 0.884113153942053]

>>> ttr("aenc()", "aenc = ('a'*5).encode")
[3.6618914099180984, 3.650091040467487, 3.6542183723140624]
>>> ttr("aenc(encoding='utf-8')", "aenc = ('a'*5).encode")
[3.964849740958016, 3.9363826484832316, 3.937290440151628]

--
Terry Jan Reedy


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.7: only Visual Studio 2008?

2012-08-21 Thread martin
I was contemplating that option indeed. Sébastien Sablé seemed to  
have the same aim. Would you know any other such efforts?


I believe Kristjan Jonsson has a port as well.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 3.3 str timings

2012-08-21 Thread martin


Zitat von Terry Reedy :

I was and am posting here in response to a certain French writer who  
dislikes the fact that 3.3 unicode favors text written with the  
first 256 code points, which do not include all the characters  
needed for French, and do not include the euro symbol invented years  
after that set was established. His opinion aside, his search for  
'evidence' did turn up a version of the example below.


I personally don't see a need to "defend" this or any other deliberate
change. There is a need to defend changes before they are made, to convince
co-contributors and other Python users, this is what the PEP process is
good for. One point of the PEP process is that once the PEP is accepted,
discussion ought to stop - or anybody continuing in discussion doesn't
deserve an answer by anybody not interested.

Anybody who doesn't like the change is free not to use Python 3.3, or
stay at 2.7, use PyPy, or switch to Ruby altogether. Neither bothers
me to the slightest. If people find proper bugs, they are encouraged
to report them; if they contribute patches along, the better. If they
merely want to complain - let them complain. If they want to see an
agreed-upon patch reverted, they can try to lobby a BDFL pronouncement.

I certainly think the performance of str in 3.3 is fine, and thought
so even before Serhiy or Victor submitted their patches. I actually
dislike some of the code complication that these improvements brought,
but I can accept that a certain loss of maintainability that gives
better performance makes a lot of people happy. But I will continue
to object further complications that support irrelevant special
cases.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Jython roadmap

2012-08-21 Thread Jeff Allen


On 21/08/2012 06:34, [email protected] wrote:


Zitat von "Juancarlo Añez (Apalala)" :

It seems that Jython is under the Python Foundation, but I can't find 
a roadmap, a plan, or instructions about how to contribute to it 
reaching 2.7 and 3.3.


Are there any pages that describe the process?


Hi Juanca,

These questions are best asked on the jython-dev mailing list, see


Hi Juancarlo:

I'm cross-posting this for you on jython-dev as Martin is right. Let's 
continue there.


Jython does need new helpers and I agree it isn't very easy to get 
started. And we could do with a published roadmap.


I began by fixing a few bugs (about a year ago now), as that seemed to 
be the suggestion on-line and patches can be offered unilaterally. 
(After a bit of nagging) some of these got reviewed and I'd won my spurs.


I found the main difficulty to be understanding the source, or rather 
the architecture: there is too little documentation and some of what you 
can find is out of date (svn?). A lot of basic stuff is still a complete 
mystery to me. As I've discovered things I've put them on the Jython 
Wiki ( http://wiki.python.org/jython/JythonDeveloperGuide ) in the hope 
of speeding others' entry, including up-to-date description of how to 
get the code to build in Eclipse.


One place to look, that may not occur to you immediately, is Frank 
Wierzbicki's blog ( http://fwierzbicki.blogspot.co.uk/ ). Frank is the 
project manager for Jython, an author of the Jython book, and has worked 
like a Trojan (the good kind, not the horse) over the last 6 months. 
Although Frank has shared inklings of a roadmap, it must be difficult to 
put dates to things that depend on a small pool of volunteers working in 
their spare time -- especially perfectionist volunteers who write more 
Javadoc than actual code, then delete it all because they've had a 
better idea :-). Direction of travel is easier: 2.5.3 is out, we're 
trying to get to 2.7b, but with an eye on 3.3. I haven't seen anything 
systematic on what's still to do, who's doing it, and where the gaps 
are, which is probably what you're looking for. ... Frank?


Jeff Allen


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com