Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-03 Thread Sturla Molden
Stefan Behnel  wrote:

> Thus my proposal to compile the modules in CPython with Cython, rather than
> duplicating their code or making/keeping them CPython specific. I think
> reducing the urge to reimplement something in C is a good thing.

For algorithmic and numerical code, Numba has already proven that Python
can be JIT compiled comparable to -O2 in C.  For non-algorthmic code, the
speed determinants are usually outside Python (e.g. the network
connection). Numba is becoming what the "dead swallow" should have been.
The question is rather should the standard library use a JIT compiler like
Numba? Cython is great for writing C extensions while avoiding all the
details of the Python C API. But for speeding up algorithmic code, Numba is
easier to use.

Sturla

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-03 Thread Stefan Behnel
Sturla Molden, 03.06.2014 17:13:
> Stefan Behnel wrote:
> 
>> Thus my proposal to compile the modules in CPython with Cython, rather than
>> duplicating their code or making/keeping them CPython specific. I think
>> reducing the urge to reimplement something in C is a good thing.
> 
> For algorithmic and numerical code, Numba has already proven that Python
> can be JIT compiled comparable to -O2 in C.  For non-algorthmic code, the
> speed determinants are usually outside Python (e.g. the network
> connection). Numba is becoming what the "dead swallow" should have been.
> The question is rather should the standard library use a JIT compiler like
> Numba? Cython is great for writing C extensions while avoiding all the
> details of the Python C API. But for speeding up algorithmic code, Numba is
> easier to use.

I certainly agree that a JIT compiler can do much better optimisations on
Python code than a static compiler, especially data driven optimisations.
However, Numba comes with major dependencies, even runtime dependencies.
>From previous discussions on this list, I gathered that there are major
objections against adding such a large dependency to CPython since it can
also just be installed as an external package if users want to have it.

Static compilation, on the other hand, is a build time thing that adds no
dependencies that CPython doesn't have already. Distributions can even
package up the compiled .so files separately from the original .py/.pyc
files, if they feel like it, to make them selectively installable. So the
argument in favour is mostly a pragmatic one. If you can have 2-5x faster
code essentially for free, why not just go for it?

Stefan


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-03 Thread Sturla Molden
Stefan Behnel  wrote:

> So the
> argument in favour is mostly a pragmatic one. If you can have 2-5x faster
> code essentially for free, why not just go for it?

I would be easier if the GIL or Cython's use of it was redesigned. Cython
just grabs the GIL and holds on to it until it is manually released. The
standard lib cannot have packages that holds the GIL forever, as a Cython
compiled module would do. Cython has to start sharing access the GIL like
the interpreter does.

Sturla

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Chris Angelico
I'm helping out with the micropython project and am finding that one
of their tests fails on CPython 3.5 (fresh build from Mercurial this
morning). It comes down to this:

Python 3.4.1rc1 (default, May  5 2014, 14:28:34)
[GCC 4.8.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "%x"%16.0
'10'

Python 3.5.0a0 (default:88814d1f8c32, Jun  4 2014, 07:29:32)
[GCC 4.7.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> "%x"%16.0
Traceback (most recent call last):
  File "", line 1, in 
TypeError: %x format: an integer is required, not float

Is this an intentional change? And if so, is it formally documented
somewhere? I don't recall seeing anything about it, but my
recollection doesn't mean much.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Victor Stinner
Hi,

2014-06-03 23:38 GMT+02:00 Chris Angelico :
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.

Yes, it's intentional. See the issue for the rationale:
http://bugs.python.org/issue19995

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Eric V. Smith
On 6/3/2014 5:38 PM, Chris Angelico wrote:
> I'm helping out with the micropython project and am finding that one
> of their tests fails on CPython 3.5 (fresh build from Mercurial this
> morning). It comes down to this:
> 
> Python 3.4.1rc1 (default, May  5 2014, 14:28:34)
> [GCC 4.8.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 "%x"%16.0
> '10'
> 
> Python 3.5.0a0 (default:88814d1f8c32, Jun  4 2014, 07:29:32)
> [GCC 4.7.2] on linux
> Type "help", "copyright", "credits" or "license" for more information.
 "%x"%16.0
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: %x format: an integer is required, not float
> 
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.

http://bugs.python.org/issue19995

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Chris Angelico
On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner  wrote:
> 2014-06-03 23:38 GMT+02:00 Chris Angelico :
>> Is this an intentional change? And if so, is it formally documented
>> somewhere? I don't recall seeing anything about it, but my
>> recollection doesn't mean much.
>
> Yes, it's intentional. See the issue for the rationale:
> http://bugs.python.org/issue19995

Thanks! I'll fix (in this case, simply remove) the test and cite that issue.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Glenn Linderman

On 6/3/2014 3:05 PM, Chris Angelico wrote:

On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner  wrote:

2014-06-03 23:38 GMT+02:00 Chris Angelico :

Is this an intentional change? And if so, is it formally documented
somewhere? I don't recall seeing anything about it, but my
recollection doesn't mean much.

Yes, it's intentional. See the issue for the rationale:
http://bugs.python.org/issue19995

Thanks! I'll fix (in this case, simply remove) the test and cite that issue.


Wouldn't it be better to keep the test, but expect the operation to fail?
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] %x formatting of floats - behaviour change since 3.4

2014-06-03 Thread Chris Angelico
On Wed, Jun 4, 2014 at 8:26 AM, Glenn Linderman  wrote:
> On 6/3/2014 3:05 PM, Chris Angelico wrote:
>
> On Wed, Jun 4, 2014 at 8:03 AM, Victor Stinner 
> wrote:
>
> 2014-06-03 23:38 GMT+02:00 Chris Angelico :
>
> Is this an intentional change? And if so, is it formally documented
> somewhere? I don't recall seeing anything about it, but my
> recollection doesn't mean much.
>
> Yes, it's intentional. See the issue for the rationale:
> http://bugs.python.org/issue19995
>
> Thanks! I'll fix (in this case, simply remove) the test and cite that issue.
>
>
> Wouldn't it be better to keep the test, but expect the operation to fail?

The way micropython does its tests is: Run CPython on a script, then
run micropython on the same script. If the output differs, it's an
error. The problem is, CPython 3.3 and CPython 3.5 give different
output (one gives an exception, the other works as if int(x) had been
given), so it's impossible for the test to be done right.

My question was mainly to ascertain whether it's the tests or my
system that needed fixing.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Should standard library modules optimize for CPython?

2014-06-03 Thread Steven D'Aprano
On Sun, Jun 01, 2014 at 06:11:39PM +1000, Steven D'Aprano wrote:
> I think I know the answer to this, but I'm going to ask it anyway...
> 
> I know that there is a general policy of trying to write code in the 
> standard library that does not disadvantage other implementations. How 
> far does that go the other way? Should the standard library accept 
> slower code because it will be much faster in other implementations?
[...]


Thanks to everyone who replied! I just wanted to make a brief note to 
say that although I haven't been very chatty in this thread, I have been 
reading it, so thanks for the advice, it is appreciated.


-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Steven D'Aprano
There is a discussion over at MicroPython about the internal 
representation of Unicode strings. Micropython is aimed at embedded 
devices, and so minimizing memory use is important, possibly even 
more important than performance.

(I'm not speaking on their behalf, just commenting as an interested 
outsider.)

At the moment, their Unicode support is patchy. They are talking about 
either:

* Having a build-time option to restrict all strings to ASCII-only.

  (I think what they mean by that is that strings will be like Python 2 
  strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

* Implementing Unicode internally as UTF-8, and giving up O(1) 
  indexing operations.

https://github.com/micropython/micropython/issues/657


Would either of these trade-offs be acceptable while still claiming 
"Python 3.4 compatibility"?

My own feeling is that O(1) string indexing operations are a quality of 
implementation issue, not a deal breaker to call it a Python. I can't 
see any requirement in the docs that str[n] must take O(1) time, but 
perhaps I have missed something.




-- 
Steven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Donald Stufft
I think UTF8 is the best option. 

> On Jun 3, 2014, at 9:17 PM, Steven D'Aprano  wrote:
> 
> There is a discussion over at MicroPython about the internal 
> representation of Unicode strings. Micropython is aimed at embedded 
> devices, and so minimizing memory use is important, possibly even 
> more important than performance.
> 
> (I'm not speaking on their behalf, just commenting as an interested 
> outsider.)
> 
> At the moment, their Unicode support is patchy. They are talking about 
> either:
> 
> * Having a build-time option to restrict all strings to ASCII-only.
> 
>  (I think what they mean by that is that strings will be like Python 2 
>  strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
> 
> * Implementing Unicode internally as UTF-8, and giving up O(1) 
>  indexing operations.
> 
> https://github.com/micropython/micropython/issues/657
> 
> 
> Would either of these trade-offs be acceptable while still claiming 
> "Python 3.4 compatibility"?
> 
> My own feeling is that O(1) string indexing operations are a quality of 
> implementation issue, not a deal breaker to call it a Python. I can't 
> see any requirement in the docs that str[n] must take O(1) time, but 
> perhaps I have missed something.
> 
> 
> 
> 
> -- 
> Steven
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/donald%40stufft.io
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Chris Angelico
On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano  wrote:
> * Having a build-time option to restrict all strings to ASCII-only.
>
>   (I think what they mean by that is that strings will be like Python 2
>   strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)

What I was actually suggesting along those lines was that the str type
still be notionally a Unicode string, but that any codepoints >127
would either raise an exception or blow an assertion, and all the code
to handle multibyte representations would be compiled out. So there'd
still be a difference between strings of text and streams of bytes,
but all encoding and decoding to/from ASCII-compatible encodings would
just point to the same bytes in RAM.

Risk: Someone would implement that with assertions, then compile with
assertions disabled, test only with ASCII, and have lurking bugs.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Nick Coghlan
On 4 June 2014 11:17, Steven D'Aprano  wrote:
> My own feeling is that O(1) string indexing operations are a quality of
> implementation issue, not a deal breaker to call it a Python.

If string indexing & iteration is still presented to the user as "an
array of code points", it should still avoid the bugs that plagued
both Python 2 narrow builds and direct use of UTF-8 encoded Py2
strings.

If they don't try to offer C API compatibility, it should be feasible
to do it that way. If they *do* try to offer C API compatibility, they
may have a problem.

> I can't
> see any requirement in the docs that str[n] must take O(1) time, but
> perhaps I have missed something.

There's a general expectation that indexing will be O(1) because all
the builtin containers that support that syntax use it for O(1) lookup
operations.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Guido van Rossum
On Tue, Jun 3, 2014 at 7:32 PM, Chris Angelico  wrote:

> On Wed, Jun 4, 2014 at 11:17 AM, Steven D'Aprano 
> wrote:
> > * Having a build-time option to restrict all strings to ASCII-only.
> >
> >   (I think what they mean by that is that strings will be like Python 2
> >   strings, ASCII-plus-arbitrary-bytes, not actually ASCII.)
>
> What I was actually suggesting along those lines was that the str type
> still be notionally a Unicode string, but that any codepoints >127
> would either raise an exception or blow an assertion, and all the code
> to handle multibyte representations would be compiled out.


That would be a pretty lousy option.

So there'd
> still be a difference between strings of text and streams of bytes,
> but all encoding and decoding to/from ASCII-compatible encodings would
> just point to the same bytes in RAM.
>

I suppose this is why you propose to reject 128-255?


> Risk: Someone would implement that with assertions, then compile with
> assertions disabled, test only with ASCII, and have lurking bugs.
>

Never mind disabling assertions -- even with enabled assertions you'd have
to expect most Python programs to fail with non-ASCII input.

Then again the UTF-8 option would be pretty devastating too for anything
manipulating strings (especially since many Python APIs are defined using
indexes, e.g. the re module).

Why not support variable-width strings like CPython 3.4?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-03 Thread Chris Angelico
On Wed, Jun 4, 2014 at 3:17 PM, Nick Coghlan  wrote:
> On 4 June 2014 11:17, Steven D'Aprano  wrote:
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python.
>
> If string indexing & iteration is still presented to the user as "an
> array of code points", it should still avoid the bugs that plagued
> both Python 2 narrow builds and direct use of UTF-8 encoded Py2
> strings.

It would. The downsides of a UTF-8 representation would be slower
iteration and much slower (O(N)) indexing/slicing.

ChrisA
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com