Re: [Python-Dev] Should we move to replace re with regex?

2011-08-28 Thread Nick Coghlan
On Sun, Aug 28, 2011 at 2:28 PM, Guido van Rossum  wrote:
> On Sat, Aug 27, 2011 at 8:59 PM, Ezio Melotti  wrote:
>> I think it would be good to:
>>   1) have some document that explains the general design and main (internal)
>> functions of the module (e.g. a PEP);
>
> I don't think that such a document needs to be a PEP; PEPs are usually
> intended where there is significant discussion expected, not just to
> explain things. A README file or a Wiki page would be fine, as long as
> it's sufficiently comprehensive.

timsort.txt and dictnotes.txt may be useful precedents for the kind of
thing that is useful on that front. IIRC, the pymalloc stuff has a
massive embedded comment, which can also work.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Simon Cross
On Sun, Aug 28, 2011 at 6:58 AM, Terry Reedy  wrote:
> 2) It is not trivial to use it correctly. I think it needs a SWIG-like
> companion script that can write at least first-pass ctypes code from the .h
> header files. Or maybe it could/should use header info at runtime (with the
> .h bundled with a module).

This is sort of already available:

-- http://starship.python.net/crew/theller/ctypes/old/codegen.html
-- http://svn.python.org/projects/ctypes/trunk/ctypeslib/

It just appears to have never made it into CPython. I've used it
successfully on a small project.

Schiavo
Simon
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Software Transactional Memory for Python

2011-08-28 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 6:08 AM, Armin Rigo  wrote:
> Hi Nick,
>
> On Sat, Aug 27, 2011 at 2:40 PM, Nick Coghlan  wrote:
>> 1. How does the patch interact with C code that explicitly releases
>> the GIL? (e.g. IO commands inside a "with atomic:" block)
>
> As implemented, any code in a "with atomic" is prevented from
> explicitly releasing and reacquiring the GIL: the GIL remain acquired
> until the end of the "with" block.  In other words
> Py_BEGIN_ALLOW_THREADS has no effect in a "with" block.  This gives
> semantics that, in a full multi-core STM world, would be implementable
> by saying that if, in the middle of a transaction, you need to do I/O,
> then from this point onwards the transaction is not allowed to abort
> any more.  Such "inevitable" transactions are already supported e.g.
> by RSTM, the C++ framework I used to prototype a C version
> (https://bitbucket.org/arigo/arigo/raw/default/hack/stm/c ).
>
>> 2. Whether or not Jython and IronPython could implement something like
>> that, since they're free threaded with fine-grained locks. If they
>> can't then I don't see how we could justify making it part of the
>> standard library.
>
> Yes, I can imagine some solutions.  I am no Jython or IronPython
> expert, but let us assume that they have a way to check synchronously
> for external events from time to time (i.e. if there is some
> equivalent to sys.setcheckinterval()).  If they do, then all you need
> is the right synchronization: the thread that wants to start a "with
> atomic" has to wait until all other threads are paused in the external
> check code.  (Again, like CPython's, this not a properly multi-core
> STM-ish solution, but it would give the right semantics.  (And if it
> turns out that STM is successful in the future, Java will grow more
> direct support for it ))
>
>
> A bientôt,
>
> Armin.

This sounds like a very interesting idea to pursue, even if it's late,
and even if it's experimental, and even if it's possible to cause
deadlocks (no news there). I propose that we offer a C API in Python
3.3 as well as an extension module that offers the proposed decorator.
The C API could then be used to implement alternative APIs purely as
extension modules (e.g. would a deadlock-detecting API be possible?).

I don't think this needs a PEP, it's not a very pervasive change. We
can even document the API as experimental. But (if I may trust Armin's
reasoning) it's important to add support directly to CPython, as
currently it cannot be done as a pure extension module.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Should we move to replace re with regex?

2011-08-28 Thread Guido van Rossum
Someone asked me off-line what I wanted besides talk. Here's the list
I came up with:

You could try for instance volunteer to do a thorough code review of
the regex code, trying to think of ways to break it (e.g. bad syntax
or extreme use of nesting etc., or bad data). Or you could volunteer
to maintain it in the future. Or you could try to port it to PEP 393.
Or you could systematically go over the given list of differences
between re and regex and decide whether they are likely to be
backwards incompatibilities that will break existing code. Or you
could try to add some of the functionality requested by Tom C in one
of his several bugs.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Guido van Rossum
On Sat, Aug 27, 2011 at 10:36 PM, Dan Stromberg  wrote:
>
> On Sat, Aug 27, 2011 at 8:57 PM, Guido van Rossum  wrote:
>>
>> On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg 
>> wrote:
>> > IMO, we really, really need some common way of accessing C libraries
>> > that
>> > works for all major Python variants.
>>
>> We have one. It's called writing an extension module.
>
> And yet Cext's are full of CPython-isms.

I have to apologize, I somehow misread your "all Python variants" as a
mixture of "all CPython versions" and "all platforms where CPython
runs".

While I have no desire to continue this discussion, you are most
welcome to do so.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 26.08.2011 16:56, schrieb Guido van Rossum:
> Also, please add the table (and the reasoning that led to it) to the PEP.

Done!

Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Stefan Behnel

Hi,

sorry for hooking in here with my usual Cython bias and promotion. When the 
question comes up what a good FFI for Python should look like, it's an 
obvious reaction from my part to throw Cython into the game.


Terry Reedy, 28.08.2011 06:58:

Dan, I once had the more or less the same opinion/question as you with
regard to ctypes, but I now see at least 3 problems.

1) It seems hard to write it correctly. There are currently 47 open ctypes
issues, with 9 being feature requests, leaving 38 behavior-related issues.
Tom Heller has not been able to work on it since the beginning of 2010 and
has formally withdrawn as maintainer. No one else that I know of has taken
his place.


Cython has an active set of developers and a rather large and growing user 
base.


It certainly has lots of open issues in its bug tracker, but most of them 
are there because we *know* where the development needs to go, not so much 
because we don't know how to get there. After all, the semantics of Python 
and C/C++, between which Cython sits, are pretty much established.


Cython compiles to C code for CPython, (hopefully soon [1]) to 
Python+ctypes for PyPy and (mostly [2]) C++/CLI code for IronPython, which 
boils down to the same build time and runtime kind of dependencies that the 
supported Python runtimes have anyway. It does not add dependencies on any 
external libraries by itself, such as the libffi in CPython's ctypes 
implementation.


For the CPython backend, the generated code is very portable and is 
self-contained when compiled against the CPython runtime (plus, obviously, 
libraries that the user code explicitly uses). It generates efficient code 
for all existing CPython versions starting with Python 2.4, with several 
optimisations also for recent CPython versions (including the upcoming 3.3).




2) It is not trivial to use it correctly.


Cython is basically Python, so Python developers with some C or C++ 
knowledge tend to get along with it quickly.


I can't say yet how easy it is (or will be) to write code that is portable 
across independent Python implementations, but given that that field is 
still young, there's certainly a lot that can be done to aid this.




I think it needs a SWIG-like
companion script that can write at least first-pass ctypes code from the .h
header files. Or maybe it could/should use header info at runtime (with the
.h bundled with a module).


From my experience, this is a "nice to have" more than a requirement. It 
has been requested for Cython a couple of times, especially by new users, 
and there are a couple of scripts out there that do this to some extent. 
But the usual problem is that Cython users (and, similarly, ctypes users) 
do not want a 1:1 mapping of a library API to a Python API (there's SWIG 
for that), and you can't easily get more than a trivial mapping out of a 
script. But, yes, a one-shot generator for the necessary declarations would 
at least help in cases where the API to be wrapped is somewhat large.




3) It seems to be slower than compiled C extension wrappers. That, at
least, was the discovery of someone who re-wrote pygame using ctypes. (The
hope was that using ctypes would aid porting to 3.x, but the time penalty
was apparently too much for time-critical code.)


Cython code can be as fast as C code, and in some cases, especially when 
developer time is limited, even faster than hand written C extensions. It 
allows for a straight forward optimisation path from regular Python code 
down to the speed of C, and trivial interaction with C code itself, if the 
need arises.


Stefan


[1] The PyPy port of Cython is currently being written as a GSoC project.

[2] The IronPython port of Cython was written to facility a NumPy port to 
the .NET environment. It's currently not a complete port of all Cython 
features.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] peps: Add memory consumption table.

2011-08-28 Thread Antoine Pitrou
On Sun, 28 Aug 2011 20:13:11 +0200
martin.v.loewis  wrote:
>  
> +Performance
> +---
> +
> +Performance of this patch must be considered for both memory
> +consumption and runtime efficiency. For memory consumption, the
> +expectation is that applications that have many large strings will see
> +a reduction in memory usage. For small strings, the effects depend on
> +the pointer size of the system, and the size of the Py_UNICODE/wchar_t
> +type. The following table demonstrates this for various small string
> +sizes and platforms.

The table is for ASCII-only strings, right? Perhaps that should be
mentioned somewhere.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
> I would say no more than a 15% slowdown on each of the following
> benchmarks:
> 
> - stringbench.py -u
>   (http://svn.python.org/view/sandbox/trunk/stringbench/)
> - iobench.py -t
>   (in Tools/iobench/)
> - the json_dump, json_load and regex_v8 tests from
>   http://hg.python.org/benchmarks/

I now have benchmark results for these; numbers are for revision
c10bcab2aac7, comparing to 1ea72da11724 (wide unicode), on 64-bit
Linux with gcc 4.6.1 running on Core i7 2.8GHz.

- stringbench gives 10% slowdown on total time; the tests take
  between 78% and 220%. The cost is typically not in performing
  the string operations themselves, but in the creation of the
  result strings. In PEP 393, a buffer must be scanned for the
  highest code point, which means that each byte must be inspected
  twice (a second time when the copying occurs).
- the iobench results are between 2% acceleration (seek operations),
  16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
  37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
  difference is probably in the UTF-8 decoder; I have already
  restored the "runs of ASCII" optimization and am out of ideas for
  further speedups. Again, having to scan the UTF-8 string twice
  is probably one cause of slowdown.
- the json and regex_v8 tests see a slowdown of below 1%.

The slowdown is larger when compared with a narrow Unicode build.

> Additionally, it would be nice if you could run at least some of the
> test_bigmem tests, according to your system's available RAM.

Running only StrTest with 4.5G allows me to run 2 tests
(test_encode_raw_unicode_escape and test_encode_utf7); this sees
a slowdown of 37% in Linux user time.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Antoine Pitrou

> - the iobench results are between 2% acceleration (seek operations),
>   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
>   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
>   difference is probably in the UTF-8 decoder; I have already
>   restored the "runs of ASCII" optimization and am out of ideas for
>   further speedups. Again, having to scan the UTF-8 string twice
>   is probably one cause of slowdown.

I don't think it's the UTF-8 decoder because I see an even larger
slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").

Thanks

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
> 
>> - the iobench results are between 2% acceleration (seek operations),
>>   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
>>   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
>>   difference is probably in the UTF-8 decoder; I have already
>>   restored the "runs of ASCII" optimization and am out of ideas for
>>   further speedups. Again, having to scan the UTF-8 string twice
>>   is probably one cause of slowdown.
> 
> I don't think it's the UTF-8 decoder because I see an even larger
> slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").

But those aren't used in iobench, are they?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Antoine Pitrou
Le dimanche 28 août 2011 à 22:23 +0200, "Martin v. Löwis" a écrit :
> Am 28.08.2011 22:01, schrieb Antoine Pitrou:
> > 
> >> - the iobench results are between 2% acceleration (seek operations),
> >>   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
> >>   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
> >>   difference is probably in the UTF-8 decoder; I have already
> >>   restored the "runs of ASCII" optimization and am out of ideas for
> >>   further speedups. Again, having to scan the UTF-8 string twice
> >>   is probably one cause of slowdown.
> > 
> > I don't think it's the UTF-8 decoder because I see an even larger
> > slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").
> 
> But those aren't used in iobench, are they?

I was not very clear, but you can change the encoding used in iobench by
using the "-E" command-line option (while UTF-8 is the default if you
don't specify anything).

For example:

$ ./python Tools/iobench/iobench.py -t -E latin1
Preparing files...
Text unit = one character (latin1-decoded)

** Text input **

[ 400KB ] read one unit at a time...   5.17 MB/s
[ 400KB ] read 20 units at a time...   77.6 MB/s
[ 400KB ] read one line at a time...209 MB/s
[ 400KB ] read 4096 units at a time...  509 MB/s

[  20KB ] read whole contents at once...885 MB/s
[ 400KB ] read whole contents at once...730 MB/s
[  10MB ] read whole contents at once...726 MB/s

(etc.)

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 review

2011-08-28 Thread Martin v. Löwis
Am 28.08.2011 22:01, schrieb Antoine Pitrou:
> 
>> - the iobench results are between 2% acceleration (seek operations),
>>   16% slowdown for small-sized reads (4.31MB/s vs. 5.22 MB/s) and
>>   37% for large sized reads (154 MB/s vs. 235 MB/s). The speed
>>   difference is probably in the UTF-8 decoder; I have already
>>   restored the "runs of ASCII" optimization and am out of ideas for
>>   further speedups. Again, having to scan the UTF-8 string twice
>>   is probably one cause of slowdown.
> 
> I don't think it's the UTF-8 decoder because I see an even larger
> slowdown with simpler encodings (e.g. "-E latin1" or "-E utf-16le").

Those haven't been ported to the new API, yet. Consider, for example,
d9821affc9ee. Before that, I got 253 MB/s on the 4096 units read test;
with that change, I get 610 MB/s. The trunk gives me 488 MB/s, so this
is a 25% speedup for PEP 393.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] LZMA compression support in 3.3

2011-08-28 Thread Greg Ewing

Guido van Rossum wrote:

On Sat, Aug 27, 2011 at 3:14 PM, Dan Stromberg  wrote:


IMO, we really, really need some common way of accessing C libraries that
works for all major Python variants.


We have one. It's called writing an extension module.


I think Dan means some way of doing this without having
to hand-craft a different one for each Python implementation.

If we're really serious about the idea that "Python is not
CPython", this seems like a reasonable thing to want. Currently
the Python universe is very much centred around CPython, with
the other implementations perpetually in catch-up mode.

My suggestion on how to address this would be something akin
to Pyrex or Cython. I gather that there has been some work
recently on adding different back-ends to Cython to generate
code for different Python implementations.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Paul Moore writes:

 > IronPython and Jython can retain UTF-16 as their native form if that
 > makes interop cleaner, but in doing so they need to ensure that basic
 > operations like indexing and len work in terms of code points, not
 > code units, if they are to conform.

[...]

 > They lose the O(1) guarantee, but that's easily defensible as a
 > tradeoff to conform to underlying runtime semantics.

Unfortunately, I don't think it's all that easy to defend.  Absent PEP
393 or a restriction to the characters in the BMP, this is a very
expensive change, easily visible to interactive users, let alone
performance-hungry applications.

I personally do advocate the "array of code points" definition, but I
don't use IronPython or Jython so PEP 393 is as close to heaven as I
expect to get.  OTOH, I also use Emacsen with Mule, and I have to
admit that there is a perceptible performance hit in any large (>1 MB)
buffer containing non-ASCII characters vs. pure ASCII (the code unit
in Mule is 1 byte).  I expect that if IronPython and Jython really
want to retain native, code-unit-based representations, it's going to
be painful to conform to an "array of code points" specification.

There may need to be a compromise of the form "Implementations SHOULD
provide an implementation of str that is both O(1) in indexing and an
array of code points.  Code that is Unicode-ly correct in Python
implementing PEP 393 will need to be ported with some effort to
implementations that do not satisfy this requirement, perhaps using
different algorithms or extra libraries."
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Guido van Rossum
On Sun, Aug 28, 2011 at 11:23 AM, Stefan Behnel  wrote:
> Hi,
>
> sorry for hooking in here with my usual Cython bias and promotion. When the
> question comes up what a good FFI for Python should look like, it's an
> obvious reaction from my part to throw Cython into the game.
>
> Terry Reedy, 28.08.2011 06:58:
>>
>> Dan, I once had the more or less the same opinion/question as you with
>> regard to ctypes, but I now see at least 3 problems.
>>
>> 1) It seems hard to write it correctly. There are currently 47 open ctypes
>> issues, with 9 being feature requests, leaving 38 behavior-related issues.
>> Tom Heller has not been able to work on it since the beginning of 2010 and
>> has formally withdrawn as maintainer. No one else that I know of has taken
>> his place.
>
> Cython has an active set of developers and a rather large and growing user
> base.
>
> It certainly has lots of open issues in its bug tracker, but most of them
> are there because we *know* where the development needs to go, not so much
> because we don't know how to get there. After all, the semantics of Python
> and C/C++, between which Cython sits, are pretty much established.
>
> Cython compiles to C code for CPython, (hopefully soon [1]) to Python+ctypes
> for PyPy and (mostly [2]) C++/CLI code for IronPython, which boils down to
> the same build time and runtime kind of dependencies that the supported
> Python runtimes have anyway. It does not add dependencies on any external
> libraries by itself, such as the libffi in CPython's ctypes implementation.
>
> For the CPython backend, the generated code is very portable and is
> self-contained when compiled against the CPython runtime (plus, obviously,
> libraries that the user code explicitly uses). It generates efficient code
> for all existing CPython versions starting with Python 2.4, with several
> optimisations also for recent CPython versions (including the upcoming 3.3).
>
>
>> 2) It is not trivial to use it correctly.
>
> Cython is basically Python, so Python developers with some C or C++
> knowledge tend to get along with it quickly.
>
> I can't say yet how easy it is (or will be) to write code that is portable
> across independent Python implementations, but given that that field is
> still young, there's certainly a lot that can be done to aid this.

Cythin does sound attractive for cross-Python-implementation use. This
is exciting.

>> I think it needs a SWIG-like
>> companion script that can write at least first-pass ctypes code from the .h
>> header files. Or maybe it could/should use header info at runtime (with the
>> .h bundled with a module).
>
> From my experience, this is a "nice to have" more than a requirement. It has
> been requested for Cython a couple of times, especially by new users, and
> there are a couple of scripts out there that do this to some extent. But the
> usual problem is that Cython users (and, similarly, ctypes users) do not
> want a 1:1 mapping of a library API to a Python API (there's SWIG for that),
> and you can't easily get more than a trivial mapping out of a script. But,
> yes, a one-shot generator for the necessary declarations would at least help
> in cases where the API to be wrapped is somewhat large.

Hm, the main use that was proposed here for ctypes is to wrap existing
libraries (not to create nicer APIs, that can be done in pure Python
on top of this). In general, an existing library cannot be called
without access to its .h files -- there are probably struct and
constant definitions, platform-specific #ifdefs and #defines, and
other things in there that affect the linker-level calling conventions
for the functions in the library. (Just like Python's own .h files --
e.g. the extensive renaming of the Unicode APIs depending on
narrow/wide build) How does Cython deal with these? I wonder if for
this particular purpose SWIG isn't the better match. (If SWIG weren't
universally hated, even by its original author. :-)

>> 3) It seems to be slower than compiled C extension wrappers. That, at
>> least, was the discovery of someone who re-wrote pygame using ctypes. (The
>> hope was that using ctypes would aid porting to 3.x, but the time penalty
>> was apparently too much for time-critical code.)
>
> Cython code can be as fast as C code, and in some cases, especially when
> developer time is limited, even faster than hand written C extensions. It
> allows for a straight forward optimisation path from regular Python code
> down to the speed of C, and trivial interaction with C code itself, if the
> need arises.
>
> Stefan
>
>
> [1] The PyPy port of Cython is currently being written as a GSoC project.
>
> [2] The IronPython port of Cython was written to facility a NumPy port to
> the .NET environment. It's currently not a complete port of all Cython
> features.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-d

Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Guido van Rossum writes:

 > I don't think anyone else has that impression. Please cite chapter and
 > verse if you really think this is important. IIUC, UCS-2 does not
 > allow surrogate pairs,

In the original definition of UCS-2 in draft ISO 10646 (1990),
everything in the BMP except for 0x and 0xFFFE was a character,
and there was no concept of "surrogate" at all.  Later in ISO 10646
(1993)[1], the Surrogate Area was carved out of the Private Area, but
UCS-2 implementations simply treat them as (single) characters with
special properties.  This was more or less backward compatible as all
corporate uses of the private area used the lower code points and
didn't conflict with the surrogates.  Finally (in 2000 or 2003) the
definition of UCS-2 in ISO 10646 was revised in a backward-
incompatible way to exclude surrogates entirely, ie, nowadays it is a
range-restricted version of UTF-16.

Footnotes: 
[1]  IIRC, strictly speaking this was done slightly later (1993 or
1994) in an official Amendment to ISO 10646; the Amendment was
incorporated into the standard in 2000.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Nick Coghlan
On Mon, Aug 29, 2011 at 12:27 PM, Guido van Rossum  wrote:
> I wonder if for
> this particular purpose SWIG isn't the better match. (If SWIG weren't
> universally hated, even by its original author. :-)

SWIG is nice when you control the C/C++ side of the API as well and
can tweak it to be SWIG-friendly. I shudder at the idea of using it to
wrap arbitrary C++ code, though.

That said, the idea of using SWIG to emit Cython code rather than
C/API code may be one well worth exploring.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 393 Summer of Code Project

2011-08-28 Thread Stephen J. Turnbull
Raymond Hettinger writes:

 > The naming convention for codecs is that the UTF prefix is used for
 > lossless encodings that cover the entire range of Unicode.

Sure.  The operative word here is "codec", not "str", though.

 > "The first amendment to the original edition of the UCS defined
 > UTF-16, an extension of UCS-2, to represent code points outside the
 > BMP."

Since when can s[0] represent a code point outside the BMP, for s a
Unicode string in a narrow build?

Remember, the UCS-2/narrow vs. UCS-4/wide distinction is *not* about
what Python supports vs. the outside world.  It's about what the str/
unicode type is an array of.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Ctypes and the stdlib (was Re: LZMA compression support in 3.3)

2011-08-28 Thread Glyph Lefkowitz

On Aug 28, 2011, at 7:27 PM, Guido van Rossum wrote:

> In general, an existing library cannot be called
> without access to its .h files -- there are probably struct and
> constant definitions, platform-specific #ifdefs and #defines, and
> other things in there that affect the linker-level calling conventions
> for the functions in the library.

Unfortunately I don't know a lot about this, but I keep hearing about something 
called "rffi" that PyPy uses to call C from RPython: 
.  This has some 
shortcomings currently, most notably the fact that it needs those .h files (and 
therefore a C compiler) at runtime, so it's currently a non-starter for code 
distributed to users.  Not to mention the fact that, as you can see, it's not 
terribly thoroughly documented.  But, that "ExternalCompilationInfo" object 
looks very promising, since it has fields like "includes", "libraries", etc.

Nevertheless it seems like it's a bit more type-safe than ctypes or cython, and 
it seems to me that it could cache some of that information that it extracts 
from header files and store it for later when a compiler might not be around.

Perhaps someone with more PyPy knowledge than I could explain whether this is a 
realistic contender for other Python runtimes?

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com