Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Dirkjan Ochtman
On Thu, Jan 21, 2010 at 02:56, Collin Winter  wrote:
> Agreed. We are actively working to improve the startup time penalty.
> We're interested in getting guidance from the CPython community as to
> what kind of a startup slow down would be sufficient in exchange for
> greater runtime performance.

For some apps (like Mercurial, which I happen to sometimes hack on),
increased startup time really sucks. We already have our demandimport
code (I believe bzr has something similar) to try and delay imports,
to prevent us spending time on imports we don't need. Maybe it would
be possible to do something like that in u-s? It could possibly also
keep track of the thorny issues, like imports where there's an except
ImportError that can do fallbacks.

Cheers,

Dirkjan
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Gregory P. Smith
+1

My biggest concern is memory usage but it sounds like addressing that is
already in your mind.  I don't so much mind an additional up front constant
and per-line-of-code hit for instrumentation but leaks are unacceptable.
 Any instrumentation data or jit caches should be managed (and tunable at
run time when possible and it makes sense).

I think having a run time flag (or environment variable for those who like
that) to disable the use of JIT at python3 execution time would be a good
idea.

-gps

disclaimer: I work for Google but not on unladen-swallow.  My motivation is
to improve the future of CPython for the entire world in the long term.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Martin v. Löwis
> It's true that as Martin said, we can rebase our code to Py3K in a
> branch on python.org any time we like, the question is more "if we do
> the work, will the Python community accept it".

I've stated my personal preference already. Let me add an observation on
top of that: even if the core committers in general might only give
a luke-warm welcome to llvm usage in the trunk - once it actually is
accepted and committed to the trunk, they will certainly start
supporting it. I would view it as similar to cyclic garbage collection:
when this was first proposed, I wondered myself "who would need that".
When it then was added, I started looking at how it actually works,
and liked the implementation approach very much. People, including
myself, then started fixing remaining types to support GC over the
next year, and would consider it a bug if it didn't work - whether
they were initially pro or con addition of GC.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jack Diederich wrote:

> Does disabling the LLVM change binary compatibility between modules
> targeted at the same version?  At tonight's Boston PIG we had some
> binary package maintainers but most people (including myself) only
> cared about source compatibility.I assume linux distros care about
> binary compatibility _a lot_.

Nope:  they (rightly) only support using binary modules either compiled
by themselves, or compiled against their version of Python.  See the
UCS4 vs. UCS2 troubles which show up routinely when folks try to reuse
binaries across incompatible Pythons on Linux (its *much* worse on
MacOS).  Source compatibility is all that matters for FLOSS developers,
really;  binary distributions are just an optimization, unless you don't
distribute sources at all.



Tres.
- --
===
Tres Seaver  +1 540-429-0999  tsea...@palladion.com
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktX3LMACgkQ+gerLs4ltQ4rZQCgzJ3DKZJdz9zIivkio1ibnzg/
I0AAoKV2vDnOqSYYPeQCVRTG3livqEbB
=xpha
-END PGP SIGNATURE-

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Andrew McNabb
On Wed, Jan 20, 2010 at 10:13:56PM -0500, Reid Kleckner wrote:
> 2) As a command line option, you can pass -j never.  If you have a
> short-lived script, you can just stick this in your #! line and forget
> about it.  This has more overhead, since all of the JIT machinery is
> loaded into memory but never used.  Right now we record feedback that
> will never be used, but we could easily make that conditional on the
> jit control flag.

Shebang lines are much less useful than they appear because they only
split on the first space.  Consider the following script:

#!/usr/bin/env python -tt
print "hello, world"

Running it gives the error because env is given the single argument
("python -tt") instead of two arguments ("python" and "-tt"):

/usr/bin/env: python -tt: No such file or directory


-- 
Andrew McNabb
http://www.mcnabbs.org/andrew/
PGP Fingerprint: 8A17 B57C 6879 1863 DE55  8012 AB4D 6098 8826 6868
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Brett Cannon
On Wed, Jan 20, 2010 at 19:57, Bill Janssen  wrote:

> Reid Kleckner  wrote:
>
> > On Wed, Jan 20, 2010 at 8:14 PM, Terry Reedy  wrote:
> > > If CPython development moves to distributed hg, the notion of 'blessed'
> > > branches (other than the PSF release branch) will, as I understand it,
> > > become somewhat obsolete. If you make a branch publicly available,
> anyone
> > > can grab it and merge it with their branch, just as they can with
> anyone
> > > elses.
> >
> > It's true that as Martin said, we can rebase our code to Py3K in a
> > branch on python.org any time we like, the question is more "if we do
> > the work, will the Python community accept it".
>
> Of course!  The Python community accepts all optional stuff.
>
> Personally, I think you've done a great job getting this far, and the
> fixes to LLVM alone are worth the effort, IMO.  That PEP is a great
> interim project report.
>
> If what you're really asking is, "will the Python community accept it
> joyfully and enthusiastically, and embrace it to their hearts?", you'll
> have to put in more work to demonstrate real advantages before the
> answer is "yes", I'd think.


I think the question the PEP is posing is "will python-dev accept the
Unladen Swallow work to add an LLVM JIT into the py3k trunk?" My personal
answer is "yes" as it's a nice speed improvement for code that runs more
than once, all while being an optional enhancement for those that don't want
it (whether it is memory, startup, or C++ dependency).

And as for the whole Hg branch thing, this is to merge all of this into the
main branch and not relegate it to some side branch.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Benjamin Peterson
2010/1/20 Jack Diederich :
> On Wed, Jan 20, 2010 at 5:27 PM, Collin Winter  
> wrote:
> [big snip]
>> In order to support hardware and software platforms where LLVM's JIT does not
>> work, Unladen Swallow provides a ``./configure --without-llvm`` option. This
>> flag carves out any part of Unladen Swallow that depends on LLVM, yielding a
>> Python binary that works and passes its tests, but has no performance
>> advantages. This configuration is recommended for hardware unsupported by 
>> LLVM,
>> or systems that care more about memory usage than performance.
>
> Does disabling the LLVM change binary compatibility between modules
> targeted at the same version?  At tonight's Boston PIG we had some
> binary package maintainers but most people (including myself) only
> cared about source compatibility.    I assume linux distros care about
> binary compatibility _a lot_.

We've traditionally broken binary compatibility between major releases
anyway, so I don't see much of an issue there. However, this might
change should PEP 384 be implemented.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Jack Diederich
On Wed, Jan 20, 2010 at 5:27 PM, Collin Winter  wrote:
[big snip]
> In order to support hardware and software platforms where LLVM's JIT does not
> work, Unladen Swallow provides a ``./configure --without-llvm`` option. This
> flag carves out any part of Unladen Swallow that depends on LLVM, yielding a
> Python binary that works and passes its tests, but has no performance
> advantages. This configuration is recommended for hardware unsupported by 
> LLVM,
> or systems that care more about memory usage than performance.

Does disabling the LLVM change binary compatibility between modules
targeted at the same version?  At tonight's Boston PIG we had some
binary package maintainers but most people (including myself) only
cared about source compatibility.I assume linux distros care about
binary compatibility _a lot_.

[snip]
> Managing LLVM Releases, C++ API Changes
> ---
> LLVM is released regularly every six months. This means that LLVM may be
> released two or three times during the course of development of a CPython 3.x
> release. Each LLVM release brings newer and more powerful optimizations,
> improved platform support and more sophisticated code generation.

I don't think this will be a problem in practice as long as the
current rules hold - namely that if someone has already committed a
patch that patch wins unless the later commit is clearly better.  That
puts the onus on people working out-of-sight to incorporate the public
mainline.  I'm sure many internal googler's (and Ubuntu'ers, and
whomever's) patches have already been developed on that timeline and
were integrated into the core without remark or incident.

[snip]
> Open Issues
> ===
>
> - *Code review policy for the ``py3k-jit`` branch.* How does the CPython
>  community want us to procede with respect to checkins on the ``py3k-jit``
>  branch? Pre-commit reviews? Post-commit reviews?
>
>  Unladen Swallow has enforced pre-commit reviews in our trunk, but we realize
>  this may lead to long review/checkin cycles in a purely-volunteer
>  organization. We would like a non-Google-affiliated member of the CPython
>  development team to review our work for correctness and compatibility, but we
>  realize this may not be possible for every commit.

As above, I don't think this will be a problem in practice -- how
often do two people work on the same part of the core?  So long as the
current "firstest with with mostest" practice holds for public commits
it doesn't matter what googlers do in private.

I like it,

-Jack
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Bill Janssen
Reid Kleckner  wrote:

> On Wed, Jan 20, 2010 at 8:14 PM, Terry Reedy  wrote:
> > If CPython development moves to distributed hg, the notion of 'blessed'
> > branches (other than the PSF release branch) will, as I understand it,
> > become somewhat obsolete. If you make a branch publicly available, anyone
> > can grab it and merge it with their branch, just as they can with anyone
> > elses.
> 
> It's true that as Martin said, we can rebase our code to Py3K in a
> branch on python.org any time we like, the question is more "if we do
> the work, will the Python community accept it".

Of course!  The Python community accepts all optional stuff.  

Personally, I think you've done a great job getting this far, and the
fixes to LLVM alone are worth the effort, IMO.  That PEP is a great
interim project report.

If what you're really asking is, "will the Python community accept it
joyfully and enthusiastically, and embrace it to their hearts?", you'll
have to put in more work to demonstrate real advantages before the
answer is "yes", I'd think.

Bill
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Ben Finney
Georg Brandl  writes:

> But I've no intention to restrict feature releases to "every 18-24
> months". What now?

Now we take further discussion to the ‘python-ideas’ forum.

-- 
 \   “We must respect the other fellow's religion, but only in the |
  `\   sense and to the extent that we respect his theory that his |
_o__) wife is beautiful and his children smart.” —Henry L. Mencken |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Reid Kleckner
On Wed, Jan 20, 2010 at 8:14 PM, Terry Reedy  wrote:
> If CPython development moves to distributed hg, the notion of 'blessed'
> branches (other than the PSF release branch) will, as I understand it,
> become somewhat obsolete. If you make a branch publicly available, anyone
> can grab it and merge it with their branch, just as they can with anyone
> elses.

It's true that as Martin said, we can rebase our code to Py3K in a
branch on python.org any time we like, the question is more "if we do
the work, will the Python community accept it".

> Given the slight benefits compared to the costs, I think this, in its
> current state, should be optional, such as is psyco.

How optional would you want it to be?  I'll point out that there are
two ways you can turn off the JIT right now:
1) As a configure time option, pass --without-llvm.  Obviously, this
is really only useful to people who are building their own binaries,
or for embedded platforms.
2) As a command line option, you can pass -j never.  If you have a
short-lived script, you can just stick this in your #! line and forget
about it.  This has more overhead, since all of the JIT machinery is
loaded into memory but never used.  Right now we record feedback that
will never be used, but we could easily make that conditional on the
jit control flag.

> Your results suggest that speeding up garden-variety Python code is harder
> than it sometimes seems. I wonder how your results from fancy codework
> compare, for instance, with simply making built-in names reserved, so that,
> for instance, len =  is illegal, and all such names get
> dereferenced at compile time.

That's cheating.  :)

Reid
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Jeffrey Yasskin
On Wed, Jan 20, 2010 at 5:56 PM, Collin Winter  wrote:
> On Wed, Jan 20, 2010 at 5:14 PM, Terry Reedy  wrote:
>> Given the slight benefits compared to the costs, I think this, in its
>> current state, should be optional, such as is psyco.
>>
>> Psyco has a similar time-space tradeoff, except that the benefit is much
>> greater (10x+ has been reported by happy users) for certain types of code
>> (lots of integer arithmethic, which psyco unboxes). Why should your changes
>> be blessed and psyco not? While now 2.x only, like UnSw, there are
>> apparently people working on a 3.x version. Pysco makes its tradeoffs
>> voluntary, and easy to switch on and off where the benefits are worth the
>> cost to the particular user.

I think you're right that users should be able to turn off the JIT at
runtime (not just configure time), and get pretty close to no slowdown
or memory overhead compared to pre-Unladen CPython. We currently have
a command-line switch to do just that, but I suspect it doesn't get as
close to pre-Unladen performance/memory use as it could. I've filed
http://code.google.com/p/unladen-swallow/issues/detail?id=123 to keep
track of this.

>> I do not think that standard CPython should be
>> made incompatible with a module that greatly benefits certain users who have
>> been using it for years.

Standard CPython is made incompatible with psyco at every release. ;)
Someone had to update psyco to support CPython 2.5 and 2.6, and
they'll have to update it to support each 3.x as well. I don't think
there's anything we've done in Unladen that will make it impossible to
update psyco to coexist with it; it's just that psyco turned out to be
too complicated for us to keep it working through our bytecode
changes.

> - While Psyco provides large benefits to numerical workloads, the
> benefits to other systems we have at Google are much, much smaller, in
> the 15-30% range.

I don't think it's just Google workloads that Psyco provides a
smaller-than-advertised benefit to.

>> Your results suggest that speeding up garden-variety Python code is harder
>> than it sometimes seems. I wonder how your results from fancy codework
>> compare, for instance, with simply making built-in names reserved, so that,
>> for instance, len =  is illegal, and all such names get
>> dereferenced at compile time.
>
> Yes, if you change the language, certain optimizations become simpler
> or some code becomes faster. However, changing the language in subtle
> ways -- like changing when builtins are bound -- increases the barrier
> to adoption, which is why we chose to implement the language as
> specified. Auditing Google's entire Python codebase to correct for
> these subtle changes would be prohibitive, as would the need to
> retrain all our engineers who use Python to educate them on the
> differences between the two languages.

Again, this isn't just Google. Do you really want to audit your code
and all the libraries you use to make sure they don't depend on late
builtin binding?

We have considered making language changes like this at some -O level
to support people who want every ounce of speed (in the same way that
gcc supports a -ffast-math option for people who don't care about
floating point accuracy), and we think it'll be a good thing to
investigate in the future, but it's not the kind of change I'd want to
make to the default language.

>> I guess what I am mainly saying is that there are several possible ways to
>> speed up Python 3 execution (including others not mentioned here) and it is
>> not at all clear to me that this particular one is in any sense 'best of
>> breed'. If it disables other approaches, I think it should be optional for
>> the standard PSF distribution.

As far as I can tell, it does not disable other approaches.

Thanks for the comments!
Jeffrey
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] newgil for python 2.5.4

2010-01-20 Thread Benjamin Peterson
2010/1/20 Ross Cohen :
> Comments? Suggestions? I'm going to continue fixing this up, but was
> wondering if this could possibly make it into python 2.7.

Yes, it could, but please post it to the tracker instead of attaching patches.



-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] newgil for python 2.5.4

2010-01-20 Thread Ross Cohen
I put together this patch which switches 2.5.4 over to use the newgil.
This was generated by diffing change 76193 (last in the newgil branch)
against change 76189 and applying that on top of the changes listed in
issue 4293 (http://bugs.python.org/issue4293), specifically 68460, 68461
and 68722. There were only a couple of rejects, mostly in docs and tests
plus some irrelevant bits. I had to fix up one or 2 places by hand which
were pretty straightforward.

Only 2 tests are failing. test_capi looks to be a problem with the test
because it was from the py3k branch and test_command is failing for me,
which I need to look into.

Some performance tests (taken from
http://www.mail-archive.com/python-dev@python.org/msg43407.html):
Processor: Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz
-j0
2.5.4 : 20.380s
newgil: 16.590s

-j4
2.5.4 : 27.440s
newgil: 20.120s

Comments? Suggestions? I'm going to continue fixing this up, but was
wondering if this could possibly make it into python 2.7.

Ross
diff --git a/Include/ceval.h b/Include/ceval.h
--- a/Include/ceval.h
+++ b/Include/ceval.h
@@ -69,10 +69,6 @@
 PyAPI_FUNC(PyObject *) PyEval_EvalFrame(struct _frame *);
 PyAPI_FUNC(PyObject *) PyEval_EvalFrameEx(struct _frame *f, int exc);
 
-/* this used to be handled on a per-thread basis - now just two globals */
-PyAPI_DATA(volatile int) _Py_Ticker;
-PyAPI_DATA(int) _Py_CheckInterval;
-
 /* Interface for threads.
 
A module that plans to do a blocking system call (or something else
@@ -131,6 +127,9 @@
 PyAPI_FUNC(void) PyEval_ReleaseThread(PyThreadState *tstate);
 PyAPI_FUNC(void) PyEval_ReInitThreads(void);
 
+PyAPI_FUNC(void) _PyEval_SetSwitchInterval(unsigned long microseconds);
+PyAPI_FUNC(unsigned long) _PyEval_GetSwitchInterval(void);
+
 #define Py_BEGIN_ALLOW_THREADS { \
 			PyThreadState *_save; \
 			_save = PyEval_SaveThread();
@@ -149,6 +148,7 @@
 #endif /* !WITH_THREAD */
 
 PyAPI_FUNC(int) _PyEval_SliceIndex(PyObject *, Py_ssize_t *);
+PyAPI_FUNC(void) _PyEval_SignalAsyncExc(void);
 
 
 #ifdef __cplusplus
diff --git a/Include/pystate.h b/Include/pystate.h
--- a/Include/pystate.h
+++ b/Include/pystate.h
@@ -82,6 +82,8 @@
 
 PyObject *dict;  /* Stores per-thread state */
 
+/* XXX doesn't mean anything anymore (the comment below is obsolete)
+   => deprecate or remove? */
 /* tick_counter is incremented whenever the check_interval ticker
  * reaches zero. The purpose is to give a useful measure of the number
  * of interpreted bytecode instructions in a given thread.  This
diff --git a/Include/sysmodule.h b/Include/sysmodule.h
--- a/Include/sysmodule.h
+++ b/Include/sysmodule.h
@@ -19,7 +19,6 @@
 			Py_GCC_ATTRIBUTE((format(printf, 1, 2)));
 
 PyAPI_DATA(PyObject *) _PySys_TraceFunc, *_PySys_ProfileFunc;
-PyAPI_DATA(int) _PySys_CheckInterval;
 
 PyAPI_FUNC(void) PySys_ResetWarnOptions(void);
 PyAPI_FUNC(void) PySys_AddWarnOption(char *);
diff --git a/Lib/test/test_capi.py b/Lib/test/test_capi.py
--- a/Lib/test/test_capi.py
+++ b/Lib/test/test_capi.py
@@ -1,10 +1,98 @@
 # Run the _testcapi module tests (tests for the Python/C API):  by defn,
 # these are all functions _testcapi exports whose name begins with 'test_'.
 
+from __future__ import with_statement
 import sys
+import time
+import random
+import unittest
+import threading
 from test import test_support
 import _testcapi
 
+class TestPendingCalls(unittest.TestCase):
+
+def pendingcalls_submit(self, l, n):
+def callback():
+#this function can be interrupted by thread switching so let's
+#use an atomic operation
+l.append(None)
+
+for i in range(n):
+time.sleep(random.random()*0.02) #0.01 secs on average
+#try submitting callback until successful.
+#rely on regular interrupt to flush queue if we are
+#unsuccessful.
+while True:
+if _testcapi._pending_threadfunc(callback):
+break;
+
+def pendingcalls_wait(self, l, n, context = None):
+#now, stick around until l[0] has grown to 10
+count = 0;
+while len(l) != n:
+#this busy loop is where we expect to be interrupted to
+#run our callbacks.  Note that callbacks are only run on the
+#main thread
+if False and test_support.verbose:
+print "(%i)"%(len(l),),
+for i in xrange(1000):
+a = i*i
+if context and not context.event.is_set():
+continue
+count += 1
+self.failUnless(count < 1,
+"timeout waiting for %i callbacks, got %i"%(n, len(l)))
+if False and test_support.verbose:
+print "(%i)"%(len(l),)
+
+def test_pendingcalls_threaded(self):
+
+#do every callback on a separate thread
+n = 32 #total callbacks
+threads = []
+class foo(object):pass
+context = foo()
+context.l = []
+  

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Collin Winter
Hi Terry,

On Wed, Jan 20, 2010 at 5:14 PM, Terry Reedy  wrote:
> Some comments from a non-developer:
>
> The proposal to add this to 3.x seems a bit premature until you have a
> version that runs with 3.x. Not that I expect that to be a problem though.
> If CPython development moves to distributed hg, the notion of 'blessed'
> branches (other than the PSF release branch) will, as I understand it,
> become somewhat obsolete. If you make a branch publicly available, anyone
> can grab it and merge it with their branch, just as they can with anyone
> elses.

Beyond inclusion in the mainline CPython source tree, we are also
interested in gauging python-dev's level of interest in an LLVM-based
JIT. While anyone can currently grab our source (as some companies
already have), we don't want to waste our time fixing the remaining
issues if python-dev is not interested in incorporating an LLVM-based
JIT into the mainline roadmap.

> 3.x add optional type annotations. It seems to me that a 3.x proposal should
> make use of those.

PEP 3107 explicitly rejected having Python-the-language decide on any
semantics for function annotations, as the exact nature of those
semantics were sure to be controversial. Reversing that decision is a
separate issue requiring a separate PEP and input from the other
implementations, I think, and in any case, would be an optimization
within the wider JIT framework.

> All the info being collected for every byte code execution *must* take extra
> time, which will slow down certain types of programs.

That is correct. For example, it slows down startup, but for most
applications, you make up the difference (and more!) once the
JIT-compiled functions kick in. We have ideas for addressing the
degradation for short-lived programs, and one of our developers is
actively working on this area.

> Given the slight benefits compared to the costs, I think this, in its
> current state, should be optional, such as is psyco.
>
> Psyco has a similar time-space tradeoff, except that the benefit is much
> greater (10x+ has been reported by happy users) for certain types of code
> (lots of integer arithmethic, which psyco unboxes). Why should your changes
> be blessed and psyco not? While now 2.x only, like UnSw, there are
> apparently people working on a 3.x version. Pysco makes its tradeoffs
> voluntary, and easy to switch on and off where the benefits are worth the
> cost to the particular user. I do not think that standard CPython should be
> made incompatible with a module that greatly benefits certain users who have
> been using it for years.

We considered extending Psyco when we began the project, but found it
to be an unsuitable baseline for a number of reasons:
- Psyco is 32-bit x86 only (as the Psyco website now prominently
notes), making it unsuitable for 64-bit environments like Google. The
module's primary maintainer, Christian Tismer, has publicly indicated
that he has no interest in porting Psyco to work on 64-bit systems.
- Psyco is a tremendously complicated system with no test suite to
verify that it is working correctly. The core development team is very
small (I believe it is solely Christian Tismer), and other people who
have tried to modify Psyco have found it not worth their time (myself,
Jeffrey Yasskin, Raymond Hettinger). Jeffrey and I originally intended
to make Psyco compatible with Unladen Swallow, but the overhead of
doing so distracted from our main goal.
- While Psyco provides large benefits to numerical workloads, the
benefits to other systems we have at Google are much, much smaller, in
the 15-30% range.

Do you have pointers to the in-development Python 3 version of Psyco
you mentioned? Google doesn't find any such a project, except for
forum comments that it doesn't exist.

> Your results suggest that speeding up garden-variety Python code is harder
> than it sometimes seems. I wonder how your results from fancy codework
> compare, for instance, with simply making built-in names reserved, so that,
> for instance, len =  is illegal, and all such names get
> dereferenced at compile time.

Yes, if you change the language, certain optimizations become simpler
or some code becomes faster. However, changing the language in subtle
ways -- like changing when builtins are bound -- increases the barrier
to adoption, which is why we chose to implement the language as
specified. Auditing Google's entire Python codebase to correct for
these subtle changes would be prohibitive, as would the need to
retrain all our engineers who use Python to educate them on the
differences between the two languages.

> Increasing startup time on a hot new machine from .2 to .3 sec may not be a
> big deal, but increases from 2 to 3 secs on an older machine will be for
> short scripts that execute in 2 secs anyway.

Agreed. We are actively working to improve the startup time penalty.
We're interested in getting guidance from the CPython community as to
what kind of a startup slow down would be sufficien

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Terry Reedy

Some comments from a non-developer:

The proposal to add this to 3.x seems a bit premature until you have a 
version that runs with 3.x. Not that I expect that to be a problem though.


If CPython development moves to distributed hg, the notion of 'blessed' 
branches (other than the PSF release branch) will, as I understand it, 
become somewhat obsolete. If you make a branch publicly available, 
anyone can grab it and merge it with their branch, just as they can with 
anyone elses.


3.x add optional type annotations. It seems to me that a 3.x proposal 
should make use of those.


All the info being collected for every byte code execution *must* take 
extra time, which will slow down certain types of programs.


Given the slight benefits compared to the costs, I think this, in its 
current state, should be optional, such as is psyco.


Psyco has a similar time-space tradeoff, except that the benefit is much 
greater (10x+ has been reported by happy users) for certain types of 
code (lots of integer arithmethic, which psyco unboxes). Why should your 
changes be blessed and psyco not? While now 2.x only, like UnSw, there 
are apparently people working on a 3.x version. Pysco makes its 
tradeoffs voluntary, and easy to switch on and off where the benefits 
are worth the cost to the particular user. I do not think that standard 
CPython should be made incompatible with a module that greatly benefits 
certain users who have been using it for years.


Your results suggest that speeding up garden-variety Python code is 
harder than it sometimes seems. I wonder how your results from fancy 
codework compare, for instance, with simply making built-in names 
reserved, so that, for instance, len =  is illegal, and all 
such names get dereferenced at compile time.


Increasing startup time on a hot new machine from .2 to .3 sec may not 
be a big deal, but increases from 2 to 3 secs on an older machine will 
be for short scripts that execute in 2 secs anyway.


I guess what I am mainly saying is that there are several possible ways 
to speed up Python 3 execution (including others not mentioned here) and 
it is not at all clear to me that this particular one is in any sense 
'best of breed'. If it disables other approaches, I think it should be 
optional for the standard PSF distribution.


Terry Jan Reedy

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread MRAB

Martin v. Löwis wrote:

The only supported default encodings in Python are:

  Python 2.x: ASCII
  Python 3.x: UTF-8
   

Is this true?


For 3.x: yes. However, the default encoding is much less relevant in
3.x, since Python will never implicitly use the default encoding, except
when some C module asks for a char*. In particular, ordering between
bytes and unicodes causes a type error always.


I thought the default encoding in Python 3 was platform
specific (i.e. cp1252 on Windows).


Not at all. You are confusing this with the IO encoding of text
files, which indeed defaults to the locale encoding (and CP_ACP
on Windows specifically - which may or may not be cp1252).

The default encoding (i.e. the one you could theoretically set
with sys.setdefaultencoding) in 3.x is UTF-8.


It's UTF-8 precisely to avoid cross-platform encoding problems,
especially important now that 'normal' strings are Unicode.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-01-20 Thread Martin v. Löwis
> We're looking forward to discussing this with everyone.

I think the PEP is asking too much (although I can
understand how marketing may have influenced that),
and also asks for permission where none is needed.

It is too broad: it asks (in its title) for the integration of Unladen
Swallow, when it actually only asks for integration of the JIT compiler
(maybe Unladen Swallow actually only *is* the JIT compiler, but given
the contributions that have already been integrated, this wouldn't be
my understanding).

It asks for permission where none is needed: Permission to continue
working on a feature you already contributed is actually granted to
every contributor - in fact, contributors are *expected* to continue
to work on their feature, at least for fixing bugs in it. It also
asks for the creation an merging of branches. Basically, you can
created any branches you want - the PEP only needs to worry whether
it is ok to have the feature on the trunk. The section on the proposed
merge plan is but one implementation detail.

And I don't even talk about the pony.

These aside, I'm certainly +1 on the PEP.

I would like to point out that the PEP still *could* target Python
2.7 if you wanted to, until beta 1. Having missed alpha 1 is no
really limitation. OTOH, I support the idea of not adding it to 2.7,
e.g. to reduce the maintenance effort, and not introduce a new risk
for the last 2.x release.

Regards,
Martin


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread Martin v. Löwis
>> The only supported default encodings in Python are:
>>
>>   Python 2.x: ASCII
>>   Python 3.x: UTF-8
>>
> 
> Is this true?

For 3.x: yes. However, the default encoding is much less relevant in
3.x, since Python will never implicitly use the default encoding, except
when some C module asks for a char*. In particular, ordering between
bytes and unicodes causes a type error always.

> I thought the default encoding in Python 3 was platform
> specific (i.e. cp1252 on Windows).

Not at all. You are confusing this with the IO encoding of text
files, which indeed defaults to the locale encoding (and CP_ACP
on Windows specifically - which may or may not be cp1252).

The default encoding (i.e. the one you could theoretically set
with sys.setdefaultencoding) in 3.x is UTF-8.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread Martin v. Löwis
> Why only set an encoding on these streams when they're directly
> connected to a tty?

If you are sending data to the terminal, you can be fairly certain
that the locale's encoding should be used. It's a convenience feature
for the interactive mode, so that Unicode strings print correctly.

When sending data to a pipe or to a file, God knows what encoding
should have been used. If it's any XML file (for example), using the
locale's encoding would be incorrect, and the encoding declared
in the XML declaration should be used (or UTF-8 if no declaration
is included). If it's a HTTP socket, it really should be restricted
to ASCII in the headers, and then to the content-type. And so on.

So in general, the applications should arrange to the the encoding
or encode themselves when they write to some output stream. If they
fail to do so, it's a bug in the application, not in Python.

> I'll patch things to remove the isatty conditional if that's acceptable.

It will make your Python release incompatible with everybody else's,
and will probably lead to moji-bake. Otherwise, it's fine.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread Michael Foord

On 20/01/2010 21:37, M.-A. Lemburg wrote:

David Malcolm wrote:
   

I'm thinking of making this downstream change to Fedora's site.py (and
possibly in future RHEL releases) so that the default encoding
automatically picks up the encoding from the locale:

  def setencoding():
  """Set the string encoding used by the Unicode implementation.  The
  default is 'ascii', but if you're willing to experiment, you can
  change this."""
  encoding = "ascii" # Default value set by _PyUnicode_Init()
-if 0:
+if 1:
  # Enable to support locale aware default string encodings.
  import locale
  loc = locale.getdefaultlocale()
  if loc[1]:
  encoding = loc[1]
  if 0:
  # Enable to switch off string to Unicode coercion and implicit
  # Unicode to string conversion.
  encoding = "undefined"
  if encoding != "ascii":
  # On Non-Unicode builds this will raise an AttributeError...
  sys.setdefaultencoding(encoding) # Needs Python Unicode build !

I've written up extensive notes on the change and the history of the
issue here:
https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale

Please let me know if there are any errors on that page!

The aim is to avoid strange behavior changes when running a script
within a shell pipeline/cronjob as opposed to at a tty (and to capture
some of the bizarre cornercases, for example, I found the behavior of
the pango/pygtk modules particularly surprising).

I mention it here as a "heads-up" about the change:
   - in case other distributions may want to do the same (or already do
so, though in my very brief survey no-one else seemed to), and
   - in case doing so breaks things in a way I'm not expecting; can
anyone see any flaws in my arguments?
   - in case other people find my notes on the issue useful

Hope this is helpful; can anyone see any potential problems with this
change?
 

Yes: such a change is unsupported by Python. The code you are
changing should really have been removed many releases ago -
it was originally only intended to serve as basis for experimentation
on choosing the "right" default encoding.

The only supported default encodings in Python are:

  Python 2.x: ASCII
  Python 3.x: UTF-8
   


Is this true? I thought the default encoding in Python 3 was platform 
specific (i.e. cp1252 on Windows). That means files written using the 
default encoding on one platform may not be read correctly on another 
platform. Slightly off topic for this discussion I realise.


Michael


If you change these, you are on your own and strange things will
start to happen. The default encoding does not only affect
the translation between Python and the outside world, but also
all internal conversions between 8-bit strings and Unicode.

Hacks like what's happening in the pango module (setting the
default encoding to 'utf-8' by reloading the site module in
order to get the sys.setdefaultencoding() API back) are just
downright wrong and will cause serious problems since Unicode
objects cache their default encoded representation.

Please don't enable the use of a locale based default encoding.

If all you want to achieve is getting the encodings of
stdout and stdin correctly setup for pipes, you should
instead change the .encoding attribute of those (only).

   



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Georg Brandl
Am 20.01.2010 03:43, schrieb David Lyon:

>> Barry was talking about mirrors of the python code. It is true a
>> "package manager" could be developed based on a SCM, however you need
>> to implement this far away from the stdlib and get traction with it
>> within the community long before inclusion would be considered.
> 
> I think I'll have better chances with PEPs.
> 
> Being honest, if wonderful libraries like Sphinx and Mercurial
> and Git and BZR can't make it into the stdlib, then there is
> no hope for even newer code to get in there.

But I've no intention to restrict feature releases to "every 18-24
months". What now?

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread M.-A. Lemburg
David Malcolm wrote:
> On Wed, 2010-01-20 at 22:37 +0100, M.-A. Lemburg wrote:
> Note that pango isn't even doing the module reload hack; it's written in
> C, and going in directly through the C API:
>PyUnicode_SetDefaultEncoding("utf-8");
> 
> I should mention that I've seen at least one C module in the wild that
> exists merely to do this:
> 
>   #include 
>   void initutf8_please(void) {
>  PyUnicode_SetDefaultEncoding("utf-8");
>   }
> 
> so that the user could do "import utf8_please" at the top of their
> scripts.

We should have made that a private C API... oh well. At the time
these APIs were written it wasn't yet clear which default encoding
to choose and even after the decision there were a few different camps:

 * Latin-1
 * UTF-8
 * locale dependent

Sometime later Guido (AFAIR) then proposed ASCII as the GCD of
all of these.

>> If all you want to achieve is getting the encodings of
>> stdout and stdin correctly setup for pipes, you should
>> instead change the .encoding attribute of those (only).
> Currently they are set up, but only when connected to a tty, which leads
> to surprising behavior changes inside pipes/cronjobs (e.g. piping a
> unicode string to "less" immediately breaks for code points above 127:
> less is expecting locale-encoded bytes, but sys.stdout has encoding
> "ASCII").
> 
> Similarly:
> [da...@brick ~]$ python -c "import sys; print sys.stdout.encoding"
> UTF-8
> [da...@brick ~]$ python -c "import sys; print sys.stdout.encoding" | cat
> None
> 
> Why only set an encoding on these streams when they're directly
> connected to a tty?  I'll patch things to remove the isatty conditional
> if that's acceptable.
> 
> (the tty-logic to do it appeared with the initial commit that added
> locale-encoding support to sys.std[in|out], in sysmodule.c:
> http://svn.python.org/view?view=rev&revision=32719
> and was later moved from sysmodule.c to pythonrun.c:
> http://svn.python.org/view?view=rev&revision=33817 
> it later grew to affect stderr:
> http://svn.python.org/view?view=rev&revision=43581
> again, only if directly connected to a tty)

For TTYs the process locale will be a reasonable source of
information about the encoding used for stdin and stdout
(since the TTY will use those settings as well).

For pipes the situation is not all that clear, e.g. you
could have a Java application creating some text in UTF-8
which then gets passed to another application Latin-1
and all that running in a CP1252 shell on Windows.

However, removing the isatty() check will certainly not cause
as many problems as changing the default encoding altogether.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread David Malcolm
On Wed, 2010-01-20 at 22:37 +0100, M.-A. Lemburg wrote:
> David Malcolm wrote:
> > I'm thinking of making this downstream change to Fedora's site.py (and
> > possibly in future RHEL releases) so that the default encoding
> > automatically picks up the encoding from the locale:
> > 
> >  def setencoding():
> >  """Set the string encoding used by the Unicode implementation.  The
> >  default is 'ascii', but if you're willing to experiment, you can
> >  change this."""
> >  encoding = "ascii" # Default value set by _PyUnicode_Init()
> > -if 0:
> > +if 1:
> >  # Enable to support locale aware default string encodings.
> >  import locale
> >  loc = locale.getdefaultlocale()
> >  if loc[1]:
> >  encoding = loc[1]
> >  if 0:
> >  # Enable to switch off string to Unicode coercion and implicit
> >  # Unicode to string conversion.
> >  encoding = "undefined"
> >  if encoding != "ascii":
> >  # On Non-Unicode builds this will raise an AttributeError...
> >  sys.setdefaultencoding(encoding) # Needs Python Unicode build !
> > 
> > I've written up extensive notes on the change and the history of the
> > issue here:
> > https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale
> > 
> > Please let me know if there are any errors on that page!
> > 
> > The aim is to avoid strange behavior changes when running a script
> > within a shell pipeline/cronjob as opposed to at a tty (and to capture
> > some of the bizarre cornercases, for example, I found the behavior of
> > the pango/pygtk modules particularly surprising).
> > 
> > I mention it here as a "heads-up" about the change:
> >   - in case other distributions may want to do the same (or already do
> > so, though in my very brief survey no-one else seemed to), and
> >   - in case doing so breaks things in a way I'm not expecting; can
> > anyone see any flaws in my arguments?
> >   - in case other people find my notes on the issue useful
> > 
> > Hope this is helpful; can anyone see any potential problems with this
> > change?
> 
> Yes: such a change is unsupported by Python. The code you are
> changing should really have been removed many releases ago -
> it was originally only intended to serve as basis for experimentation
> on choosing the "right" default encoding.
> 
> The only supported default encodings in Python are:
> 
>  Python 2.x: ASCII
>  Python 3.x: UTF-8
> 
> If you change these, you are on your own and strange things will
> start to happen. The default encoding does not only affect
> the translation between Python and the outside world, but also
> all internal conversions between 8-bit strings and Unicode.
>
> Hacks like what's happening in the pango module (setting the
> default encoding to 'utf-8' by reloading the site module in
> order to get the sys.setdefaultencoding() API back) are just
> downright wrong and will cause serious problems since Unicode
> objects cache their default encoded representation.

Thanks for the feedback.

Note that pango isn't even doing the module reload hack; it's written in
C, and going in directly through the C API:
   PyUnicode_SetDefaultEncoding("utf-8");

I should mention that I've seen at least one C module in the wild that
exists merely to do this:

  #include 
  void initutf8_please(void) {
 PyUnicode_SetDefaultEncoding("utf-8");
  }

so that the user could do "import utf8_please" at the top of their
scripts.

> If all you want to achieve is getting the encodings of
> stdout and stdin correctly setup for pipes, you should
> instead change the .encoding attribute of those (only).
Currently they are set up, but only when connected to a tty, which leads
to surprising behavior changes inside pipes/cronjobs (e.g. piping a
unicode string to "less" immediately breaks for code points above 127:
less is expecting locale-encoded bytes, but sys.stdout has encoding
"ASCII").

Similarly:
[da...@brick ~]$ python -c "import sys; print sys.stdout.encoding"
UTF-8
[da...@brick ~]$ python -c "import sys; print sys.stdout.encoding" | cat
None

Why only set an encoding on these streams when they're directly
connected to a tty?  I'll patch things to remove the isatty conditional
if that's acceptable.

(the tty-logic to do it appeared with the initial commit that added
locale-encoding support to sys.std[in|out], in sysmodule.c:
http://svn.python.org/view?view=rev&revision=32719
and was later moved from sysmodule.c to pythonrun.c:
http://svn.python.org/view?view=rev&revision=33817 
it later grew to affect stderr:
http://svn.python.org/view?view=rev&revision=43581
again, only if directly connected to a tty)

Dave

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread Martin v. Löwis
> Hope this is helpful; can anyone see any potential problems with this
> change?

As Marc-Andre says: such a change is unsupported, and *will* break Python.

It's not true that the only supported encoding in 2.x is 'ascii',
'iso-8859-1' is also supported. 'utf-8' is not, neither is anything
else.

The key problem is that objects that compare equal should also hash
equal. String and Unicode hashing has been constructed so that byte
strings hash the same as if interpreted as latin-1. If, say, utf-8
would be the system encoding, then, for some values of S,

   S == unicode(S) and hash(S) != hash(unicode(S))

That, in turn, *will* break dictionaries.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread M.-A. Lemburg
David Malcolm wrote:
> I'm thinking of making this downstream change to Fedora's site.py (and
> possibly in future RHEL releases) so that the default encoding
> automatically picks up the encoding from the locale:
> 
>  def setencoding():
>  """Set the string encoding used by the Unicode implementation.  The
>  default is 'ascii', but if you're willing to experiment, you can
>  change this."""
>  encoding = "ascii" # Default value set by _PyUnicode_Init()
> -if 0:
> +if 1:
>  # Enable to support locale aware default string encodings.
>  import locale
>  loc = locale.getdefaultlocale()
>  if loc[1]:
>  encoding = loc[1]
>  if 0:
>  # Enable to switch off string to Unicode coercion and implicit
>  # Unicode to string conversion.
>  encoding = "undefined"
>  if encoding != "ascii":
>  # On Non-Unicode builds this will raise an AttributeError...
>  sys.setdefaultencoding(encoding) # Needs Python Unicode build !
> 
> I've written up extensive notes on the change and the history of the
> issue here:
> https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale
> 
> Please let me know if there are any errors on that page!
> 
> The aim is to avoid strange behavior changes when running a script
> within a shell pipeline/cronjob as opposed to at a tty (and to capture
> some of the bizarre cornercases, for example, I found the behavior of
> the pango/pygtk modules particularly surprising).
> 
> I mention it here as a "heads-up" about the change:
>   - in case other distributions may want to do the same (or already do
> so, though in my very brief survey no-one else seemed to), and
>   - in case doing so breaks things in a way I'm not expecting; can
> anyone see any flaws in my arguments?
>   - in case other people find my notes on the issue useful
> 
> Hope this is helpful; can anyone see any potential problems with this
> change?

Yes: such a change is unsupported by Python. The code you are
changing should really have been removed many releases ago -
it was originally only intended to serve as basis for experimentation
on choosing the "right" default encoding.

The only supported default encodings in Python are:

 Python 2.x: ASCII
 Python 3.x: UTF-8

If you change these, you are on your own and strange things will
start to happen. The default encoding does not only affect
the translation between Python and the outside world, but also
all internal conversions between 8-bit strings and Unicode.

Hacks like what's happening in the pango module (setting the
default encoding to 'utf-8' by reloading the site module in
order to get the sys.setdefaultencoding() API back) are just
downright wrong and will cause serious problems since Unicode
objects cache their default encoded representation.

Please don't enable the use of a locale based default encoding.

If all you want to achieve is getting the encodings of
stdout and stdin correctly setup for pipes, you should
instead change the .encoding attribute of those (only).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 20 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread Nick Coghlan
David Malcolm wrote:
> I've written up extensive notes on the change and the history of the
> issue here:
> https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale
> 
> Please let me know if there are any errors on that page!

That discussion appears incomplete without any mention of
sys.getfilesystemencoding().

There is also no analysis to say why you believe it is better to change
the default encoding for all applications as you have done rather than
changing the calculation of encoding of the sys.stdin/out/err streams
when not running from a terminal.

Hopefully MvL and other folks more Unicode-savvy than I will comment,
but in the meantime see previous comments such as this one from MvL:
http://mail.python.org/pipermail/python-dev/2004-August/048496.html

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Proposed downstream change to site.py in Fedora (sys.defaultencoding)

2010-01-20 Thread David Malcolm
I'm thinking of making this downstream change to Fedora's site.py (and
possibly in future RHEL releases) so that the default encoding
automatically picks up the encoding from the locale:

 def setencoding():
 """Set the string encoding used by the Unicode implementation.  The
 default is 'ascii', but if you're willing to experiment, you can
 change this."""
 encoding = "ascii" # Default value set by _PyUnicode_Init()
-if 0:
+if 1:
 # Enable to support locale aware default string encodings.
 import locale
 loc = locale.getdefaultlocale()
 if loc[1]:
 encoding = loc[1]
 if 0:
 # Enable to switch off string to Unicode coercion and implicit
 # Unicode to string conversion.
 encoding = "undefined"
 if encoding != "ascii":
 # On Non-Unicode builds this will raise an AttributeError...
 sys.setdefaultencoding(encoding) # Needs Python Unicode build !

I've written up extensive notes on the change and the history of the
issue here:
https://fedoraproject.org/wiki/Features/PythonEncodingUsesSystemLocale

Please let me know if there are any errors on that page!

The aim is to avoid strange behavior changes when running a script
within a shell pipeline/cronjob as opposed to at a tty (and to capture
some of the bizarre cornercases, for example, I found the behavior of
the pango/pygtk modules particularly surprising).

I mention it here as a "heads-up" about the change:
  - in case other distributions may want to do the same (or already do
so, though in my very brief survey no-one else seemed to), and
  - in case doing so breaks things in a way I'm not expecting; can
anyone see any flaws in my arguments?
  - in case other people find my notes on the issue useful

Hope this is helpful; can anyone see any potential problems with this
change?

Dave


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Nick Coghlan
David Lyon wrote:
> On 2; Who knows what their life cycle is. CVS is pretty much
>   dead, and svn looks like it is on the way out.
>   I can't think of how anything could be better than
>   mercurial or bzr but I know I will be proved wrong.

I believe you misunderstood what Matthieu meant by life cycle there:
think "release cycle". If a project pushes out new releases
significantly more often than every 18-24 months (as is currently true
for all of the major SCM tools), then that fact alone makes it a very
bad fit for the Python standard library.

And centralised source control will be going strong for years. The DVCS
approach may be great for the open source world, but the gains are far
more limited in a closed source shop (especially a group writing
internal corporate applications which doesn't need to keep many, if any,
maintenance branches going).

If we weren't dealing with 4 active branches, the DVCS discussion would
have got a lot less traction with the core developers - aside from
better handling of multiple lines of development, most of the benefits
of the switch to a DVCS accrue to people without commit access to the
SVN repository.

Anyway, we've wandered far afield from legit python-dev topics now. Any
further ideas about super_mega_easy_install functionality that can pull
code from source control systems and build it rather than requiring
prebuild source tarballs should be directed to python-ideas (they
probably need to bake more even before they make an appearance on
distutils-sig).

Cheers,
Nick.

P.S. As Jesse said... your enthusiasm is great, but please don't assume
that some inherent conservatism on the part of other developers is
automatically evil or the result of a failure to see your point. A lot
of people around the world rely on our stuff every day. We owe it to
them to be measured in our actions and to put serious thought into any
major changes or additions we make to the language and the standard
library. For the current stage of its development, Python 3 is in a good
place from our point of view - its major carrot has really always been
the better Unicode support it offers, and the ever-increasing
globalisation of the web will create more and more pressure pushing
developers in that direction as the years go by. Sure, Python 3 cleans
up assorted other things as well, but the change to the text processing
model is the big one that is fundamentally incompatible with the
architecture of the 2.x series. Compared to that change, everything else
is just tinkering.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
---
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Antoine Pitrou
David Lyon  pythontest.org> writes:
> 
> I think I'll have better chances with PEPs.
> 
> Being honest, if wonderful libraries like Sphinx and Mercurial
> and Git and BZR can't make it into the stdlib, then there is
> no hope for even newer code to get in there.
> [snip]

This is python-ideas material, can you take it there? Thank you.

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Stephen J. Turnbull
Barry Warsaw writes:

 > (Besides, git in the stdlib doesn't make much sense :).

"Dulwich."
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] E3 BEFEHLE

2010-01-20 Thread Anand Balachandran Pillai
On Sun, Jan 17, 2010 at 9:42 AM, Dj Gilcrease  wrote:

> 2010/1/16 Jack Diederich :
> > Good lord, did this make it past other people's spam filters too?  I
> > especially liked the reference to "REGION -2,0 ; Rlyeh".  Ph'nglui
> > mglw'nafh Cthulhu R'lyeh wgah'nagl fhtagn to you too sir.
>
> Ya made it past mine too, it looks like a debug dump of a macro for a
> some German game based either LOTR or Cthulhu
>

I initially thought it was  a Python disassembler trace of some step
of operations which failed, converted to German.

In fact, I was looking for a question at the end regarding REPL.
How very optimistic...


> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/abpillai%40gmail.com
>



-- 
--Anand
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bazaar branches available (again) on Launchpad

2010-01-20 Thread Matthieu Brucher
> That's only two points. :-)

In French, we say that several starts with 2 ;)

> On 1; If that's true, I won't mention git again.

I tis, you can check on the git repository (it's a mix of C, perl,
shell scripts, Python, ...)

> On 2; Who knows what their life cycle is.

You can check on their websites, their cycles are far shorter than
Python minor releases (several months vs several years).

Matthieu
-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com