Re: [Python-Dev] [Python-3000] What should the focus for 2.6 be?

2006-08-21 Thread Jean-Paul Calderone
On Mon, 21 Aug 2006 14:21:30 -0700, Josiah Carlson <[EMAIL PROTECTED]> wrote:
>
>Talin <[EMAIL PROTECTED]> wrote:
>[snip]
>> I've been thinking about the transition to unicode strings, and I want
>> to put forward a notion that might allow the transition to be done
>> gradually instead of all at once.
>>
>> The idea would be to temporarily introduce a new name for 8-bit strings
>> - let's call it "ascii". An "ascii" object would be exactly the same as
>> today's 8-bit strings.
>
>There are two parts to the unicode conversion; all literals are unicode,
>and we don't have strings anymore, we have bytes.  Without offering the
>bytes object, then people can't really convert their code.  String
>literals can be handled with the -U command line option (and perhaps
>having the interpreter do the str=unicode assignment during startup).
>

A third step would ease this transition significantly: a unicode_literals 
__future__ import.

>
>Here's my suggestion: every feature, syntax, etc., that is slated for
>Py3k, let us release bit by bit in the 2.x series.  That lets the 2.x
>series evolve into the 3.x series in a somewhat more natural way than
>the currently proposed *everything breaks*.  If it takes 1, 2, 3, or 10
>more releases in the 2.x series to get to all of the 3.x features, great.
>At least people will have a chance to convert, or at least write correct
>code for the future.

This really seems like the right idea.  "Shoot the moon" upgrades are
almost always worse than incremental upgrades.

The incremental path is better for everyone involved.  For developers of
Python, it gets more people using and providing feedback on the new
features being developed.  For developers with Python, it keeps the scope
of a particular upgrade more manageable, letting them developer focus on a
much smaller set of changes to be made to their application.

Jean-Paul
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What should the focus for 2.6 be?

2006-08-21 Thread Josiah Carlson

Talin <[EMAIL PROTECTED]> wrote:
[snip]
> I've been thinking about the transition to unicode strings, and I want 
> to put forward a notion that might allow the transition to be done 
> gradually instead of all at once.
> 
> The idea would be to temporarily introduce a new name for 8-bit strings 
> - let's call it "ascii". An "ascii" object would be exactly the same as 
> today's 8-bit strings.

There are two parts to the unicode conversion; all literals are unicode,
and we don't have strings anymore, we have bytes.  Without offering the
bytes object, then people can't really convert their code.  String
literals can be handled with the -U command line option (and perhaps
having the interpreter do the str=unicode assignment during startup).


In any case, as I look at Py3k and the future of Python, in each release,
I ask "what are the compelling features that make me want to upgrade?"
In each of the 1.5-2.5 series that I've looked at, each has had some
compelling feature or another that has basically required that I upgrade,
or seriously consider upgrading (bugfixes for stuff that has bitten me,
new syntax that I use, significant increases in speed, etc.) .

As we approach Py3k, I again ask, "what are the compelling features?"
Wholesale breakage of anything that uses ascii strings as text or binary
data? A completely changed IO stack (requiring re-learning of everything
known about Python IO)?  Dictionary .keys(), .values(), and .items()
being their .iter*() equivalents (making it just about impossible to
optimize for Py3k dictionary behavior now)?

I understand getting rid of the cruft, really I do (you should see some
cruft I've been replacing lately). But some of that cruft is useful, or
really, some of that cruft has no alternative currently, which will
require significant rewrites of user code when Py3k is released.  When
everyone has to rewrite their code, they are going to ask, "Why don't I
just stick with the maintenance 2.x? It's going to be maintained for a
few more years yet, and I don't need to rewrite all of my disk IO,
strings in dictionary code, etc.  I will be right along with them (no
offense intended to those currently working towards py3k).

I can code defensively against buffer-sturating DOS attacks with my
socket code, but I can't code defensively to handle some (never mind all)
of the changes and incompatabilities that Py3k will bring.

Here's my suggestion: every feature, syntax, etc., that is slated for
Py3k, let us release bit by bit in the 2.x series.  That lets the 2.x
series evolve into the 3.x series in a somewhat more natural way than
the currently proposed *everything breaks*.  If it takes 1, 2, 3, or 10
more releases in the 2.x series to get to all of the 3.x features, great.
At least people will have a chance to convert, or at least write correct
code for the future.

Say 2.6 gets bytes and special factories (or a special encoding argument)
for file/socket to return bytes instead of strings, and only accept
bytes objects to .write() methods (unless an encoding on the file, etc.,
was previously given). Given these bytes objects, it may even make sense
to offer the .readinto() method that Alex B has been asking for (which
would make 3 built-in objects that could reasonably support readinto:
bytes, array, mmap).

If the IO library is available for 2.6, toss that in there, or offer it
in PyPI as an evolving library.

I would suggest pushing off the dict changes until 2.7 or later, as
there are 340+ examples of dict.keys() in the Python 2.5b2 standard
library, at least half of which are going to need to be changed to
list(dict.keys()) or otherwise.  The breakage in user code will likely
be at least as substantial.


Those are just examples that come to mind now, but I'm sure there are
others changes with similar issues.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What should the focus for 2.6 be?

2006-08-21 Thread Brett Cannon
On 8/20/06, Barry Warsaw <[EMAIL PROTECTED]> wrote:
-BEGIN PGP SIGNED MESSAGE-Hash: SHA1On Aug 20, 2006, at 11:24 AM, Guido van Rossum wrote:> I wonder if it would make sense to focus in 2.6 on making porting of> 2.6 code to 3.0 easier, rather than trying to introduce new features
> in 2.6. We've done releases without new language features before;> notable 2.3 didn't add anything new (except making a few __future__> imports redundant) and concentrated on bugfixes, performance, and
> library additions.+1, and there are other benefits to this approach too.First, the pace of change appears to slow, which addresses anothersource of complaints.  Because instead of a slew of new features
every 18 months, we really see that slew only every three years, witha stabilizing and bug fixing release in between.  Another benefit isthat with a de-emphasis on new features, we can spend more timeimproving the library and documentation.
I think fixing tests and documentation would be a great thing to focus 2.6 on.  Not glamourous, I know, but it is needed.For tests, I hope to get some decorators and such written that will help classify tests.  Also adding a function to denote what module is being tested would be good (to avoid the issue of a dependent import for testing failing and then everyone just thinking the test was skipped).  Lastly, testing the C API using ctypes would be really good since it is not thorougly tested.
As for the docs, they just need a thorough updating.  As to whether we should come up with some other format for Py3K with better semantic information and that is easier to read is another question entirely.
-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What should the focus for 2.6 be?

2006-08-21 Thread Guido van Rossum
I'll keep this in mind -- with the caveat that Georg mentioned.

For the next 96 hours I'm going to be severely limited in bandwidth
due to the physical requirements of the sprint at Google. I'd
appreciate not receiving too much email during this period...

--Guido

On 8/20/06, Talin <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > I've been thinking a bit about a focus for the 2.6 release.
> >
> > We are now officially starting parallel development of 2.6 and 3.0. I
> > really don't expect that we'll be able to merge the easily into the
> > 3.0 branch much longer, so effectively 3.0 will be a fork of 2.5.
> >
> > I wonder if it would make sense to focus in 2.6 on making porting of
> > 2.6 code to 3.0 easier, rather than trying to introduce new features
> > in 2.6. We've done releases without new language features before;
> > notable 2.3 didn't add anything new (except making a few __future__
> > imports redundant) and concentrated on bugfixes, performance, and
> > library additions.
>
> I've been thinking about the transition to unicode strings, and I want
> to put forward a notion that might allow the transition to be done
> gradually instead of all at once.
>
> The idea would be to temporarily introduce a new name for 8-bit strings
> - let's call it "ascii". An "ascii" object would be exactly the same as
> today's 8-bit strings.
>
> The 'str' builtin symbol would be assigned to 'ascii' by default, but
> you could assign it to 'unicode' if you wanted to default to wide strings:
>
> str = ascii   # Selects 8-bit strings by default
> str = unicode # Selects unicode strings by default
>
> In order to make the transition, what you would do is to temporarily
> undefine the 'str' symbol from the code base - in other words, remove
> 'str' from the builtin namespace, and then migrate all of the code --
> replacing any library reference to 'str' with a reference to 'ascii'
> *or* updating that function to deal with unicode strings. Once you get
> all of the unit tests running again, you can re-introduce 'str', but now
> you know that since none of the libraries refer to 'str' directly, you
> can safely change its definition.
>
> All of this could be done while retaining compatibility with existing
> 3rd party code - as long as 'str = ascii' is defined. So you turn it on
> to run your Python programs, and turn it off when you want to work on
> 3.0 migration.
>
> The next step (which would not be backwards compatible) would be to
> gradually remove 'ascii' from the code base -- wherever that name
> occurs, it would be a signal that the function needs to be updated to
> use 'unicode' instead.
>
> Finally, once the last occurance of 'ascii' is removed, the final step
> is to do a search and replace of all occurances of 'unicode' with 'str'.
>
> I know this seems round-about, and is more work than doing it all in one
> shot. However, I know from past experience that the trickiest part of
> doing a pervasive change to a code base like this is just keeping track
> of what parts have been migrated and what parts have not. Many times in
> the past I've changed the definition of a ubiquitous type by temporarily
> renaming it, thus vacating the old name so that it can be defined anew,
> without conflict.
>
> -- Talin
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A cast from Py_ssize_t to long

2006-08-21 Thread Thomas Wouters
On 8/21/06, Alexander Belopolsky <[EMAIL PROTECTED]> wrote:
There are also cases of implicit casts like this  that were notcaught so far:static Py_ssize_tmmap_buffer_getreadbuf(mmap_object *self, Py_ssize_t index, constvoid **ptr){  ... return self->size;
}static Py_ssize_tmmap_buffer_getwritebuf(mmap_object *self, Py_ssize_t index, constvoid **ptr){... return self->size;}I don't have any system with sizeof(size_t) != sizeof(long), but it
maybe worth the effort to review the warnings on such system.GCC on a LP64 machine does not generate warnings for the above code. It doesn't have anything to do with 64-bit or 32-bit anyway, since Py_ssize_t and size_t are supposed to be the same size. They just different in signedness, and that's the case on any system. Even if those functions were defined to return longs instead of Py_ssize_t's, GCC wouldn't generate a warning; it's a valid (implicit) downcast that might lose bits. I believe there's a Windows compiler that goes warn for such cases, but if so, I'm sure it generates a whole lot of spurious ones at the moment.
-- Thomas Wouters <[EMAIL PROTECTED]>Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A cast from Py_ssize_t to long

2006-08-21 Thread Alexander Belopolsky

On Aug 21, 2006, at 10:34 AM, Thomas Wouters wrote:

>
> Is there a simple automated way to detect situations like this? Maybe
> there is a win64 compiler that would generate a warning.
>
> I doubt it. Explicit casts are meant to silence warnings (among  
> other things.) Warning for all casts is bound to generate quite a  
> lot of warnings.
>

There are also cases of implicit casts like this  that were not  
caught so far:

static Py_ssize_t
mmap_buffer_getreadbuf(mmap_object *self, Py_ssize_t index, const  
void **ptr)
{
  ...
 return self->size;
}

static Py_ssize_t
mmap_buffer_getwritebuf(mmap_object *self, Py_ssize_t index, const  
void **ptr)
{
...
 return self->size;
}

I don't have any system with sizeof(size_t) != sizeof(long), but it  
maybe worth the effort to review the warnings on such system.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A cast from Py_ssize_t to long

2006-08-21 Thread Thomas Wouters
On 8/21/06, Alexander Belopolsky <[EMAIL PROTECTED]> wrote:
Here is a similar problem:typedef struct {   ...   size_t  pos;   ...} mmap_object;...mmap_tell_method(mmap_object *self, PyObject *unused){CHECK_VALID(NULL);return PyInt_FromLong((long) self->pos);
}See Modules/mmapmodule.c .Here a cast to ssize_t would, technically speaking, not be safeeither, but it may be worth using ssize_t anyways.It should call PyInt_FromSize_t, without any casting. That will make it a PyLong if it's bigger than a Py_ssize_t, too.
Is there a simple automated way to detect situations like this? Maybethere is a win64 compiler that would generate a warning.
I doubt it. Explicit casts are meant to silence warnings (among other things.) Warning for all casts is bound to generate quite a lot of warnings.-- Thomas Wouters <
[EMAIL PROTECTED]>Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A cast from Py_ssize_t to long

2006-08-21 Thread Alexander Belopolsky
On 8/21/06, Thomas Wouters <[EMAIL PROTECTED]> wrote:
[snip]
> > Is this a bug?
[snap]
> Yes. Py_ssize_t can be bigger than a long (on LLP64 systems, such as Win64).

Here is a similar problem:
typedef struct {
   ...
   size_t  pos;
   ...
} mmap_object;
...
mmap_tell_method(mmap_object *self, PyObject *unused)
{
CHECK_VALID(NULL);
return PyInt_FromLong((long) self->pos);
}

See Modules/mmapmodule.c .

Here a cast to ssize_t would, technically speaking, not be safe
either, but it may be worth using ssize_t anyways.

Is there a simple automated way to detect situations like this? Maybe
there is a win64 compiler that would generate a warning.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Questions on unittest behaviour

2006-08-21 Thread Jim Jewett
Collin Winter:

> [ improvements to unittest]

These would all have been (mostly) reasonable when the module was
first added to the standard library.  But it has been there as an ugly
fragile unchanging beast for several years, and is ironically not well
endowed with unit tests.  I therefore think it might be too late for
these changes.

> 2) The TestLoader.testMethodPrefix attribute currently allows anything
> to be assigned to it, including invalid objects and the empty string.
>  While the former will cause errors ... you get things like assertEqual(),
> failUnlessEqual(), etc, when TestLoader.loadTestsFromTestCase() is run.

It makes the interface ugly, and learning harder, but for code this
old, either people catch it with the first actual run of their tests
suites, or a subclass is handling it (perhaps to trigger a default, or
to call functions through a foreign interface).

> 3) TestLoader.loadTestsFromTestCase() accepts objects that are not
> test cases and will happily look for appropriately-named methods on
> any object you give it. This flexibility should be documented, or
> proper input validation should be done (a bug fix for 2.5).

People are probably using this, so I think a deprecation in 2.6 is the
fastest it should change.

> 4) TestLoader.loadTestsFromName() (and by extension,
> loadTestsFromNames(), too) raises an AttributeError if the name is the
> empty string because -- as it correctly asserts -- the object does not
> contain an attribute named ''. I recommend that this be tested for and
> ValueError be raised (bug fix for 2.5).

This seems reasonable for new code, but not this late in 2.5 -- if
people are going through an auto-generated list of attributes, they
may well be catching AttributeError and not ValueError.

> 5) When TestLoader.loadTestsFrom{Name,Names}() are given a name that
> resolves to a classmethod on a TestCase subclass, the method is not
> invoked.  ...
> It is not documented which of these tests takes priority: is the
> classmethod "a test method within a test case class" or is it a
> callable? The same issue applies to staticmethods as well.

Documentation probably should get added.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A cast from Py_ssize_t to long

2006-08-21 Thread Thomas Wouters
On 8/21/06, Alexander Belopolsky <[EMAIL PROTECTED]> wrote:
On Aug 15, 2006, at 3:16 AM, Martin v. Löwis wrote:>> Where does it assume that it is safe to case ssize_t -> long?> That would be a bug.Is this a bug?file_readinto(PyFileObject *f, PyObject *args)
{... Py_ssize_t ndone, nnow;... return PyInt_FromLong((long)ndone);}See Objects/fileobject.c (revision 51420).Yes. Py_ssize_t can be bigger than a long (on LLP64 systems, such as Win64). It doesn't matter on other systems, and you have to read more than 31 bits worth of data to detect it even on Win64, but it's still a bug. file_readinto should be using PyInt_FromSsize_t() instead. (There is the SAFE_DOWNCAST macro for cases where we know the *value* of the (s)size_t will always fit in a long, on any supported system, but that isn't the case here.)
-- Thomas Wouters <[EMAIL PROTECTED]>Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] String formatting / unicode 2.5 bug?

2006-08-21 Thread Nick Coghlan
John J Lee wrote:
>> And once the result has been promoted to unicode, __unicode__ is used 
>> directly:
>>
>  print repr("%s%s" % (a(), a()))
>> __str__
>> accessing <__main__.a object at 0x00AF66F0>.__unicode__
>> __str__
>> accessing <__main__.a object at 0x00AF6390>.__unicode__
>> __str__
>> u'hihi'
> 
> I don't understand this part.  Why is __unicode__ called?  Your example 
> doesn't appear to show this happening "once [i.e., because?] the result 
> has been promoted to unicode" -- if that were true, it would "stand to 
> reason"  that the interpreter would then conclude it should call
> __unicode__ for all remaining %s, and not bother with __str__.

It does try to call unicode directly, but because the example object doesn't 
supply __unicode__ it ends up falling back to __str__ instead. The behaviour 
is clearer when the example object provides both methods:

 >>> # Example (2.5b3)
... class a(object):
...  def __str__(self):
...  print "running __str__"
...  return u'hi'
...  def __unicode__(self):
...  print "running __unicode__"
...  return u'hi'
...
 >>> print repr("%s%s" % (a(), a()))
running __str__
running __unicode__
running __unicode__
u'hihi'

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What should the focus for 2.6 be?

2006-08-21 Thread Georg Brandl
Talin wrote:
> Guido van Rossum wrote:
>> I've been thinking a bit about a focus for the 2.6 release.
>> 
>> We are now officially starting parallel development of 2.6 and 3.0. I
>> really don't expect that we'll be able to merge the easily into the
>> 3.0 branch much longer, so effectively 3.0 will be a fork of 2.5.
>> 
>> I wonder if it would make sense to focus in 2.6 on making porting of
>> 2.6 code to 3.0 easier, rather than trying to introduce new features
>> in 2.6. We've done releases without new language features before;
>> notable 2.3 didn't add anything new (except making a few __future__
>> imports redundant) and concentrated on bugfixes, performance, and
>> library additions.
> 
> I've been thinking about the transition to unicode strings, and I want 
> to put forward a notion that might allow the transition to be done 
> gradually instead of all at once.
> 
> The idea would be to temporarily introduce a new name for 8-bit strings 
> - let's call it "ascii". An "ascii" object would be exactly the same as 
> today's 8-bit strings.
> 
> The 'str' builtin symbol would be assigned to 'ascii' by default, but 
> you could assign it to 'unicode' if you wanted to default to wide strings:
> 
> str = ascii   # Selects 8-bit strings by default
> str = unicode # Selects unicode strings by default

This doesn't change the type of string literals.

Georg

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com