Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Greg Ewing
Raymond Hettinger wrote:

 Like autodict could mean anything.

Everything is meaningless until you know something
about it. If you'd never seen Python before,
would you know what 'dict' meant?

If I were seeing defaultdict for the first time,
I would need to look up the docs before I was
confident I knew exactly what it did -- as I've
mentioned before, my initial guess would have
been wrong. The same procedure would lead me to
an understanding of 'autodict' just as quickly.

Maybe 'autodict' isn't the best term either --
I'm open to suggestions. But my instincts still
tell me that 'defaultdict' is the best term
for something *else* that we might want to add
one day as well, so I'm just trying to make
sure we don't squander it lightly.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Greg Ewing
Josiah Carlson wrote:

 In this particular example, there is no net reduction in line use. The
 execution speed of your algorithm would be reduced due to function
 calling overhead.

If there were more uses of the function, the line count
reduction would be greater.

In any case, line count and execution speed aren't the
only issues -- there is DRY to consider.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot vs. Windows

2006-02-22 Thread Neal Norwitz
On 2/21/06, Neal Norwitz [EMAIL PROTECTED] wrote:

 I agree with this, but don't know a clean way to do 2 builds.  I
 modified buildbot to:

  - Stop doing the second without deleting .py[co] run.
  - Do one run with a debug build.
  - Use -uall -r for both.

I screwed it up, so now it does:

  - Do one run with a debug build.
  - Use -uall -r for both.
  - Still does the second deleting .py[co] run

I couldn't think of a simple way to figure out that on most unixes the
program is called python, but on Mac OS X, it's called python.exe.  So
I reverted back to using make testall.  We can make a new test target
to only run once.

I also think I know how to do the double builds (one release and one
debug).  But it's too late for me to change it tonight without
screwing it up.

The good/bad news after this change is:

http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0

A seg fault on Mac OS when running with -r. :-(

n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Steve Holden
Greg Ewing wrote:
 Raymond Hettinger wrote:
 
 
Like autodict could mean anything.
 
 
 Everything is meaningless until you know something
 about it. If you'd never seen Python before,
 would you know what 'dict' meant?
 
 If I were seeing defaultdict for the first time,
 I would need to look up the docs before I was
 confident I knew exactly what it did -- as I've
 mentioned before, my initial guess would have
 been wrong. The same procedure would lead me to
 an understanding of 'autodict' just as quickly.
 
 Maybe 'autodict' isn't the best term either --
 I'm open to suggestions. But my instincts still
 tell me that 'defaultdict' is the best term
 for something *else* that we might want to add
 one day as well, so I'm just trying to make
 sure we don't squander it lightly.
 
Given that the default entries behind the non-existent keys don't 
actually exist, something like virtual_dict might be appropriate.

Or phantom_dict, or ghost_dict.

I agree that the naming of things is important.

regards
  Steve
-- 
Steve Holden   +44 150 684 7255  +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006  www.python.org/pycon/

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Greg Ewing
Steve Holden wrote:

 Given that the default entries behind the non-existent keys don't 
 actually exist, something like virtual_dict might be appropriate.

No, that would suggest to me something like
a wrapper object that delegates most of the
mapping protocol to something else. That's
even less like what we're discussing.

In our case the default values are only
virtual until you use them, upon which they
become real. Sort of like a wave function
collapse... hmmm... I suppose 'heisendict'
wouldn't fly, would it?

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Fuzzyman




Greg Ewing wrote:

  Fuzzyman wrote:

  
  
I've had problems in code that needs to treat strings, lists and
dictionaries differently (assigning values to a container where all
three need different handling) and telling the difference but allowing
duck typing is *problematic*.

  
  
You need to rethink your design so that you don't
have to make that kind of distinction.


Well... to *briefly* explain the use case, it's for value assignment in
ConfigObj.

It basically accepts as valid values strings and lists of strings [#]_.
You can also create new subsections by assigning a dictionary.

It needs to be able to recognise lists in order to check each list
member is a string. (See note below, it still needs to be able to
recognise lists when writing, even if it is not doing type checking on
assignment.)

It needs to be able to recognise dictionaries in order to create a new
section instance (rather than directly assigning the dictionary).

This is *terribly* convenient for the user (trivial example of creating
a new config file programatically) :

from configobj import ConfigObj
cfg = ConfigObj(newfilename)
cfg['key'] = 'value'
cfg['key2'] = ['value1', 'value2', 'value3']
cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2',
'value3']}
cfg.write()

Writes out :

key = value
key2 = value1, value2, value3
[section]
key = value
key2 = value1, value2, value3

(Note none of those values needed quoting, so they aren't.)

Obviously I could force the creation of sections and the assignment of
list values to use separate methods, but it's much less readable and
unnecessary.

The code as is works and has a nice API. It still needs to be able to
tell what *type* of value is being assigned.

Mapping and sequence protocols are so loosely defined that in order to
support 'list like objects' and 'dictionary like objects' some
arbitrary decision about what methods they should support has to be
made. (For example a read only mapping container is unlikely to
implement __setitem__ or methods like update).

At first we defined a mapping object as one that defines __getitem__
and keys (not update as I previously said), and list like objects as
ones that define __getitem__ and *not* keys. For strings we required a
basestring subclass. In the end I think we ripped this out and just
settled on isinstance tests.

All the best,

Michael Foord


.. [#] Although it has two modes. In the 'default' mode you can assign
any object as a value and a string representation is written out. A
more strict mode checks values at the point you assign them - so
errors will be raised at that point rather than propagating into the
config file. When writing you still need to able to recognise lists
because each element is properly quoted.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Fredrik Lundh
Raymond Hettinger wrote:

 Like autodict could mean anything.

fwiw, the first google hit for autodict appears to be part of someone's
link farm

At this website we have assistance with autodict. In addition to
information for autodict we also have the best web sites concerning
dictionary, non profit and new york. This makes autodict.com the
most reliable guide for autodict on the Internet.

and the second is a description of a self-initializing dictionary data type
for Python.

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
 Greg == Greg Ewing [EMAIL PROTECTED] writes:

Greg Stephen J. Turnbull wrote:

 What I advocate for Python is to require that the standard
 base64 codec be defined only on bytes, and always produce
 bytes.

Greg I don't understand that. It seems quite clear to me that
Greg base64 encoding (in the general sense of encoding, not the
Greg unicode sense) takes binary data (bytes) and produces
Greg characters.

Base64 is a (family of) wire protocol(s).  It's not clear to me that
it makes sense to say that the alphabets used by baseNN encodings
are composed of characters, but suppose we stipulate that.

Greg So in Py3k the correct usage would be [bytes-unicode].

IMHO, as a wire protocol, base64 simply doesn't care what Python's
internal representation of characters is.  I don't see any case for
correctness here, only for convenience, both for programmers on the
job and students in the classroom.  We can choose the character set
that works best for us.  I think that's 8-bit US ASCII.

My belief is that bytes-bytes is going to be the dominant use case,
although I don't use binary representation in XML.  However, AFAIK for
on the wire use UTF-8 is strongly recommended for XML, and in that
case it's also efficient to use bytes-bytes for XML, since
conversion of base64 bytes to UTF-8 characters is simply a matter of
Simon says, be UTF-8!

And in the classroom, you're just going to confuse students by telling
them that UTF-8 --[Unicode codec]-- Python string is decoding but
UTF-8 --[base64 codec]-- Python string is encoding, when MAL is
telling them that -- Python string is always decoding.

Sure, it all makes sense if you already know what's going on.  But I
have trouble remembering, especially in cases like UTF-8 vs UTF-16
where Perl and Python have opposite internal representations, and
glibc has a third which isn't either.  If base64 (and gzip, etc) are
all considered bytes-bytes, there just isn't an issue any more.  The
simple rule wins: to Python string is always decoding.

Why fight it when we can run away with efficiency gains?wink

(In the above, Python string means the unicode type, not str.)

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
   Ask not how you can do free software business;
  ask what your business can do for free software.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Greg Ewing
Fuzzyman wrote:

 cfg = ConfigObj(newfilename)
 cfg['key'] = 'value'
 cfg['key2'] = ['value1', 'value2', 'value3']
 cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']}

If the main purpose is to support this kind of notational
convenience, then I'd be inclined to require all the values
used with this API to be concrete strings, lists or dicts.
If you're going to make types part of the API, I think it's
better to do so with a firm hand rather than being half-
hearted and wishy-washy about it.

Then, if it's really necessary to support a wider variety
of types, provide an alternative API that separates the
different cases and isn't type-dependent at all. If someone
has a need for this API, using it isn't going to be much
of an inconvenience, since he won't be able to write out
constructors for his types using notation as compact as
the above anyway.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Jeremy Hylton
On 2/22/06, Greg Ewing [EMAIL PROTECTED] wrote:
 Mark Russell wrote:

  PEP 227 mentions using := as a rebinding operator, but rejects the
  idea as it would encourage the use of closures.

 Well, anything that facilitates rebinding in outer scopes
 is going to encourage the use of closures, so I can't
 see that as being a reason to reject a particular means
 of rebinding. You either think such rebinding is a good
 idea or not -- and that seems to be a matter of highly
 individual taste.

At the time PEP 227 was written, nested scopes were contentious.  (I
recall one developer who said he'd be embarassed to tell his
co-workers he worked on Python if it had this feature :-).  Rebinding
was more contentious, so the feature was left out.  I don't think any
particular syntax or spelling for rebinding was favored more or less.

 On this particular idea, I tend to think it's too obscure
 as well. Python generally avoids attaching randomly-chosen
 semantics to punctuation, and I'd like to see it stay
 that way.

I agree.

Jeremy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Fuzzyman




Greg Ewing wrote:

  Fuzzyman wrote:

  
  
cfg = ConfigObj(newfilename)
cfg['key'] = 'value'
cfg['key2'] = ['value1', 'value2', 'value3']
cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']}

  
  
If the main purpose is to support this kind of notational
convenience, then I'd be inclined to require all the values
used with this API to be concrete strings, lists or dicts.
If you're going to make types part of the API, I think it's
better to do so with a firm hand rather than being half-
hearted and wishy-washy about it.
[snip..]
  

Thanks, that's the solution we settled on. We use ``isinstance`` tests
to determine types.

The user can always do something like :

 cfg['section'] = dict(dict_like_object)

Which isn't so horrible.

All the best,

Michael

  
--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

  




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] operator.is*Type

2006-02-22 Thread Fuzzyman
Hello all,

Feel free to shoot this down, but a suggestion.

The operator module defines two functions :

isMappingType
isSquenceType


These return a guesstimation as to whether an object passed in supports 
the mapping and sequence protocols.

These protocols are loosely defined. Any object which has a 
``__getitem__`` method defined could support either protocol.

Therefore :

  from operator import isSequenceType, isMappingType
  class anything(object):
... def __getitem__(self, index):
... pass
...
  something = anything()
  isMappingType(something)
True
  isSequenceType(something)
True

I suggest we either deprecate these functions as worthless, *or* we 
define the protocols slightly more clearly for user defined classes.

An object prima facie supports the mapping protocol if it defines a 
``__getitem__`` method, and a ``keys`` method.

An object prima facie supports the sequence protocol if it defines a 
``__getitem__`` method, and *not* a ``keys`` method.

As a result code which needs to be able to tell the difference can use 
these functions and can sensibly refer to the definition of the mapping 
and sequence protocols when documenting what sort of objects an API call 
can accept.

All the best,

Michael Foord
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Raymond Hettinger



I'm concerned that the on_missing() part of the 
proposal is gratuitous. The main use cases for defaultdict have a simple 
factory that supplies a zero, empty list, or empty set. The on_missing() 
hook is only thereto support the rarer case of needinga key to 
compute a default value. The hook is not needed for the main use 
cases.

As it stands, we're adding a method to regular 
dicts that cannot be usefully called directly. Essentially, it is a 
framework method meant to be overridden in a subclass. So, it only makes 
sense in the context of subclassing. In the meantime, we've added an 
oddball method to the main dict API, arguably the most important object API in 
Python. 

To use the hook, you write something like 
this:

 class D(dict):
 def 
on_missing(self, key):
 
return somefunc(key)

However, we can already do something like that 
without the hook:

 class 
D(dict):  def __getitem__(self, 
key):  
try: 
 return dict.__getitem__(self, 
key)  except 
KeyError: 
 self[key] = value = 
somefunc(key) 
 return value

The latter form is already possible, doesn't 
require modifying a basic API, and is arguably clearer about when it is called 
and what it does (the former doesn't explicitly show that the returned value 
gets saved in the dictionary).

Since we can already do the latter form, 
wecan get some insight into whether the need has ever actually arisen in 
real code. I scanned the usual sources (my own code, the standard library, 
and my most commonly used third-party libraries) and found no instances of code 
like that. The closest approximation was safe_substitute() in 
string.Template where missing keys returned themselves as a default value. 
Other than that, I conclude that there isn't sufficient need to warrant adding a 
funky method to the API for regular dicts.

I wondered why thesafe_substitute() example was unique. I think the answer is that we 
normally handle default computations through simple in-line code ("if k in d: 
do1() else do2()" or a try/except pair). Overriding on_missing() then is 
really only useful when you need to create a type that can be passed to a client 
function that was expecting a regular dictionary. So it does come-up but 
not much.

Aside: Why on_missing() is an oddball among 
dict methods. When teaching dicts to beginner, all the methods are easily 
explainable except this one. You don't call this method directly, you only 
use it when subclassing, you have to override it to do anything useful, it hooks 
KeyError but onlywhen raised by __getitem__ and not other methods, 
etc. I'm concerned that evening having this method inregular 
dictAPI will create confusionabout when to use dict.get(), when to 
use dict.setdefault(), when to catch a KeyError, or when to LBYL. Adding 
this one extra choice makes the choice more difficult.

My recommendation: Dump the on_missing() 
hook. That leaves the dict API unmolested andallows a more 
straight-forward implementation/explanation of collections.default_dict or 
whatever it ends-up being named. The result is delightfully simple and 
easy to understand/explain.


Raymond







___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Stephen J. Turnbull wrote:

 Base64 is a (family of) wire protocol(s).  It's not clear to me that
 it makes sense to say that the alphabets used by baseNN encodings
 are composed of characters,

Take a look at

   http://en.wikipedia.org/wiki/Base64

where it says

   ...base64 is a binary to text encoding scheme whereby an
   arbitrary sequence of bytes is converted to a sequence of
   printable ASCII characters.

Also see RFC 2045 (http://www.ietf.org/rfc/rfc2045.txt) which
defines base64 in terms of an encoding from octets to characters,
and also says

   A 65-character subset of US-ASCII is used ... This subset has
   the important property that it is represented identically in
   all versions of ISO 646 ... and all characters in the subset
   are also represented identically in all versions of EBCDIC.

Which seems to make it perfectly clear that the result
of the encoding is to be considered as characters, which
are not necessarily going to be encoded using ascii.

So base64 on its own is *not* a wire protocol. Only after
encoding the characters do you have a wire protocol.

 I don't see any case for
 correctness here, only for convenience,

I'm thinking of convenience, too. Keep in mind that in Py3k,
'unicode' will be called 'str' (or something equally neutral
like 'text') and you will rarely have to deal explicitly with
unicode codings, this being done mostly for you by the I/O
objects. So most of the time, using base64 will be just as
convenient as it is today: base64_encode(my_bytes) and write
the result out somewhere.

The reason I say it's *corrrect* is that if you go straight
from bytes to bytes, you're *assuming* the eventual encoding
is going to be an ascii superset. The programmer is going to
have to know about this assumption and understand all its
consequences and decide whether it's right, and if not, do
something to change it.

Whereas if the result is text, the right thing happens
automatically whatever the ultimate encoding turns out to
be. You can take the text from your base64 encoding, combine
it with other text from any other source to form a complete
mail message or xml document or whatever, and write it out
through a file object that's using any unicode encoding
at all, and the result will be correct.

  it's also efficient to use bytes-bytes for XML, since
 conversion of base64 bytes to UTF-8 characters is simply a matter of
 Simon says, be UTF-8!

Efficiency is an implementation concern. In Py3k, strings
which contain only ascii or latin-1 might be stored as
1 byte per character, in which case this would not be an
issue.

 And in the classroom, you're just going to confuse students by telling
 them that UTF-8 --[Unicode codec]-- Python string is decoding but
 UTF-8 --[base64 codec]-- Python string is encoding, when MAL is
 telling them that -- Python string is always decoding.

Which is why I think that only *unicode* codings should be
available through the .encode and .decode interface. Or
alternatively there should be something more explicit like
.unicode_encode and .unicode_decode that is thus restricted.

Also, if most unicode coding is done in the I/O objects, there
will be far less need for programmers to do explicit unicode
coding in the first place, so likely it will become more of
an advanced topic, rather than something you need to come to
grips with on day one of using unicode, like it is now.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Fredrik Lundh
Raymond Hettinger wrote:

 Aside:  Why on_missing() is an oddball among dict methods.  When
 teaching dicts to beginner, all the methods are easily explainable ex-
 cept this one.  You don't call this method directly, you only use it
 when subclassing, you have to override it to do anything useful, it
 hooks KeyError but only when raised by __getitem__ and not
 other methods, etc.

agreed.

 My recommendation:  Dump the on_missing() hook.  That leaves
 the dict API unmolested and allows a more straight-forward im-
 plementation/explanation of collections.default_dict or whatever
 it ends-up being named.  The result is delightfully simple and easy
 to understand/explain.

agreed.

a separate type in collections, a template object (or factory) passed to
the constructor, and implementation inheritance, is more than good en-
ough.  and if I recall correctly, pretty much what Guido first proposed.
I trust his intuition a lot more than I trust the design-by-committee-with-
out-use-cases process.

/F 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Greg Ewing
Raymond Hettinger wrote:
 I'm concerned that the on_missing() part of the proposal is gratuitous.  

I second all that. A clear case of YAGNI.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
  from operator import isSequenceType, isMappingType
  class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
  something = anything()
  isMappingType(something)
 True
  isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

They are not worthless.  They do a damned good job of differentiating anything 
that CAN be differentiated.

Your example simply highlights the consequences of one of Python's most basic, 
original design choices (using getitem for both sequences and mappings).  That 
choice is now so fundamental to the language that it cannot possibly change. 
Get used to it.

In your example, the results are correct.  The anything class can be viewed 
as 
either a sequence or a mapping.

In this and other posts, you seem to be focusing your design around notions of 
strong typing and mandatory interfaces.  I would suggest that that approach is 
futile unless you control all of the code being run.


Raymond


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Thomas Heller
Fuzzyman wrote:
 Hello all,
 
 Feel free to shoot this down, but a suggestion.
 
 The operator module defines two functions :
 
 isMappingType
 isSquenceType
 
 
 These return a guesstimation as to whether an object passed in supports 
 the mapping and sequence protocols.
 
 These protocols are loosely defined. Any object which has a 
 ``__getitem__`` method defined could support either protocol.

The docs contain clear warnings about that.

 I suggest we either deprecate these functions as worthless, *or* we 
 define the protocols slightly more clearly for user defined classes.

I have no problems deprecating them since I've never used one of these
functions.  If I want to know if something is a string I use isinstance(),
for string-like objects I would use

  try: obj + 
  except TypeError:

and so on.

 
 An object prima facie supports the mapping protocol if it defines a 
 ``__getitem__`` method, and a ``keys`` method.
 
 An object prima facie supports the sequence protocol if it defines a 
 ``__getitem__`` method, and *not* a ``keys`` method.
 
 As a result code which needs to be able to tell the difference can use 
 these functions and can sensibly refer to the definition of the mapping 
 and sequence protocols when documenting what sort of objects an API call 
 can accept.

Thomas

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Fuzzyman
Raymond Hettinger wrote:
  from operator import isSequenceType, isMappingType
  class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
  something = anything()
  isMappingType(something)
 True
  isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

 They are not worthless.  They do a damned good job of differentiating 
 anything that CAN be differentiated.

But as far as I can tell (and I may be wrong), they only work if the 
object is a subclass of a built in type, otherwise they're broken. So 
you'd have to do a type check as well, unless you document that an API 
call *only* works with a builtin type or subclass.

In which case - an isinstance call does the same, with the advantage of 
not being broken if the object is a user-defined class.

At the very least the function would be better renamed 
``MightBeMappingType``  ;-)

 Your example simply highlights the consequences of one of Python's 
 most basic, original design choices (using getitem for both sequences 
 and mappings).  That choice is now so fundamental to the language that 
 it cannot possibly change. Get used to it.

I have no problem with it - it's useful.

 In your example, the results are correct.  The anything class can be 
 viewed as either a sequence or a mapping.

But in practise an object is *unlikely* to be both. (Although 
conceivable a mapping container *could* implement integer indexing an 
thus be both - but *very* rare). Therefore the current behaviour is not 
really useful in any conceivable situation - not that I can think of anyway.

 In this and other posts, you seem to be focusing your design around 
 notions of strong typing and mandatory interfaces.  I would suggest 
 that that approach is futile unless you control all of the code being 
 run.

Not directly. I'm suggesting that the loosely defined protocol (used 
with duck typing) can be made quite a bit more useful by making the 
definition *slightly* more specific.

A preference for strong typing would require subclassing, surely ?

The approach I suggest would allow a *less* 'strongly typed' approach to 
code, because it establishes a convention to decide whether a user 
defined class supports the mapping and sequence protocols.

The simple alternative (which we took in ConfigObj) is to require a 
'strongly typed' interface, because there is currently no useful way to 
determine whether an object that implements __getitem__ supports mapping 
or sequence. (Other than *assuming* that a mapping container implements 
a random choice from the other common mapping methods.)

All the best,

Michael

 Raymond




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot vs. Windows

2006-02-22 Thread Michael Hudson
Martin v. Löwis [EMAIL PROTECTED] writes:

 Tim Peters wrote:
 Speaking of which, a number of test failures over the past few weeks
 were provoked here only under -r (run tests in random order) or under
 a debug build, and didn't look like those were specific to Windows. 
 Adding -r to the buildbot test recipe is a decent idea.  Getting
 _some_ debug-build test runs would also be good (or do we do that
 already?).

 So what is your recipe: Add -r to all buildbots? Only to those which
 have an 'a' in their name? Only to every third build? Duplicating
 the number of builders?

 Same question for --with-pydebug. Combining this with -r would multiply
 the number of builders by 4 already.

Instead of running release and debug builds, why not just run debug
builds?  They catch more problems, earlier.

Cheers,
mwh

-- 
  This song is for anyone ... fuck it.  Shut up and listen.
 -- Eminem, The Way I Am
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Raymond Hettinger
[Alex]
 I'd love to remove setdefault in 3.0 -- but I don't think it can be  done 
 before that: default_factory won't cover the occasional use  cases where 
 setdefault is called with different defaults at different  locations, and, 
 rare as those cases may be, any 2.* should not break  any existing code that 
 uses that approach.

I'm not too concerned about this one.  Whenever setdefault gets deprecated , 
then ALL code that used it would have to be changed.  If there were cases with 
different defaults, a regular try/except would do the job just fine (heck, it 
might even be faster because the won't be a wasted instantiation in the cases 
where the key already exists).

There may be other reasons to delay removing setdefault(), but multiple default 
use case isn't one of them.



 An alternative is to have two possible attributes:
   d.default_factory = list
 or
   d.default_value = 0
 with an exception being raised when both are defined (the test is  done when 
 the
 attribute is created, not when the lookup is performed).

 I see default_value as a way to get exactly the same beginner's error  we 
 already have with function defaults:

That makes sense.


I'm somewhat happy with the patch as it stands now.  The only part that needs 
serious rethinking is putting on_missing() in regular dicts.  See my other 
email 
on that subject.



Raymond 

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Copying zlib compression objects

2006-02-22 Thread Chris AtLee
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote:
Please submit your patch to SourceForge.I've submitted the zlib patch as patch #1435422. I added some test cases to test_zlib.py and documented the new methods. I'd like to test my gzip / tarfile changes more before creating a patch for it, but I'm interested in any feedback about the idea of adding snapshot() / restore() methods to the GzipFile and TarFile classes.
It doesn't look like the underlying bz2 library supports copying compression / decompression streams, so for now it's impossible to make corresponding changes to the bz2 module.I also noticed that the tarfile reimplements the gzip file format when dealing with streams. Would it make sense to refactor some the 
gzip.py code to expose the methods that read/write the gzip file header, and have the tarfile module use those methods?Cheers,Chris
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Alex Martelli

On Feb 22, 2006, at 7:21 AM, Raymond Hettinger wrote:
...
 I'm somewhat happy with the patch as it stands now.  The only part  
 that needs serious rethinking is putting on_missing() in regular  
 dicts.  See my other email on that subject.

What if we named it _on_missing? Hook methods intended only to be  
overridden in subclasses are sometimes spelled that way, and it  
removes the need to teach about it to beginners -- it looks private  
so we don't explain it at that point.

My favorite example is Queue.Queue: I teach it (and in fact  
evangelize for it as the one sane way to do threading;-) in Python  
101, *without* ever mentioning _get, _put etc -- THOSE I teach in  
Patterns with Python as the very bext example of the Gof4's classic  
Template Method design pattern. If dict had _on_missing I'd have  
another wonderful example to teach from!  (I believe the Library  
Reference avoids teaching about _get, _put etc, too, though I haven't  
checked it for a while).

TM is my favorite DP, so I'm biased in favor of Guido's design, and I  
think that by giving the hook method (not meant to be called, only  
overridden) a private name we're meeting enough of your and /F's  
concerns to let _on_missing remain. Its existence does simplify the  
implementation of defaultdict (and some other dict subclasses), and  
if the implementation is easy to explain, it may be a good idea,  
after all;-)


Alex

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Guido van Rossum
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 I'm concerned that the on_missing() part of the proposal is gratuitous.  The
 main use cases for defaultdict have a simple factory that supplies a zero,
 empty list, or empty set.  The on_missing() hook is only there to support
 the rarer case of needing a key to compute a default value.  The hook is not
 needed for the main use cases.

The on_missing() hook is there to take the action of inserting the
default value into the dict. For this it needs the key.

It seems attractive to collaps default_factory and on_missing into a
single attribute (my first attempt did this, and I was halfway posting
about it before I realized the mistake). But on_missing() really needs
the key, and at the same time you don't want to lose the convenience
of being able to specify set, list, int etc. as default factories, so
default_factory() must be called without the key.

If you don't have on_missing, then the functionality of inserting the
key produced by default_factory would have to be in-lined in
__getitem__, which means the machinery put in place can't be reused
for other use cases -- several people have claimed to have a use case
for returning a value *without* inserting it into the dict.

 As it stands, we're adding a method to regular dicts that cannot be usefully
 called directly.  Essentially, it is a framework method meant to be
 overridden in a subclass.  So, it only makes sense in the context of
 subclassing.  In the meantime, we've added an oddball method to the main
 dict API, arguably the most important object API in Python.

Which to me actually means it's a *good* place to put the hook
functionality, since it allows for maximum reuse.

 To use the hook, you write something like this:

 class D(dict):
 def on_missing(self, key):
  return somefunc(key)

Or, more likely,

def on_missing(key):
self[key] = value = somefunc()
return value

 However, we can already do something like that without the hook:

 class D(dict):
 def __getitem__(self, key):
 try:
 return dict.__getitem__(self, key)
 except KeyError:
 self[key] = value = somefunc(key)
 return value

 The latter form is already possible, doesn't require modifying a basic API,
 and is arguably clearer about when it is called and what it does (the former
 doesn't explicitly show that the returned value gets saved in the
 dictionary).

This is exactly what Google's internal DefaultDict does. But it is
also its downfall, because now *all* __getitem__ calls are weighed
down by going through Python code; in a particular case that came up
at Google I had to recommend against using it for performance reasons.

 Since we can already do the latter form, we can get some insight into
 whether the need has ever actually arisen in real code.  I scanned the usual
 sources (my own code, the standard library, and my most commonly used
 third-party libraries) and found no instances of code like that.   The
 closest approximation was safe_substitute() in string.Template where missing
 keys returned themselves as a default value.  Other than that, I conclude
 that there isn't sufficient need to warrant adding a funky method to the API
 for regular dicts.

In this case I don't believe that the absence of real-life examples
says much (and BTW Google's DefaultDict *is* such a real life example;
it is used in other code). There is not much incentive for subclassing
dict and overriding __getitem__ if the alternative is that in a few
places you have to write two lines of code instead of one:

if key not in d: d[key] = set()# this line would be unneeded
d[key].add(value)

 I wondered why the safe_substitute() example was unique.  I think the answer
 is that we normally handle default computations through simple in-line code
 (if k in d: do1() else do2() or a try/except pair).  Overriding
 on_missing() then is really only useful when you need to create a type that
 can be passed to a client function that was expecting a regular dictionary.
 So it does come-up but not much.

I think the pattern hasn't been commonly known; people have been
struggling with setdefault() all these years.

 Aside:  Why on_missing() is an oddball among dict methods.  When teaching
 dicts to beginner, all the methods are easily explainable except this one.

You don't seriously teach beginners all dict methods do you?
setdefault(), update(), copy() are all advanced material, and so are
iteritems(), itervalues() and iterkeys() (*especially* the last since
it's redundant through for i in d:).

 You don't call this method directly, you only use it when subclassing, you
 have to override it to do anything useful, it hooks KeyError but only when
 raised by __getitem__ and not other methods, etc.

The only other methods that raise KeyError are __delitem__, pop() and
popitem(). I don't see how these could use the same hook as

Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread James Y Knight

On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote:

 I'm thinking of convenience, too. Keep in mind that in Py3k,
 'unicode' will be called 'str' (or something equally neutral
 like 'text') and you will rarely have to deal explicitly with
 unicode codings, this being done mostly for you by the I/O
 objects. So most of the time, using base64 will be just as
 convenient as it is today: base64_encode(my_bytes) and write
 the result out somewhere.

 The reason I say it's *corrrect* is that if you go straight
 from bytes to bytes, you're *assuming* the eventual encoding
 is going to be an ascii superset. The programmer is going to
 have to know about this assumption and understand all its
 consequences and decide whether it's right, and if not, do
 something to change it.

 Whereas if the result is text, the right thing happens
 automatically whatever the ultimate encoding turns out to
 be. You can take the text from your base64 encoding, combine
 it with other text from any other source to form a complete
 mail message or xml document or whatever, and write it out
 through a file object that's using any unicode encoding
 at all, and the result will be correct.

This makes little sense for mail. You combine *bytes*, in various and  
possibly different encodings to form a mail message. Some MIME  
sections might have a base64 Content-Transfer-Encoding, others might  
be 8bit encoded, others might be 7bit encoded, others might be quoted- 
printable encoded. Before the C-T-E encoding, you will have had to do  
the Content-Type encoding, coverting your text into bytes with the  
desired character encoding: utf-8, iso-8859-1, etc. Having the final  
mail message be made up of characters, right before transmission to  
the socket would be crazy.

James
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Raymond Hettinger
[Guido van Rossum]
 If we removed on_missing() from dict, we'd have to override
 __getitem__ in defaultdict (regardless of whether we give
defaultdict an on_missing() hook or in-line it).

You have another option.  Keep your current modifications to
dict.__getitem__ but do not include dict.on_missing().  Let it only
be called in a subclass IF it is defined; otherwise, raise KeyError.

That keeps me happy since the basic dict API won't show on_missing(),
but it still allows a user to attach an on_missing method to a dict subclass 
when
or if needed.  I think all your test cases would still pass without 
modification.
This is approach is not much different than for other magic methods which
kick-in if defined or revert to a default behavior if not.

My core concern is to keep the dict API clean as a whistle.


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Bob Ippolito

On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote:

 Raymond Hettinger wrote:
 from operator import isSequenceType, isMappingType
 class anything(object):
 ... def __getitem__(self, index):
 ... pass
 ...
 something = anything()
 isMappingType(something)
 True
 isSequenceType(something)
 True

 I suggest we either deprecate these functions as worthless, *or* we
 define the protocols slightly more clearly for user defined classes.

 They are not worthless.  They do a damned good job of differentiating
 anything that CAN be differentiated.

 But as far as I can tell (and I may be wrong), they only work if the
 object is a subclass of a built in type, otherwise they're broken. So
 you'd have to do a type check as well, unless you document that an API
 call *only* works with a builtin type or subclass.

If you really cared, you could check hasattr(something, 'get') and  
hasattr(something, '__getitem__'), which is a pretty good indicator  
that it's a mapping and not a sequence (in a dict-like sense, anyway).

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Ian Bicking
Raymond Hettinger wrote:
from operator import isSequenceType, isMappingType
class anything(object):

... def __getitem__(self, index):
... pass
...

something = anything()
isMappingType(something)

True

isSequenceType(something)

True

I suggest we either deprecate these functions as worthless, *or* we
define the protocols slightly more clearly for user defined classes.
 
 
 They are not worthless.  They do a damned good job of differentiating 
 anything 
 that CAN be differentiated.

But they are just identical...?  They seem terribly pointless to me. 
Deprecation is one option, of course.  I think Michael's suggestion also 
makes sense.  *If* we distinguish between sequences and mapping types 
with two functions, *then* those two functions should be distinct.  It 
seems kind of obvious, doesn't it?

I think hasattr(obj, 'keys') is the simplest distinction of the two 
kinds of collections.

 Your example simply highlights the consequences of one of Python's most 
 basic, 
 original design choices (using getitem for both sequences and mappings).  
 That 
 choice is now so fundamental to the language that it cannot possibly change. 
 Get used to it.
 
 In your example, the results are correct.  The anything class can be viewed 
 as 
 either a sequence or a mapping.
 
 In this and other posts, you seem to be focusing your design around notions 
 of 
 strong typing and mandatory interfaces.  I would suggest that that approach 
 is 
 futile unless you control all of the code being run.

I think you are reading too much into it.  If the functions exist, they 
should be useful.  That's all I see in Michael's suggestion.


-- 
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Guido van Rossum
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote:
 [Guido van Rossum]
  If we removed on_missing() from dict, we'd have to override
  __getitem__ in defaultdict (regardless of whether we give
 defaultdict an on_missing() hook or in-line it).

 You have another option.  Keep your current modifications to
 dict.__getitem__ but do not include dict.on_missing().  Let it only
 be called in a subclass IF it is defined; otherwise, raise KeyError.

OK. I don't have time right now for another round of patches -- if you
do, please go ahead. The dict docs in my latest patch must be updated
somewhat (since they document on_missing()).

 That keeps me happy since the basic dict API won't show on_missing(),
 but it still allows a user to attach an on_missing method to a dict subclass
 when
 or if needed.  I think all your test cases would still pass without
 modification.

Except the ones that explicitly test for dict.on_missing()'s presence
and behavior. :-)

 This is approach is not much different than for other magic methods which
 kick-in if defined or revert to a default behavior if not.

Right. Plenty of precedent there.

 My core concern is to keep the dict API clean as a whistle.

Understood.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path PEP: some comments (equality)

2006-02-22 Thread Jason Orendorff
On 2/20/06, Mark Mc Mahon [EMAIL PROTECTED] wrote:
It seems that the Path module as currently defined leaves equalitytesting up to the underlying string comparison. My guess is that thisis fine for Unix (maybe not even) but it is a bit lacking for Windows.
Should the path class implement an __eq__ method that might do some ofthe following things: - Get the absolute path of both self and the other path - normcase both - now see if they are equal
This has been suggested to me many times.Unfortunately, since Path is a subclass of string, this breaks stuff in weird ways.For example: 'x.py' == path('x.py') == path('X.PY') == 'X.PY', but '
x.py' != 'X.PY'And hashing needs to be consistent with __eq__: hash('x.py') == hash(path('X.PY')) == hash('X.PY') ???Granted these problems would only pop up in code where people are mixing Path and string objects. But they would cause really obscure bugs in practice, very difficult for a non-expert to figure out and fix. It's safer for Paths to behave just like strings.
-j
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Phillip J. Eby
At 06:14 AM 2/22/2006 -0500, Jeremy Hylton wrote:
On 2/22/06, Greg Ewing [EMAIL PROTECTED] wrote:
  Mark Russell wrote:
 
   PEP 227 mentions using := as a rebinding operator, but rejects the
   idea as it would encourage the use of closures.
 
  Well, anything that facilitates rebinding in outer scopes
  is going to encourage the use of closures, so I can't
  see that as being a reason to reject a particular means
  of rebinding. You either think such rebinding is a good
  idea or not -- and that seems to be a matter of highly
  individual taste.

At the time PEP 227 was written, nested scopes were contentious.  (I
recall one developer who said he'd be embarassed to tell his
co-workers he worked on Python if it had this feature :-).

Was this because of the implicit inheritance of variables from the 
enclosing scope?


   Rebinding
was more contentious, so the feature was left out.  I don't think any
particular syntax or spelling for rebinding was favored more or less.

  On this particular idea, I tend to think it's too obscure
  as well. Python generally avoids attaching randomly-chosen
  semantics to punctuation, and I'd like to see it stay
  that way.

I agree.

Note that '.' for relative naming already exists (attribute access), and 
Python 2.5 is already introducing the use of a leading '.' (with no name 
before it) to mean parent of the current namespace.  So, using that 
approach to reference variables in outer scopes wouldn't be without precedents.

IOW, I propose no new syntax for rebinding, but instead making variables' 
context explicit.  This would also fix the issue where right now you have 
to inspect a function and its context to find out whether there's a closure 
and what's in it.  The leading dots will be quite visible.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Terry Reedy

Almann T. Goo [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 IMO, Having properly nested scopes in Python in a sense made having
 closures a natural idiom to the language and part of its user
 interface.  By not allowing the name re-binding it almost seems like
 that user interface has a rough edge that is almost too easy to get
 cut on.

I can see now how it would look that way to someone who has experience with 
fully functional nested scopes in other languages and who learns Python 
after no-write nested scoping was added.  What is not mentioned in the ref 
manual and what I suppose may not be obvious even reading the PEP is that 
Python added nesting to solve two particular problems.  First was the 
inability to write nested recursive functions without the hack of stuffing 
its name in the global namespace (or of patching the byte code).  Second 
was the need to misuse the default arg mechanism in nested functions.  What 
we have now pretty well fixes both.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Terry Reedy

Greg Ewing [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Efficiency is an implementation concern.

It is also a user concern, especially if inefficiency overruns memory 
limits.

 In Py3k, strings
 which contain only ascii or latin-1 might be stored as
 1 byte per character, in which case this would not be an
 issue.

If 'might' becomes 'will', I and I suspect others will be happier with the 
change.  And I would be happy if the choice of physical storage was pretty 
much handled behind the scenes, as with the direction int/long unification 
is going.

 Which is why I think that only *unicode* codings should be
 available through the .encode and .decode interface. Or
 alternatively there should be something more explicit like
 .unicode_encode and .unicode_decode that is thus restricted.

I prefer the shorter names and using recode, for instance, for bytes to 
bytes.

Terry Jan Reedy



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Steven Bethard
On 2/21/06, Phillip J. Eby [EMAIL PROTECTED] wrote:
 Here's a crazy idea, that AFAIK has not been suggested before and could
 work for both globals and closures: using  a leading dot, ala the new
 relative import feature.  e.g.:

 def incrementer(val):
 def inc():
 .val += 1
 return .val
 return inc

 The '.' would mean this name, but in the nearest outer scope that defines
 it.  Note that this could include the global scope, so the 'global'
 keyword could go away in 2.5.  And in Python 3.0, the '.' could become
 *required* for use in closures, so that it's not necessary for the reader
 to check a function's outer scope to see whether closure is taking
 place.  EIBTI.

FWIW, I think this is nice.  Since it uses the same dot-notation that
normal attribute access uses, it's clearly accessing the attribute of
*some* namespace.  It's not perfectly intuitive that the accessed
namespace is the enclosing one, but I do think it's at least more
intuitive than the suggested := operator, and at least as intuitive as
a ``global``-like declaration.  And, as you mention, it's consistent
with the relative import feature.

I'm a little worried that this proposal will get lost amid the mass of
other suggestions being thrown out right now.  Any chance of turning
this into a PEP?

Steve
--
Grammar am for people who can't think for myself.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
[Ian Bicking]
 They seem terribly pointless to me.

FWIW, here is the script that had I used while updating and improving the two 
functions (can't remember whether it was for Py2.3 or Py2.4).  It lists 
comparative results for many different types of inputs.  Since perfection was 
not possible, the goal was to have no false negatives and mostly accurate 
positives.  IMO, they do a pretty good job and are able to access information 
in 
not otherwise visable to pure Python code.  With respect to user defined 
instances, I don't care that they can't draw a distinction where none exists in 
the first place -- at some point you have to either fallback on duck-typing or 
be in control of what kind of arguments you submit to your functions. 
Practicality beats purity -- especially when a pure solution doesn't exist 
(i.e. 
given a user defined class that defines just __getitem__, both mapping or 
sequence behavior is a possibility).


 Analysis Script 

from collections import deque
from UserList import UserList
from UserDict import UserDict
from operator import *
types = (set,
 int, float, complex, long, bool,
 str, unicode,
 list, UserList, tuple, deque,
)

for t in types:
print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t)

class c:
def __repr__(self):
return 'Instance w/o getitem'

class cn(object):
def __repr__(self):
return 'NewStyle Instance w/o getitem'

class cg:
def __repr__(self):
return 'Instance w getitem'
def __getitem__(self):
return 10

class cng(object):
def __repr__(self):
return 'NewStyle Instance w getitem'
def __getitem__(self):
return 10

def f():
return 1

def g():
yield 1

for i in (None, NotImplemented, g(), c(), cn()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)

for i in (cg(), cng(), dict(), UserDict()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)



 Output 

False False set([]) type 'set'
False False 0 type 'int'
False False 0.0 type 'float'
False False 0j type 'complex'
False False 0L type 'long'
False False False type 'bool'
False True '' type 'str'
False True u'' type 'unicode'
False True [] type 'list'
True True [] class UserList.UserList at 0x00F11B70
False True () type 'tuple'
False True deque([]) type 'collections.deque'
False False None type 'NoneType'
False False NotImplemented type 'NotImplementedType'
False False generator object at 0x00F230A8 type 'generator'
False False Instance w/o getitem type 'instance'
False False NewStyle Instance w/o getitem class '__main__.cn'
True True Instance w getitem type 'instance'
True True NewStyle Instance w getitem class '__main__.cng'
True False {} type 'dict'
True True {} type 'instance'

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Edward C. Jones
Guido van Rossen wrote:
 I think the pattern hasn't been commonly known; people have been
 struggling with setdefault() all these years.

I use setdefault _only_ to speed up the following code pattern:

if akey not in somedict:
 somedict[akey] = list()
somedict[akey].append(avalue)

These lines of simple Python are much easier to read and write than

somedict.setdefault(akey, list()).append(avalue)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Ron Adam
Terry Reedy wrote:

 Greg Ewing [EMAIL PROTECTED] wrote in message 

 Which is why I think that only *unicode* codings should be
 available through the .encode and .decode interface. Or
 alternatively there should be something more explicit like
 .unicode_encode and .unicode_decode that is thus restricted.
 
 I prefer the shorter names and using recode, for instance, for bytes to 
 bytes.

While I prefer constructors with an explicit encode argument, and use a 
recode() method for 'like to like' coding.  Then the whole encode/decode 
confusion goes away.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Michael Foord
Raymond Hettinger wrote:
 [Ian Bicking]
 They seem terribly pointless to me.

 FWIW, here is the script that had I used while updating and improving 
 the two functions (can't remember whether it was for Py2.3 or Py2.4).  
 It lists comparative results for many different types of inputs.  
 Since perfection was not possible, the goal was to have no false 
 negatives and mostly accurate positives.  IMO, they do a pretty good 
 job and are able to access information in not otherwise visable to 
 pure Python code.  With respect to user defined instances, I don't 
 care that they can't draw a distinction where none exists in the first 
 place -- at some point you have to either fallback on duck-typing or 
 be in control of what kind of arguments you submit to your functions. 
 Practicality beats purity -- especially when a pure solution doesn't 
 exist (i.e. given a user defined class that defines just __getitem__, 
 both mapping or sequence behavior is a possibility).

But given :

True True Instance w getitem type 'instance'
True True NewStyle Instance w getitem class '__main__.cng'
True True [] class UserList.UserList at 0x00F11B70
True True {} type 'instance'

(Last one is UserDict)

I can't conceive of circumstances where this is useful without duck 
typing *as well*.

The tests seem roughly analogous to :

def isMappingType(obj):
return isinstance(obj, dict) or hasattr(obj, '__getitem__')

def isSequenceType(obj):
return isinstance(obj, (basestring, list, tuple, collections.deque)) 
or hasattr(obj, '__getitem__')

If you want to allow sequence access you could either just use the 
isinstance or you *have* to trap an exception in the case of a mapping 
object being passed in.

Redefining (effectively) as :

def isMappingType(obj):
return isinstance(obj, dict) or (hasattr(obj, '__getitem__') and 
hasattr(obj, 'keys'))

def isSequenceType(obj):
return isinstance(obj, (basestring, list, tuple, collections.deque)) 
or (hasattr(obj, '__getitem__')
and not hasattr(obj, 'keys'))

Makes the test useful where you want to know you can safely treat an 
object as a mapping (or sequence) *and* where you want to tell the 
difference.

The only code that would break is use of mapping objects that don't 
define ``keys`` and sequences that do. I imagine these must be very rare 
and *would* be interested in seeing real code that does break. 
Especially if that code cannot be trivially rewritten to use the first 
example.

All the best,

Michael Foord

  Analysis Script 

 from collections import deque
 from UserList import UserList
 from UserDict import UserDict
 from operator import *
 types = (set,
 int, float, complex, long, bool,
 str, unicode,
 list, UserList, tuple, deque,
 )

 for t in types:
print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t)

 class c:
def __repr__(self):
return 'Instance w/o getitem'

 class cn(object):
def __repr__(self):
return 'NewStyle Instance w/o getitem'

 class cg:
def __repr__(self):
return 'Instance w getitem'
def __getitem__(self):
return 10

 class cng(object):
def __repr__(self):
return 'NewStyle Instance w getitem'
def __getitem__(self):
return 10

 def f():
return 1

 def g():
yield 1

 for i in (None, NotImplemented, g(), c(), cn()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)

 for i in (cg(), cng(), dict(), UserDict()):
print isMappingType(i), isSequenceType(i), repr(i), type(i)



  Output 

 False False set([]) type 'set'
 False False 0 type 'int'
 False False 0.0 type 'float'
 False False 0j type 'complex'
 False False 0L type 'long'
 False False False type 'bool'
 False True '' type 'str'
 False True u'' type 'unicode'
 False True [] type 'list'
 True True [] class UserList.UserList at 0x00F11B70
 False True () type 'tuple'
 False True deque([]) type 'collections.deque'
 False False None type 'NoneType'
 False False NotImplemented type 'NotImplementedType'
 False False generator object at 0x00F230A8 type 'generator'
 False False Instance w/o getitem type 'instance'
 False False NewStyle Instance w/o getitem class '__main__.cn'
 True True Instance w getitem type 'instance'
 True True NewStyle Instance w getitem class '__main__.cng'
 True False {} type 'dict'
 True True {} type 'instance'



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict and on_missing()

2006-02-22 Thread Michael Chermside
A minor related point about on_missing():

Haven't we learned from regrets over the .next() method of iterators
that all magically invoked methods should be named using the __xxx__
pattern? Shouldn't it be named __on_missing__() instead?

-- Michael Chermside

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Raymond Hettinger
 But  given :

 True True Instance w getitem type 'instance'
 True True NewStyle Instance w getitem class '__main__.cng'
 True True [] class UserList.UserList at 0x00F11B70
 True True {} type 'instance'

 (Last one is UserDict)

 I can't conceive of circumstances where this is useful without duck
 typing *as well*.

Yawn.  Give it up.  For user defined instances, these functions can only 
discriminate between the presence or absence of __getitem__.  If you're trying 
to distinguish between sequences and mappings for instances, you're own your 
own 
with duck-typing.  Since there is no mandatory mapping or sequence API, the 
operator module functions cannot add more checks without getting some false 
negatives (your original example is a case in point).

Use the function as-is and add your own isinstance checks for your own personal 
definition of what makes a mapping a mapping and what makes a sequence a 
sequence.  Or better yet, stop designing APIs that require you to differentiate 
things that aren't really different ;-)


Raymond 
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Samuele Pedroni
Almann T. Goo wrote:
As far as I remember, Guido wasn't particularly opposed
to the idea, but the discussion fizzled out after having
failed to reach a consensus on an obviously right way
to go about it.
 
 
 My apologies for bringing this debated topic again to the
 front-lines--that said, I think there has been good, constructive
 things said again and sometimes it doesn't hurt to kick up an old
 topic.  After pouring through some of the list archive threads and
 reading through this thread, it seems clear to me that the community
 doesn't seem all that keen on fixing issue--which was my goal to
 ferret out.
 
 For me this is one of those things where the Pythonic thing to do is
 not so clear--and that mysterious, enigmatic definition of what it
 means to be Pythonic can be quite individual so I definitely don't
 want to waste my time arguing what that means.
 
 The most compelling argument for not doing anything about it is that
 the use cases are probably not that many--that in itself makes me less
 apt to push much harder--especially since my pragmatic side agrees
 with a lot of what has been said to this regard.
 
 IMO, Having properly nested scopes in Python in a sense made having
 closures a natural idiom to the language and part of its user
 interface.  By not allowing the name re-binding it almost seems like
 that user interface has a rough edge that is almost too easy to get
 cut on.  This in-elegance seems very un-Pythonic to me.
 

If you are looking for rough edges about nested scopes in Python
this is probably worse:

  x = []
  for i in range(10):
...   x.append(lambda : i)
...
  [y() for y in x]
[9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

although experienced people can live with it. The fact is that
importing nested scope from the like of Scheme it was not considered
that in Scheme for example, looping constructs introduce new scopes.
So this work more as expected there. There were long threads
about this at some point too.

Idioms and features mostly never port straightforwardly from language
to language.

For example Python has nothing with the explicit context introduction
and grouping of a Scheme 'let', so is arguable that nested scope
code, especially with rebindings, would be less clear, readable than
in Scheme (tastes in parenthesis kept aside).




 Anyhow, good discussion.
 
 Cheers,
 Almann
 
 --
 Almann T. Goo
 [EMAIL PROTECTED]
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 358 (bytes type) comments

2006-02-22 Thread Brett Cannon
First off, thanks to Neil for writing this all down.  The whole thread
of discussion on the bytes type was rather long and thus hard to
follow.  Nice to finally have it written down in a PEP.

Anyway, a few comments on the PEP.  One, should the hex() method
instead be an attribute, implemented as a property?  Seems like static
data that is entirely based on the value of the bytes object and thus
is not properly represented by a method.

Next, why are the __*slice__ methods to be defined?  Docs say they are
deprecated.

And for the open-ended questions, I don't think sort() is needed.

Lastly, maybe I am just dense, but it took me a second to realize that
it will most likely return the ASCII string for __str__() for use in
something like socket.send(), but it isn't explicitly stated anywhere.
 There is a chance someone might think that __str__ will somehow
return the sequence of integers as a string does exist.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Samuele Pedroni
Greg Ewing wrote:
 Jeremy Hylton wrote:
 
 
The names of naming statements are quite hard to get right, I fear.
 
 
 My vote goes for 'outer'.
 
 And if this gets accepted, remove 'global' in 3.0.
 

In 3.0 we could remove 'global' even without 'outer',
and make module global scopes read-only, not rebindable
after the top-level code has run (i.e. more like function
body scopes). The only free-for-all namespaces would be
class and instance ones. I can think of some
gains from this.  .3 wink
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pre-PEP: The bytes object

2006-02-22 Thread Neil Schemenauer
On Thu, Feb 16, 2006 at 12:47:22PM -0800, Guido van Rossum wrote:
 BTW, for folks who want to experiment, it's quite simple to create a
 working bytes implementation by inheriting from array.array. Here's a
 quick draft (which only takes str instance arguments):

Here's a more complete prototype.  Also, I checked in the PEP as
#358 after making changes suggested by Guido.

  Neil



import sys
from array import array
import re
import binascii

class bytes(array):

__slots__ = []

def __new__(cls, initialiser=None, encoding=None):
b = array.__new__(cls, B)
if isinstance(initialiser, basestring):
if isinstance(initialiser, unicode):
if encoding is None:
encoding = sys.getdefaultencoding()
initialiser = initialiser.encode(encoding)
initialiser = [ord(c) for c in initialiser]
elif encoding is not None:
raise TypeError(explicit encoding invalid for non-string 
initialiser)
b.extend(initialiser)
return b

@classmethod
def fromhex(self, data):
data = re.sub(r'\s+', '', data)
return bytes(binascii.unhexlify(data))

def __str__(self):
return self.tostring()

def __repr__(self):
return bytes(%r) % self.tolist()

def __add__(self, other):
if isinstance(other, array):
return bytes(super(bytes, self).__add__(other))
return NotImplemented

def __mul__(self, n):
return bytes(super(bytes, self).__mul__(n))

__rmul__ = __mul__

def __getslice__(self, i, j):
return bytes(super(bytes, self).__getslice__(i, j))

def hex(self):
return binascii.hexlify((self.tostring()))

def decode(self, encoding):
return self.tostring().decode(encoding)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] release plan for 2.5 ?

2006-02-22 Thread Anthony Baxter
On Sunday 12 February 2006 21:51, Thomas Wouters wrote:
 Well, in the past, features -- even syntax changes -- have gone in
 between the last beta and the final release (but reminding Guido
 might bring him to tears of regret. ;) Features have also gone into
 what would have been 'bugfix releases' if you looked at the
 numbering alone (1.5 - 1.5.1 - 1.5.2, for instance.) The past
 doesn't have a very impressive track record... 

*cough* Go on. Try slipping a feature into a bugfix release now, see 
how loudly you can make an Australian swear...

See also PEP 006. Do I need to add a bad language caveat in it?

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 358 (bytes type) comments

2006-02-22 Thread Bob Ippolito

On Feb 22, 2006, at 1:22 PM, Brett Cannon wrote:

 First off, thanks to Neil for writing this all down.  The whole thread
 of discussion on the bytes type was rather long and thus hard to
 follow.  Nice to finally have it written down in a PEP.

 Anyway, a few comments on the PEP.  One, should the hex() method
 instead be an attribute, implemented as a property?  Seems like static
 data that is entirely based on the value of the bytes object and thus
 is not properly represented by a method.

 Next, why are the __*slice__ methods to be defined?  Docs say they are
 deprecated.

 And for the open-ended questions, I don't think sort() is needed.

sort would be totally useless for bytes.  array.array doesn't have  
sort either.

 Lastly, maybe I am just dense, but it took me a second to realize that
 it will most likely return the ASCII string for __str__() for use in
 something like socket.send(), but it isn't explicitly stated anywhere.
  There is a chance someone might think that __str__ will somehow
 return the sequence of integers as a string does exist.

That would be a bad idea given that bytes are supposed make the str  
type go away.  It's probably better to make __str__ return __repr__  
like it does for most types.  If bytes type supports the buffer API  
(one would hope so), functions like socket.send should do the right  
thing as-is.

http://docs.python.org/api/bufferObjects.html

-bob

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] release plan for 2.5 ?

2006-02-22 Thread Anthony Baxter
On Thursday 23 February 2006 09:19, Guido van Rossum wrote:
 However the definition of feature vs. bugfix isn't always
 crystal clear.

 Some things that went into 2.4 recently felt like small features to
 me; but others may disagree:

 - fixing chunk.py to allow chunk size to be  2GB
 - supporting Unicode filenames in fileinput.py

 Are these features or bugfixes?

Sure, the line isn't so clear sometimes. I consider both of these 
bugfixes, but others could disagree. True/False, on the other hand, I 
don't think anyone disagrees about wink/duck

This stuff is always open for discussion, of course. 

Anthony
-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Delaney, Timothy (Tim)
Raymond Hettinger wrote:

 Your example simply highlights the consequences of one of Python's
 most basic, original design choices (using getitem for both sequences
 and mappings).  That choice is now so fundamental to the language
 that it cannot possibly change. 

Hmm - just a thought ...

Since we're adding the __index__ magic method, why not have a
__getindexed__ method for sequences.

Then semantics of indexing operations would be something like:

if hasattr(obj, '__getindexed__'):
return obj.__getindexed__(val.__index__())
else:
   return obj.__getitem__(val)

Similarly __setindexed__ and __delindexed__.

This would allow distinguishing between sequences and mappings in a
fairly backwards-compatible way. It would also enforce that only indexes
can be used for sequences.

The backwards-incompatibility comes in when you have a type that
implements __getindexed__, and a subclass that implements __getitem__
e.g. if `list` implemented __getindexed__ then any `list` subclass that
overrode __getitem__ would fail. However, I think we could make it 100%
backwards-compatible for the builtin sequence types if they just had
__getindexed__ delegate to __getitem__. Effectively:

class list (object):

def __getindexed__(self, index):
return self.__getitem__(index)

Tim Delaney
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] buildbot, and test failures

2006-02-22 Thread Anthony Baxter
It took 2 hours, but I caught up on Python-dev email. Hoorah.

So, couple of things - the trunk has test failures for me, right now. 
test test_email failed -- Traceback (most recent call last):
  File 
/home/anthony/src/py/pytrunk/python/Lib/email/test/test_email.py, 
line 2111, in test_parsedate_acceptable_to_time_functions
eq(time.localtime(t)[:6], timetup[:6])
AssertionError: (2003, 2, 5, 14, 47, 26) != (2003, 2, 5, 13, 47, 26)

Right now, Australia's in daylight savings, I suspect that's the 
problem here.

I also see intermittent failures from test_socketserver:
test_socketserver
test test_socketserver crashed -- socket.error: (111, 'Connection 
refused')
is the only error message. When it fails, regrtest fails to exit - it
just sits there after printing out the summary. This suggests that 
there's a threaded server not getting cleaned up correctly.  
test_socketserver could probably do with a rewrite. 

Who's the person who hands out buildbot username/password pairs? I 
have an Ubuntu x86 box here that can become one (I think the only 
linux, currently, is Gentoo...)

Anthony
-- 
Anthony Baxter [EMAIL PROTECTED]
It's never too late to have a happy childhood.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot vs. Windows

2006-02-22 Thread Tim Peters
[Neal Norwitz]
 ...
 I also think I know how to do the double builds (one release and one
 debug).  But it's too late for me to change it tonight without
 screwing it up.

I'm not mad :-).  The debug build is more fruitful than the release
build for finding problems, so doing two debug-build runs is an
improvement (keeping in mind that some bugs only show up in release
builds, though -- for example, subtly incorrect C code that works
differently depending on whether compiler optimization is in effect).

 The good/bad news after this change is:

 http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0

 A seg fault on Mac OS when running with -r. :-(

Yay!  That's certainly good/bad news.  Since I always run with -r,
I've had the fun of tracking most of these down.  Sometimes it's very
hard, sometimes not.  regrtest's -f option is usually needed, to force
running the tests in exactly the same order, then commenting test
names out in binary-search fashion to get a minimal subset.  Alas,
half the time the cause for a -r segfault turns out to be an error in
refcounting or in setting up gc'able containers, and has nothing in
particular to do with the specific tests being run.  Those are the
very hard ones ;-)  Setting the gc threshold to 1 (do a full
collection on every allocation) can sometimes provoke such problems
easily.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Greg Ewing
Josiah Carlson wrote:
 However, I believe global was and is necessary for the
 same reasons for globals in any other language.

Oddly, in Python, 'global' isn't actually necessary,
since the module can always import itself and use
attribute access.

Clearly, though, Guido must have thought at the time
that it was worth providing an alternative way.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] defaultdict proposal round three

2006-02-22 Thread Greg Ewing
Fredrik Lundh wrote:

 fwiw, the first google hit for autodict appears to be part of someone's
 link farm
 
 At this website we have assistance with autodict. In addition to
 information for autodict we also have the best web sites concerning
 dictionary, non profit and new york.

Hmmm, looks like some sort of bot that takes the words in
your search and stuffs them into its response. I wonder
if they realise how silly the results end up sounding?

I've seen these sorts of things before, but I haven't
quite figured out yet how they manage to get into Google's
database if they're auto-generated. Anyone have any clues
what goes on?

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Terry Reedy wrote:
 Greg Ewing [EMAIL PROTECTED] wrote in message 

Efficiency is an implementation concern.

 It is also a user concern, especially if inefficiency overruns memory 
 limits.

Sure, but what I mean is that it's better to find what's
conceptually right and then look for an efficient way
of implementing it, rather than letting the implementation
drive the design.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Brendan Simons
On 22-Feb-06, at 9:28 PM, [EMAIL PROTECTED] wrote:On 21-Feb-06, at 11:21 AM, Almann T. Goo" [EMAIL PROTECTED] wrote:Why not just use a class?def incgen(start=0, inc=1) :    class incrementer(object):      a = start - inc      def __call__(self):         self.a += inc         return self.a    return incrementer()a = incgen(7, 5)for n in range(10):    print a(),Because I think that this is a workaround for a concept that thelanguage doesn't support elegantly with its lexically nested scopes.IMO, you are emulating name rebinding in a closure by creating anobject to encapsulate the name you want to rebind--you don't need thisworkaround if you only need to access free variables in an enclosingscope.  I provided a "lighter" example that didn't need a callableobject but could use any mutable such as a list.This kind of workaround is needed as soon as you want to re-bind aparent scope's name, except in the case when the parent scope is theglobal scope (since there is the "global" keyword to handle this). It's this dichotomy that concerns me, since it seems to be against theelegance of Python--at least in my opinion.It seems artificially limiting that enclosing scope name rebinds arenot provided for by the language especially since the behavior withthe global scope is not so.  In a nutshell I am proposing a solutionto make nested lexical scopes to be orthogonal with the global scopeand removing a "wart," as Jeremy put it, in the language.-Almann--Almann T. Goo[EMAIL PROTECTED]If I may be so bold, couldn't this be addressed by introducing a "rebinding" operator?  So the ' = ' operator would continue to create a new name in the current scope, and the (say) ' := ' operator would for an existing name to rebind.   The two operators would highlight the special way Python handles variable / name assignment, which many newbies miss.(from someone who was surprised by this quirk of Python before:  http://www.thescripts.com/forum/thread43418.html)  -Brendan--Brendan SimonsSorry, this got hung up in my email outbox.  I see the thread has touched on this idea in the meantime.  So, yeah.  Go team.  Brendan--Brendan Simons___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Greg Ewing
Steven Bethard wrote:
  And, as you mention, it's consistent
 with the relative import feature.

Only rather vaguely -- it's really somewhat different.

With imports, .foo is an abbreviation for myself.foo,
where myself is the absolute name for the current module,
and you could replace all instances of .foo with that.
But in the suggested scheme, .foo wouldn't have any
such interpretation -- there would be no other way of
spelling it.

Also, with imports, the dot refers to a single well-
defined point in the module-name hierarchy, but here it
would imply a search upwards throught the scope hierarchy.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
Ron Adam wrote:

 While I prefer constructors with an explicit encode argument, and use a 
 recode() method for 'like to like' coding.  Then the whole encode/decode 
 confusion goes away.

I'd be happy with that, too.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Phillip J. Eby
At 03:49 PM 2/23/2006 +1300, Greg Ewing wrote:
Steven Bethard wrote:
   And, as you mention, it's consistent
  with the relative import feature.

Only rather vaguely -- it's really somewhat different.

With imports, .foo is an abbreviation for myself.foo,
where myself is the absolute name for the current module,
and you could replace all instances of .foo with that.

Actually, import .foo is an abbreviation for import myparent.foo, not 
import myparent.myself.foo.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] getdefault(), the real replacement for setdefault()

2006-02-22 Thread Barry Warsaw
Guido's on_missing() proposal is pretty good for what it is, but it is
not a replacement for set_default().  The use cases for a derivable,
definition or instantiation time framework is different than the
call-site based decision being made with setdefault().  The difference
is that in the former case, the class designer or instantiator gets to
decide what the default is, and in the latter (i.e. current) case, the
user gets to decide.

Going back to first principles, the two biggest problems with today's
setdefault() is 1) the default object gets instantiated whether you need
it or not, and 2) the idiom is not very readable.

To directly address these two problems, I propose a new method called
getdefault() with the following signature:

def getdefault(self, key, factory)

This yields the following idiom:

d.getdefault('foo', list).append('bar')

Clearly this completely addresses problem #1.  The implementation is
simple and obvious, and there's no default object instantiated unless
the key is missing.

I think #2 is addressed nicely too because getdefault() shifts the
focus on what the method returns rather than the effect of the method on
the target dict.  Perhaps that's enough to make the chained operation on
the returned value feel more natural.  getdefault() also looks more
like get() so maybe that helps it be less jarring.

This approach also seems to address Raymond's objections because
getdefault() isn't special the way on_missing() would be.

Anyway, I don't think it's an either/or choice with Guido's subclass.
Instead I think they are different use cases.  I would add getdefault()
to the standard dict API, remove (eventually) setdefault(), and add
Guido's subclass in a separate module.  But I /wouldn't/ clutter the
built-in dict's API with on_missing().

-Barry

P.S.

_missing = object()

def getdefault(self, key, factory):
value = self.get(key, _missing)
if value is _missing:
value = self[key] = factory()
return value



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Steven Bethard
Steven Bethard wrote:
 And, as you mention, it's consistent with the relative import feature.

Greg Ewing wrote:
 With imports, .foo is an abbreviation for myself.foo,
 where myself is the absolute name for the current module,
 and you could replace all instances of .foo with that.

Phillip J. Eby wrote:
 Actually, import .foo is an abbreviation for import myparent.foo, not
 import myparent.myself.foo.

If we wanted to be fully consistent with the relative import
mechanism, we would require as many dots as nested scopes.  So:

   def incrementer(val):
   def inc():
   .val += 1
   return .val
   return inc

but also:

def incrementer_getter(val):
def incrementer():
def inc():
..val += 1
return ..val
return inc
return incrementer

(Yes, I know the example is silly.  It's not meant as a use case, just
to demonstrate the usage of dots.)  I actually don't care which way it
goes here, but if you want to make the semantics as close to the
relative import semantics as possible, then this is the way to go.

STeVe
--
Grammar am for people who can't think for myself.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Greg Ewing
Steven Bethard wrote:

 Phillip J. Eby wrote:
 
Actually, import .foo is an abbreviation for import myparent.foo, not
import myparent.myself.foo.

Oops, sorry, you're right.

s/myself/myparent/g

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot, and test failures

2006-02-22 Thread Martin v. Löwis
Anthony Baxter wrote:
 Who's the person who hands out buildbot username/password pairs?

That's me.

 I 
 have an Ubuntu x86 box here that can become one (I think the only 
 linux, currently, is Gentoo...)

How different are the Linuxes, though? How many of them do we need?

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Greg Ewing
James Y Knight wrote:

 Some MIME  sections 
 might have a base64 Content-Transfer-Encoding, others might  be 8bit 
 encoded, others might be 7bit encoded, others might be quoted- printable 
 encoded.

I stand corrected -- in that situation you would have to encode
the characters before combining them with other material.

However, this doesn't change my view that the result of base64
encoding by itself is characters, not bytes. To go straight
to bytes would require assuming an encoding, and that would
make it *harder* to use in cases where you wanted a different
encoding, because you'd first have to undo the default
encoding and then re-encode it using the one you wanted.

It may be reasonable to provide an easy way to go straight
from raw bytes to ascii-encoded-base64 bytes, but that should
be a different codec. The plain base64 codec should produce
text.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Greg Ewing
Samuele Pedroni wrote:

 If you are looking for rough edges about nested scopes in Python
 this is probably worse:
 
   x = []
   for i in range(10):
 ...   x.append(lambda : i)
 ...
   [y() for y in x]
 [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]

As an aside, is there any chance that this could be
changed in 3.0? I.e. have the for-loop create a new
binding for the loop variable on each iteration.

I know Guido seems to be attached to the idea of
being able to use the value of the loop variable
after the loop exits, but I find that to be a dubious
practice readability-wise, and I can't remember ever
using it. There are other ways of getting the same
effect, e.g. assigning it to another variable before
breaking out of the loop, or putting the loop in a
function and using return.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Greg Ewing
Delaney, Timothy (Tim) wrote:

 Since we're adding the __index__ magic method, why not have a
 __getindexed__ method for sequences.

I don't think this is a good idea, since it would be
re-introducing all the confusion that the existence of
two C-level indexing slots has led to, this time for
user-defined types.

 The backwards-incompatibility comes in when you have a type that
 implements __getindexed__, and a subclass that implements __getitem__

I don't think this is just a backwards-incompatibility
issue. Having a single syntax that can correspond to more
than one special method is inherently ambiguous. What do
you do if both are defined? Sure you can come up with
some rule to handle it, but it's better to avoid the
situation in the first place.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes

2006-02-22 Thread Greg Ewing
Almann T. Goo wrote:

  (although rebinding a name in the global scope from a
 local scope is really just a specific case of that).

That's what rankles people about this, I think -- there
doesn't seem to be a good reason for treating the global
scope so specially, given that all scopes could be
treated uniformly if only there were an 'outer' statement.
All the arguments I've seen in favour of the status quo
seem like rationalisations after the fact.

  Since there were no nested lexical scopes back
 then, there was no need to have a construct for arbitrary enclosing
 scopes.

However, if nested scopes *had* existed back then, I
rather suspect we would have had an 'outer' statement
from the beginning, or else 'global' would have been
given the semantics we are now considering for 'outer'.

Of all the suggestions so far, it seems to me that
'outer' is the least radical and most consistent with
what we already have. How about we bung it in and see
how it goes? We can always yank it out in 3.0 if it
turns out to be a horrid mistake and we get swamped
with a terabyte of grievously abusive nested scope
code. :-)

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Path PEP: some comments (equality)

2006-02-22 Thread Greg Ewing
Mark Mc Mahon wrote:

 Should the path class implement an __eq__ method that might do some of
 the following things:
  - Get the absolute path of both self and the other path

I don't think that any path operations should implicitly
touch the file system like this. The paths may not
represent real files or may be for a system other than
the one the program is running on.

  - normcase both

Not sure about this one either. When dealing with remote
file systems, it can be hard to know whether a path will
be interpreted as case-sensitive or not. This can be a
problem even with local filesystems, e.g. on MacOSX
where you can have both HFS (case-insensitive) and
Unix (case-sensitive) filesystems mounted.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] operator.is*Type

2006-02-22 Thread Greg Ewing
Fuzzyman wrote:

 The operator module defines two functions :
 
 isMappingType
 isSquenceType
 
  These protocols are loosely defined. Any object which has a
  ``__getitem__`` method defined could support either protocol.

These functions are actually testing for the presence
of two different __getitem__ methods at the C level, one
in the mapping substructure of the type object, and the
other in the sequence substructure. This only works
for types implemented in C which make use of this distinction.
It's not much use for user-defined classes, where the
presence of a __getitem__ method causes both of these
slots to become populated.

Having two different slots for __getitem__ seems to have
been an ill-considered feature in the first place and
would probably best be removed in 3.0. I wouldn't mind if
these two functions went away.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | Carpe post meridiam! |
Christchurch, New Zealand  | (I'm not a morning person.)  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
 Greg == Greg Ewing [EMAIL PROTECTED] writes:

Greg Stephen J. Turnbull wrote:

 Base64 is a (family of) wire protocol(s).  It's not clear to me
 that it makes sense to say that the alphabets used by baseNN
 encodings are composed of characters,

Greg Take a look at [this that the other]

Those references use character in an ambiguous and ill-defined way.
Trying to impose Python unicode object semantics on vague characters
is a bad idea IMO.

Greg Which seems to make it perfectly clear that the result of
Greg the encoding is to be considered as characters, which are
Greg not necessarily going to be encoded using ascii.

Please define character, and explain how its semantics map to
Python's unicode objects.

Greg So base64 on its own is *not* a wire protocol. Only after
Greg encoding the characters do you have a wire protocol.

No, base64 isn't a wire protocol.  Rather, it's a schema for a family
of wire protocols, whose alphabets are heuristically chosen on the
assumption that code units which happen to correspond to alpha-numeric
code points in a commonly-used coded character set are more likely to
pass through a communication channel without corruption.

Note that I have _precisely_ defined what I mean.  You still have the
problem that you haven't defined character, and that is a real
problem, see below.

 I don't see any case for correctness here, only for
 convenience,

Greg I'm thinking of convenience, too. Keep in mind that in Py3k,
Greg 'unicode' will be called 'str' (or something equally neutral
Greg like 'text') and you will rarely have to deal explicitly
Greg with unicode codings, this being done mostly for you by the
Greg I/O objects. So most of the time, using base64 will be just
Greg as convenient as it is today: base64_encode(my_bytes) and
Greg write the result out somewhere.

Convenient, yes, but incorrect.  Once you mix those bytes with the
Python string type, they become subject to all the usual operations on
characters, and there's no way for Python to tell you that you didn't
want to do that.  Ie,

Greg Whereas if the result is text, the right thing happens
Greg automatically whatever the ultimate encoding turns out to
Greg be. You can take the text from your base64 encoding, combine
Greg it with other text from any other source to form a complete
Greg mail message or xml document or whatever, and write it out
Greg through a file object that's using any unicode encoding at
Greg all, and the result will be correct.

Only if you do no transformations that will harm the base64-encoding.
This is why I say base64 is _not_ based on characters, at least not in
the way they are used in Python strings.  It doesn't allow any of the
usual transformations on characters that might be applied globally to
a mail composition buffer, for example.

In other words, you don't escape from the programmer having to know
what he's doing.  EIBTI, and the setup I advocate forces the
programmer to explicitly decide where to convert base64 objects to a
textual representation.  This reminds him that he'd better not touch
that text.

Greg The reason I say it's *corrrect* is that if you go straight
Greg from bytes to bytes, you're *assuming* the eventual encoding
Greg is going to be an ascii superset.  The programmer is going
Greg to have to know about this assumption and understand all its
Greg consequences and decide whether it's right, and if not, do
Greg something to change it.

I'm not assuming any such thing, except in the context of analysis of
implementation efficiency.  And the programmer needs to know about the
semantics of text that is actually a base64-encoded object, and that
they are different from string semantics.

This is something that programmers are used to dealing with in the
case of Python 2.x str and C char[]; the whole point of the unicode
type is to allow the programmer to abstract from that when dealing
human-readable text.  Why confuse the issue.

 And in the classroom, you're just going to confuse students by
 telling them that UTF-8 --[Unicode codec]-- Python string is
 decoding but UTF-8 --[base64 codec]-- Python string is
 encoding, when MAL is telling them that -- Python string is
 always decoding.

Greg Which is why I think that only *unicode* codings should be
Greg available through the .encode and .decode interface. Or
Greg alternatively there should be something more explicit like
Greg .unicode_encode and .unicode_decode that is thus restricted.

Greg Also, if most unicode coding is done in the I/O objects,
Greg there will be far less need for programmers to do explicit
Greg unicode coding in the first place, so likely it will become
Greg more of an advanced topic, rather than something you need to
Greg come to grips with on day one of using unicode, like it is
Greg now.

So then you bring it 

Re: [Python-Dev] Unifying trace and profile

2006-02-22 Thread Nicholas Bastin
On 2/21/06, Robert Brewer [EMAIL PROTECTED] wrote:
 1. Allow trace hooks to receive c_call, c_return, and c_exception events
 (like profile does).

I can easily make this modification.  You can also register the same
bound method for trace and profile, which sortof eliminates this
problem.

 2. Allow profile hooks to receive line events (like trace does).

You really don't want this in the general case.  Line events make
profiling *really* slow, and they're not that accurate (although many
thanks to Armin last year for helping me make them much more
accurate).  I guess what you require is to be able to selectively turn
on events, thus eliminating the notion of 'trace' or 'profile'
entirely, but I don't have a good idea of how to implement that at
least as efficiently as the current system at the moment - I'm sure it
could be done, I just haven't put any thought into it.

 3. Expose new sys.gettrace() and getprofile() methods, so trace and
 profile functions that want to play nice can call
 sys.settrace/setprofile(None) only if they are the current hook.

Not a bad idea, although are you really running into this problem a lot?

 4. Make the same move that sys.exitfunc - atexit made (from a single
 function to multiple functions via registration), so multiple
 tracers/profilers can play nice together.

It seems very unlikely that you'll want to have a trace hook and
profile hook installed at the same time, given the extreme
unreliability this will introduce into the profiler.

 5. Allow the core to filter on the event arg before hook(frame, event,
 arg) is called.

What do you mean by this, exactly?  How would you use this feature?

 6. Unify tracing and profiling, which would remove a lot of redundant
 code in ceval and sysmodule and free up some space in the PyThreadState
 struct to boot.

The more events you throw in profiling makes it slow, however.  Line
events, while a nice thing to have, theoretically, would probably make
a profiler useless.  If you want to create line-by-line timing data,
we're going to have to look for a more efficient way (like sampling).

 7. As if the above isn't enough of a dream, it would be nice to have a
 bytecode tracer, which didn't bother with the f_lineno logic in
 maybe_call_line_trace, but just called the hook on every instruction.

I'm working on one, but given how much time I've had to work on my
profiler in the last year, I'm not even going to guess when I'll get a
real shot at looking at that.

My long-term goal is to eliminate profiling and tracing from the core
interpreter entirely and implement the functionality in such a way
that they don't cost you when not in use (i.e., implement profilers
and debuggers which poke into the process from the outside, rather
than be supported natively through events).  This isn't impossible,
but it's difficult because of the large variety of platforms.  I have
access to most of them, but again, my time is hugely constrained right
now for python development work.

--
Nick
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes.from_hex()

2006-02-22 Thread Stephen J. Turnbull
 Ron == Ron Adam [EMAIL PROTECTED] writes:

Ron Terry Reedy wrote:

 I prefer the shorter names and using recode, for instance, for
 bytes to bytes.

Ron While I prefer constructors with an explicit encode argument,
Ron and use a recode() method for 'like to like' coding. 

'Recode' is a great name for the conceptual process, but the methods
are directional.  Also, in internationalization work, recode
strongly connotes encodingA - original - encodingB, as in iconv.

I do prefer constructors, as it's generally not a good idea to do
encoding/decoding in-place for human-readable text, since the codecs
are often lossy.

Ron Then the whole encode/decode confusion goes away.

Unlikely.  Errors like A string.encode(base64).encode(base64)
are all too easy to commit in practice.

-- 
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
   Ask not how you can do free software business;
  ask what your business can do for free software.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)

2006-02-22 Thread Almann T. Goo
 If we wanted to be fully consistent with the relative import
 mechanism, we would require as many dots as nested scopes.

At first I was a bit taken a back with the syntax, but after reading
PEP 328 (re: Relative Import) I think I can stomach the syntax a bit
better ; ).

That said, -1 because I believe it adds more problems than the one it
is designed to fix.

Part of me can appreciate using the prefixing dot as a way to spell
my parent's scope since it does not add a new keyword and in this
regard would appear to be equally as backwards compatible as the :=
proposal (to which I am not a particularly big fan of either but could
probably get used to it).

Since the current semantics allow *evaluation* to an enclosing scope's
name by an un-punctuated name, var is a synonym to .var (if
var is bound in the immediately enclosing scope).  However for
*re-binding* to an enclosing scope's name, the punctuated name is
the only one we can use, so the semantic becomes more cluttered.

This can make a problem that I would say is akin to the dangling else problem.

def incrementer_getter(val):
   def incrementer():
   val = 5
   def inc():
   ..val += 1
   return val
   return inc
   return incrementer

Building on an example that Steve wrote to demonstrate the syntax
proposed, you can see that if a user inadvertently uses the enclosing
scope for the return instead of what would presumably be the outer
most bound parameter.  Now remove the binding in the incrementer
function and it works the way the user probably thought.

Because of this, I think by adding the dot to allow resolving a name
in an explicit way hurts the language by adding a new gotcha with
existing name binding semantics.

I would be okay with this if all name access for enclosing scopes
(binding and evaluation) required the dot syntax (as I believe Steve
suggests for Python 3K)--thus keeping the semantics cleaner--but that
would be incredibly backwards incompatible for what I would guess is
*a lot* of code.  This is where the case for the re-bind operator
(i.e. :=) or an outer type keyword is stronger--the semantics in
the language today are not adversely affected.

-Almann
--
Almann T. Goo
[EMAIL PROTECTED]
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com