Re: [Python-Dev] defaultdict proposal round three
Raymond Hettinger wrote: Like autodict could mean anything. Everything is meaningless until you know something about it. If you'd never seen Python before, would you know what 'dict' meant? If I were seeing defaultdict for the first time, I would need to look up the docs before I was confident I knew exactly what it did -- as I've mentioned before, my initial guess would have been wrong. The same procedure would lead me to an understanding of 'autodict' just as quickly. Maybe 'autodict' isn't the best term either -- I'm open to suggestions. But my instincts still tell me that 'defaultdict' is the best term for something *else* that we might want to add one day as well, so I'm just trying to make sure we don't squander it lightly. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Josiah Carlson wrote: In this particular example, there is no net reduction in line use. The execution speed of your algorithm would be reduced due to function calling overhead. If there were more uses of the function, the line count reduction would be greater. In any case, line count and execution speed aren't the only issues -- there is DRY to consider. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buildbot vs. Windows
On 2/21/06, Neal Norwitz [EMAIL PROTECTED] wrote: I agree with this, but don't know a clean way to do 2 builds. I modified buildbot to: - Stop doing the second without deleting .py[co] run. - Do one run with a debug build. - Use -uall -r for both. I screwed it up, so now it does: - Do one run with a debug build. - Use -uall -r for both. - Still does the second deleting .py[co] run I couldn't think of a simple way to figure out that on most unixes the program is called python, but on Mac OS X, it's called python.exe. So I reverted back to using make testall. We can make a new test target to only run once. I also think I know how to do the double builds (one release and one debug). But it's too late for me to change it tonight without screwing it up. The good/bad news after this change is: http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0 A seg fault on Mac OS when running with -r. :-( n ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Raymond Hettinger wrote: Like autodict could mean anything. Everything is meaningless until you know something about it. If you'd never seen Python before, would you know what 'dict' meant? If I were seeing defaultdict for the first time, I would need to look up the docs before I was confident I knew exactly what it did -- as I've mentioned before, my initial guess would have been wrong. The same procedure would lead me to an understanding of 'autodict' just as quickly. Maybe 'autodict' isn't the best term either -- I'm open to suggestions. But my instincts still tell me that 'defaultdict' is the best term for something *else* that we might want to add one day as well, so I'm just trying to make sure we don't squander it lightly. Given that the default entries behind the non-existent keys don't actually exist, something like virtual_dict might be appropriate. Or phantom_dict, or ghost_dict. I agree that the naming of things is important. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC www.holdenweb.com PyCon TX 2006 www.python.org/pycon/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Steve Holden wrote: Given that the default entries behind the non-existent keys don't actually exist, something like virtual_dict might be appropriate. No, that would suggest to me something like a wrapper object that delegates most of the mapping protocol to something else. That's even less like what we're discussing. In our case the default values are only virtual until you use them, upon which they become real. Sort of like a wave function collapse... hmmm... I suppose 'heisendict' wouldn't fly, would it? -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Fuzzyman wrote: I've had problems in code that needs to treat strings, lists and dictionaries differently (assigning values to a container where all three need different handling) and telling the difference but allowing duck typing is *problematic*. You need to rethink your design so that you don't have to make that kind of distinction. Well... to *briefly* explain the use case, it's for value assignment in ConfigObj. It basically accepts as valid values strings and lists of strings [#]_. You can also create new subsections by assigning a dictionary. It needs to be able to recognise lists in order to check each list member is a string. (See note below, it still needs to be able to recognise lists when writing, even if it is not doing type checking on assignment.) It needs to be able to recognise dictionaries in order to create a new section instance (rather than directly assigning the dictionary). This is *terribly* convenient for the user (trivial example of creating a new config file programatically) : from configobj import ConfigObj cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} cfg.write() Writes out : key = value key2 = value1, value2, value3 [section] key = value key2 = value1, value2, value3 (Note none of those values needed quoting, so they aren't.) Obviously I could force the creation of sections and the assignment of list values to use separate methods, but it's much less readable and unnecessary. The code as is works and has a nice API. It still needs to be able to tell what *type* of value is being assigned. Mapping and sequence protocols are so loosely defined that in order to support 'list like objects' and 'dictionary like objects' some arbitrary decision about what methods they should support has to be made. (For example a read only mapping container is unlikely to implement __setitem__ or methods like update). At first we defined a mapping object as one that defines __getitem__ and keys (not update as I previously said), and list like objects as ones that define __getitem__ and *not* keys. For strings we required a basestring subclass. In the end I think we ripped this out and just settled on isinstance tests. All the best, Michael Foord .. [#] Although it has two modes. In the 'default' mode you can assign any object as a value and a string representation is written out. A more strict mode checks values at the point you assign them - so errors will be raised at that point rather than propagating into the config file. When writing you still need to able to recognise lists because each element is properly quoted. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Raymond Hettinger wrote: Like autodict could mean anything. fwiw, the first google hit for autodict appears to be part of someone's link farm At this website we have assistance with autodict. In addition to information for autodict we also have the best web sites concerning dictionary, non profit and new york. This makes autodict.com the most reliable guide for autodict on the Internet. and the second is a description of a self-initializing dictionary data type for Python. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Stephen J. Turnbull wrote: What I advocate for Python is to require that the standard base64 codec be defined only on bytes, and always produce bytes. Greg I don't understand that. It seems quite clear to me that Greg base64 encoding (in the general sense of encoding, not the Greg unicode sense) takes binary data (bytes) and produces Greg characters. Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by baseNN encodings are composed of characters, but suppose we stipulate that. Greg So in Py3k the correct usage would be [bytes-unicode]. IMHO, as a wire protocol, base64 simply doesn't care what Python's internal representation of characters is. I don't see any case for correctness here, only for convenience, both for programmers on the job and students in the classroom. We can choose the character set that works best for us. I think that's 8-bit US ASCII. My belief is that bytes-bytes is going to be the dominant use case, although I don't use binary representation in XML. However, AFAIK for on the wire use UTF-8 is strongly recommended for XML, and in that case it's also efficient to use bytes-bytes for XML, since conversion of base64 bytes to UTF-8 characters is simply a matter of Simon says, be UTF-8! And in the classroom, you're just going to confuse students by telling them that UTF-8 --[Unicode codec]-- Python string is decoding but UTF-8 --[base64 codec]-- Python string is encoding, when MAL is telling them that -- Python string is always decoding. Sure, it all makes sense if you already know what's going on. But I have trouble remembering, especially in cases like UTF-8 vs UTF-16 where Perl and Python have opposite internal representations, and glibc has a third which isn't either. If base64 (and gzip, etc) are all considered bytes-bytes, there just isn't an issue any more. The simple rule wins: to Python string is always decoding. Why fight it when we can run away with efficiency gains?wink (In the above, Python string means the unicode type, not str.) -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can do free software business; ask what your business can do for free software. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Fuzzyman wrote: cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} If the main purpose is to support this kind of notational convenience, then I'd be inclined to require all the values used with this API to be concrete strings, lists or dicts. If you're going to make types part of the API, I think it's better to do so with a firm hand rather than being half- hearted and wishy-washy about it. Then, if it's really necessary to support a wider variety of types, provide an alternative API that separates the different cases and isn't type-dependent at all. If someone has a need for this API, using it isn't going to be much of an inconvenience, since he won't be able to write out constructors for his types using notation as compact as the above anyway. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
On 2/22/06, Greg Ewing [EMAIL PROTECTED] wrote: Mark Russell wrote: PEP 227 mentions using := as a rebinding operator, but rejects the idea as it would encourage the use of closures. Well, anything that facilitates rebinding in outer scopes is going to encourage the use of closures, so I can't see that as being a reason to reject a particular means of rebinding. You either think such rebinding is a good idea or not -- and that seems to be a matter of highly individual taste. At the time PEP 227 was written, nested scopes were contentious. (I recall one developer who said he'd be embarassed to tell his co-workers he worked on Python if it had this feature :-). Rebinding was more contentious, so the feature was left out. I don't think any particular syntax or spelling for rebinding was favored more or less. On this particular idea, I tend to think it's too obscure as well. Python generally avoids attaching randomly-chosen semantics to punctuation, and I'd like to see it stay that way. I agree. Jeremy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Greg Ewing wrote: Fuzzyman wrote: cfg = ConfigObj(newfilename) cfg['key'] = 'value' cfg['key2'] = ['value1', 'value2', 'value3'] cfg['section'] = {'key': 'value', 'key2': ['value1', 'value2', 'value3']} If the main purpose is to support this kind of notational convenience, then I'd be inclined to require all the values used with this API to be concrete strings, lists or dicts. If you're going to make types part of the API, I think it's better to do so with a firm hand rather than being half- hearted and wishy-washy about it. [snip..] Thanks, that's the solution we settled on. We use ``isinstance`` tests to determine types. The user can always do something like : cfg['section'] = dict(dict_like_object) Which isn't so horrible. All the best, Michael -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] operator.is*Type
Hello all, Feel free to shoot this down, but a suggestion. The operator module defines two functions : isMappingType isSquenceType These return a guesstimation as to whether an object passed in supports the mapping and sequence protocols. These protocols are loosely defined. Any object which has a ``__getitem__`` method defined could support either protocol. Therefore : from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. An object prima facie supports the mapping protocol if it defines a ``__getitem__`` method, and a ``keys`` method. An object prima facie supports the sequence protocol if it defines a ``__getitem__`` method, and *not* a ``keys`` method. As a result code which needs to be able to tell the difference can use these functions and can sensibly refer to the definition of the mapping and sequence protocols when documenting what sort of objects an API call can accept. All the best, Michael Foord ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] defaultdict and on_missing()
I'm concerned that the on_missing() part of the proposal is gratuitous. The main use cases for defaultdict have a simple factory that supplies a zero, empty list, or empty set. The on_missing() hook is only thereto support the rarer case of needinga key to compute a default value. The hook is not needed for the main use cases. As it stands, we're adding a method to regular dicts that cannot be usefully called directly. Essentially, it is a framework method meant to be overridden in a subclass. So, it only makes sense in the context of subclassing. In the meantime, we've added an oddball method to the main dict API, arguably the most important object API in Python. To use the hook, you write something like this: class D(dict): def on_missing(self, key): return somefunc(key) However, we can already do something like that without the hook: class D(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: self[key] = value = somefunc(key) return value The latter form is already possible, doesn't require modifying a basic API, and is arguably clearer about when it is called and what it does (the former doesn't explicitly show that the returned value gets saved in the dictionary). Since we can already do the latter form, wecan get some insight into whether the need has ever actually arisen in real code. I scanned the usual sources (my own code, the standard library, and my most commonly used third-party libraries) and found no instances of code like that. The closest approximation was safe_substitute() in string.Template where missing keys returned themselves as a default value. Other than that, I conclude that there isn't sufficient need to warrant adding a funky method to the API for regular dicts. I wondered why thesafe_substitute() example was unique. I think the answer is that we normally handle default computations through simple in-line code ("if k in d: do1() else do2()" or a try/except pair). Overriding on_missing() then is really only useful when you need to create a type that can be passed to a client function that was expecting a regular dictionary. So it does come-up but not much. Aside: Why on_missing() is an oddball among dict methods. When teaching dicts to beginner, all the methods are easily explainable except this one. You don't call this method directly, you only use it when subclassing, you have to override it to do anything useful, it hooks KeyError but onlywhen raised by __getitem__ and not other methods, etc. I'm concerned that evening having this method inregular dictAPI will create confusionabout when to use dict.get(), when to use dict.setdefault(), when to catch a KeyError, or when to LBYL. Adding this one extra choice makes the choice more difficult. My recommendation: Dump the on_missing() hook. That leaves the dict API unmolested andallows a more straight-forward implementation/explanation of collections.default_dict or whatever it ends-up being named. The result is delightfully simple and easy to understand/explain. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Stephen J. Turnbull wrote: Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by baseNN encodings are composed of characters, Take a look at http://en.wikipedia.org/wiki/Base64 where it says ...base64 is a binary to text encoding scheme whereby an arbitrary sequence of bytes is converted to a sequence of printable ASCII characters. Also see RFC 2045 (http://www.ietf.org/rfc/rfc2045.txt) which defines base64 in terms of an encoding from octets to characters, and also says A 65-character subset of US-ASCII is used ... This subset has the important property that it is represented identically in all versions of ISO 646 ... and all characters in the subset are also represented identically in all versions of EBCDIC. Which seems to make it perfectly clear that the result of the encoding is to be considered as characters, which are not necessarily going to be encoded using ascii. So base64 on its own is *not* a wire protocol. Only after encoding the characters do you have a wire protocol. I don't see any case for correctness here, only for convenience, I'm thinking of convenience, too. Keep in mind that in Py3k, 'unicode' will be called 'str' (or something equally neutral like 'text') and you will rarely have to deal explicitly with unicode codings, this being done mostly for you by the I/O objects. So most of the time, using base64 will be just as convenient as it is today: base64_encode(my_bytes) and write the result out somewhere. The reason I say it's *corrrect* is that if you go straight from bytes to bytes, you're *assuming* the eventual encoding is going to be an ascii superset. The programmer is going to have to know about this assumption and understand all its consequences and decide whether it's right, and if not, do something to change it. Whereas if the result is text, the right thing happens automatically whatever the ultimate encoding turns out to be. You can take the text from your base64 encoding, combine it with other text from any other source to form a complete mail message or xml document or whatever, and write it out through a file object that's using any unicode encoding at all, and the result will be correct. it's also efficient to use bytes-bytes for XML, since conversion of base64 bytes to UTF-8 characters is simply a matter of Simon says, be UTF-8! Efficiency is an implementation concern. In Py3k, strings which contain only ascii or latin-1 might be stored as 1 byte per character, in which case this would not be an issue. And in the classroom, you're just going to confuse students by telling them that UTF-8 --[Unicode codec]-- Python string is decoding but UTF-8 --[base64 codec]-- Python string is encoding, when MAL is telling them that -- Python string is always decoding. Which is why I think that only *unicode* codings should be available through the .encode and .decode interface. Or alternatively there should be something more explicit like .unicode_encode and .unicode_decode that is thus restricted. Also, if most unicode coding is done in the I/O objects, there will be far less need for programmers to do explicit unicode coding in the first place, so likely it will become more of an advanced topic, rather than something you need to come to grips with on day one of using unicode, like it is now. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
Raymond Hettinger wrote: Aside: Why on_missing() is an oddball among dict methods. When teaching dicts to beginner, all the methods are easily explainable ex- cept this one. You don't call this method directly, you only use it when subclassing, you have to override it to do anything useful, it hooks KeyError but only when raised by __getitem__ and not other methods, etc. agreed. My recommendation: Dump the on_missing() hook. That leaves the dict API unmolested and allows a more straight-forward im- plementation/explanation of collections.default_dict or whatever it ends-up being named. The result is delightfully simple and easy to understand/explain. agreed. a separate type in collections, a template object (or factory) passed to the constructor, and implementation inheritance, is more than good en- ough. and if I recall correctly, pretty much what Guido first proposed. I trust his intuition a lot more than I trust the design-by-committee-with- out-use-cases process. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
Raymond Hettinger wrote: I'm concerned that the on_missing() part of the proposal is gratuitous. I second all that. A clear case of YAGNI. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Get used to it. In your example, the results are correct. The anything class can be viewed as either a sequence or a mapping. In this and other posts, you seem to be focusing your design around notions of strong typing and mandatory interfaces. I would suggest that that approach is futile unless you control all of the code being run. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Fuzzyman wrote: Hello all, Feel free to shoot this down, but a suggestion. The operator module defines two functions : isMappingType isSquenceType These return a guesstimation as to whether an object passed in supports the mapping and sequence protocols. These protocols are loosely defined. Any object which has a ``__getitem__`` method defined could support either protocol. The docs contain clear warnings about that. I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. I have no problems deprecating them since I've never used one of these functions. If I want to know if something is a string I use isinstance(), for string-like objects I would use try: obj + except TypeError: and so on. An object prima facie supports the mapping protocol if it defines a ``__getitem__`` method, and a ``keys`` method. An object prima facie supports the sequence protocol if it defines a ``__getitem__`` method, and *not* a ``keys`` method. As a result code which needs to be able to tell the difference can use these functions and can sensibly refer to the definition of the mapping and sequence protocols when documenting what sort of objects an API call can accept. Thomas ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Raymond Hettinger wrote: from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. But as far as I can tell (and I may be wrong), they only work if the object is a subclass of a built in type, otherwise they're broken. So you'd have to do a type check as well, unless you document that an API call *only* works with a builtin type or subclass. In which case - an isinstance call does the same, with the advantage of not being broken if the object is a user-defined class. At the very least the function would be better renamed ``MightBeMappingType`` ;-) Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Get used to it. I have no problem with it - it's useful. In your example, the results are correct. The anything class can be viewed as either a sequence or a mapping. But in practise an object is *unlikely* to be both. (Although conceivable a mapping container *could* implement integer indexing an thus be both - but *very* rare). Therefore the current behaviour is not really useful in any conceivable situation - not that I can think of anyway. In this and other posts, you seem to be focusing your design around notions of strong typing and mandatory interfaces. I would suggest that that approach is futile unless you control all of the code being run. Not directly. I'm suggesting that the loosely defined protocol (used with duck typing) can be made quite a bit more useful by making the definition *slightly* more specific. A preference for strong typing would require subclassing, surely ? The approach I suggest would allow a *less* 'strongly typed' approach to code, because it establishes a convention to decide whether a user defined class supports the mapping and sequence protocols. The simple alternative (which we took in ConfigObj) is to require a 'strongly typed' interface, because there is currently no useful way to determine whether an object that implements __getitem__ supports mapping or sequence. (Other than *assuming* that a mapping container implements a random choice from the other common mapping methods.) All the best, Michael Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buildbot vs. Windows
Martin v. Löwis [EMAIL PROTECTED] writes: Tim Peters wrote: Speaking of which, a number of test failures over the past few weeks were provoked here only under -r (run tests in random order) or under a debug build, and didn't look like those were specific to Windows. Adding -r to the buildbot test recipe is a decent idea. Getting _some_ debug-build test runs would also be good (or do we do that already?). So what is your recipe: Add -r to all buildbots? Only to those which have an 'a' in their name? Only to every third build? Duplicating the number of builders? Same question for --with-pydebug. Combining this with -r would multiply the number of builders by 4 already. Instead of running release and debug builds, why not just run debug builds? They catch more problems, earlier. Cheers, mwh -- This song is for anyone ... fuck it. Shut up and listen. -- Eminem, The Way I Am ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
[Alex] I'd love to remove setdefault in 3.0 -- but I don't think it can be done before that: default_factory won't cover the occasional use cases where setdefault is called with different defaults at different locations, and, rare as those cases may be, any 2.* should not break any existing code that uses that approach. I'm not too concerned about this one. Whenever setdefault gets deprecated , then ALL code that used it would have to be changed. If there were cases with different defaults, a regular try/except would do the job just fine (heck, it might even be faster because the won't be a wasted instantiation in the cases where the key already exists). There may be other reasons to delay removing setdefault(), but multiple default use case isn't one of them. An alternative is to have two possible attributes: d.default_factory = list or d.default_value = 0 with an exception being raised when both are defined (the test is done when the attribute is created, not when the lookup is performed). I see default_value as a way to get exactly the same beginner's error we already have with function defaults: That makes sense. I'm somewhat happy with the patch as it stands now. The only part that needs serious rethinking is putting on_missing() in regular dicts. See my other email on that subject. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Copying zlib compression objects
On 2/17/06, Guido van Rossum [EMAIL PROTECTED] wrote: Please submit your patch to SourceForge.I've submitted the zlib patch as patch #1435422. I added some test cases to test_zlib.py and documented the new methods. I'd like to test my gzip / tarfile changes more before creating a patch for it, but I'm interested in any feedback about the idea of adding snapshot() / restore() methods to the GzipFile and TarFile classes. It doesn't look like the underlying bz2 library supports copying compression / decompression streams, so for now it's impossible to make corresponding changes to the bz2 module.I also noticed that the tarfile reimplements the gzip file format when dealing with streams. Would it make sense to refactor some the gzip.py code to expose the methods that read/write the gzip file header, and have the tarfile module use those methods?Cheers,Chris ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
On Feb 22, 2006, at 7:21 AM, Raymond Hettinger wrote: ... I'm somewhat happy with the patch as it stands now. The only part that needs serious rethinking is putting on_missing() in regular dicts. See my other email on that subject. What if we named it _on_missing? Hook methods intended only to be overridden in subclasses are sometimes spelled that way, and it removes the need to teach about it to beginners -- it looks private so we don't explain it at that point. My favorite example is Queue.Queue: I teach it (and in fact evangelize for it as the one sane way to do threading;-) in Python 101, *without* ever mentioning _get, _put etc -- THOSE I teach in Patterns with Python as the very bext example of the Gof4's classic Template Method design pattern. If dict had _on_missing I'd have another wonderful example to teach from! (I believe the Library Reference avoids teaching about _get, _put etc, too, though I haven't checked it for a while). TM is my favorite DP, so I'm biased in favor of Guido's design, and I think that by giving the hook method (not meant to be called, only overridden) a private name we're meeting enough of your and /F's concerns to let _on_missing remain. Its existence does simplify the implementation of defaultdict (and some other dict subclasses), and if the implementation is easy to explain, it may be a good idea, after all;-) Alex ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote: I'm concerned that the on_missing() part of the proposal is gratuitous. The main use cases for defaultdict have a simple factory that supplies a zero, empty list, or empty set. The on_missing() hook is only there to support the rarer case of needing a key to compute a default value. The hook is not needed for the main use cases. The on_missing() hook is there to take the action of inserting the default value into the dict. For this it needs the key. It seems attractive to collaps default_factory and on_missing into a single attribute (my first attempt did this, and I was halfway posting about it before I realized the mistake). But on_missing() really needs the key, and at the same time you don't want to lose the convenience of being able to specify set, list, int etc. as default factories, so default_factory() must be called without the key. If you don't have on_missing, then the functionality of inserting the key produced by default_factory would have to be in-lined in __getitem__, which means the machinery put in place can't be reused for other use cases -- several people have claimed to have a use case for returning a value *without* inserting it into the dict. As it stands, we're adding a method to regular dicts that cannot be usefully called directly. Essentially, it is a framework method meant to be overridden in a subclass. So, it only makes sense in the context of subclassing. In the meantime, we've added an oddball method to the main dict API, arguably the most important object API in Python. Which to me actually means it's a *good* place to put the hook functionality, since it allows for maximum reuse. To use the hook, you write something like this: class D(dict): def on_missing(self, key): return somefunc(key) Or, more likely, def on_missing(key): self[key] = value = somefunc() return value However, we can already do something like that without the hook: class D(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: self[key] = value = somefunc(key) return value The latter form is already possible, doesn't require modifying a basic API, and is arguably clearer about when it is called and what it does (the former doesn't explicitly show that the returned value gets saved in the dictionary). This is exactly what Google's internal DefaultDict does. But it is also its downfall, because now *all* __getitem__ calls are weighed down by going through Python code; in a particular case that came up at Google I had to recommend against using it for performance reasons. Since we can already do the latter form, we can get some insight into whether the need has ever actually arisen in real code. I scanned the usual sources (my own code, the standard library, and my most commonly used third-party libraries) and found no instances of code like that. The closest approximation was safe_substitute() in string.Template where missing keys returned themselves as a default value. Other than that, I conclude that there isn't sufficient need to warrant adding a funky method to the API for regular dicts. In this case I don't believe that the absence of real-life examples says much (and BTW Google's DefaultDict *is* such a real life example; it is used in other code). There is not much incentive for subclassing dict and overriding __getitem__ if the alternative is that in a few places you have to write two lines of code instead of one: if key not in d: d[key] = set()# this line would be unneeded d[key].add(value) I wondered why the safe_substitute() example was unique. I think the answer is that we normally handle default computations through simple in-line code (if k in d: do1() else do2() or a try/except pair). Overriding on_missing() then is really only useful when you need to create a type that can be passed to a client function that was expecting a regular dictionary. So it does come-up but not much. I think the pattern hasn't been commonly known; people have been struggling with setdefault() all these years. Aside: Why on_missing() is an oddball among dict methods. When teaching dicts to beginner, all the methods are easily explainable except this one. You don't seriously teach beginners all dict methods do you? setdefault(), update(), copy() are all advanced material, and so are iteritems(), itervalues() and iterkeys() (*especially* the last since it's redundant through for i in d:). You don't call this method directly, you only use it when subclassing, you have to override it to do anything useful, it hooks KeyError but only when raised by __getitem__ and not other methods, etc. The only other methods that raise KeyError are __delitem__, pop() and popitem(). I don't see how these could use the same hook as
Re: [Python-Dev] bytes.from_hex()
On Feb 22, 2006, at 6:35 AM, Greg Ewing wrote: I'm thinking of convenience, too. Keep in mind that in Py3k, 'unicode' will be called 'str' (or something equally neutral like 'text') and you will rarely have to deal explicitly with unicode codings, this being done mostly for you by the I/O objects. So most of the time, using base64 will be just as convenient as it is today: base64_encode(my_bytes) and write the result out somewhere. The reason I say it's *corrrect* is that if you go straight from bytes to bytes, you're *assuming* the eventual encoding is going to be an ascii superset. The programmer is going to have to know about this assumption and understand all its consequences and decide whether it's right, and if not, do something to change it. Whereas if the result is text, the right thing happens automatically whatever the ultimate encoding turns out to be. You can take the text from your base64 encoding, combine it with other text from any other source to form a complete mail message or xml document or whatever, and write it out through a file object that's using any unicode encoding at all, and the result will be correct. This makes little sense for mail. You combine *bytes*, in various and possibly different encodings to form a mail message. Some MIME sections might have a base64 Content-Transfer-Encoding, others might be 8bit encoded, others might be 7bit encoded, others might be quoted- printable encoded. Before the C-T-E encoding, you will have had to do the Content-Type encoding, coverting your text into bytes with the desired character encoding: utf-8, iso-8859-1, etc. Having the final mail message be made up of characters, right before transmission to the socket would be crazy. James ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
[Guido van Rossum] If we removed on_missing() from dict, we'd have to override __getitem__ in defaultdict (regardless of whether we give defaultdict an on_missing() hook or in-line it). You have another option. Keep your current modifications to dict.__getitem__ but do not include dict.on_missing(). Let it only be called in a subclass IF it is defined; otherwise, raise KeyError. That keeps me happy since the basic dict API won't show on_missing(), but it still allows a user to attach an on_missing method to a dict subclass when or if needed. I think all your test cases would still pass without modification. This is approach is not much different than for other magic methods which kick-in if defined or revert to a default behavior if not. My core concern is to keep the dict API clean as a whistle. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
On Feb 22, 2006, at 4:18 AM, Fuzzyman wrote: Raymond Hettinger wrote: from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. But as far as I can tell (and I may be wrong), they only work if the object is a subclass of a built in type, otherwise they're broken. So you'd have to do a type check as well, unless you document that an API call *only* works with a builtin type or subclass. If you really cared, you could check hasattr(something, 'get') and hasattr(something, '__getitem__'), which is a pretty good indicator that it's a mapping and not a sequence (in a dict-like sense, anyway). -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Raymond Hettinger wrote: from operator import isSequenceType, isMappingType class anything(object): ... def __getitem__(self, index): ... pass ... something = anything() isMappingType(something) True isSequenceType(something) True I suggest we either deprecate these functions as worthless, *or* we define the protocols slightly more clearly for user defined classes. They are not worthless. They do a damned good job of differentiating anything that CAN be differentiated. But they are just identical...? They seem terribly pointless to me. Deprecation is one option, of course. I think Michael's suggestion also makes sense. *If* we distinguish between sequences and mapping types with two functions, *then* those two functions should be distinct. It seems kind of obvious, doesn't it? I think hasattr(obj, 'keys') is the simplest distinction of the two kinds of collections. Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Get used to it. In your example, the results are correct. The anything class can be viewed as either a sequence or a mapping. In this and other posts, you seem to be focusing your design around notions of strong typing and mandatory interfaces. I would suggest that that approach is futile unless you control all of the code being run. I think you are reading too much into it. If the functions exist, they should be useful. That's all I see in Michael's suggestion. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
On 2/22/06, Raymond Hettinger [EMAIL PROTECTED] wrote: [Guido van Rossum] If we removed on_missing() from dict, we'd have to override __getitem__ in defaultdict (regardless of whether we give defaultdict an on_missing() hook or in-line it). You have another option. Keep your current modifications to dict.__getitem__ but do not include dict.on_missing(). Let it only be called in a subclass IF it is defined; otherwise, raise KeyError. OK. I don't have time right now for another round of patches -- if you do, please go ahead. The dict docs in my latest patch must be updated somewhat (since they document on_missing()). That keeps me happy since the basic dict API won't show on_missing(), but it still allows a user to attach an on_missing method to a dict subclass when or if needed. I think all your test cases would still pass without modification. Except the ones that explicitly test for dict.on_missing()'s presence and behavior. :-) This is approach is not much different than for other magic methods which kick-in if defined or revert to a default behavior if not. Right. Plenty of precedent there. My core concern is to keep the dict API clean as a whistle. Understood. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path PEP: some comments (equality)
On 2/20/06, Mark Mc Mahon [EMAIL PROTECTED] wrote: It seems that the Path module as currently defined leaves equalitytesting up to the underlying string comparison. My guess is that thisis fine for Unix (maybe not even) but it is a bit lacking for Windows. Should the path class implement an __eq__ method that might do some ofthe following things: - Get the absolute path of both self and the other path - normcase both - now see if they are equal This has been suggested to me many times.Unfortunately, since Path is a subclass of string, this breaks stuff in weird ways.For example: 'x.py' == path('x.py') == path('X.PY') == 'X.PY', but ' x.py' != 'X.PY'And hashing needs to be consistent with __eq__: hash('x.py') == hash(path('X.PY')) == hash('X.PY') ???Granted these problems would only pop up in code where people are mixing Path and string objects. But they would cause really obscure bugs in practice, very difficult for a non-expert to figure out and fix. It's safer for Paths to behave just like strings. -j ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
At 06:14 AM 2/22/2006 -0500, Jeremy Hylton wrote: On 2/22/06, Greg Ewing [EMAIL PROTECTED] wrote: Mark Russell wrote: PEP 227 mentions using := as a rebinding operator, but rejects the idea as it would encourage the use of closures. Well, anything that facilitates rebinding in outer scopes is going to encourage the use of closures, so I can't see that as being a reason to reject a particular means of rebinding. You either think such rebinding is a good idea or not -- and that seems to be a matter of highly individual taste. At the time PEP 227 was written, nested scopes were contentious. (I recall one developer who said he'd be embarassed to tell his co-workers he worked on Python if it had this feature :-). Was this because of the implicit inheritance of variables from the enclosing scope? Rebinding was more contentious, so the feature was left out. I don't think any particular syntax or spelling for rebinding was favored more or less. On this particular idea, I tend to think it's too obscure as well. Python generally avoids attaching randomly-chosen semantics to punctuation, and I'd like to see it stay that way. I agree. Note that '.' for relative naming already exists (attribute access), and Python 2.5 is already introducing the use of a leading '.' (with no name before it) to mean parent of the current namespace. So, using that approach to reference variables in outer scopes wouldn't be without precedents. IOW, I propose no new syntax for rebinding, but instead making variables' context explicit. This would also fix the issue where right now you have to inspect a function and its context to find out whether there's a closure and what's in it. The leading dots will be quite visible. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Almann T. Goo [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] IMO, Having properly nested scopes in Python in a sense made having closures a natural idiom to the language and part of its user interface. By not allowing the name re-binding it almost seems like that user interface has a rough edge that is almost too easy to get cut on. I can see now how it would look that way to someone who has experience with fully functional nested scopes in other languages and who learns Python after no-write nested scoping was added. What is not mentioned in the ref manual and what I suppose may not be obvious even reading the PEP is that Python added nesting to solve two particular problems. First was the inability to write nested recursive functions without the hack of stuffing its name in the global namespace (or of patching the byte code). Second was the need to misuse the default arg mechanism in nested functions. What we have now pretty well fixes both. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Greg Ewing [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. In Py3k, strings which contain only ascii or latin-1 might be stored as 1 byte per character, in which case this would not be an issue. If 'might' becomes 'will', I and I suspect others will be happier with the change. And I would be happy if the choice of physical storage was pretty much handled behind the scenes, as with the direction int/long unification is going. Which is why I think that only *unicode* codings should be available through the .encode and .decode interface. Or alternatively there should be something more explicit like .unicode_encode and .unicode_decode that is thus restricted. I prefer the shorter names and using recode, for instance, for bytes to bytes. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
On 2/21/06, Phillip J. Eby [EMAIL PROTECTED] wrote: Here's a crazy idea, that AFAIK has not been suggested before and could work for both globals and closures: using a leading dot, ala the new relative import feature. e.g.: def incrementer(val): def inc(): .val += 1 return .val return inc The '.' would mean this name, but in the nearest outer scope that defines it. Note that this could include the global scope, so the 'global' keyword could go away in 2.5. And in Python 3.0, the '.' could become *required* for use in closures, so that it's not necessary for the reader to check a function's outer scope to see whether closure is taking place. EIBTI. FWIW, I think this is nice. Since it uses the same dot-notation that normal attribute access uses, it's clearly accessing the attribute of *some* namespace. It's not perfectly intuitive that the accessed namespace is the enclosing one, but I do think it's at least more intuitive than the suggested := operator, and at least as intuitive as a ``global``-like declaration. And, as you mention, it's consistent with the relative import feature. I'm a little worried that this proposal will get lost amid the mass of other suggestions being thrown out right now. Any chance of turning this into a PEP? Steve -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
[Ian Bicking] They seem terribly pointless to me. FWIW, here is the script that had I used while updating and improving the two functions (can't remember whether it was for Py2.3 or Py2.4). It lists comparative results for many different types of inputs. Since perfection was not possible, the goal was to have no false negatives and mostly accurate positives. IMO, they do a pretty good job and are able to access information in not otherwise visable to pure Python code. With respect to user defined instances, I don't care that they can't draw a distinction where none exists in the first place -- at some point you have to either fallback on duck-typing or be in control of what kind of arguments you submit to your functions. Practicality beats purity -- especially when a pure solution doesn't exist (i.e. given a user defined class that defines just __getitem__, both mapping or sequence behavior is a possibility). Analysis Script from collections import deque from UserList import UserList from UserDict import UserDict from operator import * types = (set, int, float, complex, long, bool, str, unicode, list, UserList, tuple, deque, ) for t in types: print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t) class c: def __repr__(self): return 'Instance w/o getitem' class cn(object): def __repr__(self): return 'NewStyle Instance w/o getitem' class cg: def __repr__(self): return 'Instance w getitem' def __getitem__(self): return 10 class cng(object): def __repr__(self): return 'NewStyle Instance w getitem' def __getitem__(self): return 10 def f(): return 1 def g(): yield 1 for i in (None, NotImplemented, g(), c(), cn()): print isMappingType(i), isSequenceType(i), repr(i), type(i) for i in (cg(), cng(), dict(), UserDict()): print isMappingType(i), isSequenceType(i), repr(i), type(i) Output False False set([]) type 'set' False False 0 type 'int' False False 0.0 type 'float' False False 0j type 'complex' False False 0L type 'long' False False False type 'bool' False True '' type 'str' False True u'' type 'unicode' False True [] type 'list' True True [] class UserList.UserList at 0x00F11B70 False True () type 'tuple' False True deque([]) type 'collections.deque' False False None type 'NoneType' False False NotImplemented type 'NotImplementedType' False False generator object at 0x00F230A8 type 'generator' False False Instance w/o getitem type 'instance' False False NewStyle Instance w/o getitem class '__main__.cn' True True Instance w getitem type 'instance' True True NewStyle Instance w getitem class '__main__.cng' True False {} type 'dict' True True {} type 'instance' ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
Guido van Rossen wrote: I think the pattern hasn't been commonly known; people have been struggling with setdefault() all these years. I use setdefault _only_ to speed up the following code pattern: if akey not in somedict: somedict[akey] = list() somedict[akey].append(avalue) These lines of simple Python are much easier to read and write than somedict.setdefault(akey, list()).append(avalue) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Terry Reedy wrote: Greg Ewing [EMAIL PROTECTED] wrote in message Which is why I think that only *unicode* codings should be available through the .encode and .decode interface. Or alternatively there should be something more explicit like .unicode_encode and .unicode_decode that is thus restricted. I prefer the shorter names and using recode, for instance, for bytes to bytes. While I prefer constructors with an explicit encode argument, and use a recode() method for 'like to like' coding. Then the whole encode/decode confusion goes away. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Raymond Hettinger wrote: [Ian Bicking] They seem terribly pointless to me. FWIW, here is the script that had I used while updating and improving the two functions (can't remember whether it was for Py2.3 or Py2.4). It lists comparative results for many different types of inputs. Since perfection was not possible, the goal was to have no false negatives and mostly accurate positives. IMO, they do a pretty good job and are able to access information in not otherwise visable to pure Python code. With respect to user defined instances, I don't care that they can't draw a distinction where none exists in the first place -- at some point you have to either fallback on duck-typing or be in control of what kind of arguments you submit to your functions. Practicality beats purity -- especially when a pure solution doesn't exist (i.e. given a user defined class that defines just __getitem__, both mapping or sequence behavior is a possibility). But given : True True Instance w getitem type 'instance' True True NewStyle Instance w getitem class '__main__.cng' True True [] class UserList.UserList at 0x00F11B70 True True {} type 'instance' (Last one is UserDict) I can't conceive of circumstances where this is useful without duck typing *as well*. The tests seem roughly analogous to : def isMappingType(obj): return isinstance(obj, dict) or hasattr(obj, '__getitem__') def isSequenceType(obj): return isinstance(obj, (basestring, list, tuple, collections.deque)) or hasattr(obj, '__getitem__') If you want to allow sequence access you could either just use the isinstance or you *have* to trap an exception in the case of a mapping object being passed in. Redefining (effectively) as : def isMappingType(obj): return isinstance(obj, dict) or (hasattr(obj, '__getitem__') and hasattr(obj, 'keys')) def isSequenceType(obj): return isinstance(obj, (basestring, list, tuple, collections.deque)) or (hasattr(obj, '__getitem__') and not hasattr(obj, 'keys')) Makes the test useful where you want to know you can safely treat an object as a mapping (or sequence) *and* where you want to tell the difference. The only code that would break is use of mapping objects that don't define ``keys`` and sequences that do. I imagine these must be very rare and *would* be interested in seeing real code that does break. Especially if that code cannot be trivially rewritten to use the first example. All the best, Michael Foord Analysis Script from collections import deque from UserList import UserList from UserDict import UserDict from operator import * types = (set, int, float, complex, long, bool, str, unicode, list, UserList, tuple, deque, ) for t in types: print isMappingType(t()), isSequenceType(t()), repr(t()), repr(t) class c: def __repr__(self): return 'Instance w/o getitem' class cn(object): def __repr__(self): return 'NewStyle Instance w/o getitem' class cg: def __repr__(self): return 'Instance w getitem' def __getitem__(self): return 10 class cng(object): def __repr__(self): return 'NewStyle Instance w getitem' def __getitem__(self): return 10 def f(): return 1 def g(): yield 1 for i in (None, NotImplemented, g(), c(), cn()): print isMappingType(i), isSequenceType(i), repr(i), type(i) for i in (cg(), cng(), dict(), UserDict()): print isMappingType(i), isSequenceType(i), repr(i), type(i) Output False False set([]) type 'set' False False 0 type 'int' False False 0.0 type 'float' False False 0j type 'complex' False False 0L type 'long' False False False type 'bool' False True '' type 'str' False True u'' type 'unicode' False True [] type 'list' True True [] class UserList.UserList at 0x00F11B70 False True () type 'tuple' False True deque([]) type 'collections.deque' False False None type 'NoneType' False False NotImplemented type 'NotImplementedType' False False generator object at 0x00F230A8 type 'generator' False False Instance w/o getitem type 'instance' False False NewStyle Instance w/o getitem class '__main__.cn' True True Instance w getitem type 'instance' True True NewStyle Instance w getitem class '__main__.cng' True False {} type 'dict' True True {} type 'instance' ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
A minor related point about on_missing(): Haven't we learned from regrets over the .next() method of iterators that all magically invoked methods should be named using the __xxx__ pattern? Shouldn't it be named __on_missing__() instead? -- Michael Chermside ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
But given : True True Instance w getitem type 'instance' True True NewStyle Instance w getitem class '__main__.cng' True True [] class UserList.UserList at 0x00F11B70 True True {} type 'instance' (Last one is UserDict) I can't conceive of circumstances where this is useful without duck typing *as well*. Yawn. Give it up. For user defined instances, these functions can only discriminate between the presence or absence of __getitem__. If you're trying to distinguish between sequences and mappings for instances, you're own your own with duck-typing. Since there is no mandatory mapping or sequence API, the operator module functions cannot add more checks without getting some false negatives (your original example is a case in point). Use the function as-is and add your own isinstance checks for your own personal definition of what makes a mapping a mapping and what makes a sequence a sequence. Or better yet, stop designing APIs that require you to differentiate things that aren't really different ;-) Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Almann T. Goo wrote: As far as I remember, Guido wasn't particularly opposed to the idea, but the discussion fizzled out after having failed to reach a consensus on an obviously right way to go about it. My apologies for bringing this debated topic again to the front-lines--that said, I think there has been good, constructive things said again and sometimes it doesn't hurt to kick up an old topic. After pouring through some of the list archive threads and reading through this thread, it seems clear to me that the community doesn't seem all that keen on fixing issue--which was my goal to ferret out. For me this is one of those things where the Pythonic thing to do is not so clear--and that mysterious, enigmatic definition of what it means to be Pythonic can be quite individual so I definitely don't want to waste my time arguing what that means. The most compelling argument for not doing anything about it is that the use cases are probably not that many--that in itself makes me less apt to push much harder--especially since my pragmatic side agrees with a lot of what has been said to this regard. IMO, Having properly nested scopes in Python in a sense made having closures a natural idiom to the language and part of its user interface. By not allowing the name re-binding it almost seems like that user interface has a rough edge that is almost too easy to get cut on. This in-elegance seems very un-Pythonic to me. If you are looking for rough edges about nested scopes in Python this is probably worse: x = [] for i in range(10): ... x.append(lambda : i) ... [y() for y in x] [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] although experienced people can live with it. The fact is that importing nested scope from the like of Scheme it was not considered that in Scheme for example, looping constructs introduce new scopes. So this work more as expected there. There were long threads about this at some point too. Idioms and features mostly never port straightforwardly from language to language. For example Python has nothing with the explicit context introduction and grouping of a Scheme 'let', so is arguable that nested scope code, especially with rebindings, would be less clear, readable than in Scheme (tastes in parenthesis kept aside). Anyhow, good discussion. Cheers, Almann -- Almann T. Goo [EMAIL PROTECTED] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 358 (bytes type) comments
First off, thanks to Neil for writing this all down. The whole thread of discussion on the bytes type was rather long and thus hard to follow. Nice to finally have it written down in a PEP. Anyway, a few comments on the PEP. One, should the hex() method instead be an attribute, implemented as a property? Seems like static data that is entirely based on the value of the bytes object and thus is not properly represented by a method. Next, why are the __*slice__ methods to be defined? Docs say they are deprecated. And for the open-ended questions, I don't think sort() is needed. Lastly, maybe I am just dense, but it took me a second to realize that it will most likely return the ASCII string for __str__() for use in something like socket.send(), but it isn't explicitly stated anywhere. There is a chance someone might think that __str__ will somehow return the sequence of integers as a string does exist. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Greg Ewing wrote: Jeremy Hylton wrote: The names of naming statements are quite hard to get right, I fear. My vote goes for 'outer'. And if this gets accepted, remove 'global' in 3.0. In 3.0 we could remove 'global' even without 'outer', and make module global scopes read-only, not rebindable after the top-level code has run (i.e. more like function body scopes). The only free-for-all namespaces would be class and instance ones. I can think of some gains from this. .3 wink ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: The bytes object
On Thu, Feb 16, 2006 at 12:47:22PM -0800, Guido van Rossum wrote: BTW, for folks who want to experiment, it's quite simple to create a working bytes implementation by inheriting from array.array. Here's a quick draft (which only takes str instance arguments): Here's a more complete prototype. Also, I checked in the PEP as #358 after making changes suggested by Guido. Neil import sys from array import array import re import binascii class bytes(array): __slots__ = [] def __new__(cls, initialiser=None, encoding=None): b = array.__new__(cls, B) if isinstance(initialiser, basestring): if isinstance(initialiser, unicode): if encoding is None: encoding = sys.getdefaultencoding() initialiser = initialiser.encode(encoding) initialiser = [ord(c) for c in initialiser] elif encoding is not None: raise TypeError(explicit encoding invalid for non-string initialiser) b.extend(initialiser) return b @classmethod def fromhex(self, data): data = re.sub(r'\s+', '', data) return bytes(binascii.unhexlify(data)) def __str__(self): return self.tostring() def __repr__(self): return bytes(%r) % self.tolist() def __add__(self, other): if isinstance(other, array): return bytes(super(bytes, self).__add__(other)) return NotImplemented def __mul__(self, n): return bytes(super(bytes, self).__mul__(n)) __rmul__ = __mul__ def __getslice__(self, i, j): return bytes(super(bytes, self).__getslice__(i, j)) def hex(self): return binascii.hexlify((self.tostring())) def decode(self, encoding): return self.tostring().decode(encoding) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] release plan for 2.5 ?
On Sunday 12 February 2006 21:51, Thomas Wouters wrote: Well, in the past, features -- even syntax changes -- have gone in between the last beta and the final release (but reminding Guido might bring him to tears of regret. ;) Features have also gone into what would have been 'bugfix releases' if you looked at the numbering alone (1.5 - 1.5.1 - 1.5.2, for instance.) The past doesn't have a very impressive track record... *cough* Go on. Try slipping a feature into a bugfix release now, see how loudly you can make an Australian swear... See also PEP 006. Do I need to add a bad language caveat in it? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 358 (bytes type) comments
On Feb 22, 2006, at 1:22 PM, Brett Cannon wrote: First off, thanks to Neil for writing this all down. The whole thread of discussion on the bytes type was rather long and thus hard to follow. Nice to finally have it written down in a PEP. Anyway, a few comments on the PEP. One, should the hex() method instead be an attribute, implemented as a property? Seems like static data that is entirely based on the value of the bytes object and thus is not properly represented by a method. Next, why are the __*slice__ methods to be defined? Docs say they are deprecated. And for the open-ended questions, I don't think sort() is needed. sort would be totally useless for bytes. array.array doesn't have sort either. Lastly, maybe I am just dense, but it took me a second to realize that it will most likely return the ASCII string for __str__() for use in something like socket.send(), but it isn't explicitly stated anywhere. There is a chance someone might think that __str__ will somehow return the sequence of integers as a string does exist. That would be a bad idea given that bytes are supposed make the str type go away. It's probably better to make __str__ return __repr__ like it does for most types. If bytes type supports the buffer API (one would hope so), functions like socket.send should do the right thing as-is. http://docs.python.org/api/bufferObjects.html -bob ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] release plan for 2.5 ?
On Thursday 23 February 2006 09:19, Guido van Rossum wrote: However the definition of feature vs. bugfix isn't always crystal clear. Some things that went into 2.4 recently felt like small features to me; but others may disagree: - fixing chunk.py to allow chunk size to be 2GB - supporting Unicode filenames in fileinput.py Are these features or bugfixes? Sure, the line isn't so clear sometimes. I consider both of these bugfixes, but others could disagree. True/False, on the other hand, I don't think anyone disagrees about wink/duck This stuff is always open for discussion, of course. Anthony -- Anthony Baxter [EMAIL PROTECTED] It's never too late to have a happy childhood. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Raymond Hettinger wrote: Your example simply highlights the consequences of one of Python's most basic, original design choices (using getitem for both sequences and mappings). That choice is now so fundamental to the language that it cannot possibly change. Hmm - just a thought ... Since we're adding the __index__ magic method, why not have a __getindexed__ method for sequences. Then semantics of indexing operations would be something like: if hasattr(obj, '__getindexed__'): return obj.__getindexed__(val.__index__()) else: return obj.__getitem__(val) Similarly __setindexed__ and __delindexed__. This would allow distinguishing between sequences and mappings in a fairly backwards-compatible way. It would also enforce that only indexes can be used for sequences. The backwards-incompatibility comes in when you have a type that implements __getindexed__, and a subclass that implements __getitem__ e.g. if `list` implemented __getindexed__ then any `list` subclass that overrode __getitem__ would fail. However, I think we could make it 100% backwards-compatible for the builtin sequence types if they just had __getindexed__ delegate to __getitem__. Effectively: class list (object): def __getindexed__(self, index): return self.__getitem__(index) Tim Delaney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] buildbot, and test failures
It took 2 hours, but I caught up on Python-dev email. Hoorah. So, couple of things - the trunk has test failures for me, right now. test test_email failed -- Traceback (most recent call last): File /home/anthony/src/py/pytrunk/python/Lib/email/test/test_email.py, line 2111, in test_parsedate_acceptable_to_time_functions eq(time.localtime(t)[:6], timetup[:6]) AssertionError: (2003, 2, 5, 14, 47, 26) != (2003, 2, 5, 13, 47, 26) Right now, Australia's in daylight savings, I suspect that's the problem here. I also see intermittent failures from test_socketserver: test_socketserver test test_socketserver crashed -- socket.error: (111, 'Connection refused') is the only error message. When it fails, regrtest fails to exit - it just sits there after printing out the summary. This suggests that there's a threaded server not getting cleaned up correctly. test_socketserver could probably do with a rewrite. Who's the person who hands out buildbot username/password pairs? I have an Ubuntu x86 box here that can become one (I think the only linux, currently, is Gentoo...) Anthony -- Anthony Baxter [EMAIL PROTECTED] It's never too late to have a happy childhood. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buildbot vs. Windows
[Neal Norwitz] ... I also think I know how to do the double builds (one release and one debug). But it's too late for me to change it tonight without screwing it up. I'm not mad :-). The debug build is more fruitful than the release build for finding problems, so doing two debug-build runs is an improvement (keeping in mind that some bugs only show up in release builds, though -- for example, subtly incorrect C code that works differently depending on whether compiler optimization is in effect). The good/bad news after this change is: http://www.python.org/dev/buildbot/all/g4%20osx.4%20trunk/builds/145/step-test/0 A seg fault on Mac OS when running with -r. :-( Yay! That's certainly good/bad news. Since I always run with -r, I've had the fun of tracking most of these down. Sometimes it's very hard, sometimes not. regrtest's -f option is usually needed, to force running the tests in exactly the same order, then commenting test names out in binary-search fashion to get a minimal subset. Alas, half the time the cause for a -r segfault turns out to be an error in refcounting or in setting up gc'able containers, and has nothing in particular to do with the specific tests being run. Those are the very hard ones ;-) Setting the gc threshold to 1 (do a full collection on every allocation) can sometimes provoke such problems easily. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Josiah Carlson wrote: However, I believe global was and is necessary for the same reasons for globals in any other language. Oddly, in Python, 'global' isn't actually necessary, since the module can always import itself and use attribute access. Clearly, though, Guido must have thought at the time that it was worth providing an alternative way. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict proposal round three
Fredrik Lundh wrote: fwiw, the first google hit for autodict appears to be part of someone's link farm At this website we have assistance with autodict. In addition to information for autodict we also have the best web sites concerning dictionary, non profit and new york. Hmmm, looks like some sort of bot that takes the words in your search and stuffs them into its response. I wonder if they realise how silly the results end up sounding? I've seen these sorts of things before, but I haven't quite figured out yet how they manage to get into Google's database if they're auto-generated. Anyone have any clues what goes on? -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Terry Reedy wrote: Greg Ewing [EMAIL PROTECTED] wrote in message Efficiency is an implementation concern. It is also a user concern, especially if inefficiency overruns memory limits. Sure, but what I mean is that it's better to find what's conceptually right and then look for an efficient way of implementing it, rather than letting the implementation drive the design. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
On 22-Feb-06, at 9:28 PM, [EMAIL PROTECTED] wrote:On 21-Feb-06, at 11:21 AM, Almann T. Goo" [EMAIL PROTECTED] wrote:Why not just use a class?def incgen(start=0, inc=1) : class incrementer(object): a = start - inc def __call__(self): self.a += inc return self.a return incrementer()a = incgen(7, 5)for n in range(10): print a(),Because I think that this is a workaround for a concept that thelanguage doesn't support elegantly with its lexically nested scopes.IMO, you are emulating name rebinding in a closure by creating anobject to encapsulate the name you want to rebind--you don't need thisworkaround if you only need to access free variables in an enclosingscope. I provided a "lighter" example that didn't need a callableobject but could use any mutable such as a list.This kind of workaround is needed as soon as you want to re-bind aparent scope's name, except in the case when the parent scope is theglobal scope (since there is the "global" keyword to handle this). It's this dichotomy that concerns me, since it seems to be against theelegance of Python--at least in my opinion.It seems artificially limiting that enclosing scope name rebinds arenot provided for by the language especially since the behavior withthe global scope is not so. In a nutshell I am proposing a solutionto make nested lexical scopes to be orthogonal with the global scopeand removing a "wart," as Jeremy put it, in the language.-Almann--Almann T. Goo[EMAIL PROTECTED]If I may be so bold, couldn't this be addressed by introducing a "rebinding" operator? So the ' = ' operator would continue to create a new name in the current scope, and the (say) ' := ' operator would for an existing name to rebind. The two operators would highlight the special way Python handles variable / name assignment, which many newbies miss.(from someone who was surprised by this quirk of Python before: http://www.thescripts.com/forum/thread43418.html) -Brendan--Brendan SimonsSorry, this got hung up in my email outbox. I see the thread has touched on this idea in the meantime. So, yeah. Go team. Brendan--Brendan Simons___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
Steven Bethard wrote: And, as you mention, it's consistent with the relative import feature. Only rather vaguely -- it's really somewhat different. With imports, .foo is an abbreviation for myself.foo, where myself is the absolute name for the current module, and you could replace all instances of .foo with that. But in the suggested scheme, .foo wouldn't have any such interpretation -- there would be no other way of spelling it. Also, with imports, the dot refers to a single well- defined point in the module-name hierarchy, but here it would imply a search upwards throught the scope hierarchy. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Ron Adam wrote: While I prefer constructors with an explicit encode argument, and use a recode() method for 'like to like' coding. Then the whole encode/decode confusion goes away. I'd be happy with that, too. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
At 03:49 PM 2/23/2006 +1300, Greg Ewing wrote: Steven Bethard wrote: And, as you mention, it's consistent with the relative import feature. Only rather vaguely -- it's really somewhat different. With imports, .foo is an abbreviation for myself.foo, where myself is the absolute name for the current module, and you could replace all instances of .foo with that. Actually, import .foo is an abbreviation for import myparent.foo, not import myparent.myself.foo. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] getdefault(), the real replacement for setdefault()
Guido's on_missing() proposal is pretty good for what it is, but it is not a replacement for set_default(). The use cases for a derivable, definition or instantiation time framework is different than the call-site based decision being made with setdefault(). The difference is that in the former case, the class designer or instantiator gets to decide what the default is, and in the latter (i.e. current) case, the user gets to decide. Going back to first principles, the two biggest problems with today's setdefault() is 1) the default object gets instantiated whether you need it or not, and 2) the idiom is not very readable. To directly address these two problems, I propose a new method called getdefault() with the following signature: def getdefault(self, key, factory) This yields the following idiom: d.getdefault('foo', list).append('bar') Clearly this completely addresses problem #1. The implementation is simple and obvious, and there's no default object instantiated unless the key is missing. I think #2 is addressed nicely too because getdefault() shifts the focus on what the method returns rather than the effect of the method on the target dict. Perhaps that's enough to make the chained operation on the returned value feel more natural. getdefault() also looks more like get() so maybe that helps it be less jarring. This approach also seems to address Raymond's objections because getdefault() isn't special the way on_missing() would be. Anyway, I don't think it's an either/or choice with Guido's subclass. Instead I think they are different use cases. I would add getdefault() to the standard dict API, remove (eventually) setdefault(), and add Guido's subclass in a separate module. But I /wouldn't/ clutter the built-in dict's API with on_missing(). -Barry P.S. _missing = object() def getdefault(self, key, factory): value = self.get(key, _missing) if value is _missing: value = self[key] = factory() return value signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
Steven Bethard wrote: And, as you mention, it's consistent with the relative import feature. Greg Ewing wrote: With imports, .foo is an abbreviation for myself.foo, where myself is the absolute name for the current module, and you could replace all instances of .foo with that. Phillip J. Eby wrote: Actually, import .foo is an abbreviation for import myparent.foo, not import myparent.myself.foo. If we wanted to be fully consistent with the relative import mechanism, we would require as many dots as nested scopes. So: def incrementer(val): def inc(): .val += 1 return .val return inc but also: def incrementer_getter(val): def incrementer(): def inc(): ..val += 1 return ..val return inc return incrementer (Yes, I know the example is silly. It's not meant as a use case, just to demonstrate the usage of dots.) I actually don't care which way it goes here, but if you want to make the semantics as close to the relative import semantics as possible, then this is the way to go. STeVe -- Grammar am for people who can't think for myself. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
Steven Bethard wrote: Phillip J. Eby wrote: Actually, import .foo is an abbreviation for import myparent.foo, not import myparent.myself.foo. Oops, sorry, you're right. s/myself/myparent/g -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buildbot, and test failures
Anthony Baxter wrote: Who's the person who hands out buildbot username/password pairs? That's me. I have an Ubuntu x86 box here that can become one (I think the only linux, currently, is Gentoo...) How different are the Linuxes, though? How many of them do we need? Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
James Y Knight wrote: Some MIME sections might have a base64 Content-Transfer-Encoding, others might be 8bit encoded, others might be 7bit encoded, others might be quoted- printable encoded. I stand corrected -- in that situation you would have to encode the characters before combining them with other material. However, this doesn't change my view that the result of base64 encoding by itself is characters, not bytes. To go straight to bytes would require assuming an encoding, and that would make it *harder* to use in cases where you wanted a different encoding, because you'd first have to undo the default encoding and then re-encode it using the one you wanted. It may be reasonable to provide an easy way to go straight from raw bytes to ascii-encoded-base64 bytes, but that should be a different codec. The plain base64 codec should produce text. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Samuele Pedroni wrote: If you are looking for rough edges about nested scopes in Python this is probably worse: x = [] for i in range(10): ... x.append(lambda : i) ... [y() for y in x] [9, 9, 9, 9, 9, 9, 9, 9, 9, 9] As an aside, is there any chance that this could be changed in 3.0? I.e. have the for-loop create a new binding for the loop variable on each iteration. I know Guido seems to be attached to the idea of being able to use the value of the loop variable after the loop exits, but I find that to be a dubious practice readability-wise, and I can't remember ever using it. There are other ways of getting the same effect, e.g. assigning it to another variable before breaking out of the loop, or putting the loop in a function and using return. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Delaney, Timothy (Tim) wrote: Since we're adding the __index__ magic method, why not have a __getindexed__ method for sequences. I don't think this is a good idea, since it would be re-introducing all the confusion that the existence of two C-level indexing slots has led to, this time for user-defined types. The backwards-incompatibility comes in when you have a type that implements __getindexed__, and a subclass that implements __getitem__ I don't think this is just a backwards-incompatibility issue. Having a single syntax that can correspond to more than one special method is inherently ambiguous. What do you do if both are defined? Sure you can come up with some rule to handle it, but it's better to avoid the situation in the first place. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP for Better Control of Nested Lexical Scopes
Almann T. Goo wrote: (although rebinding a name in the global scope from a local scope is really just a specific case of that). That's what rankles people about this, I think -- there doesn't seem to be a good reason for treating the global scope so specially, given that all scopes could be treated uniformly if only there were an 'outer' statement. All the arguments I've seen in favour of the status quo seem like rationalisations after the fact. Since there were no nested lexical scopes back then, there was no need to have a construct for arbitrary enclosing scopes. However, if nested scopes *had* existed back then, I rather suspect we would have had an 'outer' statement from the beginning, or else 'global' would have been given the semantics we are now considering for 'outer'. Of all the suggestions so far, it seems to me that 'outer' is the least radical and most consistent with what we already have. How about we bung it in and see how it goes? We can always yank it out in 3.0 if it turns out to be a horrid mistake and we get swamped with a terabyte of grievously abusive nested scope code. :-) -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path PEP: some comments (equality)
Mark Mc Mahon wrote: Should the path class implement an __eq__ method that might do some of the following things: - Get the absolute path of both self and the other path I don't think that any path operations should implicitly touch the file system like this. The paths may not represent real files or may be for a system other than the one the program is running on. - normcase both Not sure about this one either. When dealing with remote file systems, it can be hard to know whether a path will be interpreted as case-sensitive or not. This can be a problem even with local filesystems, e.g. on MacOSX where you can have both HFS (case-insensitive) and Unix (case-sensitive) filesystems mounted. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] operator.is*Type
Fuzzyman wrote: The operator module defines two functions : isMappingType isSquenceType These protocols are loosely defined. Any object which has a ``__getitem__`` method defined could support either protocol. These functions are actually testing for the presence of two different __getitem__ methods at the C level, one in the mapping substructure of the type object, and the other in the sequence substructure. This only works for types implemented in C which make use of this distinction. It's not much use for user-defined classes, where the presence of a __getitem__ method causes both of these slots to become populated. Having two different slots for __getitem__ seems to have been an ill-considered feature in the first place and would probably best be removed in 3.0. I wouldn't mind if these two functions went away. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | Carpe post meridiam! | Christchurch, New Zealand | (I'm not a morning person.) | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Greg == Greg Ewing [EMAIL PROTECTED] writes: Greg Stephen J. Turnbull wrote: Base64 is a (family of) wire protocol(s). It's not clear to me that it makes sense to say that the alphabets used by baseNN encodings are composed of characters, Greg Take a look at [this that the other] Those references use character in an ambiguous and ill-defined way. Trying to impose Python unicode object semantics on vague characters is a bad idea IMO. Greg Which seems to make it perfectly clear that the result of Greg the encoding is to be considered as characters, which are Greg not necessarily going to be encoded using ascii. Please define character, and explain how its semantics map to Python's unicode objects. Greg So base64 on its own is *not* a wire protocol. Only after Greg encoding the characters do you have a wire protocol. No, base64 isn't a wire protocol. Rather, it's a schema for a family of wire protocols, whose alphabets are heuristically chosen on the assumption that code units which happen to correspond to alpha-numeric code points in a commonly-used coded character set are more likely to pass through a communication channel without corruption. Note that I have _precisely_ defined what I mean. You still have the problem that you haven't defined character, and that is a real problem, see below. I don't see any case for correctness here, only for convenience, Greg I'm thinking of convenience, too. Keep in mind that in Py3k, Greg 'unicode' will be called 'str' (or something equally neutral Greg like 'text') and you will rarely have to deal explicitly Greg with unicode codings, this being done mostly for you by the Greg I/O objects. So most of the time, using base64 will be just Greg as convenient as it is today: base64_encode(my_bytes) and Greg write the result out somewhere. Convenient, yes, but incorrect. Once you mix those bytes with the Python string type, they become subject to all the usual operations on characters, and there's no way for Python to tell you that you didn't want to do that. Ie, Greg Whereas if the result is text, the right thing happens Greg automatically whatever the ultimate encoding turns out to Greg be. You can take the text from your base64 encoding, combine Greg it with other text from any other source to form a complete Greg mail message or xml document or whatever, and write it out Greg through a file object that's using any unicode encoding at Greg all, and the result will be correct. Only if you do no transformations that will harm the base64-encoding. This is why I say base64 is _not_ based on characters, at least not in the way they are used in Python strings. It doesn't allow any of the usual transformations on characters that might be applied globally to a mail composition buffer, for example. In other words, you don't escape from the programmer having to know what he's doing. EIBTI, and the setup I advocate forces the programmer to explicitly decide where to convert base64 objects to a textual representation. This reminds him that he'd better not touch that text. Greg The reason I say it's *corrrect* is that if you go straight Greg from bytes to bytes, you're *assuming* the eventual encoding Greg is going to be an ascii superset. The programmer is going Greg to have to know about this assumption and understand all its Greg consequences and decide whether it's right, and if not, do Greg something to change it. I'm not assuming any such thing, except in the context of analysis of implementation efficiency. And the programmer needs to know about the semantics of text that is actually a base64-encoded object, and that they are different from string semantics. This is something that programmers are used to dealing with in the case of Python 2.x str and C char[]; the whole point of the unicode type is to allow the programmer to abstract from that when dealing human-readable text. Why confuse the issue. And in the classroom, you're just going to confuse students by telling them that UTF-8 --[Unicode codec]-- Python string is decoding but UTF-8 --[base64 codec]-- Python string is encoding, when MAL is telling them that -- Python string is always decoding. Greg Which is why I think that only *unicode* codings should be Greg available through the .encode and .decode interface. Or Greg alternatively there should be something more explicit like Greg .unicode_encode and .unicode_decode that is thus restricted. Greg Also, if most unicode coding is done in the I/O objects, Greg there will be far less need for programmers to do explicit Greg unicode coding in the first place, so likely it will become Greg more of an advanced topic, rather than something you need to Greg come to grips with on day one of using unicode, like it is Greg now. So then you bring it
Re: [Python-Dev] Unifying trace and profile
On 2/21/06, Robert Brewer [EMAIL PROTECTED] wrote: 1. Allow trace hooks to receive c_call, c_return, and c_exception events (like profile does). I can easily make this modification. You can also register the same bound method for trace and profile, which sortof eliminates this problem. 2. Allow profile hooks to receive line events (like trace does). You really don't want this in the general case. Line events make profiling *really* slow, and they're not that accurate (although many thanks to Armin last year for helping me make them much more accurate). I guess what you require is to be able to selectively turn on events, thus eliminating the notion of 'trace' or 'profile' entirely, but I don't have a good idea of how to implement that at least as efficiently as the current system at the moment - I'm sure it could be done, I just haven't put any thought into it. 3. Expose new sys.gettrace() and getprofile() methods, so trace and profile functions that want to play nice can call sys.settrace/setprofile(None) only if they are the current hook. Not a bad idea, although are you really running into this problem a lot? 4. Make the same move that sys.exitfunc - atexit made (from a single function to multiple functions via registration), so multiple tracers/profilers can play nice together. It seems very unlikely that you'll want to have a trace hook and profile hook installed at the same time, given the extreme unreliability this will introduce into the profiler. 5. Allow the core to filter on the event arg before hook(frame, event, arg) is called. What do you mean by this, exactly? How would you use this feature? 6. Unify tracing and profiling, which would remove a lot of redundant code in ceval and sysmodule and free up some space in the PyThreadState struct to boot. The more events you throw in profiling makes it slow, however. Line events, while a nice thing to have, theoretically, would probably make a profiler useless. If you want to create line-by-line timing data, we're going to have to look for a more efficient way (like sampling). 7. As if the above isn't enough of a dream, it would be nice to have a bytecode tracer, which didn't bother with the f_lineno logic in maybe_call_line_trace, but just called the hook on every instruction. I'm working on one, but given how much time I've had to work on my profiler in the last year, I'm not even going to guess when I'll get a real shot at looking at that. My long-term goal is to eliminate profiling and tracing from the core interpreter entirely and implement the functionality in such a way that they don't cost you when not in use (i.e., implement profilers and debuggers which poke into the process from the outside, rather than be supported natively through events). This isn't impossible, but it's difficult because of the large variety of platforms. I have access to most of them, but again, my time is hugely constrained right now for python development work. -- Nick ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Ron == Ron Adam [EMAIL PROTECTED] writes: Ron Terry Reedy wrote: I prefer the shorter names and using recode, for instance, for bytes to bytes. Ron While I prefer constructors with an explicit encode argument, Ron and use a recode() method for 'like to like' coding. 'Recode' is a great name for the conceptual process, but the methods are directional. Also, in internationalization work, recode strongly connotes encodingA - original - encodingB, as in iconv. I do prefer constructors, as it's generally not a good idea to do encoding/decoding in-place for human-readable text, since the codecs are often lossy. Ron Then the whole encode/decode confusion goes away. Unlikely. Errors like A string.encode(base64).encode(base64) are all too easy to commit in practice. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can do free software business; ask what your business can do for free software. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Using and binding relative names (was Re: PEP for Better Control of Nested Lexical Scopes)
If we wanted to be fully consistent with the relative import mechanism, we would require as many dots as nested scopes. At first I was a bit taken a back with the syntax, but after reading PEP 328 (re: Relative Import) I think I can stomach the syntax a bit better ; ). That said, -1 because I believe it adds more problems than the one it is designed to fix. Part of me can appreciate using the prefixing dot as a way to spell my parent's scope since it does not add a new keyword and in this regard would appear to be equally as backwards compatible as the := proposal (to which I am not a particularly big fan of either but could probably get used to it). Since the current semantics allow *evaluation* to an enclosing scope's name by an un-punctuated name, var is a synonym to .var (if var is bound in the immediately enclosing scope). However for *re-binding* to an enclosing scope's name, the punctuated name is the only one we can use, so the semantic becomes more cluttered. This can make a problem that I would say is akin to the dangling else problem. def incrementer_getter(val): def incrementer(): val = 5 def inc(): ..val += 1 return val return inc return incrementer Building on an example that Steve wrote to demonstrate the syntax proposed, you can see that if a user inadvertently uses the enclosing scope for the return instead of what would presumably be the outer most bound parameter. Now remove the binding in the incrementer function and it works the way the user probably thought. Because of this, I think by adding the dot to allow resolving a name in an explicit way hurts the language by adding a new gotcha with existing name binding semantics. I would be okay with this if all name access for enclosing scopes (binding and evaluation) required the dot syntax (as I believe Steve suggests for Python 3K)--thus keeping the semantics cleaner--but that would be incredibly backwards incompatible for what I would guess is *a lot* of code. This is where the case for the re-bind operator (i.e. :=) or an outer type keyword is stronger--the semantics in the language today are not adversely affected. -Almann -- Almann T. Goo [EMAIL PROTECTED] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com