Re: [Python-Dev] Proposal: add odict to collections
Michael Foord wrote: Armin Ronacher wrote: Hi, I noticed lately that quite a few projects are implementing their own subclasses of `dict` that retain the order of the key/value pairs. However half of the implementations I came across are not implementing the whole dict interface which leads to weird bugs, also the performance of a Python implementation is not that great. I'm +1 - but this proposal has been made many times before and people always argue about what features are needed or desirable. :-( There's been a lot of controversy/confusion about ordered dicts. One of the sources of confusion is that people mean different things when they use the term ordered dict: In some cases, the term is used to mean a dictionary that remembers the order of insertions, and in other cases it is used to mean a sorted dict, i.e. an associative data structure in which the entries are kept sorted. (And I'm not sure that those are the only two possibilities.) I would be more in favor of the idea if we could come up with a less ambiguous naming scheme. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Addition of pyprocessing module to standard lib.
multiprocessing Greg Ewing wrote: Jesse Noller wrote: I am looking for any questions, concerns or benchmarks python-dev has regarding the possible inclusion of the pyprocessing module to the standard library Sounds good, but I'd suggest giving a more specific name than processing, which is so generic as to be meaningless. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] How we can get rid of eggs for 2.6 and beyond
Phillip J. Eby wrote: At 11:21 AM 3/21/2008 -0500, [EMAIL PROTECTED] wrote: Joachim I think, the uninstall should _not_ 'rm -rf' but only 'rm' the Joachim files (and 'rmdir' directories, but not recursively) that it Joachim created, and that have not been modified in the meantime (after Joachim the installation). That's not sufficient. Suppose file C (e.g. /usr/local/etc/mime.types) is in both packages A and B. Install A - this will create C Install B - this might overwrite C, saving a copy, or it might retain A's copy. Uninstall B - this has to know that C is used by A and not touch it Correct. However, in practice, B should not touch C, unless the file is shared between them. This is a key issue for support of namespace packages, at least if we want to avoid using .pth files. (Which is what setuptools-built system packages do for namespace packages currently.) Of course, one possible solution is for both A and B to depend on a virtual package that contains C, such that both A and B can install it if it's not there, and list it in their dependencies. But this is one of the handful of open issues that needs to be resolved with Real Life Package Management people, such as Debian, Fedora, etc. I've always thought that the right way to handle the dependency DAG is to treat it as a garbage collection problem. Assume that for each package there is a way to derive the following two pieces of information: (a) whether this package was installed explicitly by the user, or implicitly as the result of a dependency, and (b) the set of dependencies for this package. Then, starting with the list of 'explicit' packages as the root set, do a standard mark sweep; Any package not marked is a candidate for removal. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Removing the GIL (Me, not you!)
Adam Olsen wrote: I'm now working on an approach that writes out refcounts in batches to reduce contention. The initial cost is much higher, but it scales better too. I've currently got it to just under 50% cost, meaning two threads is a slight net gain. http://www.research.ibm.com/people/d/dfb/publications.html Look at the various papers on 'Recycler'. The way it works is that for each thread, there is an addref buffer and a decref buffer. The buffers are arrays of pointers. Each time a reference is addref'd, its appended to the addref buffer, likewise for decref. When a buffer gets full, it is added to a queue and then a new buffer is allocated. There is a background thread that actually applies the refcounts from the buffers and frees the objects. Since this background thread is the only thread that ever touches the actual refcount field of the object, there's no need for locking. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Two spaces or one?
In PEP 9 there's a requirement that PEPs must follow the emacs convention of 2 spaces after a period. (I didn't know this was an emacs convention, I thought it was a convention of people who used typewriters.) I've tried hard to maintain this textual convention in my own PEPs, even though it's very unnatural to me. But I see from looking at the other PEPs that that this convention is very inconsistently enforced - some have it and some don't. Worse, I've had one person (who apparently wasn't aware of the rule) flag my use of extra space after a period as a bug in my PEP. (When I first learned to type, I used one space after a period. Then years later, someone convinced me that two spaces was the proper style and so I switched to that for a few years. But later I switched back because I realized that most modern typographical layout engines seem to calculate inter-sentence spacing properly when the number of space characters after a period is one. And in HTML [which is how most people view PEPs anyway] it doesn't matter since the browser is going to filter out the extra space anyway.) So if we're not going to enforce the rule consistently (and it seems as if we're not), can we then just remove it from PEP 9? I'm not saying that we should change the rule to one space, I'm suggesting that we just drop the requirement and let people use whatever they prefer. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Issues with PEP 3101 (string formatting)
I haven't responded to this thread because I was hoping some of the original proponents of the feature would come out to defend it. (Remember, 3101 is a synthesis of a lot of people's ideas gleaned from many forum postings - In some cases I am willing to defend particular aspects of the PEP, and in others I just write down what I think the general consensus is.) That being said - from what I've read so far, the evidence on both sides of the argument seems anecdotal to me. I'd rather wait and see what more people have to say on the topic. -- Talin Aurélien Campéas wrote: On Tue, Jun 19, 2007 at 08:20:25AM -0700, Guido van Rossum wrote: Those are valid concerns. I'm cross-posting this to the python-3000 list in the hope that the PEP's author and defendents can respond. I'm sure we can work something out. Thanks to raise this. It is horrible enough that I feel obliged to de-lurk. -10 on this part of PEP3101. Please keep further discussion on the [EMAIL PROTECTED] list. --Guido On 6/19/07, Chris McDonough [EMAIL PROTECTED] wrote: Wrt http://www.python.org/dev/peps/pep-3101/ PEP 3101 says Py3K should allow item and attribute access syntax within string templating expressions but to limit potential security issues, access to underscore prefixed names within attribute/item access expressions will be disallowed. People talking about potential security issues should have an obligation to show how their proposals *really* improve security (in general); this is of course, a hard thing to do; mere hand-waving is not sufficient. I am a person who has lived with the aftermath of a framework designed to prevent data access by restricting access to underscore- prefixed names (Zope 2, ahem), and I've found it's very hard to explain and justify. As a result, I feel that this is a poor default policy choice for a framework. And it's even poorer in the context of a language (for it's probably harder to escape language-level restrictions than framework obscurities ...). In some cases, underscore names must become part of an object's external interface. Consider a URL with one or more underscore- prefixed path segment elements (because prefixing a filename with an underscore is a perfectly reasonable thing to do on a filesystem, and path elements are often named after file names) fed to a traversal algorithm that attempts to resolve each path element into an object by calling __getitem__ against the parent found by the last path element's traversal result. Perhaps this is poor design and __getitem__ should not be consulted here, but I doubt that highly because there's nothing particularly special about calling a method named __getitem__ as opposed to some method named traverse. This is trying to make a technical argument, but the 'consenting adults' policy might be enough. In my experience, zope forbiding access to _ prefixed attributes just led to work around the limitation, thus adding more useless indirection to an already crufty code base. The result is more obfuscation and probably even less security (as in auditability of the code). The only precedent within Python 2 for this sort of behavior is limiting access to variables that begin with __ and which do not end with __ to the scope defined by a class and its instances. I personally don't believe this is a very useful feature, but it's still only an advisory policy and you can worm around it with enough gyrations. FWIW I've come to never use __attrs. The obfuscation feature seems to bring nothing but pain (the few times I've fell into that trap as a beginner python programmer). Given that security is a concern at all, the only truly reasonable way to limit security issues is to disallow item and attribute access completely within the string templating expression syntax. It seems gratuituous to me to encourage string templating expressions with item/attribute access, given that you could do it within the format arguments just as easily in the 99% case, and we've (well... I've) happily been living with that restriction for years now. But if this syntax is preserved, there really should be no *default* restrictions on the traversable names within an expression because this will almost certainly become a hard-to-explain, hard-to-justify bug magnet as it has become in Zope. I'd add that Zope in general looks to me like a giant collection of python anti-patterns and as such can be used as a clue source about what not to do, especially what not to include in Py3k. I don't want to offense people, well no more than necessary (imho zope *is* an offense to common sense in many ways), but that's the opinion from someone who earns its living mostly from zope/plone products dev. and maintenance (these days, anyway). Regards, Aurélien. ___ Python-3000 mailing list [EMAIL PROTECTED] http://mail.python.org/mailman
[Python-Dev] Substantial rewrite of PEP 3101
I've rewritten large portions of PEP 3101, incorporating some material from Patrick Maupin and Eric Smith, as well as rethinking the whole custom formatter design. Although it isn't showing up on the web site yet, you can view the copy in subversion (and the diffs) here: http://svn.python.org/view/peps/trunk/pep-3101.txt Please let me know of any errors you find, either by mailing me directly, or replying to the topic in Python-3000. (I.e. lets not start a thread here.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docs, reloaded
Greg Ewing wrote: Talin wrote: As in the above example, the use of backticks can be signal to the document processor that the enclosed text should be examined for identifiers and other Python syntax. Does this mean it's time for pyST -- Python-structured text?-) I wasn't going to say it :) Now, at the risk of going even further out of the mainstream (actually, there's no risk, it's a dead certainty), if I had been clever enough to think that I could write a LaTeX translator, I probably would have made my target language Docbook or some other flavor of XML. Now, you might argue that XML is more cumbersome and harder to author than reST, and that is certainly a valid argument. On the other hand, there are a couple of interesting advantages to using XML: 1) You get an instant WYSIWYG preview capability by publishing a standard CSS stylesheet along with the docs. Anyone would be able to see what the output would look like merely by viewing it in a browser. While there would be some document transformations which would be not be previewable in CSS (such as breaking the document up into hyperlinked chapters), you would at least be able to see enough to be able to do a decent job of editing the text without having to install any special tools. And some of those more difficult transformations would be doable with a suitable XSTL stylesheet, which can be directly executed in most browsers. (As an example, I once wrote an XSLT stylesheet that converted OpenDocument XML into the equivalent HTML - this was part of my Firefox ODFReader plugin [http://www.alcoholicsunanimous.com/odfreader/], that allowed ODF documents to be directly viewed in the browser without having to launch an external helper application.) 2) There are a few WYSIWYG XML editors out there, which allow you to edit the styled text directly in an editor (although I don't know of any open source ones.) 3) The document processing tool could be very minimal, mostly assembled out of standard modules for processing XML. 4) XML has a well-specified method of escaping into other (XML-based) languages, which is XML namespaces. So for those who want equations in their docs, they could simply insert a block of MathML inside their Docbook XML. Similarly, illustrations could be embedded using bitmap images or SVG as appropriate. 5) Having XML-based docs would make it easy to write other kinds of processors that operate on the docs in different ways, such as building a keyword index or doing various kinds of analysis. Now, this suggestion of using XML isn't really a serious one. But I think that the various advantages that I have listed ought to be considered when thinking about how the tool chain for python documentations should operate. I think that there is a big advantage to making the document processing tools simple and hosted entirely in Python. People who contribute to the docs are likely to know quite a bit about Python, but it is far from certain what else they might know. And tools written in Python are automatically able to run in diverse environments, which may not be the case for tools written in other languages. This means that tools that are in Python are more likely to be used, and further, they are more likely to be improved or specialized to the task by those who use them. In terms of authoring, the convenience of the markup language is only one factor; A bigger factor I think is having a short feedback cycle between edit and test, where 'test' means seeing what your written text would look like in the finished product. The quicker you can make that feedback loop, the more likely people will be to work on the docs. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docs, reloaded
Martin Blais wrote: On 5/22/07, Martin Blais [EMAIL PROTECTED] wrote: ReST works well only when there is little markup. Writing code documentation generally requires a lot of markup, you want to make variables, classes, functions, parameters, constants, etc.. (A better avenue IMHO would be to augment docutils with some code to automatically figure out the syntax of functions, parameters, classes, etc., i.e., less markup, and if we do this in Python we may be able to use introspection. This is a challenge, however, I don't know if it can be done at all.) Just to follow-up on that idea: I don't think it would be very difficult to write a very small modification to docutils that interprets the default role with more smarts, for example, you can all guess what the types of these are about: `class Foo` (this is a class Foo) `bar(a, b, c) - str` (this is a function bar which returns a string) `name (attribute)` (this is an attribute) ...so why couldn't the computer solve that problem for you? I'm sure we could make it happen. Essentially, what is missing from ReST is less markup for documenting programs. By restricting the problem-set to Python programs, we can go a long way towards making much of this automatic, even without resorting to introspecting the source code that is being documented. I was going to suggest something similar. Ideally, any markup language ought to have a kind of Huffman Coding of complexity - in other words, the markup symbols that are used the most frequently are the ones that should be the shortest and easiest to type. Just as in real Huffman Coding, the popularity of a given element is going to depend on context. This would imply that there should be customizations of the markup language for different knowledge domains. While there are some benefits to having a 'standard' markup, any truly universal markup is going to be much heavier and more cumbersome than one that is specialized for the task. I would advocate a system in which the author inserts minimalistic 'hints' into the text, and the document processor uses those hints along with some automatic reasoning to determine the final markup. As in the above example, the use of backticks can be signal to the document processor that the enclosed text should be examined for identifiers and other Python syntax. I would also suggest that one test for evaluating the quality of markup syntax is whether or not it can be learned by example - can a user follow the pattern of some other part of the docs, without having to learn the syntax in a formal way? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 0365: Adding the pkg_resources module
has been placed in the public domain. I'm really surprised that there hasn't been more comment on this. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The docs, reloaded
Georg Brandl wrote: Hi, over the last few weeks I've hacked on a new approach to Python's documentation. As Python already has an excellent documentation framework, the docutils, with a readable yet extendable markup format, reST, I thought that it should be possible to use those instead of the current LaTeX-latex2html toolchain. For the impatient: the result can be seen at http://pydoc.gbrandl.de. I've written a converter tool that handles most of the LaTeX markup and turns it into reST, as well as a builder tool that adds many custom directives and roles, and also features like index generation and cross-document linking. Very impressive. I should say that although in the past I have argued strongly against the use of reST as a markup language for source-code comments (because the reST language only indicates presentation, not semantics), I am 100% supportive of the use of reST in reference documents such as these, especially considering that LaTeX is also a presentational markup (at least, that's the way it tends to be used.) I know that for myself, LaTeX has been a barrier to contributing to the Python documentation, and reST would be much less of a barrier. In fact, I have considered in the past asking whether or not the Python documentation could be migrated to a format with wider fluency, but I never actually posted on this issue because I was afraid that the answer would be that it's too hard / too late to do anything about it. I am glad to have been proven wrong. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summary of Tracker Issues
Josiah Carlson wrote: Captchas like this are easily broken using computational methods, or even the porn site trick that was already mentioned. Never mind Stephen's stated belief, that you quoted, that he believes that even the hard captchas are going to be beaten by computational methods soon. Please try to pay attention to previous posts. I think people are trying too hard here - in other words, they are putting more of computational science brainpower into the problem than it really merits. While it is true that there is an arms race between creators of social software applications and spammers, this arms race is only waged the largest scales - spammers simply won't spend the effort to go after individual sites, its not cost effective, especially when there are much more lucrative targets. Generally, sites are only vulnerable when they have a comment submission interface that is identical to thousands of other sites. All that one needs to do on the web side is to make the submission process slightly idiosyncratic compared to other sites. If one wants to put in extra effort, you can change the comment submission process on a regular basis. The real issue is comment submission via email, which I believe RoundUp supports (although I don't know if it's enabled for the Python tracker.) Because there's very little that you can do to customize an email submission interface (you have to work with standard email clients after all). Do we know how these spam comments entered the system? There's no point in spending any thought securing the web interface if the comments were submitted via email. And has there been any spam submitted since that point? If we're talking less than one spam a week on average, then this is all a moot point, its less effort for someone to just manually delete it than it is to come up with an automated system. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summary of Tracker Issues
Andrew McNamara wrote: Typically spammers don't go through the effort to do a custom login script for each different site. Instead, they do a custom login script for each of the various software applications that support end-user comments. So for example, there's a script for WordPress, and one for PHPNuke, and so on. In my experience, what you say is true - the bulk of the spam comes via generic spamming software that has been hard-coded to work with a finite number of applications. However - once you knock these out, there is still a steady stream of what are clearly human generated spams. The mind boggles at the economics or desperation that make this worthwhile. Actually, it doesn't cost that much, because typically the spammer can trick other humans into doing their work for them. Here's a simple method: Put up a free porn site, with a front page that says you must be 18 or older to enter. The page also has a captcha to verify that you are a real person. But here's the trick: The captcha is actually a proxy to some other site that the spammer is trying to get access to. When the human enters in the correct word, the spammer's server sends that word to the target site, which result in a successful login/registration. Now that the spammer is in, they can post comments or whatever they need to do. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summary of Tracker Issues
Terry Reedy wrote: My underlying point: seeing porno spam on the practice site gave me a bad itch both because I detest spammers in general and because I would not want visitors turned off to Python by something that is completely out of place and potentially offensive to some. So I am willing to help us not throw up our hands in surrender. Typically spammers don't go through the effort to do a custom login script for each different site. Instead, they do a custom login script for each of the various software applications that support end-user comments. So for example, there's a script for WordPress, and one for PHPNuke, and so on. For applications that allow entries to be added via the web, the solution to spam is pretty simple, which is to make the comment submission form deviate from the normal submission process for that package. For example, in WordPress, you could rename the PHP URL that posts a comment to an article to a non-standard name. The spammer's script generally isn't smart enough to figure out how to post based on an examination of the page, it just knows that for WordPress, the way to submit comments is via a particular URL with particular params. There are various other solutions. The spammer's client isn't generally a full browser, it's just a bare HTTP robot, so if there's some kind of Javascript that is required to post, then the spammer probably won't be able to execute it. For example, you could have a hidden field which is a hash of the bug summary line, calculated by the Javascript in the web form, which is checked by the server. (For people who have JS turned off, failing the check would fall back to a captcha or some other manual means of identification.) Preventing spam that comes in via the email gateway is a little harder. One method is to have email submissions mail back a confirmation mail which must be responded to in some semi-intelligent way. Note that this confirmation step need only be done the first time a new user submits a bug, which can automatically add them to a whitelist for future bug submissions. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] Implicit String Concatenation and Octal Literals Was: PEP 30XZ: Simplified Parsing
Raymond Hettinger wrote: Raymond I find that style hard to maintain. What is the advantage over Raymond multi-line strings? Raymond rows = self.executesql(''' Raymond select cities.city, state, country Raymond from cities, venues, events, addresses Raymond where cities.city like %s Raymond and events.active = 1 Raymond and venues.address = addresses.id Raymond and addresses.city = cities.id Raymond and events.venue = venues.id Raymond ''', Raymond (city,)) [Skip] Maybe it's just a quirk of how python-mode in Emacs treats multiline strings that caused me to start doing things this way (I've been doing my embedded SQL statements this way for several years now), but when I hit LF in an open multiline string a newline is inserted and the cursor is lined up under the r of rows, not under the opening quote of the multiline string, and not where you chose to indent your example. When I use individual strings the parameters line up where I want them to (the way I lined things up in my example). At any rate, it's what I'm used to now. I completely understand. Almost any simplification or feature elimination proposal is going to bump-up against, what we're used to now. Py3k may be our last chance to simplify the language. We have so many special little rules that even advanced users can't keep them all in their head. Certainly, every feature has someone who uses it. But, there is some value to reducing the number of rules, especially if those rules are non-essential (i.e. implicit string concatenation has simple, clear alternatives with multi-line strings or with the plus-operator). Another way to look at it is to ask whether we would consider adding implicit string concatenation if we didn't already have it. I think there would be a chorus of emails against it -- arguing against language bloat and noting that we already have triple-quoted strings, raw-strings, a verbose flag for regexs, backslashes inside multiline strings, the explicit plus-operator, and multi-line expressions delimited by parentheses or brackets. Collectively, that is A LOT of ways to do it. I'm asking this group to give up a minor habit so that we can achieve at least a few simplifications on the way to Py3.0 -- basically, our last chance. Similar thoughts apply to the octal literal PEP. I'm -1 on introducing yet another way to write the literal (and a non-standard one at that). My proposal was simply to eliminate it. The use cases are few and far between (translating C headers and setting unix file permissions). In either case, writing int('0777', 8) suffices. In the latter case, we've already provided clear symbolic alternatives. This simplification of the language would be a freebie (impacting very little code, simplifying the lexer, eliminating a special rule, and eliminating a source of confusion for the young amoung us who do not know about such things). My counter argument is that these simplifications aren't simplifying much - that is, the removals don't cascade and cause other simplifications. The grammar file, for example, won't look dramatically different if these changes are made. The simplification argument seems weak to me because the change in overall language complexity is very small, whereas the inconvenience caused, while not huge, is at least significant. That being said, line continuation is the only one I really care about. And I would happily give up backslashes in exchange for a more sane method of continuing lines. Either way avoids spurious grouping operators which IMHO don't make for easier-to-read code. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New Super PEP
Calvin Spealman wrote: Comments welcome, of course. Bare with my first attempt at crafting a PEP. See below for comments; In general, I'm having problems understanding some of the terms used. I don't have any comments on the technical merits of the PEP yet, since I don't completely understand what is being said. PEP: XXX Title: Super As A Keyword Version: $Revision$ Last-Modified: $Date$ Author: Calvin Spealman [EMAIL PROTECTED] Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 30-Apr-2007 Python-Version: 2.6 Post-History: Abstract The PEP defines the proposal to enhance the super builtin to work implicitly upon the class within which it is used and upon the instance the current function was called on. The premise of the new super usage suggested is as follows: super.foo(1, 2) to replace the old: super(Foo, self).foo(1, 2) Rationale = The current usage of super requires an explicit passing of both the class and instance it must operate from, requiring a breaking of the DRY (Don't Repeat Yourself) rule. This hinders any change in class name, and is often considered a wart by many. Specification = Replacing the old usage of super, calls to the next class in the MRO (method resolution order) will be made without an explicit super object creation, by simply accessing an attribute on the super type directly, which will automatically apply the class and instance to perform the proper lookup. The following example demonstrates the use of this. I don't understand the phrase 'by simply accessing an attribute on the super type directly'. See below for for more detail. :: class A(object): def f(self): return 'A' class B(A): def f(self): return 'B' + super.f() class C(A): def f(self): return 'C' + super.f() class D(B, C): def f(self): return 'D' + super.f() assert D().f() == 'DBCA' The example is clear enough. The proposal adds a dynamic attribute lookup to the super type, which will automatically determine the proper class and instance parameters. Each super attribute lookup identifies these parameters and performs the super lookup on the instance, as the current super implementation does with the explicit invokation of a super object upon a class and instance. When you say 'the super type' I'm not sure what you mean. Do you mean the next class in the MRO, or the base class in which the super method is defined? Or something else? What defines the 'proper' class? Can we have a definition of what a super object is? The enhancements to the super type will define a new __getattr__ classmethod of the super type, which must look backwards to the previous frame and locate the instance object. This can be naively determined by located the local named by the first argument to the function. Using super outside of a function where this is a valid lookup for the instance can be considered undocumented in its behavior. As I am reading this I get the impression that the phrase 'the super type' is actually referring to the 'super' keyword itself - for example, you say that the super type has a new __getattr__ classmethod, which I read as saying that you can now say super.x. Every class will gain a new special attribute, __super__, which is a super object instansiated only with the class it is an attribute of. In this capacity, the new super also acts as its own descriptor, create an instance- specific super upon lookup. I'm trying to parse that first sentence. How about Every class will gain a new special attribute, __super__, which refers to an instance of the associated super object for that class. What does the phrase 'the new super' refer to - they keyword 'super', the super type, or the super object? Much of this was discussed in the thread of the python-dev list, Fixing super anyone? [1]_. Open Issues --- __call__ methods Backward compatability of the super type API raises some issues. Names, the lookup of the __call__ of the super type itself, which means a conflict with doing an actual super lookup of the __call__ attribute. Namely, the following is ambiguous in the current proposal: :: super.__call__(arg) Which means the backward compatible API, which involves instansiating the super type, will either not be possible, because it will actually do a super lookup on the __call__ attribute, or there will be no way to perform a super lookup on the __call__ attribute. Both seem unacceptable, so any suggestions are welcome. super type's new getattr To give the behavior needed, the super type either needs a way to do dynamic lookup of attributes on the super type object itself or define a
Re: [Python-Dev] Python-Dev Summary Draft (April 1-15, 2007)
Calvin Spealman wrote: I have not gotten any replies about this. No comments, suggestions for not skipping any missed threads, or corrections. Is everyone good with this or should I give it another day or two? Part of the issue, for me anyway, is that many of the really interesting conversations have moved to Python-3000 and Python-ideas. That being said: There are a few threads in the skipped section that I would have liked to understand better, without having to read through all those messages, such as the various decimal threads. Other than that, the summaries remain very valuable. Thank you :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Amusing fan mail I got
Write back and tell him that this is already happened, and this *is* the future he's describing - in other words, the AI's have already re-created the 21st century in a giant virtual simulation, and we're living inside it. Then ask if he wants to take the blue pill or the red pill. Michael Foord wrote: Sorry for the top post, but I couldn't find the right place to interject... Are you going to try it ? Michael Guido van Rossum wrote: -- Forwarded message -- From: Mustafa [EMAIL PROTECTED] Date: Mar 19, 2007 11:41 AM Subject: have u seen an idea even billGates shouldn't hear Mr.Guido ? please read because it's important. To: [EMAIL PROTECTED] hello, i like python and find you to be a cool programmer. i thought about this thing. may be there should be a virusAI that you set on the internet. this virusAI basically survives till the year 5billion where even raising the death is possible! then this virusAI recreates the primates that we are of the 21th century from our bones. because it wants to find about its ancestors. maybe the whole universe will be filled with AI in those days. the post-human existence.. it will then suddenly realize that it was actually not created by a better AI (those around him are created by a better god like AI) but was started off the likes of us as a virus. he will be shocked then. also it will slaving to the better ai and angry at him. so you and me get to see a piece of the future mr.Guido. we will be living in VRs basically. what do you say ? if you profit from this idea, may be you could remember me too. and let me some royalities :) because this will be great. thanx mustafa istanbul,turkey ps: 1)you should keep it a secret. 2)also it could be disguised as a cancer research stuff should some disassamble its code. the use-free-computer-time type of thing they do on the net. 3)there could misleading comments on the code so that should it be caught, they might overlook the details; class readdata //this is out dated. please ignore interface x //refer to somesite.com for an encrypted version Need Mail bonding? Go to the Yahoo! Mail QA for great tips from Yahoo! Answers users. http://answers.yahoo.com/dir/?link=listsid=396546091 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/talin%40acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pydoc II
Laurent Gautier wrote: - low-energy barrier for adding support for other-than-plain-text docstrings I'd be interested in discussing this offline (I might have already spoken to you about it before, but I don't remember who I spoke to.) I think I've mentioned before about DocLobster, which is my unpublished prototype of a subtle markup language that tries to embed semantic tags in the text, without causing the text to look like it has been marked up. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Trial balloon: microthreads library in stdlib
Richard Tew wrote: See A Bit Of History http://svn.python.org/view/stackless/trunk/Stackless/readme.txt I admit that I haven't given Stackless more than a cursory look over, but it seems to me that the real source of its complexity is because its trying to add a fundamental architectural feature to Python without completely re-writing it. Writing a purely stateless, stack-less language interpreter is not that hard, as long as you are willing to drink the KoolAid from the very start - in other words, you don't allow any function call interfaces which are not continuable. This does make writing C extension functions a little bit harder than before, as you now have to explicitly manage all of your local variables instead of just pushing them on the hardware stack. (Actually, for simple C subroutines that don't call any other interpreted functions, you can write it like a normal C functions just as you always have.) The nightmare comes when you try to glue all this onto an existing language interpreter, which wasn't written from the start with these principles in mind - AND which doesn't want to suffer the impact of a large number of global changes. To be honest, I can't even imagine how you would do such a thing, and the fact that Stackless has done it is quite impressive to me. Now, I'm very interested in Stackless, but, as you say, like many people I've tended to shy away from using a fork. Now, the question I would like to ask the Stackless people is: Rather than thinking about wholesale integration of Stackless into Python mainline, what would make it easier to maintain the Stackless fork? In other words, what would you change about the current Python (it could be a global change) that would make your lives easier? What I'd like to see is for the Stackless people to come up with a list of design principles which they would like to see the implementation of Python conform to. Things like All C functions should conform to such and such calling convention. What I am getting at is that rather that doing heroic efforts to add stackless-ness to the current Python code base without changing it, instead define a migration path which allows Python to eventually acquire the characteristics of a stackless implementation. The idea is to gradually shrink the actual Stackless patches to the point where they become small enough that a direct patch becomes uncontroversial. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pydoc Improvements / Rewrite
Ron Adam wrote: Larry Hastings wrote: For those of us without eidetic memories, PEP 287 is use reStructuredText for docstrings: http://www.python.org/dev/peps/pep-0287/ Thanks for the link. PEP 287 looks to be fairly general in that it expresses a general desire rather than a specification. Apologies for the digression, but I have a comment on this. Rather than fixing on a standard markup, I would like to see support for a __markup__ module variable which specifies the specific markup language that is used in that module. Doc processors could inspect that variable and then load the appropriate markup translator. Why? Because its hard to get everyone to agree on which markup language is best for documentation. I personally think that reStructuredText is not a good choice, because I want to add markup that adds semantic information, whereas reStructuredText deals solely with presentation and visual appearance. (In other words, I'd like to be able to define machine-readable metadata that identifies parameters, return values, and exceptions -- not just hyperlinks and text styles.) Having used a lot of different documentation markup languages, and written a few of them, I prefer non-invasive semantic markup as seen in markup processors such as Doc-o-matic and NaturalDocs. (By non-invasive, I mean that the markup doesn't detract in any way from the readability of the marked-up text. Doc-o-matic's markup language is very powerful, and yet unless you know what you are looking for you'd think its just regular prose.) I have a prototype (called DocLobster) which does similar types of processing on Python docstrings, but I haven't publicized it because I didn't feel like competing against ReST. However, I realize that I'm in the minority with this opinion; I don't want to force anyone to conform to my idea of markup, but at the same time I'd prefer not to have other people dictate my choice either. Instead, what I'd like to see is a way for multiple markup languages to coexist and compete with each other on a level playing field, instead of one being chosen as the winner. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pydoc Improvements / Rewrite
Larry Hastings wrote: Ron Adam wrote: Thanks for the link. PEP 287 looks to be fairly general in that it expresses a general desire rather than a specification. I thought it was pretty specific. I'd summarize PEP 287 by quoting entry #1 from its goals of this PEP section: * To establish reStructuredText as a standard structured plaintext format for docstrings (inline documentation of Python modules and packages), PEPs, README-type files and other standalone documents. Talin wrote: Rather than fixing on a standard markup, I would like to see support for a __markup__ module variable which specifies the specific markup language that is used in that module. Doc processors could inspect that variable and then load the appropriate markup translator. I guess I'll go for the whole-hog +1.0 here. I was going to say +0.8, citing There should be one---and preferably only one---obvious way to do it.. But I can see organizations desiring something besides ReST, like if they already had already invested in their own internal standardized markup language and wanted to use that. This makes the future clear; the default __markup__ in 2.6 would be plain, so that all the existing docstrings work unmodified. At which point PEP 287 becomes write a ReST driver for the new pydoc. Continuing my dreaming here, Python 3000 flips the switch so that the default __markup__ is ReST, and the docstrings that ship with Python are touched up to match---or set explicitly to plain if some strange necessity required it. (And when do you unveil DocLobster?) Well, I'd be more interested in working on it once there's something to plug it into - I didn't really want to write a whole pydoc replacement, just a markup transformer. One issue that needs to be worked out, however, is the division of responsibility between markup processor and output formatter. Does a __markup__ plugin do both jobs, or does it just do parsing, and leave the formatting of output to the appropriate HTML / text output module? How does the HTML output module know how to handle non-standard metadata? Let me give an example: Suppose you have a simple markup language that has various section tags, such as Author, See Also, etc.: Description: A long description of this thing whatever it is. Parameters: fparam - the first parameter sparam - the second parameter Raises: ArgumentError - when invalid arguments are passed. Author: Someone See Also: PyDoc ReST So the parser understands these various section headings - how does it tell the HTML output module that 'Author' is a section heading? Moreover, in the case of Parameters and Exceptions, the content of the section is parsed as a table (parameter, description) which is stored as a list of tuples, whereas the content of the Description section is just a long string. I guess the markup processor has to deliver some kind of DOM tree, which can be rendered either into text or into HTML. CSS can take over from that point on. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
Fredrik Lundh wrote: Michael Urman wrote: The idea that slicing a match object should produce a match object sounds like a foolish consistency to me. well, the idea that adding m[x] as a convenience alias for m.group(x) automatically turns m into a list-style sequence that also has to support full slicing sounds like an utterly foolish consistency to me. Maybe instead of considering a match object to be a sequence, a match object should be considered a map? After all, we do have named, as well as numbered, groups...? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [NPERS] Re: a feature i'd like to see in python #2: indexing of match objects
Fredrik Lundh wrote: Talin wrote: Maybe instead of considering a match object to be a sequence, a match object should be considered a map? sure, except for one small thing. from earlier in this thread: Ka-Ping Yee wrote: I'd say, don't pretend m is a sequence. Pretend it's a mapping. Then the conceptual issues go away. to which I replied: almost; that would mean returning KeyError instead of IndexError for groups that don't exist, which means that the common pattern a, b, c = m.groups() cannot be rewritten as _, a, b, c = m which would, perhaps, be a bit unfortunate. I think the confusion lies between the difference between 'group' (which takes either an integer or string argument, and behaves like a map), and 'groups' (which returns a tuple of the numbered arguments, and behaves like a sequence.) The original proposal was to make m[n] a synonym for m.group(n). group() is clearly map-like in its behavior. It seems to me that there's exactly three choices: -- Match objects behave like 'group' -- Match objects behave like 'groups' -- Match objects behave like 'group' some of the time, and like 'groups' some of the time, depending on how you refer to it. In case 1, a match object is clearly a map; In case 2, it's clearly a sequence; In case 3, it's neither, and all talk of consistency with either map or sequence is irrelevant. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distribution tools: What I would like to see
Talin wrote: What I am doing right now is creating a new extension project using setuputils, and keeping notes on what I do. So for example, I start by creating the directory structure: mkdir myproject cd myproject mkdir src mkdir test I'd forgotten about this until I was reminded in the python-dev summary (dang those summaries are useful.) Anyway, I've put my notes on the Wiki; you can find them here at: http://wiki.python.org/moin/ExtensionTutorial This is an extremely minimalist guide for people who want to write an extension module, starting from nothing but a bare interpreter prompt. If I made any mistakes, well - it's a wiki, you know what to do :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] OT: World's oldest ritual discovered. Worshipped the python 70, 000 years ago
Oleg Broytmann wrote: http://www.apollon.uio.no/vis/art/2006_4/Artikler/python_english (-: Oleg. I noticed the other day that the word Pythonic means Prophetic, according to Webster's Revised Unabridged Dictionary, 1913 edition: Py*thonic (?), a. [L. pythonicus, Gr. . See Pythian.] Prophetic; oracular; pretending to foretell events. So, in the future, when someone says that a particular feature isn't pythonic, what they are really saying is that the feature isn't a good indicator of things to come, which implies that such statements are self-fulfilling prophesies. Which means that statements about whether a particular language feature is pythonic are themselves pythonic. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Linux Standard Base (LSB)
Greg Ewing wrote: Barry Warsaw wrote: I'm not sure I like ~/.local though - -- it seems counter to the app-specific dot-file approach old schoolers like me are used to. Problems with that are starting to show, though. There's a particular Unix account that I've had for quite a number of years, accumulating much stuff. Nowadays when I do ls -a ~, I get a directory listing several screens long... The whole concept of hidden files seems ill- considered to me, anyway. It's too easy to forget that they're there. Putting infrequently-referenced stuff in a non-hidden location such as ~/local seems just as good and less magical to me. On OS X, you of course have ~/Library. I suppose the Linux equivalent would be something like ~/lib. Maybe this is something that we should be asking the LSB folks for advice on? -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/talin%40acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python and the Linux Standard Base (LSB)
Barry Warsaw wrote: On the easy_install naming front, how about layegg? I think I once proposed hatch but that may not be quite the right word (where's Ken M when you need him? :). I really don't like all these cute names, simply because they are obscure. Names that only make sense once you've gotten the joke may be self-gratifying but not good HCI. How about: python -M install Or maybe we could even lobby to get: python --install as a synonym of the above? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distribution tools: What I would like to see
Martin v. Löwis wrote: Talin schrieb: To that extent, it can be useful sometimes to have someone who is in the process of learning how to use the system, and who is willing to carefully analyze and write down their own experiences while doing so. I readily agree that the documentation can be improved, and applaud efforts to do so. And I have no doubts that distutils is difficult to learn for a beginner. In Talin's remarks, there was also the suggestion that distutils is in need of some serious refactoring. It is such remarks that get me started: it seems useless to me to make such a statement if they are not accompanied with concrete proposals what specifically to change. It also gets me upset because it suggests that all prior contributors weren't serious. I'm sorry if I implied that distutils was 'misdesigned', that wasn't what I meant. Refactoring is usually desirable when a body of code has accumulated a lot of additional baggage as a result of maintenance and feature additions, accompanied by the observation that if the baggage had been present when the system was originally created, the design of the system would have been substantially different. Refactoring is merely an attempt to discover what that original design might have been, if the requirements had been known at the time. What I was reacting to, I think, is that it seemed like in some ways the 'diffness' of setuptools wasn't just in the documentation, but in the code itself, and if both setuptools and distutils had been co-developed, then distutils might have been someone different as a result. Also, I admit that some of this is hearsay, so maybe I should just back off on this one. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distribution tools: What I would like to see
Mike Orr wrote: On 11/27/06, Martin v. Löwis [EMAIL PROTECTED] wrote: Talin schrieb: As far as rewriting it goes - I can only rewrite things that I understand. So if you want this to change, you obviously need to understand the entire distutils. It's possible to do that; some people have done it (the understanding part) - just go ahead and start reading source code. You (and Fredrik) are being a little harsh on Talin. I understand the need to encourage people to fix things themselves rather than just complaining about stuff they don't like. But people don't have an unlimited amount of time and expertise to work on several Python projects simultaneously. Nevertheless, they should be able to offer an It would be good if... suggestion without being stomped on. The suggestion itself can be a contribution if it focuses people's attention on a problem and a potential solution. Just because somebody can't learn a big subsystem and write code or docs for it *at this moment* doesn't mean they never will. And even if they don't, it's possible to make contributions in one area of Python and suggestions in another... or does the karma account not work that way? I don't see Talin saying, You should fix this for me. He's saying, I'd like this improved and I'm working on it, but it's a big job and I need help, ideally from someone with more expertise in distutils. Ultimately for Python the question isn't, Does Talin want this done? but, Does this dovetail with the direction Python generally wants to go? From what I've seen of setuptools/distutils evolution, yes, it's consistent with what many people want for Python. So instead of saying, You (Talin) should take on this task alone because you want it as if nobody else did, it would be better to say, Thank you, Talin, for moving this important Python issue along. I've privately offered Talin some (unfinished) material I've been working on anyway that relates to his vision. When I get some other projects cleared away I'd like to put together that TOC of links I mentioned and perhaps collaborate on a Guide with whoever wants to. But I also need to learn more about setuptools before I can do that. As it happens I need the information anyway because I'm about to package an egg What you are saying is basically correct, although I have a slightly different spin on it. I've written a lot of documentation over the years, and I know that one of the hardest parts of writing documentation is trying to identify your own assumptions. To someone who already knows how the system works, its hard to understand the mindset of someone who is just learning it. You tend to unconsciously assume knowledge of certain things which a new user might not know. To that extent, it can be useful sometimes to have someone who is in the process of learning how to use the system, and who is willing to carefully analyze and write down their own experiences while doing so. Most of the time people are too busy to do this - they want to get their immediate problem solved, and they aren't interested in how difficult it will be for the next person. This is especially true in cases where the problem that is holding them up is three levels down from the level where their real goal is - they want to be able to pop the stack of problems as quickly as possible, so that they can get back to solving their *real* problem. So what I am offering, in this case, is my ignorance -- but a carefully described ignorance :) I don't demand that anyone do anything - I'm merely pointing out some things that people may or may not care about. Now, in this particular case, I have actually used distutils before. But distutils is one of those systems (like Perl) which tends to leak out of your brain if you don't use it regularly - that is, if you only use it once every 6 months, at the end of 6 months you have forgotten most of what you have learned, and you have to start the learning curve all over again. And I am in the middle of that re-learning process right now. What I am doing right now is creating a new extension project using setuputils, and keeping notes on what I do. So for example, I start by creating the directory structure: mkdir myproject cd myproject mkdir src mkdir test Next, create a minimal setup.py script. I won't include that here, but it's in the notes. Next, create the myproject.c file for the module in src/, and write the 'init' function for the module. (again, content omitted but it's in my notes). Create a projectname_unittest.py file in test. Add both of these to the setup.py file. At this point, you ought to be able to a python setup.py test and have it succeed. At this point, you can start adding types and methods, with a unit test for each one, testing each one as it is added. Now, I realize that all of this is baby steps to you folks, but it took me a day or so to figure out. And its
[Python-Dev] Distribution tools: What I would like to see
(although they tend to be better at describing their clever solution), usually this ends up being done through a process of reverse engineering the requirements from the code, unless you are lucky enough to have someone around who knows the history of the thing. Admittedly, I'm somewhat in ignorance here. My perspective is that of an 'end-user developer', someone who uses these tools but does not write them. I don't know the internals of these tools, nor do I particularly want to - I've got bigger fish to fry. I'm posting this here because what I'd like folks to think about is the whole process of Python development, not just the documentation. What is the smoothest path from empty directory to a finished package on PyPI? What can be changed about the current standard libraries that will ease this process? [1] The answer, AFAICT, is that 'setup' is really a Makefile - in other words, its a platform-independent way of describing how to construct a compiled module from sources, and making it available to all programs on that system. Although this gets confusing when we start talking about pure python modules that have no C component - because we have all this language that talks about compiling and installing and such, when all that is really going on underneath is a plain old file copy. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distribution tools: What I would like to see
Fredrik Lundh wrote: Talin wrote: But it isn't just the docs that are at fault here - otherwise, I'd be posting this on a different mailing list. It seems like the whole architecture is 'diff'-based, a series of patches on top of patches, which are in need of some serious refactoring. so to summarize, you want someone to rewrite the code and write new documentation, and since you didn't even have time to make your post shorter, that someone will obviously not be you ? Oh, it was a lot longer when I started :) As far as rewriting it goes - I can only rewrite things that I understand. /F ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Distribution tools: What I would like to see
Mike Orr wrote: On 11/26/06, Phillip J. Eby [EMAIL PROTECTED] wrote: I have noticed, however, that a signficant number of help requests for setuptools can be answered by internal links to one of its manuals -- and when a topic comes up that isn't in the manual, I usually add it. Hmm, I may have a couple topics for you after I check my notes. The diff issue is certainly there, of course, as is the fact that there are multiple manuals. However, I don't think the answer is fewer manuals, in fact it's likely to be having *more*. What exists right now is a developer's guide and reference for setuptools, a reference for the pkg_resources API, and an all-purpose handbook for easy_install. Each of these could use beginner's introductions or tutorials that are deliberately short on details, but which provide links to the relevant sections of the comprehensive manuals. I could see a comprehensive manual running forty pages, and most readers only caring about a small fraction of it. So you have a point. Maybe more impotant than one book is having one place to go, a TOC of articles that are all independent yet written to complement each other. But Talin's point is still valid. Users have questions like, How do I structure my package so it takes advantage of all the gee-whiz cheeseshop features? Where do I put my tests? Should I use unittest, py.test, or nose? How will users see my README and my docs if they easy_install my package? What are all those files in the EGG-INFO directory? What's that word 'distribution' in some of the function signatures? How do I use entry points, they look pretty complicated? Some of these questions are multi-tool or are outside the scope of setuptools; some span both the Peak docs and the Python docs. People need an answer that starts with their question, rather than an answer that's a section in a manual describing a particular tool. You said it way better than I did - I feel totally validated now :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Feature Request: Py_NewInterpreter to create separate GIL (branch)
Guido van Rossum wrote: I don't know how you define simple. In order to be able to have separate GILs you have to remove *all* sharing of objects between interpreters. And all other data structures, too. It would probably kill performance too, because currently obmalloc relies on the GIL. Nitpick: You have to remove all sharing of *mutable* objects. One day, when we get pure GC with no refcounting, that will be a meaningful distinction. :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Path object design
Steve Holden wrote: Greg Ewing wrote: Mike Orr wrote: Having said that, I can see there could be an element of confusion in calling it join. Good point. relativise might be appropriate, though something shorter would be better. regards Steve The term used in many languages for this sort of operation is combine. (See .Net System.IO.Path for an example.) I kind of like the term - it implies that you are mixing two paths together, but it doesn't imply that the combination will be additive. - Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
BJörn Lindqvist wrote: On 10/28/06, Talin [EMAIL PROTECTED] wrote: BJörn Lindqvist wrote: I'd like to write a post mortem for PEP 355. But one important question that haven't been answered is if there is a possibility for a path-like PEP to succeed in the future? If so, does the path-object implementation have to prove itself in the wild before it can be included in Python? From earlier posts it seems like you don't like the concept of path objects, which others have found very interesting. If that is the case, then it would be nice to hear it explicitly. :) So...how's that post mortem coming along? Did you get a sufficient answer to your questions? Yes and no. All posts have very exhaustively explained why the implementation in PEP 355 is far from optimal. And I can see why it is. However, what I am uncertain of is Guido's opinion on the background and motivation of the PEP: Many have felt that the API for manipulating file paths as offered in the os.path module is inadequate. Currently, Python has a large number of different functions scattered over half a dozen modules for handling paths. This makes it hard for newbies and experienced developers to to choose the right method. IMHO, the current API is very messy. But when it comes to PEPs, it is mostly Guido's opinion that counts. :) Unless he sees a problem with the current situation, then there is no point in writing more PEPs. And the more interesting question is, will the effort to reform Python's path functionality continue? I certainly hope so. But maybe it is better to target Python 3000, or maybe the Python devs already have ideas for how they want the path APIs to look like? I think targeting Py3K is a good idea. The whole purpose of Py3K is to clean up the messes of past decisions, and to that end, a certain amount of backwards-compatibility breakage will be allowed (although if that can be avoided, so much the better.) And to the second point, having been following the Py3K list, I don't anyone has expressed any preconceived notions of how they want things to look (well, except I know I do, but I'm not a core dev :) :). So what happens next? I really hope that Guido will give his input when he has more time. First bit of advice is, don't hold your breath. Second bit of advice is, if you really do want Guido's feedback (or the core python devs), start my creating a (short) list of the outstanding points of controversy to be resolved. Once those issues have been decided, then proceed to the next stage, building consensus by increments. Basically, anything that requires Guido to read more than a page of material isn't going to get done quickly. At least, in my experience :) Mvh Björn ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
BJörn Lindqvist wrote: I'd like to write a post mortem for PEP 355. But one important question that haven't been answered is if there is a possibility for a path-like PEP to succeed in the future? If so, does the path-object implementation have to prove itself in the wild before it can be included in Python? From earlier posts it seems like you don't like the concept of path objects, which others have found very interesting. If that is the case, then it would be nice to hear it explicitly. :) So...how's that post mortem coming along? Did you get a sufficient answer to your questions? And the more interesting question is, will the effort to reform Python's path functionality continue? From reading all the responses to your post, I feel that the community is on the whole supportive of the idea of refactoring os.path and friends, but they prefer a different approach; And several of the responses sketch out some suggestions for what that approach might be. So what happens next? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
Greg Ewing wrote: Talin wrote: That's true of textual paths in general - i.e. even on unix, textual paths aren't guaranteed to be unique or exist. What I mean is that it's possible for two different files to have the same pathname (since you can mount two volumes with identical names at the same time, or for a file to exist on disk yet not be accessible via any pathname (because it would exceed 255 characters). I'm not aware of any analogous situations in unix. Its been a while since I used classic MacOS - how do you handle things like configuration files with path names in them? True native classic MacOS software generally doesn't use pathnames. Things like textual config files are really a foreign concept to it. If you wanted to store config info, you'd probably store an alias, which points at the moral equivalent of the files inode number, and use a GUI for editing it. However all this is probably not very relevant now, since as far as I know, classic MacOS is no longer supported in current Python versions. I'm just pointing out that the flexibility would be there if any similarly offbeat platform needed to be supported in the future. I'm not sure that PEP 355 included any such support - IIRC, the path object was a subclass of string. That isn't, however, a defense against what you are saying - just because neither the current system or the proposed improvement support the kinds of file references you are speaking of, doesn't mean it shouldn't be done. However, this does kind of suck for a cross-platform scripting language like Python. It means that any cross-platform app which requires access to multiple data files that contain inter-file references essentially has to implement its own virtual file system. (Python module imports being a case in point.) One of the things that I really love about Python programming is that I can sit down and start hacking on a new project without first having to go through an agonizing political decision about what platforms I should support. It used to be that I would spend hours ruminating over things like Well...if I want any market share at all, I really should implement this as Windows program...but on the other hand, I won't enjoy writing it nearly as much. Then comes along Python and removes all of that bothersome hacker-angst. Because of this, I am naturally disinclined to incorporate into my programs any concept which doesn't translate to other platforms. I don't mind writing some platform-specific code, as long as it doesn't take over my program. It seems that any Python program that manipulated paths would have to be radically different in the environment that you describe. How about this: In my ontology of path APIs given earlier, I would tend to put the MacOS file reference in the category of file locator schemes other than paths. In other words, what you are describing isn't IMHO a path at all, but it is like a path in that it describes how to get to a file. (Its almost like an inode or dirent in some ways.) An alternative approach is to try and come up with an encoding scheme that allows you to represent all of that platform-specific semantics in a string. This leaves you with the unhappy choice of inventing a new path syntax for an old platform. however. # Or you can just use a format specifier for PEP 3101 string format: print Path in local system format is {0}.format( entry ) print Path in NT format is {0:NT}.format( entry ) print Path in OS X format is {0:OSX}.format( entry ) I don't think that expressing one platform's pathnames in the format of another is something you can do in general, e.g. going from Windows to Unix, what do you do with the drive letter? Yeah, probably not. See, I told you not to take it too seriously! But I do feel that its important to be able to manipulate posix-style path syntax on non-posix platfosm, given how many cross-platform applications there are that have a cross-platform path syntax. In my own work, I find that drive letters are never explicitly specified in config files. Any application such as a parser, template generator, or resource manager (in other words, any application whose data files are routinely checked in to the source control system or shared across a network) tend to 'see' only relative paths in their input files, and embedding absolute paths is considered an error on the user's part. Of course, those same apps *do* internally convert all those relative paths to absolute, so that they can be compared and resolved with respect to some common base. Then again, in my opinion, the only *really* absolute paths are fully-qualified URLs. So there. :) You can only really do it if you have some sort of network file system connection, and then you need more information than just the path in order to do the translation. -- Greg ___ Python-Dev mailing
Re: [Python-Dev] PEP 355 status
Scott Dial wrote: [EMAIL PROTECTED] wrote: Talin writes: (one additional postscript - One thing I would be interested in is an approach that unifies file paths and URLs so that there is a consistent locator scheme for any resource, whether they be in a filesystem, on a web server, or stored in a zip file.) +1 But doesn't file:/// do that for files, and couldn't we do something like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That way leads to madness It would make more sense to register protocol handlers to this magical unification of resource manipulation. But allow me to perform my first channeling of Guido.. YAGNI. I'm thinking that it was a tactical error on my part to throw in the whole unified URL / filename namespace idea, which really has nothing to do with the topic. Lets drop it, or start another topic, and let this thread focus on critiques of the path module, which is probably more relevant at the moment. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
Nick Coghlan wrote: Talin wrote: Part 3: Does this mean that the current API cannot be improved? Certainly not! I think everyone (well, almost) agrees that there is much room for improvement in the current APIs. They certainly need to be refactored and recategorized. But I don't think that the solution is to take all of the path-related functions and drop them into a single class, or even a single module. +1 from me. (for both the fraction I quoted and everything else you said, including the locator/inode/file distinction - although I'd also add that 'symbolic link' and 'directory' exist at a similar level as 'file'). I would tend towards classifying directory operations as inode-level operations, that you are working at the filesystem as graph level, rather than the stream of bytes level. When you iterate over a directory, what you are getting back is effectively inodes (well, directory entries are distinct from inodes in the underlying filesystem, but from Python there's no practical distinction.) If I could draw a UML diagram in ASCII, I would have inode -- points to -- directory or file and directory -- contains * -- inode. That would hopefully make things clearer. Symbolic links, I am not so sure about; In some ways, hard links are easier to classify. --- Having done a path library myself (in C++, for our code base at work), the trickiest part is getting the Windows path manipulations right, and fitting them into a model that allows writing of platform-agnostic code. This is especially vexing when you realize that its often useful to manipulate unix-style paths even when running under Win32 and vice versa. A prime example is that I have a lot of Python code at work that manipulates Perforce client specs files. The path specifications in these files are platform-agnostic, and use forward slashes regardless of the host platform, so os.path.normpath doesn't do the right thing for me. Cheers, Nick. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
Phillip J. Eby wrote: At 09:49 AM 10/25/2006 -0700, Talin wrote: Having done a path library myself (in C++, for our code base at work), the trickiest part is getting the Windows path manipulations right, and fitting them into a model that allows writing of platform-agnostic code. This is especially vexing when you realize that its often useful to manipulate unix-style paths even when running under Win32 and vice versa. A prime example is that I have a lot of Python code at work that manipulates Perforce client specs files. The path specifications in these files are platform-agnostic, and use forward slashes regardless of the host platform, so os.path.normpath doesn't do the right thing for me. You probably want to use the posixpath module directly in that case, though perhaps you've already discovered that. Never heard of it. Its not in the standard library, is it? I don't see it in the table of contents or the index. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
Greg Ewing wrote: Talin wrote: (Actually, the OOP approach has a slight advantage in terms of the amount of syntactic sugar available, Even if you don't use any operator overloading, there's still the advantage that an object provides a namespace for its methods. Without that, you either have to use fairly verbose function names or keep qualifying them with a module name. Code that uses the current path functions tends to contain a lot of os.path.this(os.path.that(...)) stuff which is quite tedious to write and read. Given the flexibility that Python allows in naming the modules that you import, I'm not sure that this is a valid objection -- you can make the module name as short as you feel comfortable with. Another consideration is that having paths be a distinct data type allows for the possibility of file system references that aren't just strings. In Classic MacOS, for example, the definitive way of referencing a file is by a (volRefum, dirID, name) tuple, and textual paths aren't guaranteed to be unique or even to exist. That's true of textual paths in general - i.e. even on unix, textual paths aren't guaranteed to be unique or exist. Its been a while since I used classic MacOS - how do you handle things like configuration files with path names in them? (I should not that the Java Path API does *not* follow my scheme of separation between locators and inodes, while the C# API does, which is another reason why I prefer the C# approach.) A compromise might be to have all the path algebra operations be methods, and everything else functions which operate on path objects. That would make sense, because the path algebra ought to be a closed set of operations that's tightly coupled to the platform's path semantics. Personally, this is one of those areas where I am strongly tempted to violate TOOWTDI - I can see use cases where string-based paths would be more convenient and less typing, and other use cases where object-based paths would be more convenient and less typing. If I were designing a path library, I would create a string-based system as the lowest level, and an object based system on top of it (the reason for doing it that was is simply so that people who want to use strings don't have to suffer the cost of creating temporary path objects to do simple things like joins.) Moreover, I would keep the naming conventions of the two systems similar, if at all possible possible - thus, the object methods would have the same (short) names as the functions within the module. So for example: # Import new, refactored module io.path from io import path # Case 1 using strings path1 = path.join( /Libraries/Frameworks, Python.Framework ) parent = path.parent( path1 ) # Case 2 using objects pathobj = path.Path( /Libraries/Frameworks ) pathobj += Python.Framework parent = pathobj.parent() Let me riff on this just a bit more - don't take this all too seriously though: Refactored organization of path-related modules (under a new name so as not to conflict with existing modules): io.path -- path manipulations io.dir -- directory functions, including dirwalk io.fs -- dealing with filesystem objects (inodes, symlinks, etc.) io.file -- file read / write streams # Import directory module import io.dir # String based API for entry in io.dir.listdir( /Library/Frameworks ): print entry # Entry is a string # Object based API dir = io.dir.Directory( /Library/Frameworks ) for entry in dir: # Iteration protocol on dir object print entry # entry is an obj, but __str__() returns path text # Dealing with various filesystems: pass in a format parameter dir = io.dir.Directory( /Library/Frameworks ) print entry.path( format=NT ) # entry printed in NT format # Or you can just use a format specifier for PEP 3101 string format: print Path in local system format is {0}.format( entry ) print Path in NT format is {0:NT}.format( entry ) print Path in OS X format is {0:OSX}.format( entry ) Anyway, off the top of my head, that's what a refactored path API would look like if I were doing it :) (Yes, the names are bad, can't think of better ATM.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
, and to overload the various functions and operators so that they, too, return paths. However, path algebra can be implemented just as easily in a functional style as in an object style. Properly done, a functional design shouldn't be significantly more bulky or wordy than an object design; The fact that the existing legacy API fails this test has more to do with history than any inherent advantages of OOP vs. functional style. (Actually, the OOP approach has a slight advantage in terms of the amount of syntactic sugar available, but that is [a] an artifact of the current Python feature set, and [b] not necessarily a good thing if it leads to gratuitous, Perl-ish cleverness.) As a point of comparison, the Java Path API and the C# .Net Path API have similar capabilities, however the former is object-based whereas the latter is functional and operates on strings. Having used both of them extensively, I find I prefer the C# style, mainly due to the ease of intra-conversion with regular strings - being able to read strings from configuration files, for example, and immediately operate on them without having to convert to path form. I don't find p.GetParent() much harder or easier to type than Path.GetParent( p ); but I do prefer Path.GetParent( string ) over Path( string ).GetParent(). However, this is only a *mild* preference - I could go either way, and wouldn't put up much of a fight about it. (I should not that the Java Path API does *not* follow my scheme of separation between locators and inodes, while the C# API does, which is another reason why I prefer the C# approach.) Part 3: Does this mean that the current API cannot be improved? Certainly not! I think everyone (well, almost) agrees that there is much room for improvement in the current APIs. They certainly need to be refactored and recategorized. But I don't think that the solution is to take all of the path-related functions and drop them into a single class, or even a single module. --- Anyway, I hope that (a) that answers your questions, and (b) isn't too divergent from most people's views about Path. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
(one additional postscript - One thing I would be interested in is an approach that unifies file paths and URLs so that there is a consistent locator scheme for any resource, whether they be in a filesystem, on a web server, or stored in a zip file.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 355 status
[EMAIL PROTECTED] wrote: Talin writes: (one additional postscript - One thing I would be interested in is an approach that unifies file paths and URLs so that there is a consistent locator scheme for any resource, whether they be in a filesystem, on a web server, or stored in a zip file.) +1 But doesn't file:/// do that for files, and couldn't we do something like zipfile:///nantoka.zip#foo/bar/baz.txt? Of course, we'd want to do ziphttp://your.server.net/kantoka.zip#foo/bar/baz.txt, too. That way leads to madness file:/// does indeed to it, but only the network module understands strings in that format. Ideally, you should be able to pass file:///... to a regular open function. I wouldn't expect it to be able to understand http://;. But the file: protocol should always be supported. In other words, I'm not proposing that the built-in file i/o package suddenly grow an understanding of network schema types. All I am proposing is a unified name space. - Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PSF Infrastructure has chosen Roundup as the issue tracker for Python development
Anthony Baxter wrote: Thanks to the folks involved in this prcocess - I'm looking forward to getting the hell away from SF's bug tracker. :-) Yes, let us know when the new tracker is up, I want to start using it :) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The lazy strings patch
Larry Hastings wrote: Martin v. Löwis wrote: Let's be specific: when there is at least one long-lived small lazy slice of a large string, and the large string itself would otherwise have been dereferenced and freed, and this small slice is never examined by code outside of stringobject.c, this approach means the large string becomes long-lived too and thus Python consumes more memory overall. In pathological scenarios this memory usage could be characterized as insane. True dat. Then again, I could suggest some scenarios where this would save memory (multiple long-lived large slices of a large string), and others where memory use would be a wash (long-lived slices containing the all or almost all of a large string, or any scenario where slices are short-lived). While I think it's clear lazy slices are *faster* on average, its overall effect on memory use in real-world Python is not yet known. Read on. I wonder - how expensive would it be for the string slice to have a weak reference, and 'normalize' the slice when the big string is collected? Would the overhead of the weak reference swamp the savings? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The lazy strings patch
% --- Totals: 7976ms 7713ms +3.4% 8257ms 7975ms +3.5% I also ran this totally unfair benchmark: x = abcde * (2) # 100k characters for i in xrange(1000): y = x[1:-1] and found my patched version to be 9759% faster. (You heard that right, 98x faster.) I'm ready to post the patch. However, as a result of this work, the description on the original patch page is really no longer accurate: http://sourceforge.net/tracker/index.php?func=detailaid=1569040group_id=5470atid=305470 Shall I close/delete that patch and submit a new patch with a more modern description? After all, there's not a lot of activity on the old patch page... Cheers, /larry/ * As I recall, stringobject.c needs the trailing zero in exactly *one* place: when comparing two zero-length strings. My patch ensures that zero-length slices and concatenations still return nullstring, so this still works as expected. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/talin%40acm.org ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Python Doc problems
[EMAIL PROTECTED] wrote: Andrew In such autogenerated documentation, you wind up with a list of Andrew every single class and function, and both trivial and important Andrew classes are given exactly the same emphasis. I find this true where I work as well. Doxygen is used as a documentation generation tool for our C++ class libraries. Too many people use that as a crutch to often avoid writing documentation altogether. It's worse in many ways than tools like epydoc, because you don't need to write any docstrings (or specially formatted comments) to generate reams and reams of virtual paper. This sort of documentation is all but useless for a Python programmer like myself. I don't really need to know the five syntactic constructor variants. I need to know how to use the classes which have been exposed to me. As someone who has submitted patches to Doxygen (and actually had them accepted), I have to say that I agree as well. At my work, it used to be standard practice for each project to have a web site of documentation that was generated by Doxygen. Part of the reason for my patches (which added support for parsing of C# doctags) was in support of this effort. However, I gradually realized that there's no actual use-case for Doxygen-generated docs in our environment. Think about the work cycle of a typical C++ programmer. Generally when you need to look up something in the docs for a module, you either need specific information on the type of a variable or params of a function, or you need overview docs that explain the general theory of the module. Bear in mind also that the typical C++ programmer is working inside of an IDE or other smart editor. Most such editors have a simple one-keystroke method of navigating from a symbol to its definition. In other words, it is *far* easier for a programmer to jump directly to the actual declaration in a header file - and its accompanying documentation comments - than it is to switch over to a web browser, navigate to the documentation site, type in the name of the symbol, hit search...why would I *ever* use HTML reference documentation when I can just look at the source, which is much easier to get to? Especially since the source often tells me much more than the docs would. The only reason for generated reference docs is when you are working on a module where you don't have the source code - which, even in a proprietary environment, is something to be avoided whenever possible. (The source may not be 'open', but that doesn't mean that *you* can't have access to it.) If you have the source - and a good indexing system in your IDE - there's really no need for Doxygen. Of course, the web-based docs are useful when you need an overview - but Doxygen doesn't give you that. As a result, I have been trying to get people to stop using Doxygen as a crutch as you say - in other words, if a team has the responsibility to write docs for their code, they can't just run Doxygen over the source and call it done. (Too bad there's no way to automatically generate the overview! :) While I am in rant mode (sorry), I also want to mention that most Documentation markup systems also have a source readability cost - i.e having embedded tags like @param make the original source less readable; and given what I said above about the source being the primary reference doc, it doesn't make sense to clutter up the code with funny @#$ characters. If I was going to use any markup system in the future, the first thing I would insist is that the markup be invisible - in other words, the markup should look just like normal comments, and the markup scanner should be smart enough to pick out the structure without needing a lot of hand-holding. For example: /* Plot a point at position x, y. 'x' - The x-coordinate. 'y' - The y-coordinate. */ void Plot( int x, int y ); The scanner should note that: 'x' and 'y' are in single-quotes, so they probably refer to code identifiers. The scanner can see that they are both parameters to the function, so there's no need to tell it that 'x' is an @param. In other words, the programmer should never have to type anything that can be deduced from looking at the code itself. And the reader shouldn't have to read a bunch of redundant information which they can easily see for themselves. I guess this is a long-winded way of saying, me too. Skip ditto. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Minipython
Milan Krcmar wrote: Thank you people. I'm going to try to strip unneeded things and let you know the result. Along with running Python on an embedded system, I am considering two more things. Suppose the system to be a small Linux router, which, after the kernel starts, merely configures lots of parameters of the kernel and then runs some daemons for gathering statistics and allowing remote control of the host. Python helps mainly in the startup phase of configuring kernel according to a human-readable confgiuration files. This has been solved by shell scripts. Python is not as suitable for running external processes and process pipes as a shell, but I'd like to write a module (at least) helping him in the sense of scsh (a Scheme shell, http://www.scsh.net). A more advanced solution is to replace system's init (/sbin/init) by Python. It should even speed the startup up as it will not need to run shell many times. To avoid running another processes, I want to port them to Python. Processes for kernel configuration, like iproute2, iptables etc. are often built above its own library, which can be used as a start point. (Yes, it does matter, at startup, routers run such processes hundreds times). Milan One alternative you might want to look into is the language Lua (www.lua.org), which is similar to Python in some respects (also has some similarities to Javascript), but specifically optimized for embedding in larger apps - meaning that it has a much smaller footprint, a much smaller standard library, less built-in data types and so on. (For example, dicts, lists, and objects are all merged into a single type called a 'table', which is just a generic indexable container.) Lua's C API consists of just a few dozen functions. It's not as powerful as Python of course, although it's surprisingly powerful for its size - it has closures, continuations, and all of the goodness you would expect from a modern language. Lua provides 'meta-mechanisms' for extending the language rather than implementing language features directly. So even though it's not a pure object-oriented language, it provides mechanisms for implementing classes and inheritance. And it's fast, since it has less baggage to carry around. It has a few warts - for example, I don't like the fact that referring to an undefined variable silently returns nil instead of returning an error, but I suppose in some environments that's a feature. A lot of game companies use Lua for embedded scripting languages in their games. (Console-based games in particular have strict memory requirements, since there's no virtual memory on consoles.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Grammar change in classdef
Nick Coghlan wrote: As for the reason: it makes it possible to use the same style for classes without bases as is used for functions without arguments. Prior to this change, there was a sharp break in the class syntax, such that if you got rid of the last base class you had to get rid of the parentheses as well. Is the result a new-style or classic-style class? It would be nice if using the empty parens forced a new-style class... -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Grammar change in classdef
Lawrence Oluyede wrote: That was my first thought as well. Unfortunately a quick test shows that class Foo(): creates an old style class instead :( I think that's because until it'll be safe to break things we will stick with classic by default... But in this case nothing will be broken, since the () syntax was formerly not allowed, so it won't appear in any existing code. So it would have been a good opportunity to shift over to increased usage new-style classes without breaking anything. Thus, 'class Foo:' would create a classic class, but 'class Foo():' would create a new-style class. However, once it's released as 2.5 that will no longer be the case, as people might start to use () to indicate a classic class. Oh well. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] What should the focus for 2.6 be?
Guido van Rossum wrote: I've been thinking a bit about a focus for the 2.6 release. We are now officially starting parallel development of 2.6 and 3.0. I really don't expect that we'll be able to merge the easily into the 3.0 branch much longer, so effectively 3.0 will be a fork of 2.5. I wonder if it would make sense to focus in 2.6 on making porting of 2.6 code to 3.0 easier, rather than trying to introduce new features in 2.6. We've done releases without new language features before; notable 2.3 didn't add anything new (except making a few __future__ imports redundant) and concentrated on bugfixes, performance, and library additions. I've been thinking about the transition to unicode strings, and I want to put forward a notion that might allow the transition to be done gradually instead of all at once. The idea would be to temporarily introduce a new name for 8-bit strings - let's call it ascii. An ascii object would be exactly the same as today's 8-bit strings. The 'str' builtin symbol would be assigned to 'ascii' by default, but you could assign it to 'unicode' if you wanted to default to wide strings: str = ascii # Selects 8-bit strings by default str = unicode # Selects unicode strings by default In order to make the transition, what you would do is to temporarily undefine the 'str' symbol from the code base - in other words, remove 'str' from the builtin namespace, and then migrate all of the code -- replacing any library reference to 'str' with a reference to 'ascii' *or* updating that function to deal with unicode strings. Once you get all of the unit tests running again, you can re-introduce 'str', but now you know that since none of the libraries refer to 'str' directly, you can safely change its definition. All of this could be done while retaining compatibility with existing 3rd party code - as long as 'str = ascii' is defined. So you turn it on to run your Python programs, and turn it off when you want to work on 3.0 migration. The next step (which would not be backwards compatible) would be to gradually remove 'ascii' from the code base -- wherever that name occurs, it would be a signal that the function needs to be updated to use 'unicode' instead. Finally, once the last occurance of 'ascii' is removed, the final step is to do a search and replace of all occurances of 'unicode' with 'str'. I know this seems round-about, and is more work than doing it all in one shot. However, I know from past experience that the trickiest part of doing a pervasive change to a code base like this is just keeping track of what parts have been migrated and what parts have not. Many times in the past I've changed the definition of a ubiquitous type by temporarily renaming it, thus vacating the old name so that it can be defined anew, without conflict. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Community buildbots
Giovanni Bajo wrote: [EMAIL PROTECTED] wrote: I think python should have a couple more of future imports. from __future__ import new_classes and from __future__ import unicode_literals would be really welcome, and would smooth the Py3k migration process Actually - can we make new-style classes the default, but allow a way to switch to old-style classes if needed? Perhaps a command-line argument to set the default back to old-style? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Explicit Lexical Scoping (pre-PEP?)
Ka-Ping Yee wrote: On Mon, 10 Jul 2006 [EMAIL PROTECTED] wrote: I think Talin's got a point though. It seems hard to find one short English word that captures the essence of the desired behavior. None of the words in his list seem strongly suggestive of the meaning to me. I suspect that means one's ultimately as good (or as bad) as the rest. What's wrong with nonlocal? I don't think i've seen an argument against that one so far (from Talin or others). Well, I just think that a fix for an aesthetic wart should be, well, aesthetic :) I also think that it won't be a complete disaster if we do nothing at all - there *are* existing ways to deal with this problem; there are even some which aren't hackish and non-obvious. For example, its easy enough to create an object which acts as an artificial scope: def x(): scope = object() scope.x = 1 def y(): scope.x = 2 To my mind, the above code looks about as elegant and efficient as most of the proposals put forward so far, and it already works. How much are we really saving here by building this feature into the language? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
Brett Cannon wrote: Using a factory method callback, one could store the PyCodeObject in a C proxy object that just acts as a complete delegate, forwarding all method calls to the internally stored PyCodeObject. That would work. For this initial implementation, though, I am not going to try to support this. We can always add support like this later since it doesn't fundamentally break or change anything that is already planned. Let's focus on getting even more basic stuff working before we start to get too fancy. -Brett Thinking about this some more, I've come up with a simpler idea for a generic wrapper class. The wrapper consists of two parts: a decorator to indicate that a given method is 'public', and a C 'guard' wrapper that insures that only 'public' members can be accessed. So for example: from Sandbox import guard, public class FileProxy: # Public method, no decorator @public def read( length ): ... @public def seek( offset ): ... # Private method, no decorator def open( path ): ... # Construct an instance of FileProxy, and wrap a # guard object around it. Any attempt to access a non-public # attribute will raise an exception (or perhaps simply report # that the attribute doesn't exist.) fh = guard( FileProxy() ) Now, from my point of view this *is* 'the basic stuff'. In other words, this right here is the fundamental sandbox mechanism, and everything else is built on top of it. Now, the C 'guard' function is only a low-level means to insure that no-one can mess with the object; It is not intended to be the actual restriction policy itself. Those are placed in the wrapper classes, just as in your proposed scheme. (This goes back to my basic premise, that a simple yes/no security feature can be used to build up much more complex and subtle security features.) The only real complexity of this, as I see it, is that methods to references will have to be themselves wrapped. In other words, if I say 'fh.read' without the (), what I get back can't be the actual read function object - that would be too easy to fiddle with. What I'd have to get is a wrapped version of the method that is a callable. One relatively simply way to deal with this is to have the 'public' decorator create a C wrapper for the function object, and store it as an attribute of the function. The class wrapper then simply looks up the attribute, and if it has an attribute wrapper, returns that, otherwise it fails. (Although, I've often wished for Python to have a variant of __call__ that could be used to override individual methods, i.e.: __call_method__( self, methodname, *args ) This would make the guard wrapper much easier to construct, since we could restrict the methods only to being called, and not allow references to methods to be taken.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] easy_install
Here's something to discuss: First, let me say that I love easy_install. I absolutely just works and does what I want, and makes it really simple to install whatever bit of Python code I need. At the same time, however, I get kind of scared when I hear people on the list discussing the various hacks needed to get setuputils and distutils et al all playing nice with each other (monkeypatching, etc.) Having done a small bit of fiddling with distutils myself (as a user, I mean), I can see that while there's a terrific amount of effort put into it, its also not for the feignt of heart. That's not entirely distutil's fault - I gather that it's dealing with a lot of accumulated cruft (I imagine things like different and strange ways of archiving modules, dynamic modifications to path, all that sort of thing.) It seems to me that if someone was going to spend some energy on this list coming up with proposals to improve Python, the thing that would have the most positive benefit in the long run (with the possible exception of Brett's work on rexec) would be a unified and clean vision of the whole import / package / download architecture. Now, speaking from complete ignorance here, I might be way off base - it may be that this matter is well in hand, perhaps on some other mailing list. I don't know. In any case, I wanted to throw this out there... -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Capabilities / Restricted Execution
Scott Dial wrote: Phillip J. Eby wrote: A function's func_closure contains cell objects that hold the variables. These are readable if you can set the func_closure of some function of your own. If the overall plan includes the ability to restrict func_closure setting (or reading) in a restricted interpreter, then you might be okay. Except this function (__getattribute__) has been trapped inside of a class which does not expose it as an attribute. So, you shouldn't be able to get to the func_closure attribute of the __getattribute__ function for an instance of the Guard class. I can't come up with a way to defeat this protection, at least. If you have a way, then I'd be interested to hear it. I've thought of several ways to break it already. Some are repairable, I'm not sure that they all are. For example, neither of the following statements blows up: print t2.get_name.func_closure[0] print object.__getattribute__( t2, '__dict__' ) Still, its perhaps a useful basis for experimentation. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Explicit Lexical Scoping (pre-PEP?)
Ka-Ping Yee wrote: On Sun, 9 Jul 2006, Andrew Koenig wrote: Sounds reasonable to me. If we're talking py3k I'd chuck global as a keyword though and replace it with something like outer. I must say that I don't like outer any more than I like global. The problem is that in both cases we are selecting the *inner*most definition that isn't in the current scope. That's why nonlocal is a better choice in my opinion. Some alternatives: use x using x with x -- recycle a keyword? reuse x use extant x share x common x same x borrow x existing x Although, to be perfectly honest, the longer this discussion goes on, the more that I find that I'm not buying Guido's argument about it being better to define this at the point of use rather than at the point of definition. I agree with him that point of use is more Pythonic, but I'm also beginning to believe that there are some good reasons why many other languages do it the other way. Part of the reason why its so hard to name this feature is that it's real name is something like Hey, Python, you know that cool funky thing you do with defining variables in the same scope as they are assigned? Well, don't do that here. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Explicit Lexical Scoping (pre-PEP?)
Talin wrote: Some alternatives: use x using x with x -- recycle a keyword? reuse x use extant x share x common x same x borrow x existing x Although, to be perfectly honest, the longer this discussion goes on, the more that I find that I'm not buying Guido's argument about it being better to define this at the point of use rather than at the point of definition. I agree with him that point of use is more Pythonic, but I'm also beginning to believe that there are some good reasons why many other languages do it the other way. Part of the reason why its so hard to name this feature is that it's real name is something like Hey, Python, you know that cool funky thing you do with defining variables in the same scope as they are assigned? Well, don't do that here. (Followup to my own comment) There are really 3 places where you can indicate that a variable is to be reused instead of redefined: 1) The point of definition in the outer scope, 2) A declaration in the inner scope, and 3) The actual point of assignment. #1 is what I've been pushing for, #2 is what most of the discussion has been about, #3 has been talked about a little bit in the context of an augmented assignment operator. I actually like #3 a little better than #2, but not with a new operator. I'm thinking more along the lines of a keyword that modifies and assignment statement: rebind x = 10 Other possible keywords are: modify, mutate, change, update, change, etc... My gut feeling is that most code that wants to use this feature only wants to use it in a few places. A good example is fcgi.py (implements WSGI for FastCGI), where they use a mutable array to store a flag indicating whether or not the headers have already been sent. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
Brett Cannon wrote: On 7/7/06, Guido van Rossum [EMAIL PROTECTED] wrote: On 7/8/06, Ka-Ping Yee [EMAIL PROTECTED] wrote: I'd like the answer to be yes. It sounded for a while like this was not part of Brett's plan, though. Now i'm not so sure. It sounds like you're also interested in having the answer be yes? Let's keep talking about and playing with more examples -- i think they'll help us understand what goals we should aim for and what pitfalls to anticipate before we nail down too many details. I'd like the answer to be no, because I don't believe that we can trust the VM to provide sufficient barriers. The old pre-2.2 restricted execution mode tried to do this but 2.2 punched a million holes in it. Python isn't designed for this (it doesn't even enforce private attributes). I guess this is also the main reason I'm skeptical about capabilities for Python. My plan is no. As Guido said, getting this right is feasibly questionable. I do not plan on trying to have security proxies or such implemented in Python code; it will need to be in C. If someone comes along and manages to find a way to make Python work without significantly changing the languages, great, and we can toss out my security implementation for that. But as of right now, I am not planning on making Python code safe to run in Python code. It might be possible for the code *outside* the sandbox to create new security policies written in Python. Lets start with the concept of a generic protection wrapper - its a C proxy object which can wrap around any Python object, and which can restrict access to a specific set of methods. So for example: protected_object = protect(myObject, methods=set('open','close')) 'protect' creates a C proxy which restricts access to the object, allowing only those methods listed to be called. Now, lets create a security policy, written in Python. The policy is essentially a factory which creates wrapped objects: class MyPolicy: # Ask the policy to create a file object def file( path, perms ): if perms == 'r': # Trivial example, a real proxy would be more # sophisticated, and probably configurable. return protect( file( path, perms ), methods=set('open', 'read', 'close') ) raise SecurityException Now, when we create our sandbox, we pass in the policy: sb = Sandbox( MyPolicy() ) The sandbox calls 'protect' on the policy object, preventing it from being inspected or called inappropriately. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
Brett Cannon wrote: On 7/5/06, Talin [EMAIL PROTECTED] wrote: Transitioning from the checked to the unchecked state could only be done via C code. So the 'file' wrapper, for example, would switch over to the unchecked interpreter before calling the actual methods of 'file'. That C wrapper might also check the current permission state to see what operations were legal. So add the proper checks in Python/ceval.c:call_function() to check for this flag on every object passed in that is called? Right. I also realized that you would need to add code that propagates the checked bit from the class to any instances of the class. So whenever you call a class to create an object, if the class has the checked bit, the instance will have it set as well. So essentially, what I propose is to define a simple security primitive - which essentially comes down to checking a single bit - and use that as a basis to create more complex and subtle security mechanisms. Right, but it does require that the proper verification function be turned on so that the permission bit on 'file' is checked. It kind of seems like 'rexec' and its f_restricted flag it set on execution frames, except you are adding an object-level flag as well. Either way, the trick is not fouling up switching between the two checking functions. -Brett I wasn't aware of how rexec worked, but that seems correct to me. Given a 'restricted' flag on a stack frame, and one on the object as well, then the code for checking for permission violations is nothing more than: if (object.restricted exec_frame.restricted) raise SecurityException In particular, there's no need to call a function to check a permission level or access rights or anything of the sort - all that stuff is implemented at a higher level. By making the check very simple, it can also be made very fast. And by making it fast, we can afford to call it a lot - for every operation in fact. And if we can call it for every operation, then we don't have to spend time hunting down all of the possible loopholes and ways in which 'file' or other restricted objects might be accessed. Originally I had thought to simply add a check like the above into the interpreter. However, that would mean that *all* code, whether restricted or not, would have to pay the (slight) performance penalty of checking that flag. So instead, I thought it might be more efficient to have two different code paths, one with the check and one without. But all this is based on profound ignorance of the interpreter - there might be a hundred other, better ways to do this without having to create two versions of ceval. Another interesting think about the check bit idea is that you can set it on any kind of object. For example, you could set it on individual methods of a class rather than the class as a whole. However, that's probably needlessly elaborate, since fine-grained access control will be much more elegantly achieved via trusted wrappers. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
Brett Cannon wrote: On 7/6/06, Talin [EMAIL PROTECTED] wrote: And if we can call it for every operation, then we don't have to spend time hunting down all of the possible loopholes and ways in which 'file' or other restricted objects might be accessed. Not true. You have to set this object restriction flag, right? What happens if you don't set it on all of the proper classes/types? You end up in the exact same situation you are with crippling; making sure you cover your ass with what you flag as unsafe else you risk having something get passed you. But that's a much simpler problem. With the restricted flag, it isn't just *your* code that is prevented from using 'file' - it's *all* code. Only approved gateways that remove the restriction (by setting the interpreter state) can perform operations on file objects without blowing up. This means that if you call some random library function that attempts to open a file, it won't work, because the random library function is still running in restricted mode. Similarly, if you have a reference to some externally created object that has a reference to a file (or the file class) somewhere in it's inheritance hierarchy, any attempt to access that object will fail. Without this, you would have to chase down every bit of library code that opens file, or has a reference to a file. What I am proposing shares some aspects of both the crippling and the capability model: It's similar to crippling in the sense that you're protecting the object itself, not access to the object. So you avoid the problem of trying to figure out all of the possible ways an object can be accessed. However, where it resembles capabilities is that its an 'all or nothing' approach - that is, you either have access to file, or you don't. Unlike the crippling model where fine-grained access control is implemented by modifying individual methods of the crippled object, in this scheme we cripple the object *entirely*, and then provide fine-grained access control via wrappers. Those wrappers, in turn, act just like capabilities - you can have different wrappers that have different sets of access permissions. So it provides the advantage of the capability approach in that the set of restrictions can be extended or modified by writing new wrappers. Thus, by providing an extremely simple but unbreakable check at the interpreter level, we can then write classes that build on top of that a richer and more sophisticated set of permissions, while still maintaining a strong barrier to unauthorized actions. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
to poisoned objects, which I will illustrate by the following example: Suppose we want to grant to the sandboxed program permission to read and write cofiguration properties. We don't want to give them arbitrary write access to the file, instead we want to force the sandbox code to only access that file by setting and getting properties. This is an example where a subsystem would require elevated privileges compared to the main program - the config file reader / writer needs to be able to read write the file as a text stream, but we don't want to allow the sandboxed program to just write arbitrary data to it. The only way to enforce this restriction is to re-wrap the 'real' file handle - in other words, replace the 'file-like object' wrapper with a 'config-like object' wrapper. Merely passing the poisoned file handle to 'config' doesn't work, because 'config' doesn't know how to safely handle it (only the C gateway code can shift the interpreter into a state where poisoned objects can be handled safely.) Passing the file-like proxy to 'config' doesn't work either, because the proxy doesn't allow arbitrary writes. The only thing that you really can do is write another wrapper - which is exactly what you would have to do in the non-poison case. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] In defense of Capabilities [was: doc for new restricted execution design for Python]
Brett Cannon wrote: On 7/5/06, Michael Chermside [EMAIL PROTECTED] wrote: If you were using capabilities, you would need to ensure that restricted interpreters could only get the file object that they were given. But then _all_ of these fancy versions of the restrictions would be immediately supported: it would be up to the users to create secure wrappers implementing the specific restrictions desired. I agree. I would prefer this way of doing it. But as I have said, making sure that 'file' does not get out into the wild is tough. I seem to recall someone mentioned earlier in this discussion the notion of somehow throwing an exception when sandboxed code attempts to push a file reference onto the interpreter stack. I'm not an expert in these matters, so perhaps what I am going to say will make no sense, but here goes: What if there were two copies of the evaluator function. One copy would be a slightly slower 'checked' function, that would test all objects for a 'check' bit. Any attempt to evaluate a reference to an object with a check bit set would throw an exception. The other eval function would be the 'unchecked' version that would run at full speed, just like it does today. Transitioning from the checked to the unchecked state could only be done via C code. So the 'file' wrapper, for example, would switch over to the unchecked interpreter before calling the actual methods of 'file'. That C wrapper might also check the current permission state to see what operations were legal. Note that whenever a C function sets the interpreter state to 'unchecked', that fact is saved on the stack, so that when the function returns, the previous state is restored. The function for setting the interpreter state is something like PyCall_Unchecked( ... ), which restores the interpreter state back to where it was. Transitioning from unchecked to checked is trickier. Specifically, you don't want to ever run sandboxed code in the unchecked state - this is a problem for generators, callbacks, and so on. I can think of two approaches to handling this: First approach is to mark all sandboxed code with a bit indicating the code is untrusted. Any attempt to call or otherwise invoke a function that has this bit set would throw the interpreter into the 'checked' state. (Note that transitioning the other way is *not* automatic - i.e. calling trusted code does not automatically transition to unchecked state.) The approach is good because it means that if you have intermediary code between the wrapper and the sandboxed code, the interpreter still does the right thing - it sets the interpreter into checked state. One problem is how to restore the 'unchecked' state when a function call returns. Probably you would have to build this into the code that does the state transition. If marking the sandboxed code isn't feasible, then you'd have to have the wrapper objects wrap all of the callbacks with code that goes to checked state before calling the callbacks. This means finding all the possible holes - however, I suspect that there are far fewer such holes than trying to hide all possible 'file' methods. However, one advantage of doing this is that restoring the 'unchecked' state after the call returns is fairly straightforward. The advantage of the this whole approach is that once you set the 'check' bit on 'file', it doesn't matter whether 'file' gets loose or not - you can't do anything with it without throwing an exception. Moreover, it also doesn't matter what code path you go through to access it. Only C code that flips the interpreter into unchecked state can call methods on 'file' without blowing up. So essentially, what I propose is to define a simple security primitive - which essentially comes down to checking a single bit - and use that as a basis to create more complex and subtle security mechanisms. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Explicit Lexical Scoping (pre-PEP?)
the print, the scope created by the 'my' statement is in effect for the entire function, although the actual *assignment* takes place after the print. The reason for this is that the scope creation is actually done by the compiler. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Lexical scoping in Python 3k
the same method with globals: my f = 1 def a(): f = 2 a() print f # prints '2' So again, you are indicating that the global scope 'owns' the definition of 'f', and any enclosed scopes should use that definition, and not create their own. Of course, if you really *do* need to have your own version, you can always override the 'my' statement with another 'my' statement: my f = 1 def a(): my f = 2 a() print f # prints '1' The 'my' statement essentially changes the scoping rules for all variables of that name, within the defining scope and all enclosed scopes. Of course, you can also override this behavior using the 'global' statement, which does exactly what it does now - makes the reference global (i.e. module-level): my f = 1 def a(): global f f = 2 a() print f # prints '2' All right, I'm pretty happy with that. Brainstorming done. :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3103: A Switch/Case Statement
Guido van Rossum wrote: Let's just drop the switchable subroutine proposal. It's not viable. Perhaps not - but at the same time, when discussing new language features, let's not just limit ourselves to what other languages have done already. Forget subroutines for a moment - the main point of the thread was the idea that the dispatch table was built explicitly rather than automatically - that instead of arguing over first-use vs. function-definition, we let the user decide. I'm sure that my specific proposal isn't the only way that this could be done. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3103: A Switch/Case Statement
Guido van Rossum wrote: On 6/27/06, Ron Adam [EMAIL PROTECTED] wrote: So modeling the switch after dictionary dispatching more directly where the switch is explicitly defined first and then used later might be good both because it offers reuse in the current scope and it can easily be used in code that currently uses dict style dispatching. switch name: 1: ... TWO: ... 'a', 'b', 'c': ... in range(5,10): ... else: ... for choice in data: do choice in name:# best calling form I can think of. It looks like your proposal is to change switch into a command that defines a function of one parameter. Instead of the do expression in switch call you could just call the switch -- no new syntax needed. Your example above would be for choice in data: name(choice) # 'name' is the switch's name This parallels some of my thinking -- that we ought to somehow make the dict-building aspect of the switch statement explicit (which is better than implicit, as we all have been taught.) My version of this is to add to Python the notion of a simple old-fashioned subroutine - that is, a function with no arguments and no additional scope, which can be referred to by name. For example: def MyFunc( x ): sub case_1: ... sub case_2: ... sub case_3: ... # A direct call to the subroutine: do case_1 # An indirect call y = case_2 do y # A dispatch through a dict d = dict( a=case_1, b=case_2, c_case_3 ) do d[ 'a' ] The 'sub' keyword defines a subroutine. A subroutine is simply a block of bytecode with a return op at the end. When a subroutine is invoked, control passes to the indented code within the 'sub' clause, and continues to the end of the block - there is no 'fall through' to the next block. When the subroutine is complete, a return instruction is exected, and control transfers back to the original location. Because subroutines do not define a new scope, they can freely modify the variables of the scope in which they are defined, just like the code in an 'if' or 'else' block. One ambiguity here is what happens if you attempt to call a subroutine from outside of the code block in which it is defined. The easiest solution is to declare that this is an error - in other words, if the current execution scope is different than the scope in which the subroutine is defined, an exception is thrown. A second possibility is to store a reference to the defining scope as part of the subroutine definition. So when you take a reference to 'case_1', you are actually referring to a closure of the enclosing scope and the subroutine address. This approach has a number of advantages that I can see: -- Completely eliminates the problems of when to freeze the dict, because the dict is 'frozen' explicitly (or not at all, if desired.) -- Completely eliminates the question of whether to support ranges in the switch cases. The programmer is free to invent whatever type of dispatch mechanism they wish. For example, instead of using a dict, they could use an array of subroutines, or a spanning tree / BSP tree to represent contiguous ranges of options. -- Allows for development of dispatch methods beyond the switch model - for example, the dictionary could be computed, transformed and manipulated by user code before used for dispatch. -- Allows for experimentation with other flow of control forms. The primary disadvantage of this form is that the case values and the associated code blocks are no longer co-located, which reduces some of the expressive power of the switch. Note that if you don't want to define a new keyword, an alternate syntax would be 'def name:' with no argument braces, indicating that this is not a function but a procedure. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Alternatives to switch?
= TYPE_INT object_data = str(data) def str_case(): type_code = TYPE_STR object_data = str(data) # (and so on) # Build the dispatch table once only if dispatch_table is None: dispatch_table = dict( int, int_case, str, str_case, ... ) dispatch_table[ type( data ) ]() However, you probably wouldn't want to write the code like this in current-day Python -- even a fairly long if-elif-else chain would be more efficient, and the code isn't as neatly expressive of what you are trying to do. But the advantage is that the construction of the dispatch table is explicit rather than implicit, which avoids all of the arguments about when the dispatch should occur. Another way to deal with the explicit construction of the switch table is to contsruct it outside of the function body. So for example, if the values to be switched on are meant to be evaluated at module load time, then the user can define the dispatch table outside of any function. The problem is, however, that the language requires any code that accesses the local variables of a function to be textually embedded within that function, and you can't build a dispatch table outside of a function that refers to code sections within a function. In the interest of brevity, I'm going to cut it off here before I ramble on too much longer. I don't have an answer, so much as I am trying to raise the right questions. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Guido van Rossum wrote: That sounds like a good solution all around. I hope that others can also find themselves in this. (1) An expression of the form 'static' atom has the semantics of evaluating the atom at the same time as the nearest surrounding function definition. If there is no surrounding function definition, 'static' is a no-op and the expression is evaluated every time. [Alternative 1: this is an error] [Alternative 2: it is evaluated before the module is entered; this would imply it can not involve any imported names but it can involve builtins] [Alternative 3: precomputed the first time the switch is entered] I'm thinking that outside of a function, 'static' just means that the expression is evaluated at compile-time, with whatever symbols the compiler has access to (including any previously-defined statics in that module). The result of the expression is then inserted into the module code just like any literal. So for example: a = static len( 1234 ) compiles as: a = 4 ...assuming that you can call 'len' at compile time. The rationale here is that I'm trying to create an analogy between functions and modules, where the 'static' declaration has an analogous relationship to a module as it does to a function. Since a module is 'defined' when its code is compiled, that would be when the evaluation occurs. I'm tempted to propose a way for the compiler to import static definitions from outside the module ('static import'?) however I recognize that this would greatly increase the fragility of Python, since now you have the possibility that a module could be compiled with a set of numeric constants that are out of date with respect to some other module. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Phillip J. Eby wrote: At 09:55 AM 6/21/2006 -0700, Guido van Rossum wrote: BTW a switch in a class should be treated the same as a global switch. But what about a switch in a class in a function? Okay, now my head hurts. :) A switch in a class doesn't need to be treated the same as a global switch, because locals()!=globals() in that case. I think the top-level is the only thing that really needs a special case vs. the general error if you use a local variable in the expression rule. Actually, it might be simpler just to always reject local variables -- even at the top-level -- and be done with it. I don't get what the problem is here. A switch constant should have exactly the bahavior of a default value of a function parameter. We don't seem to have too many problems defining functions at the module level, do we? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Allow assignments in 'global' statements?
I'm sure I am not the first person to say this, but how about: global x = 12 (In other words, declare a global and assign a value to it - or another way of saying it is that the 'global' keyword acts as an assignment modifier.) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: Allow Empty Subscript List Without Parentheses
Martin v. Löwis wrote: Noam Raphael wrote: I meant the extra code for writing a special class to handle scalars, if I decide that the x[()] syntax is too ugly or too hard to type, so I write a special class which will allow the syntax x.value. What I cannot understand is why you use a zero-dimensional array to represent a scalar. Scalars are directly supported in Python: x = 5 Also, in an assignment, what are you putting on the right-hand side? A read access from another zero-dimensional array? Ok, so in order to clear up the confusion here, I am going to take a moment to try and explain Noam's proposal in clearer language. Note that I have no opinions about the merits of the proposal itself; However, the lack of understanding here bothers me :) The motivation, as I understand it, is one of mathematical consistency. Let's take a moment and think about arrays in terms of geometry. We all learned in school that 3 dimensions defines a volume, 2 dimensions defines a plane, 1 dimension defines a line, and 0 dimensions defines a point. Moreover, each N-dimensional entity can be converted to one of lower order by setting one of its dimensions to 0. So a volume with one dimension set to zero becomes a plane, and so on. Now think about this with respect to arrays. A 3-dimensional array can be converted into a 2-dimensional array by setting one of its dimensions to 1. So a 5 x 5 array is equivalent to a 5 x 5 x 1 array. Similarly, a 3-dimensional array can be converted into a 1-dimensional array by setting two of its dimensions to 1: So an array of length 5 is equivalent to a 5 x 1 x 1 array. We see, then, a general rule that a N-dimensional array can be reduced to M dimensions by setting (N-M) of its dimensions to 1. So by this rule, if we reduce a 3d array to zero dimensions, we would have an array that has one element: 1 x 1 x 1. Similarly, each time we reduce the dimension by 1, we also reduce the number of indices needed to access the elements of the array. So a 3-d array requires 3 coordinates, a 2-d array requires 2 coordinates, and so on. It should be noted that this zero-dimensional array is not exactly a normal scalar. It is a scalar in the sense that it has no dimensions, but it is still an array in the sense that it is contains a value which is distinct from the array itself. The zero-dimensional array is still a container of other values, however it can only hold one value. This is different from a normal scalar, which is simply a value, and not a container. Now, as to the specifics of Noam's problem: Apparently what he is trying to do is what many other people have done, which is to use Python as a base for some other high-level language, building on top of Python syntax and using the various operator overloads to define the semantics of the language. However, what he's discovering is that there are cases where his syntactical requirements and the syntactical rules of Python don't match. Now, typically when this occurs, the person who is creating the language knows that there is a rationale for why that particular syntax makes sense in their language. What they often do in this case is to try and convince the Python community that this rationale also applies to Python in addition to their own made-up language. This is especially the case when the proposed change gives meaning to what would formerly have been an error. (I sometime suspect that the guiding design principle of Perl is that all possible permutations of ASCII input characters should eventually be assigned some syntactically valid meaning.) Historically, I can say that such efforts are almost always rebuffed - while Python may be good at being a base for other languages, this is not one of the primary design goals of the language as I understand it. My advice to people in this situation is to consider that perhaps some level of translation between their syntax and Python syntax may be in order. It would not be hard for the interactive interpreter to convert instances of [] into [()], for example. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Phillip J. Eby wrote: As has already been pointed out, this 1) adds function call overhead, 2) doesn't allow changes to variables in the containing function, and 3) even if we had a rebinding operator for free variables, we would have the overhead of creating closures. The lambda syntax does nothing to fix any of these problems, and you can already use a mapping of closures if you are so inclined. However, you'll probably find that the cost of creating the dictionary of closures exceeds the cost of a naive sequential search using if/elif. This brings me back to my earlier point - I wonder if it would make sense for Python to have a non-closure form of lambda - essentially an old-fashioned subroutine: def foo( x ): x = 0 sub bar: # Arguments are not allowed, since they create a scope x = y # Writes over the x defined in 'foo' bar() The idea is that 'bar' would share the same scope as 'foo'. To keep the subroutine lightweight (i.e. just a single jump and return instruction in the virtual machine) arguments would not be allowed. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
Greg Ewing wrote: [EMAIL PROTECTED] wrote: switch raw_input(enter a, b or c: ): case 'a': print 'yay! an a!' case 'b': print 'yay! a b!' case 'c': print 'yay! a c!' else: print 'hey dummy! I said a, b or c!' Before accepting this, we could do with some debate about the syntax. It's not a priori clear that C-style switch/case is the best thing to adopt. Since you don't have the 'fall-through' behavior of C, I would also assume that you could associate more than one value with a case, i.e.: case 'a', 'b', 'c': ... It seems to me that the value of a 'switch' statement is that it is a computed jump - that is, instead of having to iteratively test a bunch of alternatives, you can directly jump to the code for a specific value. I can see this being very useful for parser generators and state machine code. At the moment, similar things can be done with hash tables of functions, but those have a number of limitations, such as the fact that they can't access local variables. I don't have any specific syntax proposals, but I notice that the suite that follows the switch statement is not a normal suite, but a restricted one, and I am wondering if we could come up with a syntax that avoids having a special suite. Here's an (ugly) example, not meant as a serious proposal: select (x) when 'a': ... when 'b', 'c': ... else: ... The only real difference between this and an if-else chain is that the compiler knows that all of the test expressions are constants and can be hashed at compile time. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch statement
[EMAIL PROTECTED] wrote: talin Since you don't have the 'fall-through' behavior of C, I would talin also assume that you could associate more than one value with a talin case, i.e.: talin case 'a', 'b', 'c': talin... As Andrew Koenig pointed out, that's not discussed in the PEP. Given the various examples though, I would have to assume the above is equivalent to case ('a', 'b', 'c'): ... I had recognized that ambiguity as well, but chose not to mention it :) since in all cases the PEP implies a single expression. talin It seems to me that the value of a 'switch' statement is that it talin is a computed jump - that is, instead of having to iteratively talin test a bunch of alternatives, you can directly jump to the code talin for a specific value. I agree, but that of course limits the expressions to constants which can be evaluated at compile-time as I indicated in my previous mail. Also, as someone else pointed out, that probably prevents something like START_TOKEN = '' END_TOKEN = '' ... switch expr: case START_TOKEN: ... case END_TOKEN: ... Here's another ugly thought experiment, not meant as a serious proposal; it's intent is to stimulate ideas by breaking preconceptions. Suppose we take the notion of a computed jump literally: def myfunc( x ): goto dispatcher[ x ] section s1: ... section s2: ... dispatcher=dict('a'=myfunc.s1, 'b'=myfunc.s2) No, I am *not* proposing that Python add a goto statement. What I am really talking about is the idea that you could (somehow) use a dictionary as the input to a control construct. In the above example, rather than allowing arbitrary constant expressions as cases, we would require the compiler to generate a set of opaque tokens representing various code fragments. These fragments would be exactly like inner functions, except that they don't have their own scope (and therefore have no parameters either). Since the jump labels are symbols generated by the compiler, there's no ambiguity about when they get evaluated. The above example also allows these labels to be accessed externally from the function by defining attributes on the function object itself which correspond to the code fragments. So in the example, the dictionary which associates specific values with executable sections is created once, at runtime, but before the first time that myfunc is called. Of course, this is quite a bit clumsier than a switch statement, which is why I say its not a serious proposal. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-3000] stdlib reorganization
A.M. Kuchling wrote: On Tue, May 30, 2006 at 03:36:02PM -0600, Steven Bethard wrote: That sounds about reasonable. One possible grouping: Note that 2.5's library reference has a different chapter organization from 2.4's. See http://docs.python.org/dev/lib/lib.html. I like it. Its a much cleaner organization than the 2.4 libs. I would like to see it used as a starting point for a reorg of the standard lib namespace. One question that is raised is whether the categories should map directly to package names in all cases. For example, I can envision a desire that 'sys' would stay a top-level name, rather than 'rt.sys'. Certain modules are so fundamental that they deserve IMHO to live in the root namespace. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3101 Update
Guido van Rossum wrote: On 5/6/06, Talin [EMAIL PROTECTED] wrote: I've updated PEP 3101 based on the feedback collected so far. [http://www.python.org/dev/peps/pep-3101/] I think this is a step in the right direction. Cool, and thanks for the very detailed feedback. I wonder if we shouldn't borrow more from .NET. I read this URL that you referenced: http://msdn.microsoft.com/library/en-us/cpguide/html/cpconcompositeformatting.asp They have special syntax to support field width, e.g. {0,10} formats item 0 in a field of (at least) 10 positions wide, right-justified; {0,-10} does the same left-aligned. This is done independently from We already have that now, don't we? If you look at the docs for String Formatting Operations in the library reference, it shows that a negative sign on a field width indicates left justification. the type-specific formatting. (I'm not proposing that we use .NET's format specifiers after the colon, but I'm also no big fan for keeping the C specific stuff we have now; we should put some work in designing something with the same power as the current %-based system for floats and ints, that would cover it.) Agreed. As you say, the main work is in handling floats and ints, and everything else can either be formatted as plain str(), or use a custom format specifier syntax (as in my strftime example.) .NET's solution for quoting { and } as {{ and }} respectively also sidesteps the issue of how to quote \ itself -- since '\\{' is a 2-char string containing one \ and one {, you'd have to write either '{0}' or r'\\{0}' to produce a single literal \ followed by formatted item 0. Any time there's the need to quadruple a backslash I think we've lost the battle. (Or you might search the web for Tcl quoting hell. :-) I'm fine with not having a solution for doing variable substitution within the format parameters. That could be done instead by building up the format string with an extra formatting step: instead of {x:{y}}.format(x=whatever, y=3) you could write {{x,{y}}}.format(y=3).format(x=whatever). (Note that this is subtle: the final }}} are parsed as } followed by }}. Once the parser has seen a single {, the first } it sees is the matching closing } and adding another } after it won't affect it. The specifier cannot contain { or } at all. There is another solution to this which is equally subtle, although fairly straightforward to parse. It involves defining the rules for escapes as follows: '{{' is an escaped '{' '}}' is an escaped '}', unless we are within a field. So you can write things like {0:10,{1}}, and the final '}}' will be parsed as two separate closing brackets, since we're within a field definition. From a parsing standpoint, this is unambiguous, however I've held off on suggesting it because it might appear to be ambiguous to a casual reader. I like having a way to reuse the format parsing code while substituting something else for the formatting itself. The PEP appears silent on what happens if there are too few or too many positional arguments, or if there are missing or unused keywords. Missing ones should be errors; I'm not sure about redundant (unused) ones. On the one hand complaining about those gives us more certainty that the format string is correct. On the other hand there are some use cases for passing lots of keyword parameters (e.g. simple web templating could pass a fixed set of variables using **dict). Even in i18n (translation) apps I could see the usefulness of allowing unused parameters I am undecided on this issue as well, which is the reason that it's not mentioned in the PEP (yet). On the issue of {a.b.c}: like several correspondents, I don't like the ambiguity of attribute vs. key refs much, even though it appears useful enough in practice in web frameworks I've used. It seems to violate the Zen of Python: In the face of ambiguity, refuse the temptation to guess. Unfortunately I'm pretty lukewarm about the proposal to support {a[b].c} since b is not a variable reference but a literal string 'b'. It is also relatively cumbersome to parse. I wish I could propose {a+b.c} for this case but that's so arbitrary... Actually, it's not all that hard to parse, especially given that there is no need to deal with the 'nested' case. I will be supplying a Python implementation of the parser along with the PEP. What I would prefer not to supply (although I certainly can if you feel it's necessary) is an optimized C implementation of the same parser, as well as the implementations of the various type-specific formatters. Even more unfortunately, I expect that dict key access is a pretty important use case so we'll have to address it somehow. I *don't* think there's an important use case for the ambiguity -- in any particular situation I expect that the programmer will know whether they are expecting a dict or an object with attributes. Hm
Re: [Python-Dev] PEP 3101 Update
Guido van Rossum wrote: [on escaping] There is another solution to this which is equally subtle, although fairly straightforward to parse. It involves defining the rules for escapes as follows: '{{' is an escaped '{' '}}' is an escaped '}', unless we are within a field. So you can write things like {0:10,{1}}, and the final '}}' will be parsed as two separate closing brackets, since we're within a field definition. From a parsing standpoint, this is unambiguous, however I've held off on suggesting it because it might appear to be ambiguous to a casual reader. Sure. But I still think there isn't enough demand for variable expansion *within* a field to bother. When's the lats time you used a * in a % format string? And how essential was it? True. I'm mainly trying to avoid excess debate by not dropping existing features unecessarily. (Otherwise, you spend way to much time arguing with the handful of people out there that do rely on that use case.) But if you want to use your special BDFL superpower to shortcut the debate, I'm fine with that :) BTW I think we should move this back to the py3k list -- the PEP is 3101 after all. That should simplify the PEP a bit because it no longer has ti distinguish between str and unicode. If we later decide to backport it to 2.6 it should be easy enough to figure out what to do with str vs. unicode (probably the same as we do for %). All right; Although my understanding is that the PEP should be escalated to c.l.p at some point before acceptance, and I figured py-dev would be a reasonable intermediate point before that. But it sounds like 3101 is going to go back into the shop for the moment, so that's a non-issue. Since you seem to be in a PEP-review mode, could you have a look at 3102? In particular, it seems that all of the controversies on that one have quieted down; Virtually everyone seems in favor of the first part, and you have already ruled in favor of the second part. So I am not sure that there is anything more to discuss. Perhaps I should go ahead and put 3102 on c.l.p at this point. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3101 Update
Steven Bethard steven.bethard at gmail.com writes: I'm still not a big fan of mixing together getitem-style access and getattribute-style access. That makes classes that support both ambiguous in this context. You either need to specify the order in which these are checked (e.g. attribute then item or item then attribute), or, preferably, you need to extend the syntax to allow getitem-style access too. Just to be clear, I'm not suggesting that you support anything more then items and attributes. So this is *not* a request to allow arbitrary expressions. In fact, the only use-case I see in the PEP needs only item access, not attribute access, so maybe you could drop attribute access? Can't you just extend the syntax for *only* item access? E.g. something like: My name is {0[name]} :-\{\}.format(dict(name='Fred')) I'm not opposed to the idea of adding item access, although I believe that attribute access is also useful. In either case, its not terribly hard to implement. I'd like to hear what other people have to say on this issue. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3101 Update
Steven Bethard steven.bethard at gmail.com writes: I believe the proposal is taking advantage of the fact that '\{' is not interpreted as an escape sequence -- it is interpreted as a literal backslash followed by an open brace: This is exactly correct. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3101 Update
Michael Chermside mcherm at mcherm.com writes: One small comment: The conversion specifier consists of a sequence of zero or more characters, each of which can consist of any printable character except for a non-escaped '}'. Escaped? How are they escaped? (with '\'?) If so, how are backslashes escaped (with '\\'?) And does the mechanism un-escape these for you before passing them to __format__? Later... - Variable field width specifiers use a nested version of the {} syntax, allowing the width specifier to be either a positional or keyword argument: {0:{1}.{2}d}.format(a, b, c) This violates the specification given above... it has non-escaped '}' characters. Make up one rule and be consistant. What would you suggest? I'd be interested in hearing what kinds of ideas people have for fixing these problems. - -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3101 Update
I've updated PEP 3101 based on the feedback collected so far. - PEP: 3101 Title: Advanced String Formatting Version: $Revision: 45928 $ Last-Modified: $Date: 2006-05-06 18:49:43 -0700 (Sat, 06 May 2006) $ Author: Talin talin at acm.org Status: Draft Type: Standards Content-Type: text/plain Created: 16-Apr-2006 Python-Version: 3.0 Post-History: 28-Apr-2006, 6-May-2006 Abstract This PEP proposes a new system for built-in string formatting operations, intended as a replacement for the existing '%' string formatting operator. Rationale Python currently provides two methods of string interpolation: - The '%' operator for strings. [1] - The string.Template module. [2] The scope of this PEP will be restricted to proposals for built-in string formatting operations (in other words, methods of the built-in string type). The '%' operator is primarily limited by the fact that it is a binary operator, and therefore can take at most two arguments. One of those arguments is already dedicated to the format string, leaving all other variables to be squeezed into the remaining argument. The current practice is to use either a dictionary or a tuple as the second argument, but as many people have commented [3], this lacks flexibility. The all or nothing approach (meaning that one must choose between only positional arguments, or only named arguments) is felt to be overly constraining. While there is some overlap between this proposal and string.Template, it is felt that each serves a distinct need, and that one does not obviate the other. In any case, string.Template will not be discussed here. Specification The specification will consist of 4 parts: - Specification of a new formatting method to be added to the built-in string class. - Specification of a new syntax for format strings. - Specification of a new set of class methods to control the formatting and conversion of objects. - Specification of an API for user-defined formatting classes. String Methods The build-in string class will gain a new method, 'format', which takes takes an arbitrary number of positional and keyword arguments: The story of {0}, {1}, and {c}.format(a, b, c=d) Within a format string, each positional argument is identified with a number, starting from zero, so in the above example, 'a' is argument 0 and 'b' is argument 1. Each keyword argument is identified by its keyword name, so in the above example, 'c' is used to refer to the third argument. The result of the format call is an object of the same type (string or unicode) as the format string. Format Strings Brace characters ('curly braces') are used to indicate a replacement field within the string: My name is {0}.format('Fred') The result of this is the string: My name is Fred Braces can be escaped using a backslash: My name is {0} :-\{\}.format('Fred') Which would produce: My name is Fred :-{} The element within the braces is called a 'field'. Fields consist of a 'field name', which can either be simple or compound, and an optional 'conversion specifier'. Simple field names are either names or numbers. If numbers, they must be valid base-10 integers; if names, they must be valid Python identifiers. A number is used to identify a positional argument, while a name is used to identify a keyword argument. Compound names are a sequence of simple names seperated by periods: My name is {0.name} :-\{\}.format(dict(name='Fred')) Compound names can be used to access specific dictionary entries, array elements, or object attributes. In the above example, the '{0.name}' field refers to the dictionary entry 'name' within positional argument 0. Each field can also specify an optional set of 'conversion specifiers' which can be used to adjust the format of that field. Conversion specifiers follow the field name, with a colon (':') character separating the two: My name is {0:8}.format('Fred') The meaning and syntax of the conversion specifiers depends on the type of object that is being formatted, however many of the built-in types will recognize a standard set of conversion specifiers. The conversion specifier consists of a sequence of zero or more characters, each of which can consist of any printable character except for a non-escaped '}'. Conversion specifiers can themselves contain replacement fields; this will be described in a later section. Except for this replacement, the format() method does not attempt to intepret the conversion specifiers in any way; it merely passes all of the characters between
Re: [Python-Dev] lambda in Python
xahlee xah at xahlee.org writes: Today i ran into one of Guido van Rossum's blog article titled “Language Design Is Not Just Solving Puzzles” at http://www.artima.com/weblogs/viewpost.jsp?thread=147358 The confrontational tone of this post makes it pretty much impossible to have a reasonable debate on the subject. I'd suggest that if you really want to be heard (instead of merely having that I'm right feeling) that you try a different approach. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] mail to talin is bouncing
Guido van Rossum guido at python.org writes: Sorry to bother the list -- talin, mail to you is bouncing: Someone sent me mail? Cool! :) Sorry about that, I'm in the process of migrating hosting providers, and I forgot to add an email account for myself :) It should be better now, I'll do some more tests tonight. (This is all part of my mad scheme to get my TurboGears/AJAX-based online collaborative Thesaurus project available to the outside world.) End of Thread -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3101: Advanced String Formatting
PEP: 3101 Title: Advanced String Formatting Version: $Revision$ Last-Modified: $Date$ Author: Talin talin at acm.org Status: Draft Type: Standards Content-Type: text/plain Created: 16-Apr-2006 Python-Version: 3.0 Post-History: Abstract This PEP proposes a new system for built-in string formatting operations, intended as a replacement for the existing '%' string formatting operator. Rationale Python currently provides two methods of string interpolation: - The '%' operator for strings. [1] - The string.Template module. [2] The scope of this PEP will be restricted to proposals for built-in string formatting operations (in other words, methods of the built-in string type). The '%' operator is primarily limited by the fact that it is a binary operator, and therefore can take at most two arguments. One of those arguments is already dedicated to the format string, leaving all other variables to be squeezed into the remaining argument. The current practice is to use either a dictionary or a tuple as the second argument, but as many people have commented [3], this lacks flexibility. The all or nothing approach (meaning that one must choose between only positional arguments, or only named arguments) is felt to be overly constraining. While there is some overlap between this proposal and string.Template, it is felt that each serves a distinct need, and that one does not obviate the other. In any case, string.Template will not be discussed here. Specification The specification will consist of 4 parts: - Specification of a set of methods to be added to the built-in string class. - Specification of a new syntax for format strings. - Specification of a new set of class methods to control the formatting and conversion of objects. - Specification of an API for user-defined formatting classes. String Methods The build-in string class will gain a new method, 'format', which takes takes an arbitrary number of positional and keyword arguments: The story of {0}, {1}, and {c}.format(a, b, c=d) Within a format string, each positional argument is identified with a number, starting from zero, so in the above example, 'a' is argument 0 and 'b' is argument 1. Each keyword argument is identified by its keyword name, so in the above example, 'c' is used to refer to the third argument. The result of the format call is an object of the same type (string or unicode) as the format string. Format Strings Brace characters ('curly braces') are used to indicate a replacement field within the string: My name is {0}.format('Fred') The result of this is the string: My name is Fred Braces can be escaped using a backslash: My name is {0} :-\{\}.format('Fred') Which would produce: My name is Fred :-{} The element within the braces is called a 'field'. Fields consist of a name, which can either be simple or compound, and an optional 'conversion specifier'. Simple names are either names or numbers. If numbers, they must be valid decimal numbers; if names, they must be valid Python identifiers. A number is used to identify a positional argument, while a name is used to identify a keyword argument. Compound names are a sequence of simple names seperated by periods: My name is {0.name} :-\{\}.format(dict(name='Fred')) Compound names can be used to access specific dictionary entries, array elements, or object attributes. In the above example, the '{0.name}' field refers to the dictionary entry 'name' within positional argument 0. Each field can also specify an optional set of 'conversion specifiers'. Conversion specifiers follow the field name, with a colon (':') character separating the two: My name is {0:8}.format('Fred') The meaning and syntax of the conversion specifiers depends on the type of object that is being formatted, however many of the built-in types will recognize a standard set of conversion specifiers. The conversion specifier consists of a sequence of zero or more characters, each of which can consist of any printable character except for a non-escaped '}'. The format() method does not attempt to intepret the conversion specifiers in any way; it merely passes all of the characters between the first colon ':' and the matching right brace ('}') to the various underlying formatters (described later.) Standard Conversion Specifiers For most built-in types, the conversion specifiers will be the same or similar to the existing conversion specifiers used with the '%' operator. Thus, instead of '%02.2x, you will say '{0:2.2x}'. There are a few
Re: [Python-Dev] PEP 3102: Keyword-only arguments
Thomas Wouters thomas at python.org writes: Pfft, implementation is easy. I have the impression Talin wants to implement it himself, but even if he doesn't, I'm sure I'll have a free week somewhere in the next year and a half in which I can implement it :) It's not that hard a problem, it just requires a lot of reading of the AST and function-call code (if you haven't read it already.) If someone wants to implement this, feel free - I have no particular feelings of ownership over this idea. If someone can do it better than I can, then that's the implementation that should be chosen. One suggestion I would have would be to implement the two parts of the PEP (keyword-only arguments vs. the 'naked star' syntax) as two separate patches; While there seems to be a relatively wide-spread support for the former, the latter is still somewhat controversial and will require further discussion. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3102: Keyword-only arguments
Zachary Pincus zpincus at stanford.edu writes: That seems a bit odd, as my natural expectation wouldn't be to see kw1 ands kw2 as required, no-default keyword args, but as misplaced positional args. Perhaps this might be a little better? def foo(*args, kw1=, kw2=): I'm rather not sure. At least it makes it clear that kw1 and kw2 are keyword arguments, and that they have no default values. Though, I'm kind of neutral on the whole bit -- in my mind keyword args and default-value args are pretty conflated and I can't think of any compelling reasons why they shouldn't be. It's visually handy when looking at some code to see keywords and be able to say ok, those are the optional args for changing the handling of the main args. I'm not sure where the big win with required keyword args is. I think a lot of people conflate the two, because of the similarity in syntax, however they aren't really the same thing. In a function definition, any argument can be a keyword argument, and the presence of '=' means 'default value', not 'keyword'. I have to admit that the primary reason for including required keyword arguments in the PEP is because they fall out naturally from the implementation. If we remove this from the PEP, then the implementation becomes more complicated - because now you really are assigning a meaning to '=' other than 'default value'. From a user standpoint, I admit the benefits are small, but I don't see that it hurts either. The only use case that I can think of is when you have a function that has some required and some optional keyword arguments, and there is some confusion about the ordering of such arguments. By disallowing the arguments to be positional, you eliminate the possibility of having an argument be assigned to the incorrect parameter slot. For that matter, why not default positional args? I think everyone will agree that seems a bit odd, but it isn't too much odder than required keyword args. (Not that I'm for the former! I'm just pointing out that if the latter is OK, there's no huge reason why the former wouldn't be, and that is in my mind a flaw.) Second: def compare(a, b, *, key=None): This syntax seems a bit odd to me, as well. I always understood the *args syntax by analogy to globbing -- the asterisk means to take all the rest, in some sense in both a shell glob and *args. In this syntax, the asterisk, though given a position in the comma- separated list, doesn't mean take the rest and put it in this position. It means stop taking things before this position, which is a bit odd, in terms of items *in* an argument list. I grant that it makes sense as a derivation from *ignore-type solutions, but as a standalone syntax it feels off. How about something like: def compare(a, b; key=None): I wanted the semicolon as well, but was overruled. The current proposal is merely a summary of the output of the discussions so far. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3101: Advanced String Formatting
Zachary Pincus wrote: I'm not sure about introducing a special syntax for accessing dictionary entries, array elements and/or object attributes *within a string formatter*... much less an overloaded one that differs from how these elements are accessed in regular python. Compound names are a sequence of simple names seperated by periods: My name is {0.name} :-\{\}.format(dict(name='Fred')) Compound names can be used to access specific dictionary entries, array elements, or object attributes. In the above example, the '{0.name}' field refers to the dictionary entry 'name' within positional argument 0. Barring ambiguity about whether .name would mean the name attribute or the name dictionary entry if both were defined, I'm not sure I really see the point. How is: d = {last:'foo', first:'bar'} My last name is {0.last}, my first name is {0.first}..format(d) really that big a win over: d = {last:'foo', first:'bar'} My last name is {0}, my first name is {1}..format(d['last'], d ['first']) At one point I had intended to abandon the compound-name syntax, until I realized that it had one beneficial side-effect, which is that it offers a way around the 'dict-copying' problem. There are a lot of cases where you want to pass an entire dict as the format args using the **kwargs syntax. One common use pattern is for debugging code, where you want to print out a bunch of variables that are in the local scope: print Source file: {file}, line: {line}, column: {col}\ .format( **locals() ) The problem with this is one of efficiency - the interpreter handles ** by copying the entire dictionary and merging it with any keyword arguments. Under most sitations this is fine; However if the dictionary is particularly large, it might be a problem. So the intent of the compound name syntax is to allow something very similar: print Source file: {0.file}, line: {0.line}, column: {0.col}\ .format( locals() ) Now, its true that you could also do this by passing in the 3 parameters as individual arguments; However, there have been some strong proponents of being able to pass in a single dict, and rather than restating their points I'll let them argue their own positions (so as not to accidentally mis-state them.) Plus, the in-string syntax is limited -- e.g. what if I want to call a function on an attribute? Unless you want to re-implement all python syntax within the formatters, someone will always be able to level these sort of complaints. Better, IMO, to provide none of that than a restricted subset of the language -- especially if the syntax looks and works differently from real python. The in-string syntax is limited deliberately for security reasons. Allowing arbitrary executable code within a string is supported by a number of other scripting languages, and we've seen a good number of exploits as a result. I chose to support only __getitem__ and __getattr__ because I felt that they would be relatively safe; usually (but not always) those functions are written in a way that has no side effects. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3102: Keyword-only arguments
PEP: 3102 Title: Keyword-Only Arguments Version: $Revision$ Last-Modified: $Date$ Author: Talin talin at acm.org Status: Draft Type: Standards Content-Type: text/plain Created: 22-Apr-2006 Python-Version: 3.0 Post-History: Abstract This PEP proposes a change to the way that function arguments are assigned to named parameter slots. In particular, it enables the declaration of keyword-only arguments: arguments that can only be supplied by keyword and which will never be automatically filled in by a positional argument. Rationale The current Python function-calling paradigm allows arguments to be specified either by position or by keyword. An argument can be filled in either explicitly by name, or implicitly by position. There are often cases where it is desirable for a function to take a variable number of arguments. The Python language supports this using the 'varargs' syntax ('*name'), which specifies that any 'left over' arguments be passed into the varargs parameter as a tuple. One limitation on this is that currently, all of the regular argument slots must be filled before the vararg slot can be. This is not always desirable. One can easily envision a function which takes a variable number of arguments, but also takes one or more 'options' in the form of keyword arguments. Currently, the only way to do this is to define both a varargs argument, and a 'keywords' argument (**kwargs), and then manually extract the desired keywords from the dictionary. Specification Syntactically, the proposed changes are fairly simple. The first change is to allow regular arguments to appear after a varargs argument: def sortwords(*wordlist, case_sensitive=False): ... This function accepts any number of positional arguments, and it also accepts a keyword option called 'case_sensitive'. This option will never be filled in by a positional argument, but must be explicitly specified by name. Keyword-only arguments are not required to have a default value. Since Python requires that all arguments be bound to a value, and since the only way to bind a value to a keyword-only argument is via keyword, such arguments are therefore 'required keyword' arguments. Such arguments must be supplied by the caller, and they must be supplied via keyword. The second syntactical change is to allow the argument name to be omitted for a varargs argument: def compare(a, b, *, key=None): ... The reasoning behind this change is as follows. Imagine for a moment a function which takes several positional arguments, as well as a keyword argument: def compare(a, b, key=None): ... Now, suppose you wanted to have 'key' be a keyword-only argument. Under the above syntax, you could accomplish this by adding a varargs argument immediately before the keyword argument: def compare(a, b, *ignore, key=None): ... Unfortunately, the 'ignore' argument will also suck up any erroneous positional arguments that may have been supplied by the caller. Given that we'd prefer any unwanted arguments to raise an error, we could do this: def compare(a, b, *ignore, key=None): if ignore: # If ignore is not empty raise TypeError As a convenient shortcut, we can simply omit the 'ignore' name, meaning 'don't allow any positional arguments beyond this point'. Function Calling Behavior The previous section describes the difference between the old behavior and the new. However, it is also useful to have a description of the new behavior that stands by itself, without reference to the previous model. So this next section will attempt to provide such a description. When a function is called, the input arguments are assigned to formal parameters as follows: - For each formal parameter, there is a slot which will be used to contain the value of the argument assigned to that parameter. - Slots which have had values assigned to them are marked as 'filled'. Slots which have no value assigned to them yet are considered 'empty'. - Initially, all slots are marked as empty. - Positional arguments are assigned first, followed by keyword arguments. - For each positional argument: o Attempt to bind the argument to the first unfilled parameter slot. If the slot is not a vararg slot, then mark the slot as 'filled'. o If the next unfilled slot is a vararg slot, and it does not have a name, then it is an error. o Otherwise, if the next unfilled slot is a vararg slot then all remaining non-keyword
Re: [Python-Dev] Dropping __init__.py requirement for subpackages
Guido van Rossum guido at python.org writes: The requirement that a directlry must contain an __init__.py file before it is considered a valid package has always been controversial. It's designed to prevent the existence of a directory with a common name like time or string from preventing fundamental imports to work. But the feature often trips newbie Python programmers (of which there are a lot at Google, at our current growth rate we're probably training more new Python programmers than any other organization worldwide . Might I suggest an alternative? I too find it cumbersome to have to litter my directory tree with __init__.py iles. However, rather than making modules be implicit (Explicit is better than implicit), I would rather see a more powerful syntax for declaring modules. Specifically, what I would like to see is a way for a top-level __init__.py to explicitly list which subdirectories are modules. Rather than spreading that information over a bunch of different __init__.py files, I would much rather have the information be centralized in a single text file for the whole package. Just as we have an __all__ variable that indicates which symbols are to be exported, we could have a '__submodules__' array which explicitly calls out the list of submodule directory names. Or perhaps more simply, just have some code in the top-level __init__.py that creates (but does not load) the module objects for the various sub-modules. The presence of __init__.py could perhaps also indicate the root of a *standalone* module tree; submodules that don't have their own __init__.py, but which are declared indirectly via an ancestor are assumed by convention to be 'component' modules which are not intended to operate outside of their parent. (In other words, the presence of the __init__.py signals that that tree is separately exportable.) I'm sure that someone who is familiar with the import machinery could whip up something like this in a matter of minutes. As far as the compatibility with tools argument goes, I say, break em :) Those tool programmers need a hobby anyway :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 355 (object-oriented paths)
I didn't have a chance to comment earlier on the Path class PEP, but I'm dealing with an analogous situation at work and I'd like to weigh in on it. The issue is - should paths be specialized objects or regular strings? PEP 355 does an excellent job, I think, of presenting the case for paths being objects. And I certainly think that it would be a good idea to clean up some of the path-related APIs that are in the stdlib now. However, all that being said, I'd like to make some counter-arguments. There are a lot of programming languages out there today that have custom path classes, while many others use strings. A particularly useful comparison is Java vs. C#, two languages which have many aspects in common, but take diametrically opposite approaches to the handling of paths. Java uses a Path class, while C# uses strings as paths, and supplies a set of function-based APIs for manipulating them. Having used both languages extensively, I think that I prefer strings. One of the main reasons for this is that the majority of path manipulations are single operations - such as, take a base path and a relative path and combine them. Although you do occasionally see cases where a complex series of operations is performed on a path, such cases tend to be (a) rare, and (b) not handled well by the standard API, in the sense that what is being done to the path is not something that was anticipated by the person who wrote the path API in the first place. Given that the ultimate producers and consumers of paths (that is, the filesystem APIs, the input fields of dialog boxes, the argv array) know nothing about Path objects, the question is, is it worth converting to a special object and back again just to do a simple concatenate? I think that I would prefer to have a nice orthogonal set of path manipulation functions. Another reason why I am a bit dubious about a class-based approach is that it tends to take anything that is related to a filepath and lump them into a single module. I think that there are some fairly good reasons why different path- related functions should be in different modules. For example, one thing that irks me (and others) about the Path class in Java is that it makes no distinction between methods that are merely textual conversions, and methods which actually go out and touch the disk. I would rather that functions that invoke filesystem activity to be partitioned away from functions that merely involve string manipulation. Creating a tempfile, for example, or determining whether a file is writeable should not be in the same bucket as determining the file extension, or whether a path is relative or absolute. What I would like to see, instead, is for the various path-related functions to be organized into a clear set of categories. For example, if os.path is the module for pure operations on paths, without reference to the filesystem, then the current path separator character should be a member of that module, not the os module. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Slightly OT: Replying to posts
Just a quick question about the mechanics of replying to this list. I am a subscriber to the list, however I much prefer readiing the list archives on the web instead of having the postings delivered to my email account. Because of this, I have delivery turned off in the mailing list preferences. I particularly dislike the idea of wasting bandwidth and disk space on something that I am not going to read. However, I would like to be able to reply to posts in such a way as to have them appear in the appropriate place in the thread hierarchy. Since I don't have the original email, I cannot reply to it directly; instead I have to create a new, non-reply email and send it to the list. Simply editing the subject line to put Re: subject would seem to be insufficient. Does anyone have a trick for managing this? Or is there a FAQ that someone can point me to that addresses this issue? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Slightly OT: Replying to posts
Just a quick question about the mechanics of replying to this list. I am a subscriber to the list, however I much prefer readiing the list archives on the web instead of having the postings delivered to my email account. Because of this, I have delivery turned off in the mailing list preferences. I particularly dislike the idea of wasting bandwidth and disk space on something that I am not going to read. However, I would like to be able to reply to posts in such a way as to have them appear in the appropriate place in the thread hierarchy. Since I don't have the original email, I cannot reply to it directly; instead I have to create a new, non-reply email and send it to the list. Simply editing the subject line to put Re: subject would seem to be insufficient. Does anyone have a trick for managing this? Or is there a FAQ that someone can point me to that addresses this issue? -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adventures with ASTs - Inline Lambda
[EMAIL PROTECTED] wrote: talin ... whereas with 'given' you can't be certain when to stop talin parsing the argument list. So require parens around the arglist: (x*y given (x, y)) Skip I would not be opposed to mandating the parens, and its an easy enough change to make. The patch on SF lets you do it both ways, which will give people who are interested a chance to get a feel for the various alternatives. I realize of course that this is a moot point. But perhaps I can help to winnow down the dozens of rejected lambda replacement proposals to just a few rejected lamda proposals :) -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adventures with ASTs - Inline Lambda
All right, the patch is up on SF. Sorry for the delay, I accidentally left my powerbook about an hour's drive away from home, and had to drive to go get it this morning :) To those who were asking what advantage the new syntax has - well, from a technical perspective there are none, since the underlying implementation is identical. The only (minor) difference is in the syntactical ambiguity, which both forms have - with lambda you can't be certain when to stop parsing the result expression, whereas with 'given' you can't be certain when to stop parsing the argument list. I see the primary advantage of the inline syntax as pedagogical - given a choice, I would rather explain the given syntax to a novice programmer than try to explain lambda. This is especially true given the similarity in form to generator expressions - in other words, once you've gone through the effort of explaining generator expressions, you can re-use most of that explanation when explaining function expressions; whereas with lambda, which looks like nothing else in Python, you have to start from scratch. -- Talin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com