Re: [Python-Dev] Documentation idea
From: "Doug Hellmann" <[EMAIL PROTECTED] This seems like a large undertaking. Not necessarily. It can be done incrementally, starting with things like str.split() that almost no one understands completely. It should be put here and there where it adds some clarity. I'm sure you're not underestimating the effort, but I have the sense that you may be overestimating the usefulness of the results (or maybe I'm underestimating them through some lack of understanding). Would it be more optimal (in terms of both effort and results) to extend the existing documentation and/or docstrings with examples that use all of the functions so developers can see how to call them and what results to expect? The idea includes pure python code augmented by doctestable doctrings with enough examples. So, we're almost talking about the same thing. There is one difference; since the new attribute is guaranteed to be executable, it can be reliably run through doctest. The same is *not* true for arbitrary docstrings. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Oct 16, 2008, at 5:11 PM, Raymond Hettinger wrote: Raymond Hettinger wrote: * It will assist pypy style projects and other python implementations when they have to build equivalents to CPython. * Will eliminate confusion about what functions were exactly intended to do. * Will confer benefits similar to test driven development where the documentation and pure python version are developed first and doctests gotten to pass, then the C version is created to match. I haven't seen anyone comment about this assertion of "equivalence". Doesn't it strike you as difficult to maintain *two* versions of every function, and ensure they match *exactly*? Glad you brought this up. My idea is to present rough equivalence in unoptimized python that is simple and clear. The goal is to provide better documentation where code is more precise than English prose. That being said, some subset of the existing tests should be runnable against the rough equivalent and the python code should incorporate doctests. Running both sets of test should suffice to maintain the rough equivalence. This seems like a large undertaking. I'm sure you're not underestimating the effort, but I have the sense that you may be overestimating the usefulness of the results (or maybe I'm underestimating them through some lack of understanding). Would it be more optimal (in terms of both effort and results) to extend the existing documentation and/or docstrings with examples that use all of the functions so developers can see how to call them and what results to expect? Doug ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Raymond Hettinger wrote: * It will assist pypy style projects and other python implementations when they have to build equivalents to CPython. * Will eliminate confusion about what functions were exactly intended to do. * Will confer benefits similar to test driven development where the documentation and pure python version are developed first and doctests gotten to pass, then the C version is created to match. I haven't seen anyone comment about this assertion of "equivalence". Doesn't it strike you as difficult to maintain *two* versions of every function, and ensure they match *exactly*? Glad you brought this up. My idea is to present rough equivalence in unoptimized python that is simple and clear. The goal is to provide better documentation where code is more precise than English prose. That being said, some subset of the existing tests should be runnable against the rough equivalent and the python code should incorporate doctests. Running both sets of test should suffice to maintain the rough equivalence. The notion of exact equivalence should be left to PyPy folks who can attest that the code can get convoluted when you try to simulate exactly when error checking is performed, read-only behavior for attributes, and making the stacktraces look the same when there are errors. In contrast, my goal is an approximation that is executable but highly readable and expository. My thought is to do this only with tools where it really does enhance the documentation. The exercise is worthwhile in and of itself. For example, I'm working on a pure python version of str.split() and quickly determined that the docs are *still* in error even after many revisions over the years (the whitespace version does not, in fact, start by stripping whitespace from both ends). Here's what I have so far: def split(s, sep=None, maxsplit=-1): """split(S, [sep [,maxsplit]]) -> list of strings Return a list of the words in the string S, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done. If sep is not specified or is None, any whitespace string is a separator and empty strings are removed from the result. >>> from itertools import product >>> s = ' 11 2 333 4 ' >>> split(s, None) ['11', '2', '333', '4'] >>> n = 8 >>> for s in product('ab ', repeat=n): ... for maxsplit in range(-2, len(s)+2): ... s = ''.join(s) ... assert s.split(None, maxsplit) == split(s, None, maxsplit), namedtuple('Err', 'str maxsplit result target')(repr(s), maxsplit, split(s,None,maxsplit), s.split(None, maxsplit)) """ result = [] spmode = True start = 0 if maxsplit != 0: for i, c in enumerate(s): if spmode: if not c.isspace(): start = i spmode = False elif c.isspace(): result.append(s[start:i]) start = i spmode = True if len(result) == maxsplit: break rest = s[start:].lstrip() return (result + [rest]) if rest else result Once I have the cleanest possible, self-explantory code that passes tests, I'll improve the variable names and make a more sensible docstring with readable examples. Surprisingly, it hasn't been a trivial exercise to come-up with an equivalent that corresponds more closely to the way we think instead of corresponding the C code -- I want to show *what* is does more than *how* it does it. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Thu, Oct 16, 2008 at 11:13 AM, Scott Dial <[EMAIL PROTECTED]> wrote: > Raymond Hettinger wrote: >> * It will assist pypy style projects and other python implementations >> when they have to build equivalents to CPython. >> >> * Will eliminate confusion about what functions were exactly intended to >> do. >> >> * Will confer benefits similar to test driven development where the >> documentation and pure python version are developed first and doctests >> gotten to pass, then the C version is created to match. > > I haven't seen anyone comment about this assertion of "equivalence". > Doesn't it strike you as difficult to maintain *two* versions of every > function, and ensure they match *exactly*? More time-consuming than difficult. Raymond is currently talking about things like built-ins and methods on types who do not exactly change very often. > The utility to PyPy-style > projects is minimized if the two version aren't identical. And while > it's possible to say, "the tests say they are equiavelent, so they are;" > history is quite clear about people depending on "features" that are > untested and were unintended side-effects of the manner in which > something was implemented. Right, and when we find out that there is a difference, we typically standardize on a specific version and developers using the bogus semantics switch. > I think it would be a dilution of developer > man-hours to force them to maintain two versions in lock-step, and it > significantly adds to the burden of writing and reviewing potential > bugfixes. > Well, I don't see this applying to every extension module in the stdlib that does not already have a pure Python equivalent. This view also assumes that if this position was taken people will continue to write extension modules when they are not necessarily needed. If this actually makes people to write more pure Python code over extension modules I think that is a plus. And Raymond, more than probably anyone, can address the overhead he has faced in maintaining both the pure Python version of itertools in the docs and the extension module. > While I applaud the idea of documenting C functions in this manner, > let's not confuse documentation with equivalence. If the standard > distribution of Python exports the C version, then all bets are off > whether the Python version is a drop-in replacement (even if the > buildbots regularly test them). Well, considering we have not even gotten far enough to actually do this for the documentation case, I think worrying about equivalence might be jumping the gun slightly as it is more work as you point out, Scott. But one thing about doing this is it might draw in the various alternative VM folks to help maintain the Python code. If Jython, IronPython, and/or PyPy actually use the Python code for themselves then I suspect they would help with maintenance. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Raymond Hettinger wrote: > * It will assist pypy style projects and other python implementations > when they have to build equivalents to CPython. > > * Will eliminate confusion about what functions were exactly intended to > do. > > * Will confer benefits similar to test driven development where the > documentation and pure python version are developed first and doctests > gotten to pass, then the C version is created to match. I haven't seen anyone comment about this assertion of "equivalence". Doesn't it strike you as difficult to maintain *two* versions of every function, and ensure they match *exactly*? The utility to PyPy-style projects is minimized if the two version aren't identical. And while it's possible to say, "the tests say they are equiavelent, so they are;" history is quite clear about people depending on "features" that are untested and were unintended side-effects of the manner in which something was implemented. I think it would be a dilution of developer man-hours to force them to maintain two versions in lock-step, and it significantly adds to the burden of writing and reviewing potential bugfixes. While I applaud the idea of documenting C functions in this manner, let's not confuse documentation with equivalence. If the standard distribution of Python exports the C version, then all bets are off whether the Python version is a drop-in replacement (even if the buildbots regularly test them). I feel so strongly about this that I think that the consideration of adding this should be frame /solely/ as a documentation tool and nothing more. -Scott -- Scott Dial [EMAIL PROTECTED] [EMAIL PROTECTED] ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Raymond Hettinger wrote: > Bright idea > -- > Let's go one step further and do this just about everywhere and instead of > putting it in the docs, attach an exec-able string as an > attribute to our C functions. Further, those pure python examples should > include doctests so that the user can see a typical invocation and calling > pattern. > > Say we decide to call the attribute something like ".python", then you > could write something like: > > >>> print(all.python) >def all(iterable): > '''Return True if all elements of the iterable are true. > [...] +1 from the peanut gallery, with a note: since ipython is a common way for many to use/learn python interactively, if this is adopted, we'd *immediately* add to ipython's '?' introspection machinery the ability to automatically find this information. This way, when people type "all?" or "all??" we'd fetch the doc and source code. A minor question inspired by this: would it make sense to split the docstring part from the code of this .python object? I say this because in principle, the docstring should be the same of the 'parent', and it would simplify our implementation to eliminate the duplicate printout. The .python object could always be a special string-like object made from combining the pure python code with a single docstring, common to the C and the Python versions, that would remain exec-able. In any case, details aside I think this is great and if it comes to pass, we'll be happy to make it readily accessible to interactive users via ipython. Cheers, f ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Fri, Oct 10, 2008 at 9:46 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > Brett Cannon wrote: >> >> On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > >>> The advantage of the decorator version is that the compiler or module >>> loader >>> could be special cased to recognize the 'C' decorator and try it first >>> *before* using the Python version, which would serve as a backup. There >>> could be a standard version in builtins that people could replace to >>> implement non-standard loading on a particular system. To cater to other >>> implementations, the name could be something other than 'C', or we could >>> define 'C' to be the initial of "Code" (in the implementation language). >>> Either way, other implementation could start with a do-nothing "C" >>> decorator and run the file as is, then gradually replace with lower-level >>> code. >>> >> >> The decorator doesn't have to require any special casing at all >> (changing the parameters to keep the code short):: >> >> def C(module_name, want): >> def choose_version(ob): >> try: >> module = __import__(module_name, fromlist=[want]) >> return getattr(module, want) >> except (ImportError, AttributeError): >>return ob >> return choose_version >> >> The cost is purely during importation of the module and does nothing >> fancy at all and relies on stuff already available in all Python VMs. > > If I understand correctly, this decorator would only be applied *after* the > useless Python level function object was created. Yes. > I was proposing bypassing > that step when not necessary, and I believe special casing *would* be > required for that. Yes, that would. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Brett Cannon wrote: On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: The advantage of the decorator version is that the compiler or module loader could be special cased to recognize the 'C' decorator and try it first *before* using the Python version, which would serve as a backup. There could be a standard version in builtins that people could replace to implement non-standard loading on a particular system. To cater to other implementations, the name could be something other than 'C', or we could define 'C' to be the initial of "Code" (in the implementation language). Either way, other implementation could start with a do-nothing "C" decorator and run the file as is, then gradually replace with lower-level code. The decorator doesn't have to require any special casing at all (changing the parameters to keep the code short):: def C(module_name, want): def choose_version(ob): try: module = __import__(module_name, fromlist=[want]) return getattr(module, want) except (ImportError, AttributeError): return ob return choose_version The cost is purely during importation of the module and does nothing fancy at all and relies on stuff already available in all Python VMs. If I understand correctly, this decorator would only be applied *after* the useless Python level function object was created. I was proposing bypassing that step when not necessary, and I believe special casing *would* be required for that. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Fri, Oct 10, 2008 at 1:45 PM, Terry Reedy <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: >> >> On 9 Oct, 11:12 pm, [EMAIL PROTECTED] wrote: >>> >>> Background >>> -- >>> In the itertools module docs, I included pure python equivalents for each >>> of the C functions. Necessarily, some of those equivalents are only >>> approximate but they seem to have greatly enhanced the docs. >> >> Why not go the other direction? >> >> Ostensibly the reason for writing a module like 'itertools' in C is purely >> for performance. There's nothing that I'm aware of in that module which >> couldn't be in Python. >> >> Similarly, cStringIO, cPickle, etc. Everywhere these diverge, it is (if >> not a flat-out bug) not optimal. External projects are encouraged by a >> wealth of documentation to solve performance problems in a similar way: >> implement in Python, once you've got the interface right, optimize into C. >> >> So rather than have a C implementation, which points to Python, why not >> have a Python implementation that points at C? 'itertools' (and similar) >> can actually be Python modules, and use a decorator, let's call it "C", to >> do this: >> >> @C("_c_itertools.count") >> class count(object): >> """ >> This is the documentation for both the C version of itertools.count >> and the Python version - since they should be the same, right? >> """ > > The ancient string module did something like this, except that the rebinding > of function names was done at the end by 'from _string import *' where > _string had C versions of some but not all of the functions in string. (And > the list of replacements could vary by version and platform and compiler > switches.) This was great for documenting the string module. It was some > of the first Python code I studied after the tutorial. > > The problem with that and the above (with modification, see below) is the > creation and discarding of unused function objects and the time required to > do so. > > The advantage of the decorator version is that the compiler or module loader > could be special cased to recognize the 'C' decorator and try it first > *before* using the Python version, which would serve as a backup. There > could be a standard version in builtins that people could replace to > implement non-standard loading on a particular system. To cater to other > implementations, the name could be something other than 'C', or we could > define 'C' to be the initial of "Code" (in the implementation language). > Either way, other implementation could start with a do-nothing "C" > decorator and run the file as is, then gradually replace with lower-level > code. > The decorator doesn't have to require any special casing at all (changing the parameters to keep the code short):: def C(module_name, want): def choose_version(ob): try: module = __import__(module_name, fromlist=[want]) return getattr(module, want) except (ImportError, AttributeError): return ob return choose_version The cost is purely during importation of the module and does nothing fancy at all and relies on stuff already available in all Python VMs. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
[EMAIL PROTECTED] wrote: On 9 Oct, 11:12 pm, [EMAIL PROTECTED] wrote: Background -- In the itertools module docs, I included pure python equivalents for each of the C functions. Necessarily, some of those equivalents are only approximate but they seem to have greatly enhanced the docs. Why not go the other direction? Ostensibly the reason for writing a module like 'itertools' in C is purely for performance. There's nothing that I'm aware of in that module which couldn't be in Python. Similarly, cStringIO, cPickle, etc. Everywhere these diverge, it is (if not a flat-out bug) not optimal. External projects are encouraged by a wealth of documentation to solve performance problems in a similar way: implement in Python, once you've got the interface right, optimize into C. So rather than have a C implementation, which points to Python, why not have a Python implementation that points at C? 'itertools' (and similar) can actually be Python modules, and use a decorator, let's call it "C", to do this: @C("_c_itertools.count") class count(object): """ This is the documentation for both the C version of itertools.count and the Python version - since they should be the same, right? """ The ancient string module did something like this, except that the rebinding of function names was done at the end by 'from _string import *' where _string had C versions of some but not all of the functions in string. (And the list of replacements could vary by version and platform and compiler switches.) This was great for documenting the string module. It was some of the first Python code I studied after the tutorial. The problem with that and the above (with modification, see below) is the creation and discarding of unused function objects and the time required to do so. The advantage of the decorator version is that the compiler or module loader could be special cased to recognize the 'C' decorator and try it first *before* using the Python version, which would serve as a backup. There could be a standard version in builtins that people could replace to implement non-standard loading on a particular system. To cater to other implementations, the name could be something other than 'C', or we could define 'C' to be the initial of "Code" (in the implementation language). Either way, other implementation could start with a do-nothing "C" decorator and run the file as is, then gradually replace with lower-level code. Terry Jan Reedy ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Thu, Oct 9, 2008 at 8:37 PM, <[EMAIL PROTECTED]> wrote: > > On 9 Oct, 11:12 pm, [EMAIL PROTECTED] wrote: >> >> Background >> -- >> In the itertools module docs, I included pure python equivalents for each >> of the C functions. Necessarily, some of those equivalents are only >> approximate but they seem to have greatly enhanced the docs. > > Why not go the other direction? > > Ostensibly the reason for writing a module like 'itertools' in C is purely > for performance. There's nothing that I'm aware of in that module which > couldn't be in Python. > > Similarly, cStringIO, cPickle, etc. Everywhere these diverge, it is (if not > a flat-out bug) not optimal. External projects are encouraged by a wealth > of documentation to solve performance problems in a similar way: implement > in Python, once you've got the interface right, optimize into C. > > So rather than have a C implementation, which points to Python, why not have > a Python implementation that points at C? 'itertools' (and similar) can > actually be Python modules, and use a decorator, let's call it "C", to do > this: > > @C("_c_itertools.count") > class count(object): > """ > This is the documentation for both the C version of itertools.count > and the Python version - since they should be the same, right? > """ > And that decorator is generic enough to work for both classes and functions. > In Python itself, the Python module would mostly be for documentation, and > therefore solve the problem that Raymond is talking about, but it could also > be a handy fallback for sanity checking, testing, and use in other Python > runtimes (ironpython, jython, pypy). Which is why I would love to make this almost a policy for new modules that do not have any C dependency. > Many third-party projects already use > ad-hoc mechanisms for doing this same thing, but an officially-supported way > of saying "this works, but the optimized version is over here" would be a > very useful convention. > > In those modules which absolutely require some C stuff to work, the python > module could still serve as documentation. > Add to this some function to help test both the pure Python and C implementation, like ``py_version, c_version = import_versions('itertools', '_c_itertools')``, so you can run the test suite against both versions, and you then have pretty much everything covered for writing the code in Python to start and optimizing as needed in C. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
This is a really interesting idea. If extra memory/lookup overhead is a concern, you could enable this new feature by default when the interactive interpreter is started (where it's more likely to be invoked), and turn it off by default when running scripts/modules. Jared On 9 Oct 2008, at 20:37, [EMAIL PROTECTED] wrote: On 9 Oct, 11:12 pm, [EMAIL PROTECTED] wrote: Background -- In the itertools module docs, I included pure python equivalents for each of the C functions. Necessarily, some of those equivalents are only approximate but they seem to have greatly enhanced the docs. Why not go the other direction? Ostensibly the reason for writing a module like 'itertools' in C is purely for performance. There's nothing that I'm aware of in that module which couldn't be in Python. Similarly, cStringIO, cPickle, etc. Everywhere these diverge, it is (if not a flat-out bug) not optimal. External projects are encouraged by a wealth of documentation to solve performance problems in a similar way: implement in Python, once you've got the interface right, optimize into C. So rather than have a C implementation, which points to Python, why not have a Python implementation that points at C? 'itertools' (and similar) can actually be Python modules, and use a decorator, let's call it "C", to do this: @C("_c_itertools.count") class count(object): """ This is the documentation for both the C version of itertools.count and the Python version - since they should be the same, right? """ In Python itself, the Python module would mostly be for documentation, and therefore solve the problem that Raymond is talking about, but it could also be a handy fallback for sanity checking, testing, and use in other Python runtimes (ironpython, jython, pypy). Many third-party projects already use ad-hoc mechanisms for doing this same thing, but an officially-supported way of saying "this works, but the optimized version is over here" would be a very useful convention. In those modules which absolutely require some C stuff to work, the python module could still serve as documentation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/jared.grubb%40gmail.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On 9 Oct, 11:12 pm, [EMAIL PROTECTED] wrote: Background -- In the itertools module docs, I included pure python equivalents for each of the C functions. Necessarily, some of those equivalents are only approximate but they seem to have greatly enhanced the docs. Why not go the other direction? Ostensibly the reason for writing a module like 'itertools' in C is purely for performance. There's nothing that I'm aware of in that module which couldn't be in Python. Similarly, cStringIO, cPickle, etc. Everywhere these diverge, it is (if not a flat-out bug) not optimal. External projects are encouraged by a wealth of documentation to solve performance problems in a similar way: implement in Python, once you've got the interface right, optimize into C. So rather than have a C implementation, which points to Python, why not have a Python implementation that points at C? 'itertools' (and similar) can actually be Python modules, and use a decorator, let's call it "C", to do this: @C("_c_itertools.count") class count(object): """ This is the documentation for both the C version of itertools.count and the Python version - since they should be the same, right? """ In Python itself, the Python module would mostly be for documentation, and therefore solve the problem that Raymond is talking about, but it could also be a handy fallback for sanity checking, testing, and use in other Python runtimes (ironpython, jython, pypy). Many third-party projects already use ad-hoc mechanisms for doing this same thing, but an officially-supported way of saying "this works, but the optimized version is over here" would be a very useful convention. In those modules which absolutely require some C stuff to work, the python module could still serve as documentation. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Yes, I'm looking a couple of different approaches to loading the strings. For now though, I want to focus on the idea itself, not the implementation. The important thing is to gather widespread support before getting into the details of how the strings get loaded. Raymond - Original Message - From: "Lisandro Dalcin" <[EMAIL PROTECTED]> Have you ever considered the same approach for docstrings in C code? As reference, NumPy already has some trickery for maintaining docstrings outside C sources. Of course, descriptors would be a far better for implementing and support this in core Python and other projects... This keeps the C build from getting fat. More ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Thu, Oct 9, 2008 at 8:50 PM, Raymond Hettinger <[EMAIL PROTECTED]> wrote: > [Christian Heimes] >> >> The idea sounds great! >> >> Are you planing to embed the pure python code in C code? > > Am experimenting with a descriptor that fetches the attribute string from a > separate text file. Have you ever considered the same approach for docstrings in C code? As reference, NumPy already has some trickery for maintaining docstrings outside C sources. Of course, descriptors would be a far better for implementing and support this in core Python and other projects... > This keeps the C build from getting fat. More > importantly, it let's us write the execable string in a more natural way (it > bites to write C style docstrings using \n and trailing backslashes). The > best part is that people without C compilers can still submit patches to the > text files. > > > Raymond > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/dalcinl%40gmail.com > -- Lisandro Dalcín --- Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC) Instituto de Desarrollo Tecnológico para la Industria Química (INTEC) Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) PTLC - Güemes 3450, (3000) Santa Fe, Argentina Tel/Fax: +54-(0)342-451.1594 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
On Thu, Oct 9, 2008 at 4:12 PM, Raymond Hettinger <[EMAIL PROTECTED]> wrote: [SNIP] > Bright idea > -- > Let's go one step further and do this just about everywhere and instead of > putting it in the docs, attach an exec-able string as an attribute to our C > functions. Further, those pure python examples should include doctests so > that the user can see a typical invocation and calling pattern. > > Say we decide to call the attribute something like ".python", then you could > write something like: > > >>> print(all.python) > def all(iterable): > '''Return True if all elements of the iterable are true. > > >>> all(isinstance(x, int) for x in [2, 4, 6.13, 8]) > False > >>> all(isinstance(x, int) for x in [2, 4, 6, 8]) > True > ''' > > for element in iterable: > if not element: >return False > return True > > There you have it, a docstring, doctestable examples, and pure python > equivalent all in one place. And since the attribute is distinguished from > __doc__, we can insist that the string be exec-able (something we can't > insist on for arbitrary docstrings). > The idea is great. I assume the special file support is mostly for the built-ins since extension modules can do what heapq does; have a pure Python version people import and that code pulls in any supporting C code. As for an implementation, you could go as far as to have a flag in the extension module that says, "look for Python equivalents" and during module initialization find the file and pull it in. Although doing it that way would not necessarily encourage people as much to start with the pure Python version and then only do C equivalents when performance or design requires it. > Benefits > > > * I think this will greatly improve the understanding of tools like > str.split() which have proven to be difficult to document with straight > prose. Even with simple things like any() and all(), it makes it > self-evident that the functions have early-out behavior upon hitting the > first mismatch. > > * The exec-able definitions and docstrings will be testable > And have some way to test both the Python and C version with the same tests (when possible)? -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
[Christian Heimes] The idea sounds great! Are you planing to embed the pure python code in C code? Am experimenting with a descriptor that fetches the attribute string from a separate text file. This keeps the C build from getting fat. More importantly, it let's us write the execable string in a more natural way (it bites to write C style docstrings using \n and trailing backslashes). The best part is that people without C compilers can still submit patches to the text files. Raymond ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Documentation idea
Raymond Hettinger wrote: lots of cool stuff! The idea sounds great! Are you planing to embed the pure python code in C code? That's going to increase the data segment of the executable. It should be possible to disable and remove the pure python example with a simple ./configure option and some macro magic. File size and in memory size is still critical for embedders. Christian ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Documentation idea
Background -- In the itertools module docs, I included pure python equivalents for each of the C functions. Necessarily, some of those equivalents are only approximate but they seem to have greatly enhanced the docs. Something similar is in the builtin docs for any() and all(). The new collections.namedtuple() factory function also includes a verbose option that prints a pure python equivalent for the generated class. And in the decimal module, I took examples directly from the spec and included them in doc-testable docstrings. This assured compliance with the spec while providing clear examples to anyone who bothers to look at the docstrings. For itertools docs, I combined those best practices and included sample calls in the pure-python code (see the current docs for itertools to see what I mean -- perhaps look at the docs for a new tool like itertools.product() or itertools.izip_longest() to see how useful it is). Bright idea -- Let's go one step further and do this just about everywhere and instead of putting it in the docs, attach an exec-able string as an attribute to our C functions. Further, those pure python examples should include doctests so that the user can see a typical invocation and calling pattern. Say we decide to call the attribute something like ".python", then you could write something like: >>> print(all.python) def all(iterable): '''Return True if all elements of the iterable are true. >>> all(isinstance(x, int) for x in [2, 4, 6.13, 8]) False >>> all(isinstance(x, int) for x in [2, 4, 6, 8]) True ''' for element in iterable: if not element: return False return True There you have it, a docstring, doctestable examples, and pure python equivalent all in one place. And since the attribute is distinguished from __doc__, we can insist that the string be exec-able (something we can't insist on for arbitrary docstrings). Benefits * I think this will greatly improve the understanding of tools like str.split() which have proven to be difficult to document with straight prose. Even with simple things like any() and all(), it makes it self-evident that the functions have early-out behavior upon hitting the first mismatch. * The exec-able definitions and docstrings will be testable * It will assist pypy style projects and other python implementations when they have to build equivalents to CPython. * We've gotten good benefits from doctests for pure python functions, why not extend this best practice to our C functions. * The whole language will become more self-explanatory and self-documenting. * Will eliminate confusion about what functions were exactly intended to do. * Will confer benefits similar to test driven development where the documentation and pure python version are developed first and doctests gotten to pass, then the C version is created to match. * For existing code, this is a perfect project for people who want to start contributing to the language but aren't ready to start writing C (the should be able to read C however so that the equivalent really does match the C code). Limits - * In some cases, there may be no pure python equivalent (i.e. sys.getsize()). * Sometimes the equivalent can only be approximate because the actual C function is too complex (i.e. itertools.tee()). * Some cases, like int(), are useful as a type, have multiple functions, and are hard to write as pure python equivalents. * For starters, it probably only makes to do this for things that are more "algorithmic" like any() and all() or things that have a straight-forward equivalent like property() written using descriptors. Premise --- Sometimes pure python is more expressive, precise, and easy to read than English prose. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com