Re: [Python-Dev] Moving the developer docs?

2010-09-23 Thread Steven Elliott Jr
Hello All,


I am new to this list, but I have been lurking around getting a feel for the
environment and processes. I had some discussion yesterday about the
developer documentation as well, since it’s what I do professionally. I am a
technical writer but also work in the web development arena (using Django).
In fact one of my projects now is to develop a comprehensive platform for
distributing online help, user documentation, etc. which I am just about to
put up on BitBucket (winter ’10). Anyway, that said, with regard to Wikis. I
have worked in several organizations where almost all of the development
documentation was maintained on a wiki. This can be great for getting up and
running with something quickly, but over time it becomes very unmanageable
and confusing.


What I have done in various organizations has been to create a system where
an official repository is kept with all of the *official* documentation and
a way for users (developers) to submit their proposals as to what they would
like to add and change. These proposals are kept in a tracker where they are
read and evaluated. Generally, some discussion ensues and the choices are
made as to what stays published or changed. This is what the system I am
writing is all about as well. It maintains the documentation, and allows for
users to comment on various parts of that documentation and submit requests
to change or add. The admins can then change or deny the documentation based
on community response. Anyway, I am not pitching my idea or trying to hump
my system but I will be releasing it before winter on BitBucket for anyone
to try and distribute freely.


I do however, discourage the use of wikis at all costs. It has been said
that they feel loose and unofficial, and although that my not be the intent,
over time this becomes reality.


Anyway, thank you for your time.


Warmest Regards,

Steve

On Thu, Sep 23, 2010 at 11:06 AM, Dirkjan Ochtman dirk...@ochtman.nlwrote:

 On Thu, Sep 23, 2010 at 16:56, Guido van Rossum gu...@python.org wrote:
  I want to believe your theory (since I also have a feeling that some
  wiki pages feel less trustworthy than others) but my own use of
  Wikipedia makes me skeptical that this is all there is -- on many
  pages on important topics you can clearly tell that a lot of effort
  went into the article, and then I trust it more. On other places you
  can tell that almost nobody cared. But I never look at the names of
  the authors.

 Right -- I feel like wiki quality varies with the amount of attention
 spent on maintaining it. Wikis that get a lot of maintenance (or have
 someone devoted to wiki gardening) will be good (consistent and up
 to date), while wikis that are only occasionally updated, or updated
 without much consistency or added to without editing get to feel bad.
 Seems like a variation of the broken window theory.

 So what we really need is a way to make editing the developer docs
 more rewarding (or less hard) for potential authors (i.e. python
 committers). If putting it in a proper VCS so they can use their
 editor of choice would help that, that seems like a good solution.

 Cheers,

 Dirkjan
 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe:
 http://mail.python.org/mailman/options/python-dev/stevenrelliottjr1%40gmail.com

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Moving the developer docs?

2010-09-23 Thread Steven Elliott Jr
   If we can recruit a bunch of somebodies who *do* care, then the wiki

 would be much more useful.  But I still don't want to edit the
 dev docs there, if I have a choice :)  There's a reason I stopped
 updating the wiki as soon as I moved to a code repository.


I think that there are plenty that do care; I for one would be more than
happy to work on whatever documentation needs might arise for this group. I
am a bit of a documentation nut, since its what I do, also I come from the
Django camp where people are obsessive over documentation. I still think
that wikis are not the best solution but if that is something that needs to
be tightened up then it would be something that I personally would have no
problem working on.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2007-04-14 Thread Steven Elliott
;`On Thu, 2007-02-22 at 01:26 +0100, Giovanni Bajo wrote: 
 On 20/02/2007 16.07, Steven Elliott wrote:
 
  I'm finally getting back into this.  I'd like to take one more shot at
  it with a revised version of what I proposed before.  
  
  For those of you that did not see the original thread it was about ways
  that accessing builtins could be more efficient.  It's a bit much to
  summarize again now, but you should be able to find it in the archive
  with this subject and a date of 2006-03-08.  
 
 Are you aware of this patch, which is still awaiting review?
 https://sourceforge.net/tracker/?func=detailatid=305470aid=1616125group_id=5470

I was not aware of your patch.  I've since downloaded it, applied it,
and played with it a bit.

I find the cached module lookups (cached lookups when loading attributes
in modules via LOAD_ATTR) to be particularly interesting since it
addresses a case where PEP 280 leaves off.  

Your idea is to have an indexable array of objects that is only used
when the hash table has not been changed, which can be determined by the
timestamps you added.  That may be the best way of handling attributes
in modules (LOAD_ATTR).  For global variables (LOAD_GLOBAL) I'm curious
how it compares to PEP 280 and or Greg Ewing's idea.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2007-02-21 Thread Steven Elliott
On Tue, 2007-02-20 at 07:48 -0800, Guido van Rossum wrote:
 If this is not a replay of an old message, please move the discussion
 to python-ideas.

It's a modified version of an old idea, so I wasn't sure where to post
it since previously it was discussed here.  I'll look into python-ideas.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2007-02-20 Thread Steven Elliott
I'm finally getting back into this.  I'd like to take one more shot at
it with a revised version of what I proposed before.  

For those of you that did not see the original thread it was about ways
that accessing builtins could be more efficient.  It's a bit much to
summarize again now, but you should be able to find it in the archive
with this subject and a date of 2006-03-08.  

On Fri, 2006-03-10 at 12:46 +1300, Greg Ewing wrote: 
 Steven Elliott wrote:
  One way of handling it is to
  alter STORE_ATTR (op code for assigning to mod.str) to always check to
  see if the key being assigned is one of the default builtins.  If it is,
  then the module's indexed array of builtins is assigned to.
 
 As long as you're going to all that trouble, it
 doesn't seem like it would be much harder to treat
 all global names that way, instead of just a predefined
 set. The compiler already knows all of the names that
 are used as globals in the module's code.

What I have in mind may be close to what you are suggesting above.  My
thought now is that builtins are a set of tokens that typically, but
don't necessarily, point to the same objects in all modules.  Such
tokens, which I'll refer to as global tokens, can be roughly broken
into two sets:
1) Global tokens that typically point to the same object in all
modules.
2) Global tokens that that are likely to point to the different
objects (or be undefined) in different modules.
Set 1) is pretty much the the builtins.  True and len are likely to
point to the same objects in all modules, but not necessarily.  Set 2)
might be things like os and sys which are often defined (imported)
in modules, but not necessarily.

Access to the globals of a module, including the current module, is done
with one of three opcodes (LOAD_GLOBAL, LOAD_ATTR and LOAD_NAME).  For
each of these opcodes the following snippet of code from ceval.c (for
LOAD_GLOBAL) is relevant to this discussion:
/* This is the un-inlined version of the code above */
x = PyDict_GetItem(f-f_globals, w);
if (x == NULL) {
x = PyDict_GetItem(f-f_builtins, w);
if (x == NULL) {
  load_global_error:
format_exc_check_arg(
PyExc_NameError,
GLOBAL_NAME_ERROR_MSG, w);
break;
}
}

So, to avoid the hash table lookups above maybe the global tokens could
be assigned an index value that is fixed for any given version of the
interpreter and that is the same for all modules (that True is always
index 7, len is always index 3, etc.)

Once a set of indexes have been determined a new opcode, that I'll call
LOAD_GTOKEN, could be created that avoids the hash table lookup by
functioning in a way that is similar to LOAD_FAST (pull a local variable
value out of an array).  For example, static references to True could
always be compiled to
  LOAD_GTOKEN 7 (True)

As to set 1) and set 2) that I mentioned above - there is only a need to
distinguish between the two sets if a copy-on-write mechanism is used.
That way global tokens that are likely to have their value changed
(group 2) ) can all be together in one group so that only that group
needs to be copied when one of the global tokens is written to.  For
example code such as:
True = 1
print True
would be compiled into something like:
  1   LOAD_CONST  1 (1)
  STORE_GTOKEN1   7 (True)
  2   LOAD_GTOKEN17 (True)
  PRINT_ITEM
  PRINT_NEWLINE
Note that 1 has been appended to STORE_GTOKEN to indicate that group
1) is being worked with.  The store command will copy the array of
pointers once, the first time it is called.

Just as a new opcode is needed for LOAD_GLOBAL one would be needed for
LOAD_ATTR.  Perhaps LOAD_ATOKEN would work.  For example:
amodule.len = my_len
print amodule.len
would be compiled into something like:
  1   LOAD_GLOBAL 0 (my_len)
  LOAD_GLOBAL 1 (amodule)
  STORE_ATOKEN1   3 (len)

  2   LOAD_GLOBAL 1 (amodule)
  LOAD_ATOKEN13 (len)
  PRINT_ITEM
  PRINT_NEWLINE
  LOAD_CONST  0 (None)
  RETURN_VALUE

Note that it looks almost identical to the code that is currently
generated, but the oparg 3 shown for the LOAD_ATOKEN1 above indexes
into an array (like LOAD_FAST) to get at the attribute directly whereas
the oparg that would be shown for LOAD_ATTR is an index into an array of
constants/strings which is then used to retrieve the attribute from the
module's global hash table.

  That's great, but I'm curious if additional gains can be
  made be focusing just on builtins.
 
 As long as builtins can be shadowed, I can't see how
 to make any extra use of the fact that it's a builtin.
 A semantic change would be needed, such as forbidding
 shadowing of builtins, or at least forbidding this
 from outside the module.

I now think that it best not to think of builtins as being a special
case.  What really matters

Re: [Python-Dev] Making builtins more efficient

2006-03-14 Thread Steven Elliott
On Thu, 2006-03-09 at 08:51 -0800, Raymond Hettinger wrote:
 [Steven Elliott]
  As you probably know each access of a builtin requires two hash table
  lookups.  First, the builtin is not found in the list of globals.  It is
  then found in the list of builtins.
 
 If someone really cared about the double lookup, they could flatten a level 
 by 
 starting their modules with:
 
from __builtin__ import *
 
 However, we don't see people writing this kind of code.  That could mean that 
 the double lookup hasn't been a big concern.

It could mean that.  I think what you are suggesting is sufficiently
cleaver that the average Python coder may not have thought of it.

In any case, many people are willing to do while 1 instead of while
True to avoid the double lookup.  And the from __builtin__ import *
additionally imposes a startup cost and memory cost (at least a word per
builtin, I would guess).

  Why not have a means of referencing the default builtins with some sort
  of index the way the LOAD_FAST op code currently works?
 
 FWIW, there was a PEP proposing a roughly similar idea, but the PEP 
 encountered 
 a great deal of resistance:
 
   http://www.python.org/doc/peps/pep-0329/
 
 When it comes time to write your PEP, it would helpful to highlight how it 
 differs from PEP 329 (i.e. implemented through the compiler rather than as a 
 bytecode hack, etc.).

I'm flattered that you think it might be worthy of a PEP.  I'll look
into doing that.

  Perhaps what I'm suggesting isn't feasible for reasons that have already
  been discussed.  But it seems like it should be possible to make while
  True as efficient as while 1.
 
 That is going to be difficult as long as it is legal to write:
 
 True = 0

LOAD_BUILTIN (or whatever we want to call it) should be as fast as
LOAD_FAST (locals) or LOAD_CONST in that they each index into an
array where the index is the argument to the opcode.  

I'll look into writing a PEP.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2006-03-11 Thread Steven Elliott
On Fri, 2006-03-10 at 12:46 +1300, Greg Ewing wrote:
 Steven Elliott wrote:
  One way of handling it is to
  alter STORE_ATTR (op code for assigning to mod.str) to always check to
  see if the key being assigned is one of the default builtins.  If it is,
  then the module's indexed array of builtins is assigned to.
 
 As long as you're going to all that trouble, it
 doesn't seem like it would be much harder to treat
 all global names that way, instead of just a predefined
 set. The compiler already knows all of the names that
 are used as globals in the module's code.

The important difference between builtins and globals is that with
builtins both the compiler and the runtime can enumerate all references
to builtins in a single consistent way.  That is True can always be
builtin #3 and len can always be builtin #17, or whatever.  This isn't
true of globals in that a pyc file referencing a global in a module may
have been compiled with a different version of that module (that is
some_module.some_global can't compiled to single fixed index since
stuff may shift around in some_module).  With globals you have the
same kind of problem that you have with operating systems that use
ordinals to refer to symbols in shared libraries.

So in the case of a static reference to a builtin (while True, or
whatever) the compiler would generate something that refers to it with
that builtin's index (such as a new BUILTIN_OP opcode, as Philip
suggested).  Ordinary globals (non-builtins) would continue to be
generated as the same code (the LOAD_GLOBAL opcode (I'll only refer to
the loading opcodes in this email)).

In the case of a dynamic reference to a builtin (eval('True = 7') or
from foo import * or whatever) would generate the opcode that
indicates that the runtime needs to figure out what do to (the same
LOAD_NAME opcode).  The second part of the the LOAD_NAME opcode is
similar to the current LOAD_GLOBAL opcode - it first checks the hash
tables of globals and then checks the hash table of builtins.  However,
the second part of the LOAD_NAME opcode could be implemented such that
it first checks against a list of default builtins (which could be a
hash table that returns the index of that builtin) and then indexes into
the array of builtins if it is found, or retrieves it from the single
hash table of globals otherwise.  So the LOAD_NAME opcode (or similar
attempts to dynamically get a name) would almost be as efficient as it
currently it.

  That's great, but I'm curious if additional gains can be
  made be focusing just on builtins.
 
 As long as builtins can be shadowed, I can't see how
 to make any extra use of the fact that it's a builtin.
 A semantic change would be needed, such as forbidding
 shadowing of builtins, or at least forbidding this
 from outside the module.

One way of looking at is rather than having a clear distinction between
builtins and globals (as there currently is) there would be a single
global name space that internally in Python is implemented in two data
structures.  An array for frequently used names and a hash table for
infrequently used names.  And the division between the two wouldn't even
have two be between globals and builtins like we've been talking about
so far.

What distinguishes the builtins is you get them for free (initialized on
startup).  So, it would be possible to insert infrequently used builtins
into the hash table of infrequently used names only when the module
refers to it.  Conversely, names that aren't builtins, but that are used
frequently in many different modules, such as sys or os, could have
indexes set aside for for them in the array of frequently used names.
Later, when when it gets a value (because sys is imported, or
whatever) it just gets stuck into the predetermined slot in the array of
frequently used names.

Since builtins can be shadowed, as you point out, there would have to be
one array of frequently used names per module.  But often it would be
the same as other modules.  So internally, as a matter of efficiency,
the interpreter could use a copy on write strategy where a global array
of frequently used names is used by the module until it assigns to
True, or something like that, at which point it gets its own copy.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Making builtins more efficient

2006-03-09 Thread Steven Elliott
On Thu, 2006-03-09 at 12:00 +, Paul Moore wrote:
 On 3/9/06, Nick Coghlan [EMAIL PROTECTED] wrote:
  Steven Elliott wrote:
   I'm interested in how builtins could be more efficient.  I've read over
   some of the PEPs having to do with making global variables more
   efficient (search for global):
   http://www.python.org/doc/essays/pepparade.html
   But I think the problem can be simplified by focusing strictly on
   builtins.
 
  Unfortunately, builtins can currently be shadowed in the module global
  namespace from outside the module (via constructs like import mod; mod.str 
  =
  my_str). Unless/until that becomes illegal, focusing solely on builtins
  doesn't help - the difficulties lie in optimising builtin access while
  preserving the existing name shadowing semantics.
 
 Is there any practical way of detecting and flagging constructs like
 the above (remotely shadowing a builtin in another module)? I can't
 see a way of doing it (but I know very little about this area...).

It may be possible to flag it, or it may be possible it make it work.

In my post I mentioned one special case that needs to be addressed
(assigning to __builtins__).  What Nick mentioned in his post (import
mod; mod.str = my_str) is another special case that needs to be
addressed.  If we can assume that all pyc files are compiled with the
same set of default bulitins (which should be assured by the by the
version in the pyc file) then there are two ways that things like
mod.str = my_str could be handled.

I believe that currently mod.str = my_str alters the module's global
hash table (f-f_globals in the code).  One way of handling it is to
alter STORE_ATTR (op code for assigning to mod.str) to always check to
see if the key being assigned is one of the default builtins.  If it is,
then the module's indexed array of builtins is assigned to.

Alternatively if we also wanted to optimize mod.str = my_str then
there could be a new opcode like STORE_ATTR that would take an index
into the array of builtins instead of an index into the names.

PEP 280, which Nick mentioned, talks about a cells, a hybrid data
structure that can do both hash table lookups and lookups by index
efficiently.  That's great, but I'm curious if additional gains can be
made be focusing just on builtins.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Making builtins more efficient

2006-03-08 Thread Steven Elliott
I'm interested in how builtins could be more efficient.  I've read over
some of the PEPs having to do with making global variables more
efficient (search for global):
http://www.python.org/doc/essays/pepparade.html
But I think the problem can be simplified by focusing strictly on
builtins.

One of my assumptions is that only a small fractions of modules override
the default builtins with something like:
import mybuiltins
__builtins__ = mybuiltins

As you probably know each access of a builtin requires two hash table
lookups.  First, the builtin is not found in the list of globals.  It is
then found in the list of builtins.

Why not have a means of referencing the default builtins with some sort
of index the way the LOAD_FAST op code currently works?  In other words,
by default each module gets the default set of builtins indexed (where
the index indexes into an array) in a certain order.  The version stored
in the pyc file would be bumped  each time the set of default builtins
is changed.

I don't have very strong feelings whether things like True = (1 == 1)
would be a syntax error, but assigning to a builtin could just do the
equivalent of STORE_FAST.  I also don't have very strong feelings about
whether the array of default builtins would be shared between modules.
To simulate the current behavior where attempting to assign to builtin
actually alters that module's global hashtable a separate array of
builtins could be used for each module.

As to assigning to __builtins__ (like I mentioned at the beginning of
this post) perhaps it could assign to the builtin array for those items
that have a name that matches a default builtin (such as True or
len).  Those items that don't match a default builtin would just
create global variables.

Perhaps what I'm suggesting isn't feasible for reasons that have already
been discussed.  But it seems like it should be possible to make while
True as efficient as while 1.

-- 
---
|  Steven Elliott  |  [EMAIL PROTECTED] |
---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com