Re: [Python-Dev] (no subject)

2005-11-28 Thread Guido van Rossum
On 11/24/05, Duncan Grisby [EMAIL PROTECTED] wrote:
 Hi,

 I posted this to comp.lang.python, but got no response, so I thought I
 would consult the wise people here...

 I have encountered a problem with the re module. I have a
 multi-threaded program that does lots of regular expression searching,
 with some relatively complex regular expressions. Occasionally, events
 can conspire to mean that the re search takes minutes. That's bad
 enough in and of itself, but the real problem is that the re engine
 does not release the interpreter lock while it is running. All the
 other threads are therefore blocked for the entire time it takes to do
 the regular expression search.

Rather than trying to fight the GIL, I suggest that you let a regex
expert look at your regex(es) and the input that causes the long
running times. As Fredrik suggested, certain patterns are just
inefficient but can be rewritten more efficiently. There are plenty of
regex experts on c.l.py.

Unless you have a multi-CPU box, the performance of your app isn't
going to improve by releasing the GIL -- it only affects the
responsiveness of other threads.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 1351036: PythonD modifications

2005-11-28 Thread Guido van Rossum
On 11/20/05, Martin v. Löwis [EMAIL PROTECTED] wrote:
 [EMAIL PROTECTED] wrote:
  The local python community here in Sydney indicated that python.org is
  only upset when groups port the source to 'obscure' systems and *don't*
  submit patches... It is possible that I was misinformed.

 I never heard such concerns. I personally wouldn't notice if somebody
 ported Python, and did not feed back the patches.

I guess that I'm the source of that sentiment.

My reason for wanting people to contribute ports back is that if they
don't, the port is more likely to stick on some ancient version of
Python (e.g. I believe Nokia is still at 2.2.2). Then, assuming the
port remains popular, its users are going to pressure developers of
general Python packages to provide support for old versions of Python.

While I agree that maintaining port-specific code is a pain whenever
Python is upgraded, I still think that accepting patches for
odd-platform ports is the better alternative. Even if the patches
deteriorate as Python evolves, they should still (in principle) make a
re-port easier.

Perhaps the following compromise can be made: the PSF accepts patches
from reputable platform maintainers. (Of course, like all
contributions, they must be of high quality and not break anything,
etc., before they are accepted.) If such patches cause problems with
later Python versions, the PSF won't maintain them, but instead invite
the original contributors (or other developers who are interested in
that particular port) to fix them. If there is insufficient response,
or if it comes too late given the PSF release schedule, the PSF
developers may decide to break or remove support for the affected
platform.

There's a subtle balance between keeping too much old cruft and being
too aggressive in removing cruft that still serves a purpose for
someone. I bet that we've erred in both directions at times.

 Sometimes, people ask there is this and that port, why isn't it
 integrated, to which the answer is in most cases because authors
 didn't contribute. This is not being upset - it is merely a fact.
 This port (djgcc) is the first one in a long time (IIRC) where
 anybody proposed rejecting it.

  I am not sure about the future myself. DJGPP 2.04 has been parked at beta
  for two years now. It might be fair to say that the *general* DJGPP
  developer base has shrunk a little bit. But the PythonD userbase has
  actually grown since the first release three years ago. For the time
  being, people get very angry when the servers go down here :-)

 It's not that much availability of the platform I worry about, but the
 commitment of the Python porter. We need somebody to forward bug
 reports to, and somebody to intervene if incompatible changes are made.
 This person would also indicate that the platform is no longer
 available, and hence the port can be removed.

It sounds like Ben Decker is for the time being volunteering to
provide patches and to maintain them. (I hope I'm reading you right,
Ben.) I'm +1 on accepting his patches, *provided* as always they pass
muster in terms of general Python development standards. (Jeff Epler's
comments should be taken to heart.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SRE should release the GIL (was: no subject)

2005-11-28 Thread Duncan Grisby
On Monday 28 November, Guido van Rossum wrote:

 On 11/24/05, Duncan Grisby [EMAIL PROTECTED] wrote:

  I have encountered a problem with the re module. I have a
  multi-threaded program that does lots of regular expression searching,
  with some relatively complex regular expressions. Occasionally, events
  can conspire to mean that the re search takes minutes. That's bad
  enough in and of itself, but the real problem is that the re engine
  does not release the interpreter lock while it is running. All the
  other threads are therefore blocked for the entire time it takes to do
  the regular expression search.
 
 Rather than trying to fight the GIL, I suggest that you let a regex
 expert look at your regex(es) and the input that causes the long
 running times. As Fredrik suggested, certain patterns are just
 inefficient but can be rewritten more efficiently. There are plenty of
 regex experts on c.l.py.

Part of the problem is certainly inefficient regexes, and we have
improved things to some extent by changing some of them. Unfortunately,
the regexes come from user input, so we can't be certain that our users
aren't going to do stupid things. It's not too bad if a stupid regex
slows things down for a bit, but it is bad if it causes the whole
application to freeze for minutes at a time.

 Unless you have a multi-CPU box, the performance of your app isn't
 going to improve by releasing the GIL -- it only affects the
 responsiveness of other threads.

We do have a multi-CPU box. Even with good regexes, regex matching takes
up a significant proportion of the time spent processing in our
application, so being able to release the GIL will hopefully increase
performance overall as well as increasing responsiveness.

We are currently testing our application with the patch to sre that Eric
posted. Once we get on to some performance tests, we'll post the results
of whether releasing the GIL does make a measurable difference for us.

Cheers,

Duncan.

-- 
 -- Duncan Grisby --
  -- [EMAIL PROTECTED] --
   -- http://www.grisby.org --
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 1351036: PythonD modifications

2005-11-28 Thread Martin v. Löwis
Guido van Rossum wrote:
 Perhaps the following compromise can be made: the PSF accepts patches
 from reputable platform maintainers. (Of course, like all
 contributions, they must be of high quality and not break anything,
 etc., before they are accepted.) If such patches cause problems with
 later Python versions, the PSF won't maintain them, but instead invite
 the original contributors (or other developers who are interested in
 that particular port) to fix them. If there is insufficient response,
 or if it comes too late given the PSF release schedule, the PSF
 developers may decide to break or remove support for the affected
 platform.

This is indeed the compromise I was after. If the contributors indicate
that they will maintain it for some time (which happened in this case),
then I can happily accept any port (and did indeed in the past).

In the specific case, there is an additional twist that we deliberately
removed DOS support some time ago, and listed that as officially removed
in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
although I don't understand the differences (anymore).

But if it's fine with you, it is fine with me.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Bug day this Sunday?

2005-11-28 Thread A.M. Kuchling
Is anyone interested in joining a Python bug day this Sunday?

A useful task might be to prepare for the python-core sprint at PyCon
by going through the bug and patch managers, and listing bugs/patches
that would be good candidates for working on at PyCon.

We'd meet in the usual location: #python-dev on irc.freenode.net, from
roughly 9AM to 3PM Eastern (2PM to 8PM UTC) on Sunday Dec. 4.

--amk
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed additional keyword argument in logging calls

2005-11-28 Thread Guido van Rossum
On 11/22/05, Vinay Sajip [EMAIL PROTECTED] wrote:
 On numerous occasions, requests have been made for the ability to easily add
 user-defined data to logging events. For example, a multi-threaded server
 application may want to output specific information to a particular server
 thread (e.g. the identity of the client, specific protocol options for the
 client connection, etc.)

 This is currently possible, but you have to subclass the Logger class and
 override its makeRecord method to put custom attributes in the LogRecord.
 These can then be output using a customised format string containing e.g.
 %(foo)s %(bar)d. The approach is usable but requires more work than
 necessary.

 I'd like to propose a simpler way of achieving the same result, which
 requires use of an additional optional keyword argument in logging calls.
 The signature of the (internal) Logger._log method would change from

   def _log(self, level, msg, args, exc_info=None)

 to

   def _log(self, level, msg, args, exc_info=None, extra_info=None)

 The extra_info argument will be passed to Logger.makeRecord, whose signature
 will change from

   def makeRecord(self, name, level, fn, lno, msg, args, exc_info):

 to

   def makeRecord(self, name, level, fn, lno, msg, args, exc_info,
 extra_info)

 makeRecord will, after doing what it does now, use the extra_info argument
 as follows:

 If type(extra_info) != types.DictType, it will be ignored.

 Otherwise, any entries in extra_info whose keys are not already in the
 LogRecord's __dict__ will be added to the LogRecord's __dict__.

 Can anyone see any problems with this approach? If not, I propose to post
 the approach on python-list and then if there are no strong objections,
 check it in to the trunk. (Since it could break existing code, I'm assuming
 (please correct me if I'm wrong) that it shouldn't go into the
 release24-maint branch.)

This looks like a good clean solution to me. I agree with Paul Moore's
suggestion that if extra_info is not None you should just go ahead and
use it as a dict and let the errors propagate.

What's the rationale for not letting it override existing fields?
(There may be a good one, I just don't see it without turning on my
thinking cap, which would cost extra. :-)

Perhaps it makes sense to call it 'extra' instead of 'extra_info'?

As a new feature it should definitely not go into 2.4; but I don't see
how it could break existing code.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Guido van Rossum
On 11/18/05, Neil Schemenauer [EMAIL PROTECTED] wrote:
 Perhaps we should use the memory management technique that the rest
 of Python uses: reference counting.  I don't see why the AST
 structures couldn't be PyObjects.

Me neither. Adding yet another memory allocation scheme to Python's
already staggering number of memory allocation strategies sounds like
a bad idea.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] something is wrong with test___all__

2005-11-28 Thread Guido van Rossum
Has this been handled yet? If not, perhaps showing the good and bad
bytecode here would help trigger someone's brain into understanding
the problem.

On 11/22/05, Reinhold Birkenfeld [EMAIL PROTECTED] wrote:
 Hi,

 on my machine, make test hangs at test_colorsys.

 Careful investigation shows that when the bytecode is freshly generated
 by make all (precisely in test___all__) the .pyc file is different from 
 what a
 direct call to regrtest.py test_colorsys produces.

 Curiously, a call to regrtest.py test___all__ instead of make test 
 produces
 the correct bytecode.

 I can only suspect some AST bug here.

 Reinhold

 --
 Mail address is perfectly valid!

 ___
 Python-Dev mailing list
 Python-Dev@python.org
 http://mail.python.org/mailman/listinfo/python-dev
 Unsubscribe: 
 http://mail.python.org/mailman/options/python-dev/guido%40python.org



--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 1351036: PythonD modifications

2005-11-28 Thread Martin v. Löwis
Guido van Rossum wrote:
  I don't recall why DOS support was removed (PEP 11 doesn't say)

The PEP was actually created after the removal, so you added (or
asked me to add) this entry:

 Name: MS-DOS, MS-Windows 3.x
 Unsupported in:   Python 2.0
 Code removed in:  Python 2.1

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 1351036: PythonD modifications

2005-11-28 Thread Guido van Rossum
On 11/28/05, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Guido van Rossum wrote:
  Perhaps the following compromise can be made: the PSF accepts patches
  from reputable platform maintainers. (Of course, like all
  contributions, they must be of high quality and not break anything,
  etc., before they are accepted.) If such patches cause problems with
  later Python versions, the PSF won't maintain them, but instead invite
  the original contributors (or other developers who are interested in
  that particular port) to fix them. If there is insufficient response,
  or if it comes too late given the PSF release schedule, the PSF
  developers may decide to break or remove support for the affected
  platform.

 This is indeed the compromise I was after. If the contributors indicate
 that they will maintain it for some time (which happened in this case),
 then I can happily accept any port (and did indeed in the past).

 In the specific case, there is an additional twist that we deliberately
 removed DOS support some time ago, and listed that as officially removed
 in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
 although I don't understand the differences (anymore).

 But if it's fine with you, it is fine with me.

Thanks. :-) I say, the more platforms the merrier.

I don't recall why DOS support was removed (PEP 11 doesn't say) but I
presume it was just because nobody volunteered to maintain it, not
because we have a particularly dislike for DOS. So now that we have a
volunteer let's deal with his patches without prejudice.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Guido van Rossum
On 11/28/05, Jeremy Hylton [EMAIL PROTECTED] wrote:
 On 11/28/05, Guido van Rossum [EMAIL PROTECTED] wrote:
  On 11/18/05, Neil Schemenauer [EMAIL PROTECTED] wrote:
   Perhaps we should use the memory management technique that the rest
   of Python uses: reference counting.  I don't see why the AST
   structures couldn't be PyObjects.
 
  Me neither. Adding yet another memory allocation scheme to Python's
  already staggering number of memory allocation strategies sounds like
  a bad idea.

 The reason this thread started was the complaint that reference
 counting in the compiler is really difficult.  Almost every line of
 code can lead to an error exit.

Sorry, I forgot that (I've been off-line for a week of quality time
with Orlijn, and am now digging my self out from under several hundred
emails :-).

 The code becomes quite cluttered when
 it uses reference counting.  Right now, the AST is created with
 malloc/free, but that makes it hard to free the ast at the right time.

Would fixing the code to add free() calls in all the error exits make
it more or less cluttered than using reference counting?

  It would be fairly complex to convert the ast nodes to pyobjects.
 They're just simple discriminated unions right now.

Are they all the same size?

 If they were
 allocated from an arena, the entire arena could be freed when the
 compilation pass ends.

Then I don't understand why there was discussion of alloca() earlier
on -- surely the lifetime of a node should not be limited by the stack
frame that allocated it?

I'm not in principle against having an arena for this purpose, but I
worry that this will make it really hard to provide a Python API for
the AST, which has already been requested and whose feasibility
(unless I'm mistaken) also was touted as an argument for switching to
the AST compiler in the first place. I hope we'll never have to deal
with an API like the parser module provides...

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Martin v. Löwis
Jeremy Hylton wrote:
  The reason this thread started was the complaint that reference
  counting in the compiler is really difficult.  Almost every line of
  code can lead to an error exit.  The code becomes quite cluttered when
  it uses reference counting.  Right now, the AST is created with
  malloc/free, but that makes it hard to free the ast at the right time.
   It would be fairly complex to convert the ast nodes to pyobjects.
  They're just simple discriminated unions right now.  If they were
  allocated from an arena, the entire arena could be freed when the
  compilation pass ends.

I haven't looked at the AST code at all so far, but my experience
with gcc is that such an approach is fundamentally flawed: you
would always have memory that ought to survive the parsing, so
you will have to copy it out of the arena. This will either lead
to dangling pointers, or garbage memory. So in gcc, they eventually
moved to a full garbage collector (after several iterations).

Reference counting has the advantage that you can always DECREF
at the end of the function. So if you put all local variables
at the beginning of the function, and all DECREFs at the end,
getting clean memory management should be doable, IMO. Plus,
contributors would be familiar with the scheme in place.

I don't know if details have already been proposed, but I would
update asdl to generate a hierarchy of classes: i.e.

class mod(object):pass

class Module(mod):
   def __init__(self, body):
 self.body = body # List of stmt

#...

class Expression(mod):
   def __init__(self, body):
 self.body = body # expr

# ...
class Raise(stmt):
   def __init__(self, dest, values, nl):
  self.dest # expr or None
  self.values # List of expr
  self.bl # bool (True or False)

There would be convenience functions, like

   PyObject *mod_Module(PyObject* body);
   enum mod_kind mod_kind(PyObject* mod);
   // Module, Interactive, Expression, or mod_INVALID
   PyObject *mod_Expression_body(PyObject*);
   //...
   PyObject *stmt_Raise_dest(PyObject*);

(whether the accessors return new or borrowed reference
  could be debated; plain C struct accesses would also
  be possible)

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Guido van Rossum
[Guido]
  Then I don't understand why there was discussion of alloca() earlier
  on -- surely the lifetime of a node should not be limited by the stack
  frame that allocated it?

[Jeremy]
 Actually this is a pretty good limit, because all these data
 structures are temporaries used by the compiler.  Once compilation has
 finished, there's no need for the AST or the compiler state.

Are you really saying that there is one function which is called only
once (per compilation) which allocates *all* the AST nodes? That's the
only situation where I'd see alloca() working -- unless your alloca()
doesn't allocate memory on the stack. I was somehow assuming that the
tree would be built piecemeal by parser callbacks or some such
mechanism. There's still a stack frame whose lifetime limits the AST
lifetime, but it is not usually the current stackframe when a new node
is allocated, so alloca() can't be used.

I guess I don't understand the AST compiler code enough to participate
in this discussion. Or perhaps we are agreeing violently?

  I'm not in principle against having an arena for this purpose, but I
  worry that this will make it really hard to provide a Python API for
  the AST, which has already been requested and whose feasibility
  (unless I'm mistaken) also was touted as an argument for switching to
  the AST compiler in the first place. I hope we'll never have to deal
  with an API like the parser module provides...

 My preference would be to have the ast shared by value.  We generate
 code to serialize it to and from a byte stream and share that between
 Python and C.  It is less efficient, but it is also very simple.

So there would still be a Python-objects version of the AST but the
compiler itself doesn't use it.

At least by-value makes sense to me -- if you're making tree
transformations you don't want accidental sharing to cause unexpected
side effects.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] reference leaks

2005-11-28 Thread Walter Dörwald
Neal Norwitz wrote:

 On 11/25/05, Walter Dörwald [EMAIL PROTECTED] wrote:
 Can you move the call to codecs.register_error() out of test_callbacks()
 and retry?
 
 It then leaks 3 refs on each call to test_callbacks().

This should be fixed now in r41555 and r41556.

Bye,
Walter Dörwald

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Greg Ewing
Here's a somewhat radical idea:

Why not write the parser and bytecode compiler in Python?

A .pyc could be bootstrapped from it and frozen into
the executable.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | A citizen of NewZealandCorp, a   |
Christchurch, New Zealand  | wholly-owned subsidiary of USA Inc.  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Brett Cannon
On 11/28/05, Greg Ewing [EMAIL PROTECTED] wrote:
 Here's a somewhat radical idea:

 Why not write the parser and bytecode compiler in Python?

 A .pyc could be bootstrapped from it and frozen into
 the executable.


Is there a specific reason you are leaving out the AST, Greg, or do
you count that as part of the bytecode compiler (I think of that as
the AST-bytecode step handled by Python/compile.c)?

While ease of maintenance would be fantastic and would probably lead
to much more language experimentation if more of the core parts of
Python were written in Python, I would worry about performance.  While
generating bytecode is not necessarily an everytime thing, I know
Guido has said he doesn't like punishing the performance of small
scripts in the name of large-scale apps (reason why interpreter
startup time has always been an issue) which tend not to have a .pyc
file.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CVS repository mostly closed now

2005-11-28 Thread 장혜식
On 11/27/05, Martin v. Löwis [EMAIL PROTECTED] wrote:
 I tried removing the CVS repository from SF; it turns
 out that this operation is not supported. Instead, it
 is only possible to remove it from the project page;
 pserver and ssh access remain indefinitely, as does
 viewcvs.

There's a hacky trick to remove them:
 put  rm -rf $CVSROOT/src into CVSROOT/loginfo
and remove the line then and commit again. :)


Hye-Shik
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CVS repository mostly closed now

2005-11-28 Thread Fred L. Drake, Jr.
On Monday 28 November 2005 20:14, 장혜식 wrote:
  There's a hacky trick to remove them:
   put  rm -rf $CVSROOT/src into CVSROOT/loginfo
  and remove the line then and commit again. :)

Wow, that is tricky!  Glad it wasn't me who thought of this one.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   fdrake at acm.org
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Neal Norwitz
On 11/28/05, Martin v. Löwis [EMAIL PROTECTED] wrote:
 Neal Norwitz wrote:
  Hope this helps explain a bit.  Please speak up with how this can be
  improved.  Gotta run.

 I would rewrite it as

[code snipped]

For those watching, Greg's and Martin's version were almost the same. 
However, Greg's version left in the memory leak, while Martin fixed it
by letting the result fall through.  Martin added some helpful rules
about dealing with the memory.  Martin also gets bonus points for
talking about developing a checker. :-)

In both cases, their modified code is similar to the existing AST
code, but all deallocation is done with Py_[X]DECREFs rather than a
type specific deallocator.  Definitely nicer than the current
situation.  It's also the same as the rest of the python code.

With arenas the code would presumably look something like this:

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
{
/* funcdef: 'def' [decorators] NAME parameters ':' suite */
identifier name;
arguments_ty args;
asdl_seq *body;
asdl_seq *decorator_seq = NULL;
int name_i;

REQ(n, funcdef);

if (NCH(n) == 6) { /* decorators are present */
decorator_seq = ast_for_decorators(c, CHILD(n, 0));
if (!decorator_seq)
return NULL;
name_i = 2;
}
else {
name_i = 1;
}

name = NEW_IDENTIFIER(CHILD(n, name_i));
if (!name)
return NULL;
Py_AST_Register(name);
if (!strcmp(STR(CHILD(n, name_i)), None)) {
ast_error(CHILD(n, name_i), assignment to None);
return NULL;
}
args = ast_for_arguments(c, CHILD(n, name_i + 1));
body = ast_for_suite(c, CHILD(n, name_i + 3));
if (!args || !body)
return NULL;

return FunctionDef(name, args, body, decorator_seq, LINENO(n));
}

All the goto's become return NULLs.  After allocating a PyObject, it
would need to be registered (ie, the mythical Py_AST_Register(name)). 
This is easier than using all PyObjects in that when an error occurs,
there's nothing to think about, just return.  Only optional values
(like decorator_seq) need to be initialized.  It's harder in that one
must remember to register any PyObject so it can be Py_DECREFed at the
end.  Since the arena is allocated in big hunk(s), it would presumably
be faster than using PyObjects since there would be less memory
allocation (and fragmentation).  It should be possible to get rid of
some of the conditionals too (I joined body and args above).

Using all PyObjects has another benefit that may have been mentioned
elsewhere, ie that the rest of Python uses the same techniques for
handling deallocation.

I'm not really advocating any particular approach.  I *think* arenas
would be easiest, but it's not a clear winner.  I think Martin's note
about GCC using GC is interesting.  AFAIK GCC is a lot more complex
than the Python code, so I'm not sure it's 100% relevant.  OTOH, we
need to weigh that experience.

n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser compiler

2005-11-28 Thread Martin v. Löwis
Neal Norwitz wrote:
 For those watching, Greg's and Martin's version were almost the same. 
 However, Greg's version left in the memory leak, while Martin fixed it
 by letting the result fall through.

Actually, Greg said (correctly) that his version also fixes the
leak: he assumed that FunctionDef would *consume* the references
being passed (whether it is successful or not).

I don't think this is a good convention, though.

Regards,
Martin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com