Hi,

On Sun, May 08, 2005 at 11:26:59AM +0200, Armin Rigo wrote:
> Hi Ludovic, hi all,
> 
> I had a look at recparser, and how to integrate it into PyPy.  Ideally,
> it can be exported as the 'parser' module by adding a line to
> interpreter/baseobjspace.py (see the commented-out line about the other
> 'parser'). 
Yes we've been experiencing with that already.

> A few comments about the interface file pyparser.py (this
> should be put in some documentation...):
> 
> * applevel() requires obscure tweaking about the 'import compiler'
>   statement, the prevent the whole compiler package to be dragged in and
>   compiled by PyPy (which may be what we want later, but for now it just
>   doesn't work, I expect).  I checked that in.
maybe this will solve the problem we're seeing when trying to compile
parse trees generated from either parsers.


> * the interpleveldef exports a class, 'STType'.  I added another hack in
>   lazymodule.py to make that work.  Basically, the interp-level exports
>   had to be wrapped objects, or functions -- which get wrapped
>   automatically.  Types now also get wrapped automatically.  Previously,
>   you'd have needed an interpleveldef like
>   
>      'STType': 'space.gettypeobject(pyparser.STType.typedef)'
> 
>   which fishes the typedef (i.e. the definition of the app-level type)
>   corresponding to the class STType, and asks the space to build a real
>   app-level type object for it.
ok

> At the moment, with the above changes, it appears to work rather nicely
> (at least the few exported methods).  But we cannot feed the parse
> tuples to the pure Python compiler package because the latter expect
> tuples with line number information, and as far as I see you're always
> generating tuples without.  It seems that you're collecting the
> information already so it should not be difficult to fix.


> 
> The next step would be to integrate it so that it is used by the
> built-ins, like compile().  There is a new abstraction, class Compiler,
> in pypy.interpreter.compiler.  Its purpose is to be subclassed by
> concrete compilers; currently there is only CPythonCompiler, which
> cheats and calls compile() at interpreter-level.  I guess that it should
> be possible to create another subclass that uses recparser and the pure
> Python compiler package to do its job, or even a generic PythonCompiler
> that uses whatever built-in 'parser' module is available, and then the
> pure Python compiler package.
> 
> All of PyPy ends up using the compiler instance is stored in the current
> execution context whenever it needs to compile source code (including at
> the interactive prompt).
> 
> 
> Finally, a quick look over the recparser sources shows a few constructs
> that are clearly not "RPython", i.e. too dynamic.  We need to think a
> bit and see how to address the issue.  About RPython:
> http://codespeak.net/pypy/index.cgi?doc/coding-style.html#restricted-python
> 
> Before we actually try to perform type inference on recparser, it's a
> bit hard to know if there are type problems or not.  It is often the
> case that even when we write code knowing that it should be RPython we
> oversee some subtle typing problem.  I'll give it a try, I guess (this
> is done by enabling the recparser module in baseobjspace as hinted
> above, running "dist/goal/translate_pypy.py targetpypy", and trying to
> make sense out of the obscure assertion errors and enormous flow graphs
> we get...)

> For now, a problematic feature that is obvious is the visitor pattern
> that you use extensively.  It's definitely a great pattern, but not one
> that immediately applies to C- or Java-like languages.  I'm not saying
> that you should rewrite all of recparser; more that we need to find a
> trick to implement visitor patterns without the getattr() with a
> computed attribute name.  Possibly something along these lines:
> 
>     class MyVisitor:
>         def visit_name1(self, node):
>             ...
>         def visit_name2(self, node):
>             ...
> 
>         # this can be computed by a for loop instead:
>         VISIT_MAP = {'name1': visit_name1,
>                      'name2': visit_name2,
>                     }
>     
>     class Node:
>         def visit(self, visitor):
>             visit_meth = visitor.VISIT_MAP[self.name]
>             visit_meth(visitor, self)
> 
> The difference with the getattr() case is that the operation that
> replaces it, a getitem on a constant dictionary, has a reasonable
> C-level equivalent, namely a (precomputed) hash table lookup.
sure, I discussed that with Hoelger already, thing is the visitor isn't
used for parsing but only by the EBNFParser which parses the python
grammar file and turn it into a tree of grammar object
This should be called only at startup time.
I must say I am not sure whether the following call in recparser/__init__.py:
PYTHON_PARSER = pythonutil.python_grammar()
really is called at bootstrap time ?
anyway, at this time PYTHON_PARSER is a static tree of objects
representing the grammar and for now the parsing is done by providing a
'builder' object to the match method of the tree (in fact there are
several subtrees, one for each grammar targets)


> 
> That's it for now.  Don't hesitate to ask if I'm not making sense, or
> for more help about integration issues.  I am aware that it is some kind
> of guesswork at the moment.  Just feel free to post to pypy-dev.
> 
> 
> A bientot,
> 
> Armin.
> 

-- 
Ludovic Aubry                                 LOGILAB, Paris (France).
http://www.logilab.com   http://www.logilab.fr  http://www.logilab.org
_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev

Reply via email to