On Sat, Mar 26, 2005 at 06:49:58PM +1100, Adam Kennedy wrote:
: >Er, I'm not sure you will want to--I'm using PPI's evil twin brother,
: >"PPD" (the actual Perl parser).  I've just modified it so it doesn't
: >forget anything I want it to remember.  (As you know, the standard
: >parser throws away gobs of useful information, everything from
: >whitespace and comments to pruned opcode subtrees.  I have a version
: >that doesn't do that, by and large, though I'm still finding fiddly
: >spots.) 
: 
: So I'm presuming that you don't intend this as a tool that can do mass 
: porting of code (due to the dependency issues), but rather as something 
: for helping individual module authors port individual files/modules.

With the existence of Ponie, my hope is that people can port things
piecemeal and retest for regressions at every stage along the way,
presuming they have something that actually has regression tests.
I think "translate everything and hope for the best" is a recipe
for disaster on any project larger than one person's head.

That being said, there's nothing that says the translator has to
support only one kind of output, which means there's no reason
you can't have some kind of overall policy driving the individual
translations, so I don't see why dependency mapping should be a
big problem.  It just forces your translation granularity to chunks
of modules that require the same support, when that support is of a
nature that can't be split between Ponie and Perl 6.  Only in the limit
does that mean you have to translate everything all at once, and you'd
still probably want some kind of overall policy file to control it,
if only so you can tweak it and try the whole mess some other way.

: Also curious how you handle BEGIN and friends... I take they are 
: executed and then pruned, and end up unpruned in your XML?

I just intercept the op_free() routine with another routine that knows
where to store the op tree that was about to be freed, to the first
approximation.  I also install null nodes in the tree as "pegs" to hang
the exact location of declarations like BEGIN, use, subs, etc.

: Also curious if you have managed to keep comments, POD etc...

Certainly.  It takes MAD skills, where MAD stands for Miscellaneous
Attribute Decorations.  (Doing anything with toke.c requires madness.)

Well, actually, speaking of doing things piecemeal, I haven't tested
the POD part yet, just the comments.  And I'm quite sure I haven't
captured the __DATA__ yet, but that'll have to happen too.  But conceptually
it's all there.  :-)

The thing is that these MAD props are hung on whatever node is handy at
the time, which might be the token before, but usually is the token after,
but usually *wants* to be somewhere up higher in the tree that doesn't
exist yet.  The changes to Perl internals are intentionally very minimal
so as not to influence parsing behavior more than .5 iota, so I don't
try to do any tree rearrangement in the parser.  The XML is just the
raw dump of the tree with its misplaced madprops.  That's the main
reason for the first pass of translator, to reattach the madprops
at a more appropriate place in the tree.

Interesting issues arise, such as deciding when a comment goes with the
previous code and when it goes with the next code, or when you just
stick it into the interstices for now.  At the moment my tendency is
to hoist leading and trailing whitespace into the interstices of the
higher list when that's practical.  But with comments you'd like them
to travel with the code they're commenting, in cases where refactoring
moves code around.

The basic problem is that there's no one level that's right to do
the translation.  You have to take into account both shallow and
deep information and everything in between simultaneously, because
all of those things are important to the programmer at some point.
I'm aiming for a deeply correct translation that tries to preserve
as much surface detail as possible, but when push comes to shove,
it's the surface detail that has to get shoved, even if that screws
up their pretty formatting.  The nice thing about a deep translation
is that you can know when you're guessing, and at least mark it
so the programmer can double-check the translation.  A surface-level
translator is always guessing, and doesn't always know it.  I dare
say most Perl 5 could be translated to Perl 6 with a series of s///,
but it always be getting stupid just when you want it to be smart.

Gee, it looks like you found my hot button, or at least my warm button.
Maybe I should work up a talk about all this someday...

Larry

Reply via email to