On Sat, Mar 26, 2005 at 06:49:58PM +1100, Adam Kennedy wrote: : >Er, I'm not sure you will want to--I'm using PPI's evil twin brother, : >"PPD" (the actual Perl parser). I've just modified it so it doesn't : >forget anything I want it to remember. (As you know, the standard : >parser throws away gobs of useful information, everything from : >whitespace and comments to pruned opcode subtrees. I have a version : >that doesn't do that, by and large, though I'm still finding fiddly : >spots.) : : So I'm presuming that you don't intend this as a tool that can do mass : porting of code (due to the dependency issues), but rather as something : for helping individual module authors port individual files/modules.
With the existence of Ponie, my hope is that people can port things piecemeal and retest for regressions at every stage along the way, presuming they have something that actually has regression tests. I think "translate everything and hope for the best" is a recipe for disaster on any project larger than one person's head. That being said, there's nothing that says the translator has to support only one kind of output, which means there's no reason you can't have some kind of overall policy driving the individual translations, so I don't see why dependency mapping should be a big problem. It just forces your translation granularity to chunks of modules that require the same support, when that support is of a nature that can't be split between Ponie and Perl 6. Only in the limit does that mean you have to translate everything all at once, and you'd still probably want some kind of overall policy file to control it, if only so you can tweak it and try the whole mess some other way. : Also curious how you handle BEGIN and friends... I take they are : executed and then pruned, and end up unpruned in your XML? I just intercept the op_free() routine with another routine that knows where to store the op tree that was about to be freed, to the first approximation. I also install null nodes in the tree as "pegs" to hang the exact location of declarations like BEGIN, use, subs, etc. : Also curious if you have managed to keep comments, POD etc... Certainly. It takes MAD skills, where MAD stands for Miscellaneous Attribute Decorations. (Doing anything with toke.c requires madness.) Well, actually, speaking of doing things piecemeal, I haven't tested the POD part yet, just the comments. And I'm quite sure I haven't captured the __DATA__ yet, but that'll have to happen too. But conceptually it's all there. :-) The thing is that these MAD props are hung on whatever node is handy at the time, which might be the token before, but usually is the token after, but usually *wants* to be somewhere up higher in the tree that doesn't exist yet. The changes to Perl internals are intentionally very minimal so as not to influence parsing behavior more than .5 iota, so I don't try to do any tree rearrangement in the parser. The XML is just the raw dump of the tree with its misplaced madprops. That's the main reason for the first pass of translator, to reattach the madprops at a more appropriate place in the tree. Interesting issues arise, such as deciding when a comment goes with the previous code and when it goes with the next code, or when you just stick it into the interstices for now. At the moment my tendency is to hoist leading and trailing whitespace into the interstices of the higher list when that's practical. But with comments you'd like them to travel with the code they're commenting, in cases where refactoring moves code around. The basic problem is that there's no one level that's right to do the translation. You have to take into account both shallow and deep information and everything in between simultaneously, because all of those things are important to the programmer at some point. I'm aiming for a deeply correct translation that tries to preserve as much surface detail as possible, but when push comes to shove, it's the surface detail that has to get shoved, even if that screws up their pretty formatting. The nice thing about a deep translation is that you can know when you're guessing, and at least mark it so the programmer can double-check the translation. A surface-level translator is always guessing, and doesn't always know it. I dare say most Perl 5 could be translated to Perl 6 with a series of s///, but it always be getting stupid just when you want it to be smart. Gee, it looks like you found my hot button, or at least my warm button. Maybe I should work up a talk about all this someday... Larry