So, Tom and I have wondered on and off about how to represent/handle $-vars better. And last week, when Charlie, Tom, and I met, we talked about it some more, where it occured to us that $-vars are not that different from local variables. Tom had also indicated earlier that $1 .. $n are just a function of the match-data ($~), and so they are. So are $`, $', $+.
Today, on my flight to San Francisco, I started poking around it some more and realized that we could encode $-vars better. Here is a proposal for the same, which also has some ramifications (I think for the good) on the JRuby implementation of RegExp. We create a special operand for $~, say LastMatch. Once we figure out more details, LastMatch could derive off LocalVariable, and in all analyses will behave as if the programmer declared it, and will get a fixed slot (say zero, or -1 which could represent the last slot, or n+1 where n is # incoming arg slots) in the binding (instead of a special field in DynamicScope). One significant difference from a regular local-var is that there won't be any scope-visible definitions into this local variable. But, clearly, reg-exp calls will set this var (updateBackref in RubyRegexp and RubyString) -- but this could be pretty much any call (in the absence of any other info). So, that information has to be somehow surfaced in the IR representation so it can be subject to regular analysis like any other var. In any case, I dont know how it will as an explicit local-var yet, but till then, we could treat LastMatch as a special operand type. So, we create LastMatch, and get rid of Backref and NthRef operands and convert them into special calls on the LastMatch operand. And, here are the two significant changes. If there are no uses of LastMatch in a method (and all descendent scopes -- blocks passed into calls), then there is no reason for RubyString/RubyRegexp to do anything with updateBackref. This also means that scopes can have reg-exp calls and can still get by without allocating a heap binding for those scopes. The question is how to pass this information into RubyString/RubyRegexp. One is by way of a special flag set somewhere on the call stack ... or by using special-purpose calls in Regexp. The existing AST implementations might also be able to take advantage of it, I think ... Anyway, this is just a broad outline, and not all details are worked out, but insofaras $~ is effectively just a regular local-var, it has the exact same behavior as a method-level local var used by a piece of Ruby code. So far, in the current implementation of backRef in DynamicScope as a special field, I see that it behaves this way as well -- so why not just make it explicit and take advantage of it? Am I missing anything? Subbu.