On Thu, May 3, 2012 at 3:17 PM, Charles Oliver Nutter <head...@headius.com> wrote: > Inline... > > On Thu, May 3, 2012 at 3:52 AM, Subramanya Sastry <sss.li...@gmail.com> wrote: >> So, Tom and I have wondered on and off about how to represent/handle $-vars >> better. And last week, when Charlie, Tom, and I met, we talked about it >> some more, where it occured to us that $-vars are not that different from >> local variables. Tom had also indicated earlier that $1 .. $n are just a >> function of the match-data ($~), and so they are. So are $`, $', $+. > > Let's be clear here first of all...we're talking about the > method-local (frame-local) $-vars $_, $~, and derivatives of the > latter like you list. Other globals have differenty scoping. > >> We create a special operand for $~, say LastMatch. Once we figure out more >> details, LastMatch could derive off LocalVariable, and in all analyses will >> behave as if the programmer declared it, and will get a fixed slot (say >> zero, or -1 which could represent the last slot, or n+1 where n is # >> incoming arg slots) in the binding (instead of a special field in >> DynamicScope). One significant difference from a regular local-var is that >> there won't be any scope-visible definitions into this local variable. But, >> clearly, reg-exp calls will set this var (updateBackref in RubyRegexp and >> RubyString) -- but this could be pretty much any call (in the absence of any >> other info). So, that information has to be somehow surfaced in the IR >> representation so it can be subject to regular analysis like any other var. >> In any case, I dont know how it will as an explicit local-var yet, but till >> then, we could treat LastMatch as a special operand type. >> >> So, we create LastMatch, and get rid of Backref and NthRef operands and >> convert them into special calls on the LastMatch operand. >> >> And, here are the two significant changes. If there are no uses of >> LastMatch in a method (and all descendent scopes -- blocks passed into >> calls), then there is no reason for RubyString/RubyRegexp to do anything >> with updateBackref. This also means that scopes can have reg-exp calls and >> can still get by without allocating a heap binding for those scopes. The >> question is how to pass this information into RubyString/RubyRegexp. One is >> by way of a special flag set somewhere on the call stack ... or by using >> special-purpose calls in Regexp. The existing AST implementations might >> also be able to take advantage of it, I think ... > > I don't see here how two methods -- one that writes backref and one > that reads it -- would be satisfied. Not all uses of $~ (for example) > read $~ in the body of the Ruby code; it's possible to use it across > calls without ever using any of the specially-named globals.
I think the trick here is that those two methods must be written in terms of a Ruby-ish (not really valid) method which uses the captured $~ variable: some_method: $~ = nil # LocalVar (special one) # This was Regexp.search (and has to be seen as it -- perhaps biggest challenge) Regexp.search_internal do |internal| if internal.good_I_made_this_up $~ = Match.new(some_data) end end So in order for this to go away then search_internal and the block MUST inline. Another option would be to write these methods in pure IR and replace the known callsite with that chunk of IR. The main challenge in this is that it is impossible to statically determine whether the callsite is actually one of these special methods (except things like /foo/). So we would still need some mechanism outside of this opt until we could determine this case. Once we do know this case we can do this and eliminate the bookkeeping of $~ on a special stack (assuming it is actually unused and can be removed via DCE). > >> Anyway, this is just a broad outline, and not all details are worked out, >> but insofaras $~ is effectively just a regular local-var, it has the exact >> same behavior as a method-level local var used by a piece of Ruby code. So >> far, in the current implementation of backRef in DynamicScope as a special >> field, I see that it behaves this way as well -- so why not just make it >> explicit and take advantage of it? >> >> Am I missing anything? > > Maybe not? > > Our goal with all these "extra" frame-local pieces of data is > obviously to make them free when not used (and as cheap as possible > when used). Backref/lastline violate that perhaps more than any other > feature right now in that if there's any hint that they might be used > we construct a full heap-based scope for the method *and* force all > locals into that scope too. That sounds bad enough...and then realize > that "hint of use" includes any method *names* in core that are known > to access backref/lastline, like #[]. So every method that calls #[] > on any object deoptimizes to use a full heap scope every time. *Awful* > > I've had several ideas for eliminating this gross over-compensation, > and some of them will be helped by IR treating backref/lastline as > though they're "always heap-based" locals: > > * When a method *may* access backref/lastline, increment/decrement an > index into a per-thread array of values. This avoids all allocation > but has the cost of read+modify of a field twice and try/finally logic > (both of which would be *drastically* cheaper than a heap scope). It's > not free when not used, but it has the same drastically reduced cost > whether it is used or not. > > * Add an additional call path for methods that *may* access > backref/lastline that preserves a single thread-local value for each. > Only one would be set at a time, and the call stack would preserve > context. In this case, the dynamic call logic looks up a target method > to inspect whether it *actually* accesses backref/lastline, and only > does the thread-local logic in that case. It would make the non-used > case nearly free (on invokedynamic) and the used case cheaper (but > more expensive than the index and pre-allocated array) > > The former I can do now even with the current compiler by adding > another "CallConfiguration" and compiler logic around it (currently > CallConfiguration can only reflect frame:yes|no, > scope:full|dummy|none), and I may even do that for JRuby 1.7. The > latter really needs IR and invokedynamic to be efficient, since I > would want to bind the proper logic into each call site. > > - Charlie > > --------------------------------------------------------------------- > To unsubscribe from this list, please visit: > > http://xircles.codehaus.org/manage_email > > -- blog: http://blog.enebo.com twitter: tom_enebo mail: tom.en...@gmail.com --------------------------------------------------------------------- To unsubscribe from this list, please visit: http://xircles.codehaus.org/manage_email