Re: Looping construction using only MethodHandles
I do think I remember that (I've been subscribed to the list since 2009-01-21), this was mostly an attempt to see how far it would go (and to get my brain wired for the MethodHandle API). However, are the discounted code size budgets the only reason why the MethodHandle 'loop' can recurse 2x to 6x more than a simpler POJM? Obviously it doesn't generate stacktraces (at least not that show up), yet it does consume stack. One of the other things I'm going to attempt is to create my own While adapter, with a static field, a bootstrap method and an invoke method that does an invoke dynamic on the static field. I wonder how small I can get the class file. I presume that the anonymous classloader in the sun package will, at some point, be moved/finalized into a java.invoke or java.lang form? Thank you for taking a look, --Maarten Daalder On 6 December 2011 06:54, Charles Oliver Nutter wrote: > This actually came up early this past summer, when I tried to do the > same thing (after Ola Bini showed me the MutableCallSite invoker > trick. You can probably find the thread in July some time. > > The bottom line was that even if the handles could get folded into > tail recursion, the call site ends up interfering with the > MethodHandle optimizer's visibility into the graph. > > I was as disappointed as you are (or as you will be, once you've read > this). I wish I'd been more involved early in the JSR-292 process and > pushed for a looping construct. Unfortunately that train has sailed, > and I've found no way to do loops (or any sort of backward branching) > using the current set of method handle adapters :( > > Here's the tease, for future work on invokedynamic: I believe that if > we had a way to do backward branches, method handles could become a > turing-complete, user-accessible IR for the JVM's JIT. > > - Charlie > > On Mon, Dec 5, 2011 at 1:06 PM, MLeo wrote: > > Hello all, > > > > Over the past few days I've been thinkering with a bit of code that folds > > over the values of some container using some MethodHandle and a 'zero' > > value. > > > > https://gist.github.com/1425706 > > > > It's actually an implementation of a strategy to encode higher order > > functions (HOF) without introducing a plethoramorphic callsite (if I > > remember the term correctly). If you squint at it right sort of resembles > > inversion of control. Or it's more like turning a function inside out. > > > > And it mostly works, with suprising result. > > It's not entirely tail call optimized, it still eats stack, but not the > same > > amount as an naive unequivalent Java static method that is hand optimized > > for the task (so no HOF). > > > > I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the > naive > > version bails after a depth of ~11 and after some optimization it > will > > get to ~31 (after the first invocation!). The fold, on the other > hand, > > only bails out after recursing for ~65 times and gets there nearly > > immediately, the first run is ~1 lower, but the remainder of > invocations > > is consistent (or seems that way, rather, I think it's because not enough > > optimizations have kicked in yet). > > > > Execution time is another story, the POJM (Plain Old Java Method) takes > > ~5ms, while the fold takes ~30ms. Which, now I think some more about it, > is > > more or less in line with the previous initial result, but still seems > to be > > a bit slow to me. > > After some optimizations have kicked in (when the POJM reaches a depth of > > 31) the execution times stay roughly the same. So it would appear the > > JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem > to > > impact performance, if I make them recurse 10 times (so it doesn't > > overflow) then the execution time is roughly the same. > > > > Both ways are tested by invoking them through invokeWithArguments (I > haven't > > yet managed to get ASM to produce a test class for me), and I let both > try > > to sum an array (turned into an Iterator through > > Arrays.asList(...).iterator()) of a million elements, so both will cause > a > > StackOverflowException. The test is to invoke a methods 1000 times (I > got a > > bit tired of waiting for a million times) and that I do 6 times (so I > take 6 > > samples and average across them). > > > > > > Now my question, is there something I can do to make it completely tail > call > > optimized? I've tried to 'rotate' the call graph/tree but that obviously > > wouldn't work (it's still a direct recursion, no matter if you do it > > directly or in the combiner of a foldArguments). It seems that it is > almost > > completely TCO already, but I haven't found where it's leaking stack. > It's > > definitely leaking less stack than a simple recursion. > > Or will we need a special looping combinator here? I initially tried to > > create a while combinator, however it seems that guardWithTest does not > > accept MethodHandles returning void (for the targ
Re: Looping construction using only MethodHandles
This actually came up early this past summer, when I tried to do the same thing (after Ola Bini showed me the MutableCallSite invoker trick. You can probably find the thread in July some time. The bottom line was that even if the handles could get folded into tail recursion, the call site ends up interfering with the MethodHandle optimizer's visibility into the graph. I was as disappointed as you are (or as you will be, once you've read this). I wish I'd been more involved early in the JSR-292 process and pushed for a looping construct. Unfortunately that train has sailed, and I've found no way to do loops (or any sort of backward branching) using the current set of method handle adapters :( Here's the tease, for future work on invokedynamic: I believe that if we had a way to do backward branches, method handles could become a turing-complete, user-accessible IR for the JVM's JIT. - Charlie On Mon, Dec 5, 2011 at 1:06 PM, MLeo wrote: > Hello all, > > Over the past few days I've been thinkering with a bit of code that folds > over the values of some container using some MethodHandle and a 'zero' > value. > > https://gist.github.com/1425706 > > It's actually an implementation of a strategy to encode higher order > functions (HOF) without introducing a plethoramorphic callsite (if I > remember the term correctly). If you squint at it right sort of resembles > inversion of control. Or it's more like turning a function inside out. > > And it mostly works, with suprising result. > It's not entirely tail call optimized, it still eats stack, but not the same > amount as an naive unequivalent Java static method that is hand optimized > for the task (so no HOF). > > I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the naive > version bails after a depth of ~11 and after some optimization it will > get to ~31 (after the first invocation!). The fold, on the other hand, > only bails out after recursing for ~65 times and gets there nearly > immediately, the first run is ~1 lower, but the remainder of invocations > is consistent (or seems that way, rather, I think it's because not enough > optimizations have kicked in yet). > > Execution time is another story, the POJM (Plain Old Java Method) takes > ~5ms, while the fold takes ~30ms. Which, now I think some more about it, is > more or less in line with the previous initial result, but still seems to be > a bit slow to me. > After some optimizations have kicked in (when the POJM reaches a depth of > 31) the execution times stay roughly the same. So it would appear the > JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem to > impact performance, if I make them recurse 10 times (so it doesn't > overflow) then the execution time is roughly the same. > > Both ways are tested by invoking them through invokeWithArguments (I haven't > yet managed to get ASM to produce a test class for me), and I let both try > to sum an array (turned into an Iterator through > Arrays.asList(...).iterator()) of a million elements, so both will cause a > StackOverflowException. The test is to invoke a methods 1000 times (I got a > bit tired of waiting for a million times) and that I do 6 times (so I take 6 > samples and average across them). > > > Now my question, is there something I can do to make it completely tail call > optimized? I've tried to 'rotate' the call graph/tree but that obviously > wouldn't work (it's still a direct recursion, no matter if you do it > directly or in the combiner of a foldArguments). It seems that it is almost > completely TCO already, but I haven't found where it's leaking stack. It's > definitely leaking less stack than a simple recursion. > Or will we need a special looping combinator here? I initially tried to > create a while combinator, however it seems that guardWithTest does not > accept MethodHandles returning void (for the target and fallback). > > If this can be made to work (so fully TCO'd, and maybe some massaging from > HotSpot engineers), then in theory it would allow for functional languages a > way to monomorphize HOFs. > For something like map (or fold) it will probably require a guard/PIC for > the implementations of that method, and those methods need to be compiled in > such a way so that they become bootstrap methods/factories (see the > constructor of MethodHandleFoldBuilder.java in the gist) which can then get > installed in the invocation of that fold. > > Thank you for your time, > --Maarten Daalder > > PS. > > In case you haven't noticed, I'm not a statistics wizard, so the numbers > mentioned should be taken only as an early indication and not of any > statistical significance. YMWV. > > Btw, if you were to implement 'map' with this then it would (if I noticed it > correctly) also reverse the list, or you're going to need to append each > element to the list, which will probably be slow. This is something I'm > going to try in the next few days or so. > > __
Looping construction using only MethodHandles
Hello all, Over the past few days I've been thinkering with a bit of code that folds over the values of some container using some MethodHandle and a 'zero' value. https://gist.github.com/1425706 It's actually an implementation of a strategy to encode higher order functions (HOF) without introducing a plethoramorphic callsite (if I remember the term correctly). If you squint at it right sort of resembles inversion of control. Or it's more like turning a function inside out. And it mostly works, with suprising result. It's not entirely tail call optimized, it still eats stack, but not the same amount as an naive unequivalent Java static method that is hand optimized for the task (so no HOF). I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the naive version bails after a depth of ~11 and after some optimization it will get to ~31 (after the first invocation!). The fold, on the other hand, only bails out after recursing for ~65 times and gets there nearly immediately, the first run is ~1 lower, but the remainder of invocations is consistent (or seems that way, rather, I think it's because not enough optimizations have kicked in yet). Execution time is another story, the POJM (Plain Old Java Method) takes ~5ms, while the fold takes ~30ms. Which, now I think some more about it, is more or less in line with the previous initial result, but still seems to be a bit slow to me. After some optimizations have kicked in (when the POJM reaches a depth of 31) the execution times stay roughly the same. So it would appear the JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem to impact performance, if I make them recurse 10 times (so it doesn't overflow) then the execution time is roughly the same. Both ways are tested by invoking them through invokeWithArguments (I haven't yet managed to get ASM to produce a test class for me), and I let both try to sum an array (turned into an Iterator through Arrays.asList(...).iterator()) of a million elements, so both will cause a StackOverflowException. The test is to invoke a methods 1000 times (I got a bit tired of waiting for a million times) and that I do 6 times (so I take 6 samples and average across them). Now my question, is there something I can do to make it completely tail call optimized? I've tried to 'rotate' the call graph/tree but that obviously wouldn't work (it's still a direct recursion, no matter if you do it directly or in the combiner of a foldArguments). It seems that it is almost completely TCO already, but I haven't found where it's leaking stack. It's definitely leaking less stack than a simple recursion. Or will we need a special looping combinator here? I initially tried to create a while combinator, however it seems that guardWithTest does not accept MethodHandles returning void (for the target and fallback). If this can be made to work (so fully TCO'd, and maybe some massaging from HotSpot engineers), then in theory it would allow for functional languages a way to monomorphize HOFs. For something like map (or fold) it will probably require a guard/PIC for the implementations of that method, and those methods need to be compiled in such a way so that they become bootstrap methods/factories (see the constructor of MethodHandleFoldBuilder.java in the gist) which can then get installed in the invocation of that fold. Thank you for your time, --Maarten Daalder PS. In case you haven't noticed, I'm not a statistics wizard, so the numbers mentioned should be taken only as an early indication and not of any statistical significance. YMWV. Btw, if you were to implement 'map' with this then it would (if I noticed it correctly) also reverse the list, or you're going to need to append each element to the list, which will probably be slow. This is something I'm going to try in the next few days or so. ___ mlvm-dev mailing list mlvm-dev@openjdk.java.net http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev