Re: Looping construction using only MethodHandles

2011-12-08 Thread MLeo
I do think I remember that (I've been subscribed to the list since
2009-01-21), this was mostly an attempt to see how far it would go (and to
get my brain wired for the MethodHandle API).

However, are the discounted code size budgets the only reason why the
MethodHandle 'loop' can recurse 2x to 6x more than a simpler POJM?
Obviously it doesn't generate stacktraces (at least not that show up), yet
it does consume stack.

One of the other things I'm going to attempt is to create my own While
adapter, with a static field, a bootstrap method and an invoke method that
does an invoke dynamic on the static field. I wonder how small I can get
the class file. I presume that the anonymous classloader in the sun package
will, at some point, be moved/finalized into a java.invoke or java.lang
form?

Thank you for taking a look,

--Maarten Daalder
On 6 December 2011 06:54, Charles Oliver Nutter  wrote:

> This actually came up early this past summer, when I tried to do the
> same thing (after Ola Bini showed me the MutableCallSite invoker
> trick. You can probably find the thread in July some time.
>
> The bottom line was that even if the handles could get folded into
> tail recursion, the call site ends up interfering with the
> MethodHandle optimizer's visibility into the graph.
>
> I was as disappointed as you are (or as you will be, once you've read
> this). I wish I'd been more involved early in the JSR-292 process and
> pushed for a looping construct. Unfortunately that train has sailed,
> and I've found no way to do loops (or any sort of backward branching)
> using the current set of method handle adapters :(
>
> Here's the tease, for future work on invokedynamic: I believe that if
> we had a way to do backward branches, method handles could become a
> turing-complete, user-accessible IR for the JVM's JIT.
>
> - Charlie
>
> On Mon, Dec 5, 2011 at 1:06 PM, MLeo  wrote:
> > Hello all,
> >
> > Over the past few days I've been thinkering with a bit of code that folds
> > over the values of some container using some MethodHandle and a 'zero'
> > value.
> >
> > https://gist.github.com/1425706
> >
> > It's actually an implementation of a strategy to encode higher order
> > functions (HOF) without introducing a plethoramorphic callsite (if I
> > remember the term correctly). If you squint at it right sort of resembles
> > inversion of control. Or it's more like turning a function inside out.
> >
> > And it mostly works, with suprising result.
> > It's not entirely tail call optimized, it still eats stack, but not the
> same
> > amount as an naive unequivalent Java static method that is hand optimized
> > for the task (so no HOF).
> >
> > I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the
> naive
> > version bails after a depth of ~11 and after some optimization it
> will
> > get to ~31 (after the first invocation!). The fold, on the other
> hand,
> > only bails out after recursing for ~65 times and gets there nearly
> > immediately, the first run is ~1 lower, but the remainder of
> invocations
> > is consistent (or seems that way, rather, I think it's because not enough
> > optimizations have kicked in yet).
> >
> > Execution time is another story, the POJM (Plain Old Java Method) takes
> > ~5ms, while the fold takes ~30ms. Which, now I think some more about it,
> is
> > more or less in line with the previous initial result, but still seems
> to be
> > a bit slow to me.
> > After some optimizations have kicked in (when the POJM reaches a depth of
> > 31) the execution times stay roughly the same. So it would appear the
> > JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem
> to
> > impact performance, if I make them recurse 10 times (so it doesn't
> > overflow) then the execution time is roughly the same.
> >
> > Both ways are tested by invoking them through invokeWithArguments (I
> haven't
> > yet managed to get ASM to produce a test class for me), and I let both
> try
> > to sum an array (turned into an Iterator through
> > Arrays.asList(...).iterator()) of a million elements, so both will cause
> a
> > StackOverflowException. The test is to invoke a methods 1000 times (I
> got a
> > bit tired of waiting for a million times) and that I do 6 times (so I
> take 6
> > samples and average across them).
> >
> >
> > Now my question, is there something I can do to make it completely tail
> call
> > optimized? I've tried to 'rotate' the call graph/tree but that obviously
> > wouldn't work (it's still a direct recursion, no matter if you do it
> > directly or in the combiner of a foldArguments). It seems that it is
> almost
> > completely TCO already, but I haven't found where it's leaking stack.
> It's
> > definitely leaking less stack than a simple recursion.
> > Or will we need a special looping combinator here? I initially tried to
> > create a while combinator, however it seems that guardWithTest does not
> > accept MethodHandles returning void (for the targ

Re: Looping construction using only MethodHandles

2011-12-05 Thread Charles Oliver Nutter
This actually came up early this past summer, when I tried to do the
same thing (after Ola Bini showed me the MutableCallSite invoker
trick. You can probably find the thread in July some time.

The bottom line was that even if the handles could get folded into
tail recursion, the call site ends up interfering with the
MethodHandle optimizer's visibility into the graph.

I was as disappointed as you are (or as you will be, once you've read
this). I wish I'd been more involved early in the JSR-292 process and
pushed for a looping construct. Unfortunately that train has sailed,
and I've found no way to do loops (or any sort of backward branching)
using the current set of method handle adapters :(

Here's the tease, for future work on invokedynamic: I believe that if
we had a way to do backward branches, method handles could become a
turing-complete, user-accessible IR for the JVM's JIT.

- Charlie

On Mon, Dec 5, 2011 at 1:06 PM, MLeo  wrote:
> Hello all,
>
> Over the past few days I've been thinkering with a bit of code that folds
> over the values of some container using some MethodHandle and a 'zero'
> value.
>
> https://gist.github.com/1425706
>
> It's actually an implementation of a strategy to encode higher order
> functions (HOF) without introducing a plethoramorphic callsite (if I
> remember the term correctly). If you squint at it right sort of resembles
> inversion of control. Or it's more like turning a function inside out.
>
> And it mostly works, with suprising result.
> It's not entirely tail call optimized, it still eats stack, but not the same
> amount as an naive unequivalent Java static method that is hand optimized
> for the task (so no HOF).
>
> I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the naive
> version bails after a depth of ~11 and after some optimization it will
> get to ~31 (after the first invocation!). The fold, on the other hand,
> only bails out after recursing for ~65 times and gets there nearly
> immediately, the first run is ~1 lower, but the remainder of invocations
> is consistent (or seems that way, rather, I think it's because not enough
> optimizations have kicked in yet).
>
> Execution time is another story, the POJM (Plain Old Java Method) takes
> ~5ms, while the fold takes ~30ms. Which, now I think some more about it, is
> more or less in line with the previous initial result, but still seems to be
> a bit slow to me.
> After some optimizations have kicked in (when the POJM reaches a depth of
> 31) the execution times stay roughly the same. So it would appear the
> JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem to
> impact performance, if I make them recurse 10 times (so it doesn't
> overflow) then the execution time is roughly the same.
>
> Both ways are tested by invoking them through invokeWithArguments (I haven't
> yet managed to get ASM to produce a test class for me), and I let both try
> to sum an array (turned into an Iterator through
> Arrays.asList(...).iterator()) of a million elements, so both will cause a
> StackOverflowException. The test is to invoke a methods 1000 times (I got a
> bit tired of waiting for a million times) and that I do 6 times (so I take 6
> samples and average across them).
>
>
> Now my question, is there something I can do to make it completely tail call
> optimized? I've tried to 'rotate' the call graph/tree but that obviously
> wouldn't work (it's still a direct recursion, no matter if you do it
> directly or in the combiner of a foldArguments). It seems that it is almost
> completely TCO already, but I haven't found where it's leaking stack. It's
> definitely leaking less stack than a simple recursion.
> Or will we need a special looping combinator here? I initially tried to
> create a while combinator, however it seems that guardWithTest does not
> accept MethodHandles returning void (for the target and fallback).
>
> If this can be made to work (so fully TCO'd, and maybe some massaging from
> HotSpot engineers), then in theory it would allow for functional languages a
> way to monomorphize HOFs.
> For something like map (or fold) it will probably require a guard/PIC for
> the implementations of that method, and those methods need to be compiled in
> such a way so that they become bootstrap methods/factories (see the
> constructor of MethodHandleFoldBuilder.java in the gist) which can then get
> installed in the invocation of that fold.
>
> Thank you for your time,
> --Maarten Daalder
>
> PS.
>
> In case you haven't noticed, I'm not a statistics wizard, so the numbers
> mentioned should be taken only as an early indication and not of any
> statistical significance. YMWV.
>
> Btw, if you were to implement 'map' with this then it would (if I noticed it
> correctly) also reverse the list, or you're going to need to append each
> element to the list, which will probably be slow. This is something I'm
> going to try in the next few days or so.
>
> __

Looping construction using only MethodHandles

2011-12-05 Thread MLeo
Hello all,

Over the past few days I've been thinkering with a bit of code that folds
over the values of some container using some MethodHandle and a 'zero'
value.

https://gist.github.com/1425706

It's actually an implementation of a strategy to encode higher order
functions (HOF) without introducing a plethoramorphic callsite (if I
remember the term correctly). If you squint at it right sort of resembles
inversion of control. Or it's more like turning a function inside out.

And it mostly works, with suprising result.
It's not entirely tail call optimized, it still eats stack, but not the
same amount as an naive unequivalent Java static method that is hand
optimized for the task (so no HOF).

I'm using the x64 update 2 jdk on Windows Vista with -Xss10m. And the naive
version bails after a depth of ~11 and after some optimization it will
get to ~31 (after the first invocation!). The fold, on the other hand,
only bails out after recursing for ~65 times and gets there nearly
immediately, the first run is ~1 lower, but the remainder of
invocations is consistent (or seems that way, rather, I think it's because
not enough optimizations have kicked in yet).

Execution time is another story, the POJM (Plain Old Java Method) takes
~5ms, while the fold takes ~30ms. Which, now I think some more about it, is
more or less in line with the previous initial result, but still seems to
be a bit slow to me.
After some optimizations have kicked in (when the POJM reaches a depth of
31) the execution times stay roughly the same. So it would appear the
JVM inlined the POJM 2 or 3 times. Creating the stacktraces doesn't seem to
impact performance, if I make them recurse 10 times (so it doesn't
overflow) then the execution time is roughly the same.

Both ways are tested by invoking them through invokeWithArguments (I
haven't yet managed to get ASM to produce a test class for me), and I let
both try to sum an array (turned into an Iterator through
Arrays.asList(...).iterator()) of a million elements, so both will cause a
StackOverflowException. The test is to invoke a methods 1000 times (I got a
bit tired of waiting for a million times) and that I do 6 times (so I take
6 samples and average across them).


Now my question, is there something I can do to make it completely tail
call optimized? I've tried to 'rotate' the call graph/tree but that
obviously wouldn't work (it's still a direct recursion, no matter if you do
it directly or in the combiner of a foldArguments). It seems that it is
almost completely TCO already, but I haven't found where it's leaking
stack. It's definitely leaking less stack than a simple recursion.
Or will we need a special looping combinator here? I initially tried to
create a while combinator, however it seems that guardWithTest does not
accept MethodHandles returning void (for the target and fallback).

If this can be made to work (so fully TCO'd, and maybe some massaging from
HotSpot engineers), then in theory it would allow for functional languages
a way to monomorphize HOFs.
For something like map (or fold) it will probably require a guard/PIC for
the implementations of that method, and those methods need to be compiled
in such a way so that they become bootstrap methods/factories (see the
constructor of MethodHandleFoldBuilder.java in the gist) which can then get
installed in the invocation of that fold.

Thank you for your time,
--Maarten Daalder

PS.

In case you haven't noticed, I'm not a statistics wizard, so the numbers
mentioned should be taken only as an early indication and not of any
statistical significance. YMWV.

Btw, if you were to implement 'map' with this then it would (if I noticed
it correctly) also reverse the list, or you're going to need to append each
element to the list, which will probably be slow. This is something I'm
going to try in the next few days or so.
___
mlvm-dev mailing list
mlvm-dev@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/mlvm-dev