On 4/29/08, Jochen Theodorou <[EMAIL PROTECTED]> wrote:
>
> Hi all,
>
> I wanted to collect a bit data to how you avoid boxing in your language
> implementations. I am asking because currently Groovy makes lots of
> calls via Reflection, and that means creating for each call an Object[],
> containing boxed values of ints and all the other primitive types. Not
> only that... for doing 2+3, we actually box the numbers, make a dynamic
> method call to Integer.plus, which then unboxes the values, makes the
> plus operation and boxes the result. Usually we keep the boxed value, so
> there is no need to unbox again, but still... for a simple plus you need
> to do quite a lot. And even if the call would be native.. I make
> measurements, that such code is 20-30 times slower, even with a direct
> method call.
>
> The best thing would be of course to call the method with primitive
> types.. but keeping such a type information isn't the most easy thing to
> do. Reflective method calls do not return primitives and f(a)+g(b) might
> or might not be something where two ints are added. On the other hand...
> keeping the values on the stack isn't easy either. There are just too
> many primitive types to provide a path for each and every primitive
> type. Also longs and doubles take up two slots instead of one, making
> stack manipulation more difficult.
>
> I see here an advantage for interpreted languages, since thy have not to
> care about such things and can do whatever they need to do. And static
> languages do usually now the resulting types of method calls, so they
> won't have these problems either I guess..
>
> John Rose was talking about tuples... but I am not sure they can be used
> to resolve the general problem. What do others think?
Like Groovy, I use a MetaClass to implement the dynamic behaviour of a
class. MetaClasses are immutably associated with a Class but the
MetaClass itself is mutable. I support Monkey Patching by mutating the
MetaClass.
Like Groovy (kind of) I make calls to a dispatcher object (called
ThreadLocal) which, in turn, makes calls to the MetaClass to actually
perform the operation/execute the method. This object is unlike the
Groovy approach in two respects: Firstly it is Thread specific,
secondly it knows absolutely nothing about the default semantics of
the language it just orchestrates the call to the MetaClass. In Ng the
MetaClass has absolute control over the behaviour of the Class it
represents.
The Ng runtime system does not implement operators as method calls. My
view is that doing this throws away information which is useful in
improving the performance of the system. So a + b results in a call to
a method on ThreadLocal which looks like tc.add().apply(a, b) (tc is
the instance of ThreadContext for the current thread). ThreadContext
will find the correct MetaClass for a and route the call to the
MetaClass passing with it some information (about thread specific
Monkey Patching) which will help the MetaClass decide what to do.
The ThreadContext API for operators is *very* rich (there are about
250 methods which implement addition, for example). The reason for the
richness of the API the combinatorial explosion cause by the need to
support all the combinations of the primitive arithmetic types plus
BigDecimal, BigInteger and Object. This is compounded by the fact that
there are methods which return primitive results.
For example we have the following two methods on add:
Object apply(int lhs, int rhs)
and
int intApply(int lhs, int rhs) throws NotPerformed
The first flavour is a "slow" implementation which, in the standard
implementation, returns a boxed int. The second is a "fast"
implementation which returns an unboxed int.
Now the compiler can never just use the "fast" implementation. The
user can, at any time, change the semantics of addition (for example,
to return a long if the operation overflows). If this happens the
"fast" implementation will throw the NotPerformed exception if it
wants to return a result which is not an int.
So the compiler generates code which "speculatively" executes the
"fast" calls and falls back to the "slow" calls if one of the "fast'
calls fails.
e.g.
int a, b, c;
...
a = a + b * c
generates the equivalent of
try {
a = tc.add().intApply(a, tc.multiply().intApply(b, c));
} catch(NotPerformed e) {
a = tc.convert().asInt(tc.add().apply(a, rc.multiply().apply(b, c)));
}
Obviously the implementations of intApply(), etc. must not have side
effects and calls to user methods must not be made in the try/catch
block.
My initial tests show that this leads to very good performance (the
"fast" path running at less than twice the speed of Java). The fact
that the path from the call site to the code performing the operation
is very short and very simple means that the JIT seems to be able to
work wonders. The "slow" path is no slouch either (running at less
than four times slower than Java).
This has convinced me that the approach of having a very wide but
shallow API (as opposed to Groovy's narrow but deep API) really has
something to offer in improving the performance of Dynamic languages
on the JVM. (I see that Alex Tkachman's work on improving Groovy
performance involves widening the API which tends to validate this
approach).
John Wilson
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---