On Sun, 2006-09-03 at 10:02 +0200, Jeroen Frijters wrote:
> Raif S. Naffah wrote:
> > the attached patch adds support for GNU MP in BigInteger 
> > if/when configured.
> 
> How/why is the native version better? Is it really worthwhile to
> complicate the code this way? Where are the benchmarks that prove the
> native code is faster?

Valid questions, indeed. I don't purport to have the answers to them,
but IIRC, Kaffe has (or formerly had) a GNU MP-based implementation
which supposedly was faster. However, I wouldn't automatically think
this was the case, given the large overhead of JNI calls.
(GOTO [1] for a longer rambling about that)

What I'd like to propose here is that in any case, the choice of
implementation should be strictly a build-time option, and the
two implementations kept entirely seperate. 

I'm not very happy about the alternatives: Having two implementations in
one class (as in the proposed patch), or moving the actual impl into yet
another VM* class. (Indeed I've been increasingly critical recently over
that part. I've still not worked out an exact solution, but suffice to
say that I'll be proposing some changes in that area soon).


[1] Considering a usage; multiplying two 1024-bit numbers to generate
a 2048-bit crypto key or such. Each number is 32 ints, and using
ordinary long multiplication (which is what we do currently) that means
32*32 = 1024 int multiplications. Which might sound like a lot, but now
consider that a JNI call has an overhead of at least few hundred
instructions or so, then the native routine would have to perform at
least 20% faster than JIT-compiled code to break even. Which may be
the case, but not necessarily. 

An example I have real numbers for is the NIO charset converters (a
similar type of job: Fast individual operations performed in bulk
quantities). Benchmarking the GNU iconv-based converter versus the
pure java one on JamVM -Certainly the best possible scenario for
the native code; since Jam has excellent JNI speed, and interpreted
java code.

In that case the Java-based converters are still faster, due to the JNI
overhead. The break-even point on my machine was about 2k characters. 
With a VM like Cacao that'll be much higher (about more than twice the
JNI overhead, and 5x or so faster java code). 

Another drawback of native code IMHO (which Jeroen doesn't mention) is
that you also create an undesireable situation where better VMs perform
worse, since the VMs in general can't optimize the native code (e.g.
method inlining). (Creating a VM that can inline dynamically loaded
native library methods is left as an exercise for the reader ;))

Anyway enough disgression, I hope I didn't sound harsh, Raif! :) 
If it's faster we should definitely have an implementation. My main
point is that I'd just like to see the implementations kept seperate.

(Too bad there's not a SPI for this. Brings me back to my old gripe
that I really wish you could make constructors create subclasses.)

/Sven


Reply via email to