On Sun, 2006-09-03 at 10:02 +0200, Jeroen Frijters wrote: > Raif S. Naffah wrote: > > the attached patch adds support for GNU MP in BigInteger > > if/when configured. > > How/why is the native version better? Is it really worthwhile to > complicate the code this way? Where are the benchmarks that prove the > native code is faster?
Valid questions, indeed. I don't purport to have the answers to them, but IIRC, Kaffe has (or formerly had) a GNU MP-based implementation which supposedly was faster. However, I wouldn't automatically think this was the case, given the large overhead of JNI calls. (GOTO [1] for a longer rambling about that) What I'd like to propose here is that in any case, the choice of implementation should be strictly a build-time option, and the two implementations kept entirely seperate. I'm not very happy about the alternatives: Having two implementations in one class (as in the proposed patch), or moving the actual impl into yet another VM* class. (Indeed I've been increasingly critical recently over that part. I've still not worked out an exact solution, but suffice to say that I'll be proposing some changes in that area soon). [1] Considering a usage; multiplying two 1024-bit numbers to generate a 2048-bit crypto key or such. Each number is 32 ints, and using ordinary long multiplication (which is what we do currently) that means 32*32 = 1024 int multiplications. Which might sound like a lot, but now consider that a JNI call has an overhead of at least few hundred instructions or so, then the native routine would have to perform at least 20% faster than JIT-compiled code to break even. Which may be the case, but not necessarily. An example I have real numbers for is the NIO charset converters (a similar type of job: Fast individual operations performed in bulk quantities). Benchmarking the GNU iconv-based converter versus the pure java one on JamVM -Certainly the best possible scenario for the native code; since Jam has excellent JNI speed, and interpreted java code. In that case the Java-based converters are still faster, due to the JNI overhead. The break-even point on my machine was about 2k characters. With a VM like Cacao that'll be much higher (about more than twice the JNI overhead, and 5x or so faster java code). Another drawback of native code IMHO (which Jeroen doesn't mention) is that you also create an undesireable situation where better VMs perform worse, since the VMs in general can't optimize the native code (e.g. method inlining). (Creating a VM that can inline dynamically loaded native library methods is left as an exercise for the reader ;)) Anyway enough disgression, I hope I didn't sound harsh, Raif! :) If it's faster we should definitely have an implementation. My main point is that I'd just like to see the implementations kept seperate. (Too bad there's not a SPI for this. Brings me back to my old gripe that I really wish you could make constructors create subclasses.) /Sven
