This is described as being to help RSA, etc., but it will be very hard to use for that purpose without an add with carry. There are many ways to do the product, but a simple version of the core is like this:
for i=0 to s-1 C := 0 for j=0 to s-1 (C,S) := t[i+j] + a[j] * b[i] + C t[i+j] := S t[i+s] := C for i=0 to s-1 C := 0 m := t i *n' 0 mod W for j=0 to s-1 (C,S) := t[i+j] + m*n[j] + C t[i+j] := S ADD(t[i+s],C) ... the result is in the carry flag and t . The logic in the x86 version of SharedRuntime::montgomery_multiply uses a primitive which multiplies two longs and accumulates the result into a triple-length sum. x86 can do this in four instructions. I guess a primitive like this will fit nicely with value types, but I'm not sure how it's possible to do this with Java today. (My apologies: I'm sure you know this already, but I didn't think it was wise not to say anything.) Andrew. [Algorithm from http://koclab.cs.ucsb.edu/docs/koc/j37.pdf]